Home | << 1 2 3 4 >> |
Records | |||||
---|---|---|---|---|---|
Author | Carola Figueroa Flores; Abel Gonzalez-Garcia; Joost Van de Weijer; Bogdan Raducanu | ||||
Title | Saliency for fine-grained object recognition in domains with scarce training data | Type | Journal Article | ||
Year | 2019 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 94 | Issue | Pages | 62-73 | |
Keywords | |||||
Abstract | This paper investigates the role of saliency to improve the classification accuracy of a Convolutional Neural Network (CNN) for the case when scarce training data is available. Our approach consists in adding a saliency branch to an existing CNN architecture which is used to modulate the standard bottom-up visual features from the original image input, acting as an attentional mechanism that guides the feature extraction process. The main aim of the proposed approach is to enable the effective training of a fine-grained recognition model with limited training samples and to improve the performance on the task, thereby alleviating the need to annotate a large dataset. The vast majority of saliency methods are evaluated on their ability to generate saliency maps, and not on their functionality in a complete vision pipeline. Our proposed pipeline allows to evaluate saliency methods for the high-level task of object recognition. We perform extensive experiments on various fine-grained datasets (Flowers, Birds, Cars, and Dogs) under different conditions and show that saliency can considerably improve the network’s performance, especially for the case of scarce training data. Furthermore, our experiments show that saliency methods that obtain improved saliency maps (as measured by traditional saliency benchmarks) also translate to saliency methods that yield improved performance gains when applied in an object recognition pipeline. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.109; 600.141; 600.120 | Approved | no | ||
Call Number | Admin @ si @ FGW2019 | Serial | 3264 | ||
Permanent link to this record | |||||
Author | Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny | ||||
Title | Segmentation-free Word Spotting with Exemplar SVMs | Type | Journal Article | ||
Year | 2014 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 47 | Issue | 12 | Pages | 3967–3978 |
Keywords | Word spotting; Segmentation-free; Unsupervised learning; Reranking; Query expansion; Compression | ||||
Abstract | In this paper we propose an unsupervised segmentation-free method for word spotting in document images. Documents are represented with a grid of HOG descriptors, and a sliding-window approach is used to locate the document regions that are most similar to the query. We use the Exemplar SVM framework to produce a better representation of the query in an unsupervised way. Then, we use a more discriminative representation based on Fisher Vector to rerank the best regions retrieved, and the most promising ones are used to expand the Exemplar SVM training set and improve the query representation. Finally, the document descriptors are precomputed and compressed with Product Quantization. This offers two advantages: first, a large number of documents can be kept in RAM memory at the same time. Second, the sliding window becomes significantly faster since distances between quantized HOG descriptors can be precomputed. Our results significantly outperform other segmentation-free methods in the literature, both in accuracy and in speed and memory usage. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.045; 600.056; 600.061; 602.006; 600.077 | Approved | no | ||
Call Number | Admin @ si @ AGF2014b | Serial | 2485 | ||
Permanent link to this record | |||||
Author | Mario Hernandez; Joao Sanchez; Jordi Vitria | ||||
Title | Selected papers from Iberian Conference on Pattern Recognition and Image Analysis | Type | Book Whole | ||
Year | 2012 | Publication | Pattern Recognition | Abbreviated Journal | |
Volume | 45 | Issue | 9 | Pages | 3047-3582 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | OR;MV | Approved | no | ||
Call Number | Admin @ si @ HSV2012 | Serial | 2069 | ||
Permanent link to this record | |||||
Author | Parichehr Behjati; Pau Rodriguez; Carles Fernandez; Isabelle Hupont; Armin Mehri; Jordi Gonzalez | ||||
Title | Single image super-resolution based on directional variance attention network | Type | Journal Article | ||
Year | 2023 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 133 | Issue | Pages | 108997 | |
Keywords | |||||
Abstract | Recent advances in single image super-resolution (SISR) explore the power of deep convolutional neural networks (CNNs) to achieve better performance. However, most of the progress has been made by scaling CNN architectures, which usually raise computational demands and memory consumption. This makes modern architectures less applicable in practice. In addition, most CNN-based SR methods do not fully utilize the informative hierarchical features that are helpful for final image recovery. In order to address these issues, we propose a directional variance attention network (DiVANet), a computationally efficient yet accurate network for SISR. Specifically, we introduce a novel directional variance attention (DiVA) mechanism to capture long-range spatial dependencies and exploit inter-channel dependencies simultaneously for more discriminative representations. Furthermore, we propose a residual attention feature group (RAFG) for parallelizing attention and residual block computation. The output of each residual block is linearly fused at the RAFG output to provide access to the whole feature hierarchy. In parallel, DiVA extracts most relevant features from the network for improving the final output and preventing information loss along the successive operations inside the network. Experimental results demonstrate the superiority of DiVANet over the state of the art in several datasets, while maintaining relatively low computation and memory footprint. The code is available at https://github.com/pbehjatii/DiVANet. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ BPF2023 | Serial | 3861 | ||
Permanent link to this record | |||||
Author | Meysam Madadi; Hugo Bertiche; Sergio Escalera | ||||
Title | SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery | Type | Journal Article | ||
Year | 2020 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 106 | Issue | Pages | 107472 | |
Keywords | Deep learning; 3D Human pose; Body shape; SMPL; Denoising autoencoder; Volumetric stack hourglass | ||||
Abstract | In this paper we propose to embed SMPL within a deep-based model to accurately estimate 3D pose and shape from a still RGB image. We use CNN-based 3D joint predictions as an intermediate representation to regress SMPL pose and shape parameters. Later, 3D joints are reconstructed again in the SMPL output. This module can be seen as an autoencoder where the encoder is a deep neural network and the decoder is SMPL model. We refer to this as SMPL reverse (SMPLR). By implementing SMPLR as an encoder-decoder we avoid the need of complex constraints on pose and shape. Furthermore, given that in-the-wild datasets usually lack accurate 3D annotations, it is desirable to lift 2D joints to 3D without pairing 3D annotations with RGB images. Therefore, we also propose a denoising autoencoder (DAE) module between CNN and SMPLR, able to lift 2D joints to 3D and partially recover from structured error. We evaluate our method on SURREAL and Human3.6M datasets, showing improvement over SMPL-based state-of-the-art alternatives by about 4 and 12 mm, respectively. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ MBE2020 | Serial | 3439 | ||
Permanent link to this record | |||||
Author | Debora Gil; Aura Hernandez-Sabate; Mireia Brunat;Steven Jansen; Jordi Martinez-Vilalta | ||||
Title | Structure-preserving smoothing of biomedical images | Type | Journal Article | ||
Year | 2011 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 44 | Issue | 9 | Pages | 1842-1851 |
Keywords | Non-linear smoothing; Differential geometry; Anatomical structures; segmentation; Cardiac magnetic resonance; Computerized tomography | ||||
Abstract | Smoothing of biomedical images should preserve gray-level transitions between adjacent tissues, while restoring contours consistent with anatomical structures. Anisotropic diffusion operators are based on image appearance discontinuities (either local or contextual) and might fail at weak inter-tissue transitions. Meanwhile, the output of block-wise and morphological operations is prone to present a block structure due to the shape and size of the considered pixel neighborhood. In this contribution, we use differential geometry concepts to define a diffusion operator that restricts to image consistent level-sets. In this manner, the final state is a non-uniform intensity image presenting homogeneous inter-tissue transitions along anatomical structures, while smoothing intra-structure texture. Experiments on different types of medical images (magnetic resonance, computerized tomography) illustrate its benefit on a further process (such as segmentation) of images. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | IAM; ADAS | Approved | no | ||
Call Number | IAM @ iam @ GHB2011 | Serial | 1526 | ||
Permanent link to this record | |||||
Author | Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados | ||||
Title | Table detection in business document images by message passing networks | Type | Journal Article | ||
Year | 2022 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 127 | Issue | Pages | 108641 | |
Keywords | |||||
Abstract | Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches. | ||||
Address | July 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.162; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RGR2022 | Serial | 3729 | ||
Permanent link to this record | |||||
Author | Susana Alvarez; Maria Vanrell | ||||
Title | Texton theory revisited: a bag-of-words approach to combine textons | Type | Journal Article | ||
Year | 2012 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 45 | Issue | 12 | Pages | 4312-4325 |
Keywords | |||||
Abstract | The aim of this paper is to revisit an old theory of texture perception and
update its computational implementation by extending it to colour. With this in mind we try to capture the optimality of perceptual systems. This is achieved in the proposed approach by sharing well-known early stages of the visual processes and extracting low-dimensional features that perfectly encode adequate properties for a large variety of textures without needing further learning stages. We propose several descriptors in a bag-of-words framework that are derived from different quantisation models on to the feature spaces. Our perceptual features are directly given by the shape and colour attributes of image blobs, which are the textons. In this way we avoid learning visual words and directly build the vocabularies on these lowdimensionaltexton spaces. Main differences between proposed descriptors rely on how co-occurrence of blob attributes is represented in the vocabularies. Our approach overcomes current state-of-art in colour texture description which is proved in several experiments on large texture datasets. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ AlV2012a | Serial | 2130 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Dimosthenis Karatzas | ||||
Title | TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild | Type | Journal Article | ||
Year | 2017 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 70 | Issue | Pages | 60-74 | |
Keywords | |||||
Abstract | Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way.
Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.084; 601.197; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ GoK2017 | Serial | 2886 | ||
Permanent link to this record | |||||
Author | Veronica Romero; Alicia Fornes; Nicolas Serrano; Joan Andreu Sanchez; A.H. Toselli; Volkmar Frinken; E. Vidal; Josep Llados | ||||
Title | The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition | Type | Journal Article | ||
Year | 2013 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 46 | Issue | 6 | Pages | 1658-1669 |
Keywords | |||||
Abstract | Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier Science Inc. New York, NY, USA | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.045; 602.006; 605.203 | Approved | no | ||
Call Number | Admin @ si @ RFS2013 | Serial | 2298 | ||
Permanent link to this record | |||||
Author | Estefania Talavera; Carolin Wuerich; Nicolai Petkov; Petia Radeva | ||||
Title | Topic modelling for routine discovery from egocentric photo-streams | Type | Journal Article | ||
Year | 2020 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 104 | Issue | Pages | 107330 | |
Keywords | Routine; Egocentric vision; Lifestyle; Behaviour analysis; Topic modelling | ||||
Abstract | Developing tools to understand and visualize lifestyle is of high interest when addressing the improvement of habits and well-being of people. Routine, defined as the usual things that a person does daily, helps describe the individuals’ lifestyle. With this paper, we are the first ones to address the development of novel tools for automatic discovery of routine days of an individual from his/her egocentric images. In the proposed model, sequences of images are firstly characterized by semantic labels detected by pre-trained CNNs. Then, these features are organized in temporal-semantic documents to later be embedded into a topic models space. Finally, Dynamic-Time-Warping and Spectral-Clustering methods are used for final day routine/non-routine discrimination. Moreover, we introduce a new EgoRoutine-dataset, a collection of 104 egocentric days with more than 100.000 images recorded by 7 users. Results show that routine can be discovered and behavioural patterns can be observed. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ TWP2020 | Serial | 3435 | ||
Permanent link to this record | |||||
Author | Jorge Bernal; F. Javier Sanchez; Fernando Vilariño | ||||
Title | Towards Automatic Polyp Detection with a Polyp Appearance Model | Type | Journal Article | ||
Year | 2012 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 45 | Issue | 9 | Pages | 3166-3182 |
Keywords | Colonoscopy,PolypDetection,RegionSegmentation,SA-DOVA descriptot | ||||
Abstract | This work aims at the automatic polyp detection by using a model of polyp appearance in the context of the analysis of colonoscopy videos. Our method consists of three stages: region segmentation, region description and region classification. The performance of our region segmentation method guarantees that if a polyp is present in the image, it will be exclusively and totally contained in a single region. The output of the algorithm also defines which regions can be considered as non-informative. We define as our region descriptor the novel Sector Accumulation-Depth of Valleys Accumulation (SA-DOVA), which provides a necessary but not sufficient condition for the polyp presence. Finally, we classify our segmented regions according to the maximal values of the SA-DOVA descriptor. Our preliminary classification results are promising, especially when classifying those parts of the image that do not contain a polyp inside. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | 800 | Expedition | Conference | IbPRIA | |
Notes | MV;SIAI | Approved | no | ||
Call Number | Admin @ si @ BSV2012; IAM @ iam | Serial | 1997 | ||
Permanent link to this record | |||||
Author | Daniel Ponsa; Antonio Lopez | ||||
Title | Variance reduction techniques in particle-based visual contour Tracking | Type | Journal Article | ||
Year | 2009 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 42 | Issue | 11 | Pages | 2372–2391 |
Keywords | Contour tracking; Active shape models; Kalman filter; Particle filter; Importance sampling; Unscented particle filter; Rao-Blackwellization; Partitioned sampling | ||||
Abstract | This paper presents a comparative study of three different strategies to improve the performance of particle filters, in the context of visual contour tracking: the unscented particle filter, the Rao-Blackwellized particle filter, and the partitioned sampling technique. The tracking problem analyzed is the joint estimation of the global and local transformation of the outline of a given target, represented following the active shape model approach. The main contributions of the paper are the novel adaptations of the considered techniques on this generic problem, and the quantitative assessment of their performance in extensive experimental work done. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ PoL2009a | Serial | 1168 | ||
Permanent link to this record | |||||
Author | Souhail Bakkali; Zuheng Ming; Mickael Coustaty; Marçal Rusiñol; Oriol Ramos Terrades | ||||
Title | VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification | Type | Journal Article | ||
Year | 2023 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 139 | Issue | Pages | 109419 | |
Keywords | |||||
Abstract | Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream approach. In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues, considering intra- and inter-modality relationships. Instead of merging features from different modalities into a common representation space, the proposed method exploits high-level interactions and learns relevant semantic information from effective attention flows within and across modalities. The proposed learning objective is devised between intra- and inter-modality alignment tasks, where the similarity distribution per task is computed by contracting positive sample pairs while simultaneously contrasting negative ones in the common feature representation space}. Extensive experiments on public document classification datasets demonstrate the effectiveness and the generalization capacity of our model on both low-scale and large-scale datasets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISSN 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.140; 600.121 | Approved | no | ||
Call Number | Admin @ si @ BMC2023 | Serial | 3826 | ||
Permanent link to this record | |||||
Author | Albert Gordo; Alicia Fornes; Ernest Valveny | ||||
Title | Writer identification in handwritten musical scores with bags of notes | Type | Journal Article | ||
Year | 2013 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 46 | Issue | 5 | Pages | 1337-1345 |
Keywords | |||||
Abstract | Writer Identification is an important task for the automatic processing of documents. However, the identification of the writer in graphical documents is still challenging. In this work, we adapt the Bag of Visual Words framework to the task of writer identification in handwritten musical scores. A vanilla implementation of this method already performs comparably to the state-of-the-art. Furthermore, we analyze the effect of two improvements of the representation: a Bhattacharyya embedding, which improves the results at virtually no extra cost, and a Fisher Vector representation that very significantly improves the results at the cost of a more complex and costly representation. Experimental evaluation shows results more than 20 points above the state-of-the-art in a new, challenging dataset. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ GFV2013 | Serial | 2307 | ||
Permanent link to this record |