Publicacions CVC -- Query Results

<< 1 2 3 4 >>

Details

Records
Author	Carola Figueroa Flores; Abel Gonzalez-Garcia; Joost Van de Weijer; Bogdan Raducanu
Title	Saliency for fine-grained object recognition in domains with scarce training data			Type	Journal Article
Year	2019	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	94	Issue		Pages	62-73
Keywords
Abstract	This paper investigates the role of saliency to improve the classification accuracy of a Convolutional Neural Network (CNN) for the case when scarce training data is available. Our approach consists in adding a saliency branch to an existing CNN architecture which is used to modulate the standard bottom-up visual features from the original image input, acting as an attentional mechanism that guides the feature extraction process. The main aim of the proposed approach is to enable the effective training of a fine-grained recognition model with limited training samples and to improve the performance on the task, thereby alleviating the need to annotate a large dataset. The vast majority of saliency methods are evaluated on their ability to generate saliency maps, and not on their functionality in a complete vision pipeline. Our proposed pipeline allows to evaluate saliency methods for the high-level task of object recognition. We perform extensive experiments on various fine-grained datasets (Flowers, Birds, Cars, and Dogs) under different conditions and show that saliency can considerably improve the network’s performance, especially for the case of scarce training data. Furthermore, our experiments show that saliency methods that obtain improved saliency maps (as measured by traditional saliency benchmarks) also translate to saliency methods that yield improved performance gains when applied in an object recognition pipeline.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.109; 600.141; 600.120			Approved	no
Call Number	Admin @ si @ FGW2019			Serial	3264
Permanent link to this record



Author	Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny
Title	Segmentation-free Word Spotting with Exemplar SVMs			Type	Journal Article
Year	2014	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	47	Issue	12	Pages	3967–3978
Keywords	Word spotting; Segmentation-free; Unsupervised learning; Reranking; Query expansion; Compression
Abstract	In this paper we propose an unsupervised segmentation-free method for word spotting in document images. Documents are represented with a grid of HOG descriptors, and a sliding-window approach is used to locate the document regions that are most similar to the query. We use the Exemplar SVM framework to produce a better representation of the query in an unsupervised way. Then, we use a more discriminative representation based on Fisher Vector to rerank the best regions retrieved, and the most promising ones are used to expand the Exemplar SVM training set and improve the query representation. Finally, the document descriptors are precomputed and compressed with Product Quantization. This offers two advantages: first, a large number of documents can be kept in RAM memory at the same time. Second, the sliding window becomes significantly faster since distances between quantized HOG descriptors can be precomputed. Our results significantly outperform other segmentation-free methods in the literature, both in accuracy and in speed and memory usage.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.045; 600.056; 600.061; 602.006; 600.077			Approved	no
Call Number	Admin @ si @ AGF2014b			Serial	2485
Permanent link to this record



Author	Mario Hernandez; Joao Sanchez; Jordi Vitria
Title	Selected papers from Iberian Conference on Pattern Recognition and Image Analysis			Type	Book Whole
Year	2012	Publication	Pattern Recognition	Abbreviated Journal
Volume	45	Issue	9	Pages	3047-3582
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	OR;MV			Approved	no
Call Number	Admin @ si @ HSV2012			Serial	2069
Permanent link to this record



Author	Parichehr Behjati; Pau Rodriguez; Carles Fernandez; Isabelle Hupont; Armin Mehri; Jordi Gonzalez
Title	Single image super-resolution based on directional variance attention network			Type	Journal Article
Year	2023	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	133	Issue		Pages	108997
Keywords
Abstract	Recent advances in single image super-resolution (SISR) explore the power of deep convolutional neural networks (CNNs) to achieve better performance. However, most of the progress has been made by scaling CNN architectures, which usually raise computational demands and memory consumption. This makes modern architectures less applicable in practice. In addition, most CNN-based SR methods do not fully utilize the informative hierarchical features that are helpful for final image recovery. In order to address these issues, we propose a directional variance attention network (DiVANet), a computationally efficient yet accurate network for SISR. Specifically, we introduce a novel directional variance attention (DiVA) mechanism to capture long-range spatial dependencies and exploit inter-channel dependencies simultaneously for more discriminative representations. Furthermore, we propose a residual attention feature group (RAFG) for parallelizing attention and residual block computation. The output of each residual block is linearly fused at the RAFG output to provide access to the whole feature hierarchy. In parallel, DiVA extracts most relevant features from the network for improving the final output and preventing information loss along the successive operations inside the network. Experimental results demonstrate the superiority of DiVANet over the state of the art in several datasets, while maintaining relatively low computation and memory footprint. The code is available at https://github.com/pbehjatii/DiVANet.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ BPF2023			Serial	3861
Permanent link to this record



Author	Meysam Madadi; Hugo Bertiche; Sergio Escalera
Title	SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery			Type	Journal Article
Year	2020	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	106	Issue		Pages	107472
Keywords	Deep learning; 3D Human pose; Body shape; SMPL; Denoising autoencoder; Volumetric stack hourglass
Abstract	In this paper we propose to embed SMPL within a deep-based model to accurately estimate 3D pose and shape from a still RGB image. We use CNN-based 3D joint predictions as an intermediate representation to regress SMPL pose and shape parameters. Later, 3D joints are reconstructed again in the SMPL output. This module can be seen as an autoencoder where the encoder is a deep neural network and the decoder is SMPL model. We refer to this as SMPL reverse (SMPLR). By implementing SMPLR as an encoder-decoder we avoid the need of complex constraints on pose and shape. Furthermore, given that in-the-wild datasets usually lack accurate 3D annotations, it is desirable to lift 2D joints to 3D without pairing 3D annotations with RGB images. Therefore, we also propose a denoising autoencoder (DAE) module between CNN and SMPLR, able to lift 2D joints to 3D and partially recover from structured error. We evaluate our method on SURREAL and Human3.6M datasets, showing improvement over SMPL-based state-of-the-art alternatives by about 4 and 12 mm, respectively.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; no proj			Approved	no
Call Number	Admin @ si @ MBE2020			Serial	3439
Permanent link to this record



Author	Debora Gil; Aura Hernandez-Sabate; Mireia Brunat;Steven Jansen; Jordi Martinez-Vilalta
Title	Structure-preserving smoothing of biomedical images			Type	Journal Article
Year	2011	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	44	Issue	9	Pages	1842-1851
Keywords	Non-linear smoothing; Differential geometry; Anatomical structures; segmentation; Cardiac magnetic resonance; Computerized tomography
Abstract	Smoothing of biomedical images should preserve gray-level transitions between adjacent tissues, while restoring contours consistent with anatomical structures. Anisotropic diffusion operators are based on image appearance discontinuities (either local or contextual) and might fail at weak inter-tissue transitions. Meanwhile, the output of block-wise and morphological operations is prone to present a block structure due to the shape and size of the considered pixel neighborhood. In this contribution, we use differential geometry concepts to define a diffusion operator that restricts to image consistent level-sets. In this manner, the final state is a non-uniform intensity image presenting homogeneous inter-tissue transitions along anatomical structures, while smoothing intra-structure texture. Experiments on different types of medical images (magnetic resonance, computerized tomography) illustrate its benefit on a further process (such as segmentation) of images.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	IAM; ADAS			Approved	no
Call Number	IAM @ iam @ GHB2011			Serial	1526
Permanent link to this record



Author	Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados
Title	Table detection in business document images by message passing networks			Type	Journal Article
Year	2022	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	127	Issue		Pages	108641
Keywords
Abstract	Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches.
Address	July 2022
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.162; 600.121			Approved	no
Call Number	Admin @ si @ RGR2022			Serial	3729
Permanent link to this record



Author	Susana Alvarez; Maria Vanrell
Title	Texton theory revisited: a bag-of-words approach to combine textons			Type	Journal Article
Year	2012	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	45	Issue	12	Pages	4312-4325
Keywords
Abstract	The aim of this paper is to revisit an old theory of texture perception and update its computational implementation by extending it to colour. With this in mind we try to capture the optimality of perceptual systems. This is achieved in the proposed approach by sharing well-known early stages of the visual processes and extracting low-dimensional features that perfectly encode adequate properties for a large variety of textures without needing further learning stages. We propose several descriptors in a bag-of-words framework that are derived from different quantisation models on to the feature spaces. Our perceptual features are directly given by the shape and colour attributes of image blobs, which are the textons. In this way we avoid learning visual words and directly build the vocabularies on these lowdimensionaltexton spaces. Main differences between proposed descriptors rely on how co-occurrence of blob attributes is represented in the vocabularies. Our approach overcomes current state-of-art in colour texture description which is proved in several experiments on large texture datasets.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	Admin @ si @ AlV2012a			Serial	2130
Permanent link to this record



Author	Lluis Gomez; Dimosthenis Karatzas
Title	TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild			Type	Journal Article
Year	2017	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	70	Issue		Pages	60-74
Keywords
Abstract	Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way. Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.084; 601.197; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ GoK2017			Serial	2886
Permanent link to this record



Author	Veronica Romero; Alicia Fornes; Nicolas Serrano; Joan Andreu Sanchez; A.H. Toselli; Volkmar Frinken; E. Vidal; Josep Llados
Title	The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition			Type	Journal Article
Year	2013	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	46	Issue	6	Pages	1658-1669
Keywords
Abstract	Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies.
Address
Corporate Author				Thesis
Publisher	Elsevier Science Inc. New York, NY, USA	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.045; 602.006; 605.203			Approved	no
Call Number	Admin @ si @ RFS2013			Serial	2298
Permanent link to this record



Author	Estefania Talavera; Carolin Wuerich; Nicolai Petkov; Petia Radeva
Title	Topic modelling for routine discovery from egocentric photo-streams			Type	Journal Article
Year	2020	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	104	Issue		Pages	107330
Keywords	Routine; Egocentric vision; Lifestyle; Behaviour analysis; Topic modelling
Abstract	Developing tools to understand and visualize lifestyle is of high interest when addressing the improvement of habits and well-being of people. Routine, defined as the usual things that a person does daily, helps describe the individuals’ lifestyle. With this paper, we are the first ones to address the development of novel tools for automatic discovery of routine days of an individual from his/her egocentric images. In the proposed model, sequences of images are firstly characterized by semantic labels detected by pre-trained CNNs. Then, these features are organized in temporal-semantic documents to later be embedded into a topic models space. Finally, Dynamic-Time-Warping and Spectral-Clustering methods are used for final day routine/non-routine discrimination. Moreover, we introduce a new EgoRoutine-dataset, a collection of 104 egocentric days with more than 100.000 images recorded by 7 users. Results show that routine can be discovered and behavioural patterns can be observed.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ TWP2020			Serial	3435
Permanent link to this record



Author	Jorge Bernal; F. Javier Sanchez; Fernando Vilariño
Title	Towards Automatic Polyp Detection with a Polyp Appearance Model			Type	Journal Article
Year	2012	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	45	Issue	9	Pages	3166-3182
Keywords	Colonoscopy,PolypDetection,RegionSegmentation,SA-DOVA descriptot
Abstract	This work aims at the automatic polyp detection by using a model of polyp appearance in the context of the analysis of colonoscopy videos. Our method consists of three stages: region segmentation, region description and region classification. The performance of our region segmentation method guarantees that if a polyp is present in the image, it will be exclusively and totally contained in a single region. The output of the algorithm also defines which regions can be considered as non-informative. We define as our region descriptor the novel Sector Accumulation-Depth of Valleys Accumulation (SA-DOVA), which provides a necessary but not sufficient condition for the polyp presence. Finally, we classify our segmented regions according to the maximal values of the SA-DOVA descriptor. Our preliminary classification results are promising, especially when classifying those parts of the image that do not contain a polyp inside.
Address
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0031-3203	ISBN		Medium
Area	800	Expedition		Conference	IbPRIA
Notes	MV;SIAI			Approved	no
Call Number	Admin @ si @ BSV2012; IAM @ iam			Serial	1997
Permanent link to this record



Author	Daniel Ponsa; Antonio Lopez
Title	Variance reduction techniques in particle-based visual contour Tracking			Type	Journal Article
Year	2009	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	42	Issue	11	Pages	2372–2391
Keywords	Contour tracking; Active shape models; Kalman filter; Particle filter; Importance sampling; Unscented particle filter; Rao-Blackwellization; Partitioned sampling
Abstract	This paper presents a comparative study of three different strategies to improve the performance of particle filters, in the context of visual contour tracking: the unscented particle filter, the Rao-Blackwellized particle filter, and the partitioned sampling technique. The tracking problem analyzed is the joint estimation of the global and local transformation of the outline of a given target, represented following the active shape model approach. The main contributions of the paper are the novel adaptations of the considered techniques on this generic problem, and the quantitative assessment of their performance in extensive experimental work done.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ PoL2009a			Serial	1168
Permanent link to this record



Author	Souhail Bakkali; Zuheng Ming; Mickael Coustaty; Marçal Rusiñol; Oriol Ramos Terrades
Title	VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification			Type	Journal Article
Year	2023	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	139	Issue		Pages	109419
Keywords
Abstract	Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream approach. In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues, considering intra- and inter-modality relationships. Instead of merging features from different modalities into a common representation space, the proposed method exploits high-level interactions and learns relevant semantic information from effective attention flows within and across modalities. The proposed learning objective is devised between intra- and inter-modality alignment tasks, where the similarity distribution per task is computed by contracting positive sample pairs while simultaneously contrasting negative ones in the common feature representation space}. Extensive experiments on public document classification datasets demonstrate the effectiveness and the generalization capacity of our model on both low-scale and large-scale datasets.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	ISSN 0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.140; 600.121			Approved	no
Call Number	Admin @ si @ BMC2023			Serial	3826
Permanent link to this record



Author	Albert Gordo; Alicia Fornes; Ernest Valveny
Title	Writer identification in handwritten musical scores with bags of notes			Type	Journal Article
Year	2013	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	46	Issue	5	Pages	1337-1345
Keywords
Abstract	Writer Identification is an important task for the automatic processing of documents. However, the identification of the writer in graphical documents is still challenging. In this work, we adapt the Bag of Visual Words framework to the task of writer identification in handwritten musical scores. A vanilla implementation of this method already performs comparably to the state-of-the-art. Furthermore, we analyze the effect of two improvements of the representation: a Bhattacharyya embedding, which improves the results at virtually no extra cost, and a Fisher Vector representation that very significantly improves the results at the cost of a more complex and costly representation. Experimental evaluation shows results more than 20 points above the state-of-the-art in a new, challenging dataset.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	DAG			Approved	no
Call Number	Admin @ si @ GFV2013			Serial	2307
Permanent link to this record