Home | << 1 2 3 4 >> |
Records | |||||
---|---|---|---|---|---|
Author | Jose Antonio Rodriguez; Florent Perronnin; Gemma Sanchez; Josep Llados | ||||
Title | Unsupervised writer adaptation of whole-word HMMs with application to word-spotting | Type | Journal Article | ||
Year | 2010 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 31 | Issue | 8 | Pages | 742–749 |
Keywords | Word-spotting; Handwriting recognition; Writer adaptation; Hidden Markov model; Document analysis | ||||
Abstract | In this paper we propose a novel approach for writer adaptation in a handwritten word-spotting task. The method exploits the fact that the semi-continuous hidden Markov model separates the word model parameters into (i) a codebook of shapes and (ii) a set of word-specific parameters.
Our main contribution is to employ this property to derive writer-specific word models by statistically adapting an initial universal codebook to each document. This process is unsupervised and does not even require the appearance of the keyword(s) in the searched document. Experimental results show an increase in performance when this adaptation technique is applied. To the best of our knowledge, this is the first work dealing with adaptation for word-spotting. The preliminary version of this paper obtained an IBM Best Student Paper Award at the 19th International Conference on Pattern Recognition. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ RPS2010 | Serial | 1290 | ||
Permanent link to this record | |||||
Author | Josep Llados; Horst Bunke; Enric Marti | ||||
Title | Finding rotational symmetries by cyclic string matching | Type | Journal Article | ||
Year | 1997 | Publication | Pattern recognition letters | Abbreviated Journal | PRL |
Volume | 18 | Issue | 14 | Pages | 1435-1442 |
Keywords | Rotational symmetry; Reflectional symmetry; String matching | ||||
Abstract | Symmetry is an important shape feature. In this paper, a simple and fast method to detect perfect and distorted rotational symmetries of 2D objects is described. The boundary of a shape is polygonally approximated and represented as a string. Rotational symmetries are found by cyclic string matching between two identical copies of the shape string. The set of minimum cost edit sequences that transform the shape string to a cyclically shifted version of itself define the rotational symmetry and its order. Finally, a modification of the algorithm is proposed to detect reflectional symmetries. Some experimental results are presented to show the reliability of the proposed algorithm | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG;IAM; | Approved | no | ||
Call Number | IAM @ iam @ LBM1997a | Serial | 1562 | ||
Permanent link to this record | |||||
Author | Kai Wang; Joost Van de Weijer; Luis Herranz | ||||
Title | ACAE-REMIND for online continual learning with compressed feature replay | Type | Journal Article | ||
Year | 2021 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 150 | Issue | Pages | 122-129 | |
Keywords | online continual learning; autoencoders; vector quantization | ||||
Abstract | Online continual learning aims to learn from a non-IID stream of data from a number of different tasks, where the learner is only allowed to consider data once. Methods are typically allowed to use a limited buffer to store some of the images in the stream. Recently, it was found that feature replay, where an intermediate layer representation of the image is stored (or generated) leads to superior results than image replay, while requiring less memory. Quantized exemplars can further reduce the memory usage. However, a drawback of these methods is that they use a fixed (or very intransigent) backbone network. This significantly limits the learning of representations that can discriminate between all tasks. To address this problem, we propose an auxiliary classifier auto-encoder (ACAE) module for feature replay at intermediate layers with high compression rates. The reduced memory footprint per image allows us to save more exemplars for replay. In our experiments, we conduct task-agnostic evaluation under online continual learning setting and get state-of-the-art performance on ImageNet-Subset, CIFAR100 and CIFAR10 dataset. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.147; 601.379; 600.120; 600.141 | Approved | no | ||
Call Number | Admin @ si @ WWH2021 | Serial | 3575 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Ali Furkan Biten; Ruben Tito; Andres Mafla; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas | ||||
Title | Multimodal grid features and cell pointers for scene text visual question answering | Type | Journal Article | ||
Year | 2021 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 150 | Issue | Pages | 242-249 | |
Keywords | |||||
Abstract | This paper presents a new model for the task of scene text visual question answering. In this task questions about a given image can only be answered by reading and understanding scene text. Current state of the art models for this task make use of a dual attention mechanism in which one attention module attends to visual features while the other attends to textual features. A possible issue with this is that it makes difficult for the model to reason jointly about both modalities. To fix this problem we propose a new model that is based on an single attention mechanism that attends to multi-modal features conditioned to the question. The output weights of this attention module over a grid of multi-modal spatial features are interpreted as the probability that a certain spatial location of the image contains the answer text to the given question. Our experiments demonstrate competitive performance in two standard datasets with a model that is faster than previous methods at inference time. Furthermore, we also provide a novel analysis of the ST-VQA dataset based on a human performance study. Supplementary material, code, and data is made available through this link. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.084; 600.121 | Approved | no | ||
Call Number | Admin @ si @ GBT2021 | Serial | 3620 | ||
Permanent link to this record | |||||
Author | M. Bressan; Jordi Vitria | ||||
Title | Nonparametric Discriminant Analysis and Nearest Neighbor Classification | Type | Journal Article | ||
Year | 2003 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 24 | Issue | 15 | Pages | 2743–2749 |
Keywords | |||||
Abstract | IF: 0.809 | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | OR;MV | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ BrV2003b | Serial | 367 | ||
Permanent link to this record | |||||
Author | Manuel Carbonell; Alicia Fornes; Mauricio Villegas; Josep Llados | ||||
Title | A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages | Type | Journal Article | ||
Year | 2020 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 136 | Issue | Pages | 219-227 | |
Keywords | |||||
Abstract | In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.140; 601.311; 600.121 | Approved | no | ||
Call Number | Admin @ si @ CFV2020 | Serial | 3451 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; Agnes Borras; Josep Llados | ||||
Title | Relational Indexing of Vectorial Primitives for Symbol Spotting in Line-Drawing Images | Type | Journal Article | ||
Year | 2010 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 31 | Issue | 3 | Pages | 188–201 |
Keywords | Document image analysis and recognition, Graphics recognition, Symbol spotting ,Vectorial representations, Line-drawings | ||||
Abstract | This paper presents a symbol spotting approach for indexing by content a database of line-drawing images. As line-drawings are digital-born documents designed by vectorial softwares, instead of using a pixel-based approach, we present a spotting method based on vector primitives. Graphical symbols are represented by a set of vectorial primitives which are described by an off-the-shelf shape descriptor. A relational indexing strategy aims to retrieve symbol locations into the target documents by using a combined numerical-relational description of 2D structures. The zones which are likely to contain the queried symbol are validated by a Hough-like voting scheme. In addition, a performance evaluation framework for symbol spotting in graphical documents is proposed. The presented methodology has been evaluated with a benchmarking set of architectural documents achieving good performance results. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ RBL2010 | Serial | 1177 | ||
Permanent link to this record | |||||
Author | Marco Pedersoli; Jordi Gonzalez; Andrew Bagdanov; Xavier Roca | ||||
Title | Efficient Discriminative Multiresolution Cascade for Real-Time Human Detection Applications | Type | Journal Article | ||
Year | 2011 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 32 | Issue | 13 | Pages | 1581-1587 |
Keywords | |||||
Abstract | Human detection is fundamental in many machine vision applications, like video surveillance, driving assistance, action recognition and scene understanding. However in most of these applications real-time performance is necessary and this is not achieved yet by current detection methods.
This paper presents a new method for human detection based on a multiresolution cascade of Histograms of Oriented Gradients (HOG) that can highly reduce the computational cost of detection search without affecting accuracy. The method consists of a cascade of sliding window detectors. Each detector is a linear Support Vector Machine (SVM) composed of HOG features at different resolutions, from coarse at the first level to fine at the last one. In contrast to previous methods, our approach uses a non-uniform stride of the sliding window that is defined by the feature resolution and allows the detection to be incrementally refined as going from coarse-to-fine resolution. In this way, the speed-up of the cascade is not only due to the fewer number of features computed at the first levels of the cascade, but also to the reduced number of windows that need to be evaluated at the coarse resolution. Experimental results show that our method reaches a detection rate comparable with the state-of-the-art of detectors based on HOG features, while at the same time the detection search is up to 23 times faster. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ PGB2011a | Serial | 1707 | ||
Permanent link to this record | |||||
Author | Meysam Madadi; Sergio Escalera; Jordi Gonzalez; Xavier Roca; Felipe Lumbreras | ||||
Title | Multi-part body segmentation based on depth maps for soft biometry analysis | Type | Journal Article | ||
Year | 2015 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 56 | Issue | Pages | 14-21 | |
Keywords | 3D shape context; 3D point cloud alignment; Depth maps; Human body segmentation; Soft biometry analysis | ||||
Abstract | This paper presents a novel method extracting biometric measures using depth sensors. Given a multi-part labeled training data, a new subject is aligned to the best model of the dataset, and soft biometrics such as lengths or circumference sizes of limbs and body are computed. The process is performed by training relevant pose clusters, defining a representative model, and fitting a 3D shape context descriptor within an iterative matching procedure. We show robust measures by applying orthogonal plates to body hull. We test our approach in a novel full-body RGB-Depth data set, showing accurate estimation of soft biometrics and better segmentation accuracy in comparison with random forest approach without requiring large training data. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; ISE; ADAS; 600.076;600.049; 600.063; 600.054; 302.018;MILAB | Approved | no | ||
Call Number | Admin @ si @ MEG2015 | Serial | 2588 | ||
Permanent link to this record | |||||
Author | Miguel Angel Bautista; Sergio Escalera; Xavier Baro; Petia Radeva; Jordi Vitria; Oriol Pujol | ||||
Title | Minimal Design of Error-Correcting Output Codes | Type | Journal Article | ||
Year | 2011 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 33 | Issue | 6 | Pages | 693-702 |
Keywords | Multi-class classification; Error-correcting output codes; Ensemble of classifiers | ||||
Abstract | IF JCR CCIA 1.303 2009 54/103
The classification of large number of object categories is a challenging trend in the pattern recognition field. In literature, this is often addressed using an ensemble of classifiers. In this scope, the Error-correcting output codes framework has demonstrated to be a powerful tool for combining classifiers. However, most state-of-the-art ECOC approaches use a linear or exponential number of classifiers, making the discrimination of a large number of classes unfeasible. In this paper, we explore and propose a minimal design of ECOC in terms of the number of classifiers. Evolutionary computation is used for tuning the parameters of the classifiers and looking for the best minimal ECOC code configuration. The results over several public UCI datasets and different multi-class computer vision problems show that the proposed methodology obtains comparable (even better) results than state-of-the-art ECOC methodologies with far less number of dichotomizers. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0167-8655 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | MILAB; OR;HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ BEB2011a | Serial | 1800 | ||
Permanent link to this record | |||||
Author | Mikkel Thogersen; Sergio Escalera; Jordi Gonzalez; Thomas B. Moeslund | ||||
Title | Segmentation of RGB-D Indoor scenes by Stacking Random Forests and Conditional Random Fields | Type | Journal Article | ||
Year | 2016 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 80 | Issue | Pages | 208–215 | |
Keywords | |||||
Abstract | This paper proposes a technique for RGB-D scene segmentation using Multi-class
Multi-scale Stacked Sequential Learning (MMSSL) paradigm. Following recent trends in state-of-the-art, a base classifier uses an initial SLIC segmentation to obtain superpixels which provide a diminution of data while retaining object boundaries. A series of color and depth features are extracted from the superpixels, and are used in a Conditional Random Field (CRF) to predict superpixel labels. Furthermore, a Random Forest (RF) classifier using random offset features is also used as an input to the CRF, acting as an initial prediction. As a stacked classifier, another Random Forest is used acting on a spatial multi-scale decomposition of the CRF confidence map to correct the erroneous labels assigned by the previous classifier. The model is tested on the popular NYU-v2 dataset. The approach shows that simple multi-modal features with the power of the MMSSL paradigm can achieve better performance than state of the art results on the same dataset. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; ISE;MILAB; 600.098; 600.119 | Approved | no | ||
Call Number | Admin @ si @ TEG2016 | Serial | 2843 | ||
Permanent link to this record | |||||
Author | Miquel Ferrer; Ernest Valveny; F. Serratosa | ||||
Title | Median graph: A new exact algorithm using a distance based on the maximum common subgraph | Type | Journal Article | ||
Year | 2009 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 30 | Issue | 5 | Pages | 579–588 |
Keywords | |||||
Abstract | Median graphs have been presented as a useful tool for capturing the essential information of a set of graphs. Nevertheless, computation of optimal solutions is a very hard problem. In this work we present a new and more efficient optimal algorithm for the median graph computation. With the use of a particular cost function that permits the definition of the graph edit distance in terms of the maximum common subgraph, and a prediction function in the backtracking algorithm, we reduce the size of the search space, avoiding the evaluation of a great amount of states and still obtaining the exact median. We present a set of experiments comparing our new algorithm against the previous existing exact algorithm using synthetic data. In addition, we present the first application of the exact median graph computation to real data and we compare the results against an approximate algorithm based on genetic search. These experimental results show that our algorithm outperforms the previous existing exact algorithm and in addition show the potential applicability of the exact solutions to real problems. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier Science Inc. | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0167-8655 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ FVS2009a | Serial | 1114 | ||
Permanent link to this record | |||||
Author | Mohamed Ali Souibgui; Alicia Fornes; Yousri Kessentini; Beata Megyesi | ||||
Title | Few shots are all you need: A progressive learning approach for low resource handwritten text recognition | Type | Journal Article | ||
Year | 2022 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 160 | Issue | Pages | 43-49 | |
Keywords | |||||
Abstract | Handwritten text recognition in low resource scenarios, such as manuscripts with rare alphabets, is a challenging problem. In this paper, we propose a few-shot learning-based handwriting recognition approach that significantly reduces the human annotation process, by requiring only a few images of each alphabet symbols. The method consists of detecting all the symbols of a given alphabet in a textline image and decoding the obtained similarity scores to the final sequence of transcribed symbols. Our model is first pretrained on synthetic line images generated from an alphabet, which could differ from the alphabet of the target domain. A second training step is then applied to reduce the gap between the source and the target data. Since this retraining would require annotation of thousands of handwritten symbols together with their bounding boxes, we propose to avoid such human effort through an unsupervised progressive learning approach that automatically assigns pseudo-labels to the unlabeled data. The evaluation on different datasets shows that our model can lead to competitive results with a significant reduction in human effort. The code will be publicly available in the following repository: https://github.com/dali92002/HTRbyMatching | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121; 600.162; 602.230 | Approved | no | ||
Call Number | Admin @ si @ SFK2022 | Serial | 3736 | ||
Permanent link to this record | |||||
Author | Oriol Ramos Terrades; Ernest Valveny | ||||
Title | A new use of the ridgelets transform for describing linear singularities in images | Type | Journal Article | ||
Year | 2006 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 27 | Issue | 6 | Pages | 587–596 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ RaV2006a | Serial | 635 | ||
Permanent link to this record | |||||
Author | Pau Riba; Josep Llados; Alicia Fornes | ||||
Title | Hierarchical graphs for coarse-to-fine error tolerant matching | Type | Journal Article | ||
Year | 2020 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 134 | Issue | Pages | 116-124 | |
Keywords | Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval | ||||
Abstract | During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting). | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.097; 601.302; 603.057; 600.140; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RLF2020 | Serial | 3349 | ||
Permanent link to this record |