toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados edit   pdf
doi  openurl
  Title Embedding Document Structure to Bag-of-Words through Pair-wise Stable Key-regions Type Conference Article
  Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (up) 2903 - 2908  
  Keywords  
  Abstract Since the document structure carries valuable discriminative information, plenty of efforts have been made for extracting and understanding document structure among which layout analysis approaches are the most commonly used. In this paper, Distance Transform based MSER (DTMSER) is employed to efficiently extract the document structure as a dendrogram of key-regions which roughly correspond to structural elements such as characters, words and paragraphs. Inspired by the Bag
of Words (BoW) framework, we propose an efficient method for structural document matching by representing the document image as a histogram of key-region pairs encoding structural relationships.
Applied to the scenario of document image retrieval, experimental results demonstrate a remarkable improvement when comparing the proposed method with typical BoW and pyramidal BoW methods.
 
  Address Stockholm; Sweden; August 2014  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.056; 600.061; 600.077 Approved no  
  Call Number Admin @ si @ GRK2014b Serial 2497  
Permanent link to this record
 

 
Author Josep Llados; Dimosthenis Karatzas; Joan Mas; Gemma Sanchez edit  openurl
  Title A Generic Architecture for the Conversion of Document Collections into Semantically Annotated Digital Archives Type Journal
  Year 2008 Publication Journal of Universal Computer Science Abbreviated Journal  
  Volume 14 Issue 18 Pages (up) 2912–2935  
  Keywords Median Graph, Graph Embedding, Graph Matching, Structural Pattern Recognition  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ LKM2008 Serial 1142  
Permanent link to this record
 

 
Author Yunchao Gong; Svetlana Lazebnik; Albert Gordo; Florent Perronnin edit   pdf
doi  isbn
openurl 
  Title Iterative quantization: A procrustean approach to learning binary codes for Large-Scale Image Retrieval Type Journal Article
  Year 2012 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 35 Issue 12 Pages (up) 2916-2929  
  Keywords  
  Abstract This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections. We formulate this problem in terms of finding a rotation of zero-centered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube, and propose a simple and efficient alternating minimization algorithm to accomplish this task. This algorithm, dubbed iterative quantization (ITQ), has connections to multi-class spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). The resulting binary codes significantly outperform several other state-of-the-art methods. We also show that further performance improvements can result from transforming the data with a nonlinear kernel mapping prior to PCA or CCA. Finally, we demonstrate an application of ITQ to learning binary attributes or “classemes” on the ImageNet dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0162-8828 ISBN 978-1-4577-0394-2 Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ GLG 2012b Serial 2008  
Permanent link to this record
 

 
Author Mohammed Al Rawi; Ernest Valveny edit   pdf
url  doi
openurl 
  Title Compact and Efficient Multitask Learning in Vision, Language and Speech Type Conference Article
  Year 2019 Publication IEEE International Conference on Computer Vision Workshops Abbreviated Journal  
  Volume Issue Pages (up) 2933-2942  
  Keywords  
  Abstract Across-domain multitask learning is a challenging area of computer vision and machine learning due to the intra-similarities among class distributions. Addressing this problem to cope with the human cognition system by considering inter and intra-class categorization and recognition complicates the problem even further. We propose in this work an effective holistic and hierarchical learning by using a text embedding layer on top of a deep learning model. We also propose a novel sensory discriminator approach to resolve the collisions between different tasks and domains. We then train the model concurrently on textual sentiment analysis, speech recognition, image classification, action recognition from video, and handwriting word spotting of two different scripts (Arabic and English). The model we propose successfully learned different tasks across multiple domains.  
  Address Seul; Korea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCVW  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ RaV2019 Serial 3365  
Permanent link to this record
 

 
Author Albert Gordo; Jose Antonio Rodriguez; Florent Perronnin; Ernest Valveny edit   pdf
doi  isbn
openurl 
  Title Leveraging category-level labels for instance-level image retrieval Type Conference Article
  Year 2012 Publication 25th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (up) 3045-3052  
  Keywords  
  Abstract In this article, we focus on the problem of large-scale instance-level image retrieval. For efficiency reasons, it is common to represent an image by a fixed-length descriptor which is subsequently encoded into a small number of bits. We note that most encoding techniques include an unsupervised dimensionality reduction step. Our goal in this work is to learn a better subspace in a supervised manner. We especially raise the following question: “can category-level labels be used to learn such a subspace?” To answer this question, we experiment with four learning techniques: the first one is based on a metric learning framework, the second one on attribute representations, the third one on Canonical Correlation Analysis (CCA) and the fourth one on Joint Subspace and Classifier Learning (JSCL). While the first three approaches have been applied in the past to the image retrieval problem, we believe we are the first to show the usefulness of JSCL in this context. In our experiments, we use ImageNet as a source of category-level labels and report retrieval results on two standard dataseis: INRIA Holidays and the University of Kentucky benchmark. Our experimental study shows that metric learning and attributes do not lead to any significant improvement in retrieval accuracy, as opposed to CCA and JSCL. As an example, we report on Holidays an increase in accuracy from 39.3% to 48.6% with 32-dimensional representations. Overall JSCL is shown to yield the best results.  
  Address Providence, Rhode Island  
  Corporate Author Thesis  
  Publisher IEEE Xplore Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1063-6919 ISBN 978-1-4673-1226-4 Medium  
  Area Expedition Conference CVPR  
  Notes DAG Approved no  
  Call Number Admin @ si @ GRP2012 Serial 2050  
Permanent link to this record
 

 
Author Jaume Gibert; Ernest Valveny; Horst Bunke edit   pdf
doi  openurl
  Title Graph Embedding in Vector Spaces by Node Attribute Statistics Type Journal Article
  Year 2012 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 45 Issue 9 Pages (up) 3072-3083  
  Keywords Structural pattern recognition; Graph embedding; Data clustering; Graph classification  
  Abstract Graph-based representations are of broad use and applicability in pattern recognition. They exhibit, however, a major drawback with regards to the processing tools that are available in their domain. Graphembedding into vectorspaces is a growing field among the structural pattern recognition community which aims at providing a feature vector representation for every graph, and thus enables classical statistical learning machinery to be used on graph-based input patterns. In this work, we propose a novel embedding methodology for graphs with continuous nodeattributes and unattributed edges. The approach presented in this paper is based on statistics of the node labels and the edges between them, based on their similarity to a set of representatives. We specifically deal with an important issue of this methodology, namely, the selection of a suitable set of representatives. In an experimental evaluation, we empirically show the advantages of this novel approach in the context of different classification problems using several databases of graphs.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0031-3203 ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ GVB2012a Serial 1992  
Permanent link to this record
 

 
Author P. Wang; V. Eglin; C. Garcia; C. Largeron; Josep Llados; Alicia Fornes edit   pdf
doi  openurl
  Title A Coarse-to-Fine Word Spotting Approach for Historical Handwritten Documents Based on Graph Embedding and Graph Edit Distance Type Conference Article
  Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (up) 3074 - 3079  
  Keywords word spotting; coarse-to-fine mechamism; graphbased representation; graph embedding; graph edit distance  
  Abstract Effective information retrieval on handwritten document images has always been a challenging task, especially historical ones. In the paper, we propose a coarse-to-fine handwritten word spotting approach based on graph representation. The presented model comprises both the topological and morphological signatures of the handwriting. Skeleton-based graphs with the Shape Context labelled vertexes are established for connected components. Each word image is represented as a sequence of graphs. Aiming at developing a practical and efficient word spotting approach for large-scale historical handwritten documents, a fast and coarse comparison is first applied to prune the regions that are not similar to the query based on the graph embedding methodology. Afterwards, the query and regions of interest are compared by graph edit distance based on the Dynamic Time Warping alignment. The proposed approach is evaluated on a public dataset containing 50 pages of historical marriage license records. The results show that the proposed approach achieves a compromise between efficiency and accuracy.  
  Address Stockholm; Sweden; August 2014  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.061; 602.006; 600.077 Approved no  
  Call Number Admin @ si @ WEG2014a Serial 2515  
Permanent link to this record
 

 
Author Jon Almazan; Alicia Fornes; Ernest Valveny edit   pdf
url  doi
openurl 
  Title A non-rigid appearance model for shape description and recognition Type Journal Article
  Year 2012 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 45 Issue 9 Pages (up) 3105--3113  
  Keywords Shape recognition; Deformable models; Shape modeling; Hand-drawn recognition  
  Abstract In this paper we describe a framework to learn a model of shape variability in a set of patterns. The framework is based on the Active Appearance Model (AAM) and permits to combine shape deformations with appearance variability. We have used two modifications of the Blurred Shape Model (BSM) descriptor as basic shape and appearance features to learn the model. These modifications permit to overcome the rigidity of the original BSM, adapting it to the deformations of the shape to be represented. We have applied this framework to representation and classification of handwritten digits and symbols. We show that results of the proposed methodology outperform the original BSM approach.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0031-3203 ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ AFV2012 Serial 1982  
Permanent link to this record
 

 
Author Lluis Gomez; Dimosthenis Karatzas edit   pdf
doi  openurl
  Title MSER-based Real-Time Text Detection and Tracking Type Conference Article
  Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (up) 3110 - 3115  
  Keywords  
  Abstract We present a hybrid algorithm for detection and tracking of text in natural scenes that goes beyond the fulldetection approaches in terms of time performance optimization.
A state-of-the-art scene text detection module based on Maximally Stable Extremal Regions (MSER) is used to detect text asynchronously, while on a separate thread detected text objects are tracked by MSER propagation. The cooperation of these two modules yields real time video processing at high frame rates even on low-resource devices.
 
  Address Stockholm; August 2014  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.056; 601.158; 601.197; 600.077 Approved no  
  Call Number Admin @ si @ GoK2014a Serial 2492  
Permanent link to this record
 

 
Author Muhammad Muzzamil Luqman; Thierry Brouard; Jean-Yves Ramel; Josep Llados edit  doi
isbn  openurl
  Title A Content Spotting System For Line Drawing Graphic Document Images Type Conference Article
  Year 2010 Publication 20th International Conference on Pattern Recognition Abbreviated Journal  
  Volume 20 Issue Pages (up) 3420–3423  
  Keywords  
  Abstract We present a content spotting system for line drawing graphic document images. The proposed system is sufficiently domain independent and takes the keyword based information retrieval for graphic documents, one step forward, to Query By Example (QBE) and focused retrieval. During offline learning mode: we vectorize the documents in the repository, represent them by attributed relational graphs, extract regions of interest (ROIs) from them, convert each ROI to a fuzzy structural signature, cluster similar signatures to form ROI classes and build an index for the repository. During online querying mode: a Bayesian network classifier recognizes the ROIs in the query image and the corresponding documents are fetched by looking up in the repository index. Experimental results are presented for synthetic images of architectural and electronic documents.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN 978-1-4244-7542-1 Medium  
  Area Expedition Conference ICPR  
  Notes DAG Approved no  
  Call Number DAG @ dag @ LBR2010b Serial 1460  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: