|   | 
Details
   web
Records
Author Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados
Title Fast Structural Matching for Document Image Retrieval through Spatial Databases Type (up) Conference Article
Year 2014 Publication Document Recognition and Retrieval XXI Abbreviated Journal
Volume 9021 Issue Pages
Keywords Document image retrieval; distance transform; MSER; spatial database
Abstract The structure of document images plays a signi cant role in document analysis thus considerable e orts have been made towards extracting and understanding document structure, usually in the form of layout analysis approaches. In this paper, we rst employ Distance Transform based MSER (DTMSER) to eciently extract stable document structural elements in terms of a dendrogram of key-regions. Then a fast structural matching method is proposed to query the structure of document (dendrogram) based on a spatial database which facilitates the formulation of advanced spatial queries. The experiments demonstrate a signi cant improvement in a document retrieval scenario when compared to the use of typical Bag of Words (BoW) and pyramidal BoW descriptors.
Address Amsterdam; September 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference SPIE-DRR
Notes DAG; 600.056; 600.061; 600.077 Approved no
Call Number Admin @ si @ GRK2014a Serial 2496
Permanent link to this record
 

 
Author Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados
Title Embedding Document Structure to Bag-of-Words through Pair-wise Stable Key-regions Type (up) Conference Article
Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 2903 - 2908
Keywords
Abstract Since the document structure carries valuable discriminative information, plenty of efforts have been made for extracting and understanding document structure among which layout analysis approaches are the most commonly used. In this paper, Distance Transform based MSER (DTMSER) is employed to efficiently extract the document structure as a dendrogram of key-regions which roughly correspond to structural elements such as characters, words and paragraphs. Inspired by the Bag
of Words (BoW) framework, we propose an efficient method for structural document matching by representing the document image as a histogram of key-region pairs encoding structural relationships.
Applied to the scenario of document image retrieval, experimental results demonstrate a remarkable improvement when comparing the proposed method with typical BoW and pyramidal BoW methods.
Address Stockholm; Sweden; August 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 600.056; 600.061; 600.077 Approved no
Call Number Admin @ si @ GRK2014b Serial 2497
Permanent link to this record