|
Albert Gordo, Jaume Gibert, Ernest Valveny, & Marçal Rusiñol. (2010). A Kernel-based Approach to Document Retrieval. In 9th IAPR International Workshop on Document Analysis Systems (377–384).
Abstract: In this paper we tackle the problem of document image retrieval by combining a similarity measure between documents and the probability that a given document belongs to a certain class. The membership probability to a specific class is computed using Support Vector Machines in conjunction with similarity measure based kernel applied to structural document representations. In the presented experiments, we use different document representations, both visual and structural, and we apply them to a database of historical documents. We show how our method based on similarity kernels outperforms the usual distance-based retrieval.
|
|
|
Farshad Nourbakhsh, Dimosthenis Karatzas, & Ernest Valveny. (2010). A polar-based logo representation based on topological and colour features. In 9th IAPR International Workshop on Document Analysis Systems (341–348).
Abstract: In this paper, we propose a novel rotation and scale invariant method for colour logo retrieval and classification, which involves performing a simple colour segmentation and subsequently describing each of the resultant colour components based on a set of topological and colour features. A polar representation is used to represent the logo and the subsequent logo matching is based on Cyclic Dynamic Time Warping (CDTW). We also show how combining information about the global distribution of the logo components and their local neighbourhood using the Delaunay triangulation allows to improve the results. All experiments are performed on a dataset of 2500 instances of 100 colour logo images in different rotations and scales.
|
|
|
Albert Gordo, Alicia Fornes, Ernest Valveny, & Josep Llados. (2010). A Bag of Notes Approach to Writer Identification in Old Handwritten Music Scores. In 9th IAPR International Workshop on Document Analysis Systems (247–254).
Abstract: Determining the authorship of a document, namely writer identification, can be an important source of information for document categorization. Contrary to text documents, the identification of the writer of graphical documents is still a challenge. In this paper we present a robust approach for writer identification in a particular kind of graphical documents, old music scores. This approach adapts the bag of visual terms method for coping with graphic documents. The identification is performed only using the graphical music notation. For this purpose, we generate a graphic vocabulary without recognizing any music symbols, and consequently, avoiding the difficulties in the recognition of hand-drawn symbols in old and degraded documents. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving very high identification rates.
|
|
|
Partha Pratim Roy, Umapada Pal, & Josep Llados. (2010). Query Driven Word Retrieval in Graphical Documents. In 9th IAPR International Workshop on Document Analysis Systems (191–198).
Abstract: In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.
|
|
|
Sebastien Mace, Herve Locteau, Ernest Valveny, & Salvatore Tabbone. (2010). A system to detect rooms in architectural floor plan images. In 9th IAPR International Workshop on Document Analysis Systems (167–174).
Abstract: In this article, a system to detect rooms in architectural floor plan images is described. We first present a primitive extraction algorithm for line detection. It is based on an original coupling of classical Hough transform with image vectorization in order to perform robust and efficient line detection. We show how the lines that satisfy some graphical arrangements are combined into walls. We also present the way we detect some door hypothesis thanks to the extraction of arcs. Walls and door hypothesis are then used by our room segmentation strategy; it consists in recursively decomposing the image until getting nearly convex regions. The notion of convexity is difficult to quantify, and the selection of separation lines between regions can also be rough. We take advantage of knowledge associated to architectural floor plans in order to obtain mostly rectangular rooms. Qualitative and quantitative evaluations performed on a corpus of real documents show promising results.
|
|
|
Antonio Clavelli, Dimosthenis Karatzas, & Josep Llados. (2010). A framework for the assessment of text extraction algorithms on complex colour images. In 9th IAPR International Workshop on Document Analysis Systems (19–26).
Abstract: The availability of open, ground-truthed datasets and clear performance metrics is a crucial factor in the development of an application domain. The domain of colour text image analysis (real scenes, Web and spam images, scanned colour documents) has traditionally suffered from a lack of a comprehensive performance evaluation framework. Such a framework is extremely difficult to specify, and corresponding pixel-level accurate information tedious to define. In this paper we discuss the challenges and technical issues associated with developing such a framework. Then, we describe a complete framework for the evaluation of text extraction methods at multiple levels, provide a detailed ground-truth specification and present a case study on how this framework can be used in a real-life situation.
|
|
|
Marçal Rusiñol, & Josep Llados. (2010). Efficient Logo Retrieval Through Hashing Shape Context Descriptors. In 9th IAPR International Workshop on Document Analysis Systems (215–222).
Abstract: In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.
|
|