|
Veronica Romero and 7 others. 2013. The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition. PR, 46(6), 1658–1669.
Abstract: Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies.
|
|
|
Hongxing Gao and 6 others. 2013. Key-region detection for document images -applications to administrative document retrieval. 12th International Conference on Document Analysis and Recognition.230–234.
Abstract: In this paper we argue that a key-region detector designed to take into account the special characteristics of document images can result in the detection of less and more meaningful key-regions. We propose a fast key-region detector able to capture aspects of the structural information of the document, and demonstrate its efficiency by comparing against standard detectors in an administrative document retrieval scenario. We show that using the proposed detector results to a smaller number of detected key-regions and higher performance without any drop in speed compared to standard state of the art detectors.
|
|
|
Andreas Fischer, Volkmar Frinken, Horst Bunke and Ching Y. Suen. 2013. Improving HMM-Based Keyword Spotting with Character Language Models. 12th International Conference on Document Analysis and Recognition.506–510.
Abstract: Facing high error rates and slow recognition speed for full text transcription of unconstrained handwriting images, keyword spotting is a promising alternative to locate specific search terms within scanned document images. We have previously proposed a learning-based method for keyword spotting using character hidden Markov models that showed a high performance when compared with traditional template image matching. In the lexicon-free approach pursued, only the text appearance was taken into account for recognition. In this paper, we integrate character n-gram language models into the spotting system in order to provide an additional language context. On the modern IAM database as well as the historical George Washington database, we demonstrate that character language models significantly improve the spotting performance.
|
|
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, Apostolos Antonacopoulos and Josep Llados. 2013. An interactive appearance-based document retrieval system for historical newspapers. Proceedings of the International Conference on Computer Vision Theory and Applications.84–87.
Abstract: In this paper we present a retrieval-based application aimed at assisting a user to semi-automatically segment an incoming flow of historical newspaper images by automatically detecting a particular type of pages based on their appearance. A visual descriptor is used to assess page similarity while a relevance feedback process allow refining the results iteratively. The application is tested on a large dataset of digitised historic newspapers.
|
|
|
Albert Gordo. 2009. A Cyclic Page Layout Descriptor for Document Classification & Retrieval. (Master's thesis, .)
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2013. Embedding of Graphs with Discrete Attributes Via Label Frequencies. IJPRAI, 27(3), 1360002–1360029.
Abstract: Graph-based representations of patterns are very flexible and powerful, but they are not easily processed due to the lack of learning algorithms in the domain of graphs. Embedding a graph into a vector space solves this problem since graphs are turned into feature vectors and thus all the statistical learning machinery becomes available for graph input patterns. In this work we present a new way of embedding discrete attributed graphs into vector spaces using node and edge label frequencies. The methodology is experimentally tested on graph classification problems, using patterns of different nature, and it is shown to be competitive to state-of-the-art classification algorithms for graphs, while being computationally much more efficient.
Keywords: Discrete attributed graphs; graph embedding; graph classification
|
|
|
Albert Gordo. 2013. Document Image Representation, Classification and Retrieval in Large-Scale Domains. (Ph.D. thesis, Ediciones Graficas Rey.)
Abstract: Despite the “paperless office” ideal that started in the decade of the seventies, businesses still strive against an increasing amount of paper documentation. Companies still receive huge amounts of paper documentation that need to be analyzed and processed, mostly in a manual way. A solution for this task consists in, first, automatically scanning the incoming documents. Then, document images can be analyzed and information can be extracted from the data. Documents can also be automatically dispatched to the appropriate workflows, used to retrieve similar documents in the dataset to transfer information, etc.
Due to the nature of this “digital mailroom”, we need document representation methods to be general, i.e., able to cope with very different types of documents. We need the methods to be sound, i.e., able to cope with unexpected types of documents, noise, etc. And, we need to methods to be scalable, i.e., able to cope with thousands or millions of documents that need to be processed, stored, and consulted. Unfortunately, current techniques of document representation, classification and retrieval are not apt for this digital mailroom framework, since they do not fulfill some or all of these requirements.
Through this thesis we focus on the problem of document representation aimed at classification and retrieval tasks under this digital mailroom framework. We first propose a novel document representation based on runlength histograms, and extend it to cope with more complex documents such as multiple-page documents, or documents that contain more sources of information such as extracted OCR text. Then we focus on the scalability requirements and propose a novel binarization method which we dubbed PCAE, as well as two general asymmetric distances between binary embeddings that can significantly improve the retrieval results at a minimal extra computational cost. Finally, we note the importance of supervised learning when performing large-scale retrieval, and study several approaches that can significantly boost the results at no extra cost at query time.
|
|
|
Marçal Rusiñol, R.Roset, Josep Llados and C.Montaner. 2011. Automatic Index Generation of Digitized Map Series by Coordinate Extraction and Interpretation. In Proceedings of the Sixth International Workshop on Digital Technologies in Cartographic Heritage.
|
|
|
Jon Almazan, Alicia Fornes and Ernest Valveny. 2012. A non-rigid appearance model for shape description and recognition. PR, 45(9), 3105–3113.
Abstract: In this paper we describe a framework to learn a model of shape variability in a set of patterns. The framework is based on the Active Appearance Model (AAM) and permits to combine shape deformations with appearance variability. We have used two modifications of the Blurred Shape Model (BSM) descriptor as basic shape and appearance features to learn the model. These modifications permit to overcome the rigidity of the original BSM, adapting it to the deformations of the shape to be represented. We have applied this framework to representation and classification of handwritten digits and symbols. We show that results of the proposed methodology outperform the original BSM approach.
Keywords: Shape recognition; Deformable models; Shape modeling; Hand-drawn recognition
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2012. Graph Embedding in Vector Spaces by Node Attribute Statistics. PR, 45(9), 3072–3083.
Abstract: Graph-based representations are of broad use and applicability in pattern recognition. They exhibit, however, a major drawback with regards to the processing tools that are available in their domain. Graphembedding into vectorspaces is a growing field among the structural pattern recognition community which aims at providing a feature vector representation for every graph, and thus enables classical statistical learning machinery to be used on graph-based input patterns. In this work, we propose a novel embedding methodology for graphs with continuous nodeattributes and unattributed edges. The approach presented in this paper is based on statistics of the node labels and the edges between them, based on their similarity to a set of representatives. We specifically deal with an important issue of this methodology, namely, the selection of a suitable set of representatives. In an experimental evaluation, we empirically show the advantages of this novel approach in the context of different classification problems using several databases of graphs.
Keywords: Structural pattern recognition; Graph embedding; Data clustering; Graph classification
|
|