|
Josep Llados, Jaime Lopez-Krahe, Gemma Sanchez and Enric Marti. 2000. Interprétation de cartes et plans par mise en correspondance de graphes de attributs. 12 Congrès Francophone AFRIF–AFIA.225–234.
|
|
|
Oriol Ramos Terrades, N. Serrano, Albert Gordo, Ernest Valveny and Alfons Juan-Ciscar. 2010. Interactive-predictive detection of handwritten text blocks. 17th Document Recognition and Retrieval Conference, part of the IS&T-SPIE Electronic Imaging Symposium.75340Q–75340Q–10.
Abstract: A method for text block detection is introduced for old handwritten documents. The proposed method takes advantage of sequential book structure, taking into account layout information from pages previously transcribed. This glance at the past is used to predict the position of text blocks in the current page with the help of conventional layout analysis methods. The method is integrated into the GIDOC prototype: a first attempt to provide integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. Results are given in a transcription task on a 764-page Spanish manuscript from 1891.
|
|
|
Marçal Rusiñol, David Aldavert, Dimosthenis Karatzas, Ricardo Toledo and Josep Llados. 2011. Interactive Trademark Image Retrieval by Fusing Semantic and Visual Content. Advances in Information Retrieval. In P. Clough and 6 others, eds. 33rd European Conference on Information Retrieval. Berlin, Springer, 314–325. (LNCS.)
Abstract: In this paper we propose an efficient queried-by-example retrieval system which is able to retrieve trademark images by similarity from patent and trademark offices' digital libraries. Logo images are described by both their semantic content, by means of the Vienna codes, and their visual contents, by using shape and color as visual cues. The trademark descriptors are then indexed by a locality-sensitive hashing data structure aiming to perform approximate k-NN search in high dimensional spaces in sub-linear time. The resulting ranked lists are combined by using the Condorcet method and a relevance feedback step helps to iteratively revise the query and refine the obtained results. The experiments demonstrate the effectiveness and efficiency of this system on a realistic and large dataset.
|
|
|
Oriol Ramos Terrades, Alejandro Hector Toselli, Nicolas Serrano, Veronica Romero, Enrique Vidal and Alfons Juan. 2010. Interactive layout analysis and transcription systems for historic handwritten documents. 10th ACM Symposium on Document Engineering.219–222.
Abstract: The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents, waiting to be classified and finally transcribed into a textual electronic format (such as ASCII or PDF). Nevertheless, most of the available fully-automatic applications addressing this task are far from being perfect and heavy and inefficient human intervention is often required to check and correct the results of such systems. In contrast, multimodal interactive-predictive approaches may allow the users to participate in the process helping the system to improve the overall performance. With this in mind, two sets of recent advances are introduced in this work: a novel interactive method for text block detection and two multimodal interactive handwritten text transcription systems which use active learning and interactive-predictive technologies in the recognition process.
Keywords: Handwriting recognition; Interactive predictive processing; Partial supervision; Interactive layout analysis
|
|
|
Ernest Valveny, Oriol Ramos Terrades, Joan Mas and Marçal Rusiñol. 2013. Interactive Document Retrieval and Classification. In Angel Sappa and Jordi Vitria, eds. Multimodal Interaction in Image and Video Applications. Springer Berlin Heidelberg, 17–30.
Abstract: In this chapter we describe a system for document retrieval and classification following the interactive-predictive framework. In particular, the system addresses two different scenarios of document analysis: document classification based on visual appearance and logo detection. These two classical problems of document analysis are formulated following the interactive-predictive model, taking the user interaction into account to make easier the process of annotating and labelling the documents. A system implementing this model in a real scenario is presented and analyzed. This system also takes advantage of active learning techniques to speed up the task of labelling the documents.
|
|
|
David Aldavert, Marçal Rusiñol, Ricardo Toledo and Josep Llados. 2013. Integrating Visual and Textual Cues for Query-by-String Word Spotting. 12th International Conference on Document Analysis and Recognition.511–515.
Abstract: In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character $n$-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances.
|
|
|
Youssef El Rhabi, Simon Loic, Brun Luc, Josep Llados and Felipe Lumbreras. 2016. Information Theoretic Rotationwise Robust Binary Descriptor Learning. Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR).368–378.
Abstract: In this paper, we propose a new data-driven approach for binary descriptor selection. In order to draw a clear analysis of common designs, we present a general information-theoretic selection paradigm. It encompasses several standard binary descriptor construction schemes, including a recent state-of-the-art one named BOLD. We pursue the same endeavor to increase the stability of the produced descriptors with respect to rotations. To achieve this goal, we have designed a novel offline selection criterion which is better adapted to the online matching procedure. The effectiveness of our approach is demonstrated on two standard datasets, where our descriptor is compared to BOLD and to several classical descriptors. In particular, it emerges that our approach can reproduce equivalent if not better performance as BOLD while relying on twice shorter descriptors. Such an improvement can be influential for real-time applications.
|
|
|
Veronica Romero, Alicia Fornes, Enrique Vidal and Joan Andreu Sanchez. 2017. Information Extraction in Handwritten Marriage Licenses Books Using the MGGI Methodology. In L.A. Alexandre, J.Salvador Sanchez and Joao M. F. Rodriguez, eds. 8th Iberian Conference on Pattern Recognition and Image Analysis.287–294. (LNCS.)
Abstract: Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach.
Keywords: Handwritten Text Recognition; Information extraction; Language modeling; MGGI; Categories-based language model
|
|
|
Veronica Romero, Emilio Granell, Alicia Fornes, Enrique Vidal and Joan Andreu Sanchez. 2019. Information Extraction in Handwritten Marriage Licenses Books. 5th International Workshop on Historical Document Imaging and Processing.66–71.
Abstract: Handwritten marriage licenses books are characterized by a simple structure of the text in the records with an evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. Previous works have shown that the use of category-based language models and a Grammatical Inference technique known as MGGI can improve the accuracy of these
tasks. However, the application of the MGGI algorithm requires an a priori knowledge to label the words of the training strings, that is not always easy to obtain. In this paper we study how to automatically obtain the information required by the MGGI algorithm using a technique based on Confusion Networks. Using the resulting language model, full handwritten text recognition and information extraction experiments have been carried out with results supporting the proposed approach.
|
|
|
Juan Ignacio Toledo, Manuel Carbonell, Alicia Fornes and Josep Llados. 2019. Information Extraction from Historical Handwritten Document Images with a Context-aware Neural Model. PR, 86, 27–36.
Abstract: Many historical manuscripts that hold trustworthy memories of the past societies contain information organized in a structured layout (e.g. census, birth or marriage records). The precious information stored in these documents cannot be effectively used nor accessed without costly annotation efforts. The transcription driven by the semantic categories of words is crucial for the subsequent access. In this paper we describe an approach to extract information from structured historical handwritten text images and build a knowledge representation for the extraction of meaning out of historical data. The method extracts information, such as named entities, without the need of an intermediate transcription step, thanks to the incorporation of context information through language models. Our system has two variants, the first one is based on bigrams, whereas the second one is based on recurrent neural networks. Concretely, our second architecture integrates a Convolutional Neural Network to model visual information from word images together with a Bidirecitonal Long Short Term Memory network to model the relation among the words. This integrated sequential approach is able to extract more information than just the semantic category (e.g. a semantic category can be associated to a person in a record). Our system is generic, it deals with out-of-vocabulary words by design, and it can be applied to structured handwritten texts from different domains. The method has been validated with the ICDAR IEHHR competition protocol, outperforming the existing approaches.
Keywords: Document image analysis; Handwritten documents; Named entity recognition; Deep neural networks
|
|