|
Dimosthenis Karatzas and 9 others. 2013. ICDAR 2013 Robust Reading Competition. 12th International Conference on Document Analysis and Recognition.1484–1493.
Abstract: This report presents the final results of the ICDAR 2013 Robust Reading Competition. The competition is structured in three Challenges addressing text extraction in different application domains, namely born-digital images, real scene images and real-scene videos. The Challenges are organised around specific tasks covering text localisation, text segmentation and word recognition. The competition took place in the first quarter of 2013, and received a total of 42 submissions over the different tasks offered. This report describes the datasets and ground truth specification, details the performance evaluation protocols used and presents the final results along with a brief summary of the participating methods.
|
|
|
Lluis Gomez. 2012. Perceptual Organization for Text Extraction in Natural Scenes. (Master's thesis, .)
|
|
|
Lluis Gomez and Dimosthenis Karatzas. 2013. Multi-script Text Extraction from Natural Scenes. 12th International Conference on Document Analysis and Recognition.467–471.
Abstract: Scene text extraction methodologies are usually based in classification of individual regions or patches, using a priori knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organisation through which text emerges as a perceptually significant group of atomic objects. Therefore humans are able to detect text even in languages and scripts never seen before. In this paper, we argue that the text extraction problem could be posed as the detection of meaningful groups of regions. We present a method built around a perceptual organisation framework that exploits collaboration of proximity and similarity laws to create text-group hypotheses. Experiments demonstrate that our algorithm is competitive with state of the art approaches on a standard dataset covering text in variable orientations and two languages.
|
|
|
Albert Gordo, Florent Perronnin and Ernest Valveny. 2013. Large-scale document image retrieval and classification with runlength histograms and binary embeddings. PR, 46(7), 1898–1905.
Abstract: We present a new document image descriptor based on multi-scale runlength
histograms. This descriptor does not rely on layout analysis and can be
computed efficiently. We show how this descriptor can achieve state-of-theart
results on two very different public datasets in classification and retrieval
tasks. Moreover, we show how we can compress and binarize these descriptors
to make them suitable for large-scale applications. We can achieve state-ofthe-
art results in classification using binary descriptors of as few as 16 to 64
bits.
Keywords: visual document descriptor; compression; large-scale; retrieval; classification
|
|
|
Albert Gordo, Alicia Fornes and Ernest Valveny. 2013. Writer identification in handwritten musical scores with bags of notes. PR, 46(5), 1337–1345.
Abstract: Writer Identification is an important task for the automatic processing of documents. However, the identification of the writer in graphical documents is still challenging. In this work, we adapt the Bag of Visual Words framework to the task of writer identification in handwritten musical scores. A vanilla implementation of this method already performs comparably to the state-of-the-art. Furthermore, we analyze the effect of two improvements of the representation: a Bhattacharyya embedding, which improves the results at virtually no extra cost, and a Fisher Vector representation that very significantly improves the results at the cost of a more complex and costly representation. Experimental evaluation shows results more than 20 points above the state-of-the-art in a new, challenging dataset.
|
|
|
David Fernandez, Simone Marinai, Josep Llados and Alicia Fornes. 2013. Contextual Word Spotting in Historical Manuscripts using Markov Logic Networks. 2nd International Workshop on Historical Document Imaging and Processing.36–43.
Abstract: Natural languages can often be modelled by suitable grammars whose knowledge can improve the word spotting results. The implicit contextual information is even more useful when dealing with information that is intrinsically described as one collection of records. In this paper, we present one approach to word spotting which uses the contextual information of records to improve the results. The method relies on Markov Logic Networks to probabilistically model the relational organization of handwritten records. The performance has been evaluated on the Barcelona Marriages Dataset that contains structured handwritten records that summarize marriage information.
|
|
|
Volkmar Frinken, Andreas Fischer, Markus Baumgartner and Horst Bunke. 2014. Keyword spotting for self-training of BLSTM NN based handwriting recognition systems. PR, 47(3), 1073–1082.
Abstract: The automatic transcription of unconstrained continuous handwritten text requires well trained recognition systems. The semi-supervised paradigm introduces the concept of not only using labeled data but also unlabeled data in the learning process. Unlabeled data can be gathered at little or not cost. Hence it has the potential to reduce the need for labeling training data, a tedious and costly process. Given a weak initial recognizer trained on labeled data, self-training can be used to recognize unlabeled data and add words that were recognized with high confidence to the training set for re-training. This process is not trivial and requires great care as far as selecting the elements that are to be added to the training set is concerned. In this paper, we propose to use a bidirectional long short-term memory neural network handwritten recognition system for keyword spotting in order to select new elements. A set of experiments shows the high potential of self-training for bootstrapping handwriting recognition systems, both for modern and historical handwritings, and demonstrate the benefits of using keyword spotting over previously published self-training schemes.
Keywords: Document retrieval; Keyword spotting; Handwriting recognition; Neural networks; Semi-supervised learning
|
|
|
Veronica Romero and 7 others. 2013. The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition. PR, 46(6), 1658–1669.
Abstract: Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies.
|
|
|
Hongxing Gao and 6 others. 2013. Key-region detection for document images -applications to administrative document retrieval. 12th International Conference on Document Analysis and Recognition.230–234.
Abstract: In this paper we argue that a key-region detector designed to take into account the special characteristics of document images can result in the detection of less and more meaningful key-regions. We propose a fast key-region detector able to capture aspects of the structural information of the document, and demonstrate its efficiency by comparing against standard detectors in an administrative document retrieval scenario. We show that using the proposed detector results to a smaller number of detected key-regions and higher performance without any drop in speed compared to standard state of the art detectors.
|
|
|
Andreas Fischer, Ching Y. Suen, Volkmar Frinken, Kaspar Riesen and Horst Bunke. 2013. A Fast Matching Algorithm for Graph-Based Handwriting Recognition. 9th IAPR – TC15 Workshop on Graph-based Representation in Pattern Recognition. Springer Berlin Heidelberg, 194–203. (LNCS.)
Abstract: The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy.
|
|