|
Volkmar Frinken, Alicia Fornes, Josep Llados and Jean-Marc Ogier. 2012. Bidirectional Language Model for Handwriting Recognition. Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop. Springer Berlin Heidelberg, 611–619. (LNCS.)
Abstract: In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
|
|
|
Anjan Dutta, Josep Llados and Umapada Pal. 2011. A Bag-of-Paths Based Serialized Subgraph Matching for Symbol Spotting in Line Drawings. In Jordi Vitria, Joao Miguel Raposo and Mario Hernandez, eds. 5th Iberian Conference on Pattern Recognition and Image Analysis. Berlin, Springer Berlin Heidelberg, 620–627. (LNCS.)
Abstract: In this paper we propose an error tolerant subgraph matching algorithm based on bag-of-paths for solving the problem of symbol spotting in line drawings. Bag-of-paths is a factorized representation of graphs where the factorization is done by considering all the acyclic paths between each pair of connected nodes. Similar paths within the whole collection of documents are clustered and organized in a lookup table for efficient indexing. The lookup table contains the index key of each cluster and the corresponding list of locations as a single entry. The mean path of each of the clusters serves as the index key for each table entry. The spotting method is then formulated by a spatial voting scheme to the list of locations of the paths that are decided in terms of search of similar paths that compose the query symbol. Efficient indexing of common substructures helps to reduce the computational burden of usual graph based methods. The proposed method can also be seen as a way to serialize graphs which allows to reduce the complexity of the subgraph isomorphism. We have encoded the paths in terms of both attributed strings and turning functions, and presented a comparative results between them within the symbol spotting framework. Experimentations for matching different shape silhouettes are also reported and the method has been proved to work in noisy environment also.
|
|
|
Albert Gordo, Marçal Rusiñol, Dimosthenis Karatzas and Andrew Bagdanov. 2013. Document Classification and Page Stream Segmentation for Digital Mailroom Applications. 12th International Conference on Document Analysis and Recognition.621–625.
Abstract: In this paper we present a method for the segmentation of continuous page streams into multipage documents and the simultaneous classification of the resulting documents. We first present an approach to combine the multiple pages of a document into a single feature vector that represents the whole document. Despite its simplicity and low computational cost, the proposed representation yields results comparable to more complex methods in multipage document classification tasks. We then exploit this representation in the context of page stream segmentation. The most plausible segmentation of a page stream into a sequence of multipage documents is obtained by optimizing a statistical model that represents the probability of each segmented multipage document belonging to a particular class. Experimental results are reported on a large sample of real administrative multipage documents.
|
|
|
J. Chazalon, Marçal Rusiñol, Jean-Marc Ogier and Josep Llados. 2015. A Semi-Automatic Groundtruthing Tool for Mobile-Captured Document Segmentation. 13th International Conference on Document Analysis and Recognition ICDAR2015.621–625.
Abstract: This paper presents a novel way to generate groundtruth data for the evaluation of mobile document capture systems, focusing on the first stage of the image processing pipeline involved: document object detection and segmentation in lowquality preview frames. We introduce and describe a simple, robust and fast technique based on color markers which enables a semi-automated annotation of page corners. We also detail a technique for marker removal. Methods and tools presented in the paper were successfully used to annotate, in few hours, 24889
frames in 150 video files for the smartDOC competition at ICDAR 2015
|
|
|
Nuria Cirera, Alicia Fornes and Josep Llados. 2015. Hidden Markov model topology optimization for handwriting recognition. 13th International Conference on Document Analysis and Recognition ICDAR2015.626–630.
Abstract: In this paper we present a method to optimize the topology of linear left-to-right hidden Markov models. These models are very popular for sequential signals modeling on tasks such as handwriting recognition. Many topology definition methods select the number of states for a character model based
on character length. This can be a drawback when characters are shorter than the minimum allowed by the model, since they can not be properly trained nor recognized. The proposed method optimizes the number of states per model by automatically including convenient skip-state transitions and therefore it avoids the aforementioned problem.We discuss and compare our method with other character length-based methods such the Fixed, Bakis and Quantile methods. Our proposal performs well on off-line handwriting recognition task.
|
|
|
David Fernandez, Josep Llados and Alicia Fornes. 2011. Handwritten Word Spotting in Old Manuscript Images Using a Pseudo-Structural Descriptor Organized in a Hash Structure. In Jordi Vitria, Joao Miguel Raposo and Mario Hernandez, eds. 5th Iberian Conference on Pattern Recognition and Image Analysis.628–635.
Abstract: There are lots of historical handwritten documents with information that can be used for several studies and projects. The Document Image Analysis and Recognition community is interested in preserving these documents and extracting all the valuable information from them. Handwritten word-spotting is the pattern classification task which consists in detecting handwriting word images. In this work, we have used a query-by-example formalism: we have matched an input image with one or multiple images from handwritten documents to determine the distance that might indicate a correspondence. We have developed an approach based in characteristic Loci Features stored in a hash structure. Document images of the marriage licences of the Cathedral of Barcelona are used as the benchmarking database.
|
|
|
Alicia Fornes and Josep Llados. 2010. A Symbol-dependent Writer Identifcation Approach in Old Handwritten Music Scores. 12th International Conference on Frontiers in Handwriting Recognition.634–639.
Abstract: Writer identification consists in determining the writer of a piece of handwriting from a set of writers. In this paper we introduce a symbol-dependent approach for identifying the writer of old music scores, which is based on two symbol recognition methods. The main idea is to use the Blurred Shape Model descriptor and a DTW-based method for detecting, recognizing and describing the music clefs and notes. The proposed approach has been evaluated in a database of old music scores, achieving very high writer identification rates.
|
|
|
Sophie Wuerger, Kaida Xiao, Chenyang Fu and Dimosthenis Karatzas. 2010. Colour-opponent mechanisms are not affected by age-related chromatic sensitivity changes. OPO, 30(5), 635–659.
Abstract: The purpose of this study was to assess whether age-related chromatic sensitivity changes are associated with corresponding changes in hue perception in a large sample of colour-normal observers over a wide age range (n = 185; age range: 18-75 years). In these observers we determined both the sensitivity along the protan, deutan and tritan line; and settings for the four unique hues, from which the characteristics of the higher-order colour mechanisms can be derived. We found a significant decrease in chromatic sensitivity due to ageing, in particular along the tritan line. From the unique hue settings we derived the cone weightings associated with the colour mechanisms that are at equilibrium for the four unique hues. We found that the relative cone weightings (w(L) /w(M) and w(L) /w(S)) associated with the unique hues were independent of age. Our results are consistent with previous findings that the unique hues are rather constant with age while chromatic sensitivity declines. They also provide evidence in favour of the hypothesis that higher-order colour mechanisms are equipped with flexible cone weightings, as opposed to fixed weights. The mechanism underlying this compensation is still poorly understood.
|
|
|
Ruben Tito, Minesh Mathew, C.V. Jawahar, Ernest Valveny and Dimosthenis Karatzas. 2021. ICDAR 2021 Competition on Document Visual Question Answering. 16th International Conference on Document Analysis and Recognition.635–649.
Abstract: In this report we present results of the ICDAR 2021 edition of the Document Visual Question Challenges. This edition complements the previous tasks on Single Document VQA and Document Collection VQA with a newly introduced on Infographics VQA. Infographics VQA is based on a new dataset of more than 5, 000 infographics images and 30, 000 question-answer pairs. The winner methods have scored 0.6120 ANLS in Infographics VQA task, 0.7743 ANLSL in Document Collection VQA task and 0.8705 ANLS in Single Document VQA. We present a summary of the datasets used for each task, description of each of the submitted methods and the results and analysis of their performance. A summary of the progress made on Single Document VQA since the first edition of the DocVQA 2020 challenge is also presented.
|
|
|
Robert Benavente, Gemma Sanchez, Ramon Baldrich, Maria Vanrell and Josep Llados. 2000. Normalized colour segmentation for human appearance description. 15 th International Conference on Pattern Recognition.637–641.
|
|