|
Marçal Rusiñol, Farshad Nourbakhsh, Dimosthenis Karatzas, Ernest Valveny and Josep Llados. 2010. Perceptual Image Retrieval by Adding Color Information to the Shape Context Descriptor. 20th International Conference on Pattern Recognition.1594–1597.
Abstract: In this paper we present a method for the retrieval of images in terms of perceptual similarity. Local color information is added to the shape context descriptor in order to obtain an object description integrating both shape and color as visual cues. We use a color naming algorithm in order to represent the color information from a perceptual point of view. The proposed method has been tested in two different applications, an object retrieval scenario based on color sketch queries and a color trademark retrieval problem. Experimental results show that the addition of the color information significantly outperforms the sole use of the shape context descriptor.
|
|
|
Muhammad Muzzamil Luqman, Thierry Brouard, Jean-Yves Ramel and Josep Llados. 2010. A Content Spotting System For Line Drawing Graphic Document Images. 20th International Conference on Pattern Recognition.3420–3423.
Abstract: We present a content spotting system for line drawing graphic document images. The proposed system is sufficiently domain independent and takes the keyword based information retrieval for graphic documents, one step forward, to Query By Example (QBE) and focused retrieval. During offline learning mode: we vectorize the documents in the repository, represent them by attributed relational graphs, extract regions of interest (ROIs) from them, convert each ROI to a fuzzy structural signature, cluster similar signatures to form ROI classes and build an index for the repository. During online querying mode: a Bayesian network classifier recognizes the ROIs in the query image and the corresponding documents are fetched by looking up in the repository index. Experimental results are presented for synthetic images of architectural and electronic documents.
|
|
|
Albert Gordo and Florent Perronnin. 2010. A Bag-of-Pages Approach to Unordered Multi-Page Document Classification. 20th International Conference on Pattern Recognition.1920–1923.
Abstract: We consider the problem of classifying documents containing multiple unordered pages. For this purpose, we propose a novel bag-of-pages document representation. To represent a document, one assigns every page to a prototype in a codebook of pages. This leads to a histogram representation which can then be fed to any discriminative classifier. We also consider several refinements over this initial approach. We show on two challenging datasets that the proposed approach significantly outperforms a baseline system.
|
|
|
Alicia Fornes and Josep Llados. 2010. A Symbol-dependent Writer Identifcation Approach in Old Handwritten Music Scores. 12th International Conference on Frontiers in Handwriting Recognition.634–639.
Abstract: Writer identification consists in determining the writer of a piece of handwriting from a set of writers. In this paper we introduce a symbol-dependent approach for identifying the writer of old music scores, which is based on two symbol recognition methods. The main idea is to use the Blurred Shape Model descriptor and a DTW-based method for detecting, recognizing and describing the music clefs and notes. The proposed approach has been evaluated in a database of old music scores, achieving very high writer identification rates.
|
|
|
Alicia Fornes, Volkmar Frinken, Andreas Fischer, Jon Almazan, G. Jackson and Horst Bunke. 2011. A Keyword Spotting Approach Using Blurred Shape Model-Based Descriptors. Proceedings of the 2011 Workshop on Historical Document Imaging and Processing. ACM, 83–90.
Abstract: The automatic processing of handwritten historical documents is considered a hard problem in pattern recognition. In addition to the challenges given by modern handwritten data, a lack of training data as well as effects caused by the degradation of documents can be observed. In this scenario, keyword spotting arises to be a viable solution to make documents amenable for searching and browsing. For this task we propose the adaptation of shape descriptors used in symbol recognition. By treating each word image as a shape, it can be represented using the Blurred Shape Model and the De-formable Blurred Shape Model. Experiments on the George Washington database demonstrate that this approach is able to outperform the commonly used Dynamic Time Warping approach.
|
|
|
Thanh Ha Do, Salvatore Tabbone and Oriol Ramos Terrades. 2013. Document noise removal using sparse representations over learned dictionary. Symposium on Document engineering.161–168.
Abstract: best paper award
In this paper, we propose an algorithm for denoising document images using sparse representations. Following a training set, this algorithm is able to learn the main document characteristics and also, the kind of noise included into the documents. In this perspective, we propose to model the noise energy based on the normalized cross-correlation between pairs of noisy and non-noisy documents. Experimental
results on several datasets demonstrate the robustness of our method compared with the state-of-the-art.
|
|
|
David Fernandez, Simone Marinai, Josep Llados and Alicia Fornes. 2013. Contextual Word Spotting in Historical Manuscripts using Markov Logic Networks. 2nd International Workshop on Historical Document Imaging and Processing.36–43.
Abstract: Natural languages can often be modelled by suitable grammars whose knowledge can improve the word spotting results. The implicit contextual information is even more useful when dealing with information that is intrinsically described as one collection of records. In this paper, we present one approach to word spotting which uses the contextual information of records to improve the results. The method relies on Markov Logic Networks to probabilistically model the relational organization of handwritten records. The performance has been evaluated on the Barcelona Marriages Dataset that contains structured handwritten records that summarize marriage information.
|
|
|
Volkmar Frinken, Andreas Fischer and Carlos David Martinez Hinarejos. 2013. Handwriting Recognition in Historical Documents using Very Large Vocabularies. 2nd International Workshop on Historical Document Imaging and Processing.67–72.
Abstract: Language models are used in automatic transcription system to resolve ambiguities. This is done by limiting the vocabulary of words that can be recognized as well as estimating the n-gram probability of the words in the given text. In the context of historical documents, a non-unified spelling and the limited amount of written text pose a substantial problem for the selection of the recognizable vocabulary as well as the computation of the word probabilities. In this paper we propose for the transcription of historical Spanish text to keep the corpus for the n-gram limited to a sample of the target text, but expand the vocabulary with words gathered from external resources. We analyze the performance of such a transcription system with different sizes of external vocabularies and demonstrate the applicability and the significant increase in recognition accuracy of using up to 300 thousand external words.
|
|
|
Ariel Amato, Angel Sappa, Alicia Fornes, Felipe Lumbreras and Josep Llados. 2013. Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform. 2nd International ACM Workshop on Crowdsourcing for Multimedia.21–22.
Abstract: In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.
|
|
|
Alicia Fornes, Josep Llados, Joan Mas, Joana Maria Pujadas-Mora and Anna Cabre. 2014. A Bimodal Crowdsourcing Platform for Demographic Historical Manuscripts. Digital Access to Textual Cultural Heritage Conference.103–108.
Abstract: In this paper we present a crowdsourcing web-based application for extracting information from demographic handwritten document images. The proposed application integrates two points of view: the semantic information for demographic research, and the ground-truthing for document analysis research. Concretely, the application has the contents view, where the information is recorded into forms, and the labeling view, with the word labels for evaluating document analysis techniques. The crowdsourcing architecture allows to accelerate the information extraction (many users can work simultaneously), validate the information, and easily provide feedback to the users. We finally show how the proposed application can be extended to other kind of demographic historical manuscripts.
|
|