|
V. Chapaprieta and Ernest Valveny. 2001. Handwritten Digit Recognition Using Point Distribution Models..
|
|
|
Arnau Baro, Alicia Fornes and Carles Badal. 2020. Handwritten Historical Music Recognition by Sequence-to-Sequence with Attention Mechanism. 17th International Conference on Frontiers in Handwriting Recognition.
Abstract: Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks.
|
|
|
Arnau Baro, Carles Badal, Pau Torras and Alicia Fornes. 2022. Handwritten Historical Music Recognition through Sequence-to-Sequence with Attention Mechanism. 3rd International Workshop on Reading Music Systems (WoRMS2021).55–59.
Abstract: Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks.
Keywords: Optical Music Recognition; Digits; Image Classification
|
|
|
Francisco Cruz and Oriol Ramos Terrades. 2013. Handwritten Line Detection via an EM Algorithm. 12th International Conference on Document Analysis and Recognition.718–722.
Abstract: In this paper we present a handwritten line segmentation method devised to work on documents composed of several paragraphs with multiple line orientations. The method is based on a variation of the EM algorithm for the estimation of a set of regression lines between the connected components that compose the image. We evaluated our method on the ICDAR2009 handwriting segmentation contest dataset with promising results that overcome most of the presented methods. In addition, we prove the usability of the presented method by performing line segmentation on the George Washington database obtaining encouraging results.
|
|
|
Alicia Fornes, Sergio Escalera, Josep Llados, Gemma Sanchez, Petia Radeva and Oriol Pujol. 2007. Handwritten Symbol Recognition by a Boosted Blurred Shape Model with Error Correction. 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4477:13–21.
|
|
|
Juan Ignacio Toledo, Sebastian Sudholt, Alicia Fornes, Jordi Cucurull, A. Fink and Josep Llados. 2016. Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling. Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer International Publishing, 543–552. (LNCS.)
Abstract: The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results.
Keywords: Document image analysis; Word image categorization; Convolutional neural networks; Named entity detection
|
|
|
Pau Riba, Josep Llados and Alicia Fornes. 2015. Handwritten Word Spotting by Inexact Matching of Grapheme Graphs. 13th International Conference on Document Analysis and Recognition ICDAR2015.781–785.
Abstract: This paper presents a graph-based word spotting for handwritten documents. Contrary to most word spotting techniques, which use statistical representations, we propose a structural representation suitable to be robust to the inherent deformations of handwriting. Attributed graphs are constructed using a part-based approach. Graphemes extracted from shape convexities are used as stable units of handwriting, and are associated to graph nodes. Then, spatial relations between them determine graph edges. Spotting is defined in terms of an error-tolerant graph matching using bipartite-graph matching algorithm. To make the method usable in large datasets, a graph indexing approach that makes use of binary embeddings is used as preprocessing. Historical documents are used as experimental framework. The approach is comparable to statistical ones in terms of time and memory requirements, especially when dealing with large document collections.
|
|
|
David Fernandez, Josep Llados and Alicia Fornes. 2011. Handwritten Word Spotting in Old Manuscript Images Using a Pseudo-Structural Descriptor Organized in a Hash Structure. In Jordi Vitria, Joao Miguel Raposo and Mario Hernandez, eds. 5th Iberian Conference on Pattern Recognition and Image Analysis.628–635.
Abstract: There are lots of historical handwritten documents with information that can be used for several studies and projects. The Document Image Analysis and Recognition community is interested in preserving these documents and extracting all the valuable information from them. Handwritten word-spotting is the pattern classification task which consists in detecting handwriting word images. In this work, we have used a query-by-example formalism: we have matched an input image with one or multiple images from handwritten documents to determine the distance that might indicate a correspondence. We have developed an approach based in characteristic Loci Features stored in a hash structure. Document images of the marriage licences of the Cathedral of Barcelona are used as the benchmarking database.
|
|
|
David Fernandez. 2010. Handwritten Word Spotting in Old Manuscript Images using Shape Descriptors. (Master's thesis, .)
|
|
|
Jon Almazan, Albert Gordo, Alicia Fornes and Ernest Valveny. 2013. Handwritten Word Spotting with Corrected Attributes. 15th IEEE International Conference on Computer Vision.1017–1024.
Abstract: We propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results.
|
|