|
Miquel Ferrer, I. Bardaji, Ernest Valveny, Dimosthenis Karatzas and Horst Bunke. 2013. Median Graph Computation by Means of Graph Embedding into Vector Spaces. In Yun Fu and Yungian Ma, eds. Graph Embedding for Pattern Analysis. Springer New York, 45–72.
Abstract: In pattern recognition [8, 14], a key issue to be addressed when designing a system is how to represent input patterns. Feature vectors is a common option. That is, a set of numerical features describing relevant properties of the pattern are computed and arranged in a vector form. The main advantages of this kind of representation are computational simplicity and a well sound mathematical foundation. Thus, a large number of operations are available to work with vectors and a large repository of algorithms for pattern analysis and classification exist. However, the simple structure of feature vectors might not be the best option for complex patterns where nonnumerical features or relations between different parts of the pattern become relevant.
|
|
|
Giuseppe De Gregorio and 6 others. 2022. A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts. Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022).3–12. (LNCS.)
Abstract: Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction.
Keywords: N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections
|
|
|
Arnau Baro, Pau Riba and Alicia Fornes. 2022. Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network. Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022).171–184. (LNCS.)
Abstract: During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.
Keywords: Object detection; Optical music recognition; Graph neural network
|
|
|
Utkarsh Porwal, Alicia Fornes and Faisal Shafait, eds. 2022. Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022. Springer. (LNCS.)
|
|
|
Gemma Sanchez, Josep Llados and K. Tombre. 2001. An Error-Correction Graph Grammar to Recognize Textured Symbols..
|
|
|
Debora Gil, Oriol Ramos Terrades and Raquel Perez. 2021. Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution. Extended Abstracts GEOMVAP 2019, Trends in Mathematics 15. Springer Nature, 89–93.
Abstract: Abnormalities in radiomic measures correlate to genomic alterations prone to alter the outcome of personalized anti-cancer treatments. TOPiomics is a new method for the early detection of variations in tumor imaging phenotype from a topological structure in multi-view radiomic spaces.
|
|
|
Ernest Valveny, Robert Benavente, Agata Lapedriza, Miquel Ferrer, Jaume Garcia and Gemma Sanchez. 2012. Adaptation of a computer programming course to the EXHE requirements: evaluation five years later.
|
|
|
Ali Furkan Biten, Ruben Tito, Lluis Gomez, Ernest Valveny and Dimosthenis Karatzas. 2022. OCR-IDL: OCR Annotations for Industry Document Library Dataset. ECCV Workshop on Text in Everything.
Abstract: Pretraining has proven successful in Document Intelligence tasks where deluge of documents are used to pretrain the models only later to be finetuned on downstream tasks. One of the problems of the pretraining approaches is the inconsistent usage of pretraining data with different OCR engines leading to incomparable results between models. In other words, it is not obvious whether the performance gain is coming from diverse usage of amount of data and distinct OCR engines or from the proposed models. To remedy the problem, we make public the OCR annotations for IDL documents using commercial OCR engine given their superior performance over open source OCR models. The contributed dataset (OCR-IDL) has an estimated monetary value over 20K US$. It is our hope that OCR-IDL can be a starting point for future works on Document Intelligence. All of our data and its collection process with the annotations can be found in this https URL.
|
|
|
Juan Ignacio Toledo, Jordi Cucurull, Jordi Puiggali, Alicia Fornes and Josep Llados. 2015. Document Analysis Techniques for Automatic Electoral Document Processing: A Survey. E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015.139–141. (LNCS.)
Abstract: In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents.
Keywords: Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally
|
|
|
Marçal Rusiñol, R.Roset, Josep Llados and C.Montaner. 2011. Automatic Index Generation of Digitized Map Series by Coordinate Extraction and Interpretation.
Abstract: By means of computer vision algorithms scanned images of maps are processed in order to extract relevant geographic information from printed coordinate pairs. The meaningful information is then transformed into georeferencing information for each single map sheet, and the complete set is compiled to produce a graphical index sheet for the map series along with relevant metadata. The whole process is fully automated and trained to attain maximum effectivity and throughput.
|
|