|
Asma Bensalah, Antonio Parziale, Giuseppe De Gregorio, Angelo Marcelli, Alicia Fornes and Josep Llados. 2023. I Can’t Believe It’s Not Better: In-air Movement for Alzheimer Handwriting Synthetic Generation. 21st International Graphonomics Conference.136–148.
Abstract: During recent years, there here has been a boom in terms of deep learning use for handwriting analysis and recognition. One main application for handwriting analysis is early detection and diagnosis in the health field. Unfortunately, most real case problems still suffer a scarcity of data, which makes difficult the use of deep learning-based models. To alleviate this problem, some works resort to synthetic data generation. Lately, more works are directed towards guided data synthetic generation, a generation that uses the domain and data knowledge to generate realistic data that can be useful to train deep learning models. In this work, we combine the domain knowledge about the Alzheimer’s disease for handwriting and use it for a more guided data generation. Concretely, we have explored the use of in-air movements for synthetic data generation.
|
|
|
Arnau Baro, Pau Riba and Alicia Fornes. 2022. Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network. Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022).171–184. (LNCS.)
Abstract: During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.
Keywords: Object detection; Optical music recognition; Graph neural network
|
|
|
Pau Riba, Josep Llados and Alicia Fornes. 2020. Hierarchical graphs for coarse-to-fine error tolerant matching. PRL, 134, 116–124.
Abstract: During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting).
Keywords: Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval
|
|
|
Jaume Gibert, Ernest Valveny, Oriol Ramos Terrades and Horst Bunke. 2011. Multiple Classifiers for Graph of Words Embedding. In Carlo Sansone, Josef Kittler and Fabio Roli, eds. 10th International Conference on Multiple Classifier Systems.36–45. (LNCS.)
Abstract: During the last years, there has been an increasing interest in applying the multiple classifier framework to the domain of structural pattern recognition. Constructing base classifiers when the input patterns are graph based representations is not an easy problem. In this work, we make use of the graph embedding methodology in order to construct different feature vector representations for graphs. The graph of words embedding assigns a feature vector to every graph by counting unary and binary relations between node representatives and combining these pieces of information into a single vector. Selecting different node representatives leads to different vectorial representations and therefore to different base classifiers that can be combined. We experimentally show how this methodology significantly improves the classification of graphs with respect to single base classifiers.
|
|
|
P. Wang, V. Eglin, C. Garcia, C. Largeron, Josep Llados and Alicia Fornes. 2014. Représentation par graphe de mots manuscrits dans les images pour la recherche par similarité. Colloque International Francophone sur l'Écrit et le Document.233–248.
Abstract: Effective information retrieval on handwritten document images has always been
a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labeled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment results introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
Keywords: word spotting; graph-based representation; shape context description; graph edit distance; DTW; block merging; query by example
|
|
|
P. Wang, V. Eglin, C. Garcia, C. Largeron, Josep Llados and Alicia Fornes. 2014. A Coarse-to-Fine Word Spotting Approach for Historical Handwritten Documents Based on Graph Embedding and Graph Edit Distance. 22nd International Conference on Pattern Recognition.3074–3079.
Abstract: Effective information retrieval on handwritten document images has always been a challenging task, especially historical ones. In the paper, we propose a coarse-to-fine handwritten word spotting approach based on graph representation. The presented model comprises both the topological and morphological signatures of the handwriting. Skeleton-based graphs with the Shape Context labelled vertexes are established for connected components. Each word image is represented as a sequence of graphs. Aiming at developing a practical and efficient word spotting approach for large-scale historical handwritten documents, a fast and coarse comparison is first applied to prune the regions that are not similar to the query based on the graph embedding methodology. Afterwards, the query and regions of interest are compared by graph edit distance based on the Dynamic Time Warping alignment. The proposed approach is evaluated on a public dataset containing 50 pages of historical marriage license records. The results show that the proposed approach achieves a compromise between efficiency and accuracy.
Keywords: word spotting; coarse-to-fine mechamism; graphbased representation; graph embedding; graph edit distance
|
|
|
P. Wang, V. Eglin, C. Garcia, C. Largeron, Josep Llados and Alicia Fornes. 2014. A Novel Learning-free Word Spotting Approach Based on Graph Representation. 11th IAPR International Workshop on Document Analysis and Systems.207–211.
Abstract: Effective information retrieval on handwritten document images has always been a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labelled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment result is introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
|
|
|
Anguelos Nicolaou, Sounak Dey, V.Christlein, A.Maier and Dimosthenis Karatzas. 2018. Non-deterministic Behavior of Ranking-based Metrics when Evaluating Embeddings. International Workshop on Reproducible Research in Pattern Recognition.71–82. (LNCS.)
Abstract: Embedding data into vector spaces is a very popular strategy of pattern recognition methods. When distances between embeddings are quantized, performance metrics become ambiguous. In this paper, we present an analysis of the ambiguity quantized distances introduce and provide bounds on the effect. We demonstrate that it can have a measurable effect in empirical data in state-of-the-art systems. We also approach the phenomenon from a computer security perspective and demonstrate how someone being evaluated by a third party can exploit this ambiguity and greatly outperform a random predictor without even access to the input data. We also suggest a simple solution making the performance metrics, which rely on ranking, totally deterministic and impervious to such exploits.
|
|
|
Josep Llados, Enric Marti and Jordi Regincos. 1993. Interpretación de diseños a mano alzada como técnica de entrada a un sistema CAD en un ámbito de arquitectura. III National Conference on Computer Graphics. Granada.
Abstract: En los últimos años, se ha introducido ámpliamente el uso de los sistemas CAD en dominios relacionados con la arquitectura. Dichos sistemas CAD son muy útiles para el arquitecto en el diseño de planos de plantas de edificios. Sin embargo, la utilización eficiente de un CAD requiere un tiempo de aprendizaje, en especial, en la etapa de creación y edición del diseño. Además, una vez familiarizado con un CAD, el arquitecto debe adaptarse a la simbología que éste le permite que, en algunos casos puede ser poco flexible.Con esta motivación, se propone una técnica alternativa de entrada de documentos en sistemas CAD. Dicha técnica se basa en el diseño del plano sobre papel mediante un dibujo lineal hecho a mano alzada a modo de boceto e introducido mediante scanner. Una vez interpretado este dibujo inicial e introducido en el CAD, el arquitecto sólo deber hacer sobre éste los retoques finales del documento.El sistema de entrada propuesto se compone de dos módulos principales: En primer lugar, la extracción de características (puntos característicos, rectas y arcos) de la imagen obtenida mediante scanner. En dicho módulo se aplican principalmente técnicas de procesamiento de imágenes obteniendo como resultado una representaci¢n del dibujo de entrada basada en grafos de atributos. El objetivo del segundo módulo es el de encontrar y reconocer las entidades integrantes del documento (puertas, mesas, etc.) en base a una biblioteca de símbolos definida en el sistema CAD. La implementación de dicho módulo se basa en técnicas de isomorfismo de grafos.El sistema propone una alternativa que permita, mediante el diseño a mano alzada, la introducción de la informaci¢n m s significativa del plano de forma rápida, sencilla y estandarizada por parte del usuario.
|
|
|
Mohamed Ali Souibgui, Alicia Fornes, Y.Kessentini and C.Tudor. 2021. A Few-shot Learning Approach for Historical Encoded Manuscript Recognition. 25th International Conference on Pattern Recognition.5413–5420.
Abstract: Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning, our method surpasses existing unsupervised and supervised HTR methods for ciphers recognition.
|
|