|
Thanh Ha Do, Salvatore Tabbone and Oriol Ramos Terrades. 2016. Spotting Symbol over Graphical Documents Via Sparsity in Visual Vocabulary. Recent Trends in Image Processing and Pattern Recognition.
|
|
|
Lluis Pere de las Heras, Oriol Ramos Terrades and Josep Llados. 2017. Ontology-Based Understanding of Architectural Drawings. International Workshop on Graphics Recognition. GREC 2015.Graphic Recognition. Current Trends and Challenges.75–85. (LNCS.)
Abstract: In this paper we present a knowledge base of architectural documents aiming at improving existing methods of floor plan classification and understanding. It consists of an ontological definition of the domain and the inclusion of real instances coming from both, automatically interpreted and manually labeled documents. The knowledge base has proven to be an effective tool to structure our knowledge and to easily maintain and upgrade it. Moreover, it is an appropriate means to automatically check the consistency of relational data and a convenient complement of hard-coded knowledge interpretation systems.
Keywords: Graphics recognition; Floor plan analysi; Domain ontology
|
|
|
Arnau Baro, Pau Riba, Jorge Calvo-Zaragoza and Alicia Fornes. 2018. Optical Music Recognition by Long Short-Term Memory Networks. In A. Fornes, B.L., ed. Graphics Recognition. Current Trends and Evolutions. Springer, 81–95. (LNCS.)
Abstract: Optical Music Recognition refers to the task of transcribing the image of a music score into a machine-readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level. The experimental results are promising, showing the benefits of our approach.
Keywords: Optical Music Recognition; Recurrent Neural Network; Long ShortTerm Memory
|
|
|
Raul Gomez, Lluis Gomez, Jaume Gibert and Dimosthenis Karatzas. 2019. Self-Supervised Learning from Web Data for Multimodal Retrieval. Multi-Modal Scene Understanding Book.279–306.
Abstract: Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal data. In this work we propose to exploit this free available data to learn a multimodal image and text embedding, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the proposed pipeline can learn from images with associated text without supervision and analyze the semantic structure of the learnt joint image and text embeddingspace. Weperformathoroughanalysisandperformancecomparisonoffivedifferentstateof the art text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text basedimageretrievaltask,andweclearlyoutperformstateoftheartintheMIRFlickrdatasetwhen training in the target data. Further, we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Keywords: self-supervised learning; webly supervised learning; text embeddings; multimodal retrieval; multimodal embedding
|
|
|
Alicia Fornes, Josep Llados and Joana Maria Pujadas-Mora. 2020. Browsing of the Social Network of the Past: Information Extraction from Population Manuscript Images. Handwritten Historical Document Analysis, Recognition, and Retrieval – State of the Art and Future Trends. World Scientific.
|
|
|
Joana Maria Pujadas-Mora, Alicia Fornes, Josep Llados, Gabriel Brea-Martinez and Miquel Valls-Figols. 2019. The Baix Llobregat (BALL) Demographic Database, between Historical Demography and Computer Vision (nineteenth–twentieth centuries. Nominative Data in Demographic Research in the East and the West: monograph.29–61.
Abstract: The Baix Llobregat (BALL) Demographic Database is an ongoing database project containing individual census data from the Catalan region of Baix Llobregat (Spain) during the nineteenth and twentieth centuries. The BALL Database is built within the project ‘NETWORKS: Technology and citizen innovation for building historical social networks to understand the demographic past’ directed by Alícia Fornés from the Center for Computer Vision and Joana Maria Pujadas-Mora from the Center for Demographic Studies, both at the Universitat Autònoma de Barcelona, funded by the Recercaixa program (2017–2019).
Its webpage is http://dag.cvc.uab.es/xarxes/.The aim of the project is to develop technologies facilitating massive digitalization of demographic sources, and more specifically the padrones (local censuses), in order to reconstruct historical ‘social’ networks employing computer vision technology. Such virtual networks can be created thanks to the linkage of nominative records compiled in the local censuses across time and space. Thus, digitized versions of individual and family lifespans are established, and individuals and families can be located spatially.
|
|
|
Lluis Gomez, Anguelos Nicolaou, Marçal Rusiñol and Dimosthenis Karatzas. 2020. 12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding. In K. Alahari and C.V. Jawahar, eds. Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis. Springer. (Series on Advances in Computer Vision and Pattern Recognition.)
|
|
|
Lluis Gomez, Dena Bazazian and Dimosthenis Karatzas. 2020. Historical review of scene text detection research. In K. Alahari and C.V. Jawahar, eds. Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis. Springer. (Series on Advances in Computer Vision and Pattern Recognition.)
|
|
|
Jon Almazan, Lluis Gomez, Suman Ghosh, Ernest Valveny and Dimosthenis Karatzas. 2020. WATTS: A common representation of word images and strings using embedded attributes for text recognition and retrieval. In Analysis”, K.A. and C.V. Jawahar, eds. Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis. Springer. (Series on Advances in Computer Vision and Pattern Recognition.)
|
|
|
Debora Gil, Oriol Ramos Terrades and Raquel Perez. 2021. Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution. Extended Abstracts GEOMVAP 2019, Trends in Mathematics 15. Springer Nature, 89–93.
Abstract: Abnormalities in radiomic measures correlate to genomic alterations prone to alter the outcome of personalized anti-cancer treatments. TOPiomics is a new method for the early detection of variations in tumor imaging phenotype from a topological structure in multi-view radiomic spaces.
|
|