|
Josep Llados, & Enric Marti. (1995). Interpretacio de dibuixos lineals mitjançant tècniques d isomorfisme entre grafs. In Trobada de Joves Investigadors.
Abstract: L’anàlisi de documents té com a objectiu la interpretació automàtica de documents impresos sobre paper, amb la finalitat d’obtenir una descripció simbòlica d’aquests, que permeti el seu emmagatzemament i posterior tractament computacional. Les tècniques basades en grafs relacionals d’atributs permeten representar de manera compacta la informació continguda en dibuixos lineals i mitjançant mecanismes d’isomorfisme entre grafs, reconèixer-hi certes estructures i d’aquesta manera, interpretar el document. En aquest treball es dóna una visió general de les tènciques de grafs aplicades al reconeixement visual d’objectes en problemes d’anàlisi de documents. Aquestes tècniques s’il·lustren amb un exemple de reconeixement de plànols dibuixats a mà alçada. Finalment es proposa la utilització de tècniques de Hough com a mecanisme per accelerar el procés de reconeixement aplicant un cert coneixement sobre el domini en el que es treballa
|
|
|
Josep Llados, & Dorothea Blostein. (2007). Special Issue on Graphics Recognition. IJDAR - International Journal on Document Analysis and Recognition, 1–2.
|
|
|
Josep Llados, Dimosthenis Karatzas, Joan Mas, & Gemma Sanchez. (2008). A Generic Architecture for the Conversion of Document Collections into Semantically Annotated Digital Archives. Journal of Universal Computer Science, 2912–2935.
Keywords: Median Graph, Graph Embedding, Graph Matching, Structural Pattern Recognition
|
|
|
Josep Llados, Daniel Lopresti, & Seiichi Uchida (Eds.). (2021). 16th International Conference, 2021, Proceedings, Part III (Vol. 12823). LNCS. Springer Cham.
Abstract: This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
|
|
|
Josep Llados, Daniel Lopresti, & Seiichi Uchida (Eds.). (2021). 16th International Conference, 2021, Proceedings, Part IV (Vol. 12824). LNCS. Springer Cham.
Abstract: This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
|
|
|
Josep Llados, Daniel Lopresti, & Seiichi Uchida (Eds.). (2021). 16th International Conference, 2021, Proceedings, Part I (Vol. 12821). LNCS. Springer Cham.
Abstract: This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition.
|
|
|
Josep Llados, Daniel Lopresti, & Seiichi Uchida (Eds.). (2021). 16th International Conference, 2021, Proceedings, Part II (Vol. 12822). LNCS. Springer Cham.
Abstract: This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
|
|
|
Josep Llados. (1996). Interpretacio de dibuixos linials fets a ma alçada mitjançant isomorfisme entre subgrafs i transformacio de Hough.
|
|
|
Josep Llados. (2006). Perspectives on the Analysis of Graphical Documents.
|
|
|
Josep Llados. (2006). Computer Vision: Progress of Research and Development ( J. Llados(ed.), Ed.).
|
|
|
Josep Llados. (2007). Advances in Graphics Recognition. In Digital Document Processing, Major Directions and Recent Advances, Advances in Pattern Recognition, B.B. Chaudhuri, ed., 281–304.
|
|
|
Josep Llados. (2021). The 5G of Document Intelligence. In 3rd Workshop on Future of Document Analysis and Recognition.
|
|
|
Josep Famadas, Meysam Madadi, Cristina Palmero, & Sergio Escalera. (2020). Generative Video Face Reenactment by AUs and Gaze Regularization. In 15th IEEE International Conference on Automatic Face and Gesture Recognition (pp. 444–451).
Abstract: In this work, we propose an encoder-decoder-like architecture to perform face reenactment in image sequences. Our goal is to transfer the training subject identity to a given test subject. We regularize face reenactment by facial action unit intensity and 3D gaze vector regression. This way, we enforce the network to transfer subtle facial expressions and eye dynamics, providing a more lifelike result. The proposed encoder-decoder receives as input the previous sequence frame stacked to the current frame image of facial landmarks. Thus, the generated frames benefit from appearance and geometry, while keeping temporal coherence for the generated sequence. At test stage, a new target subject with the facial performance of the source subject and the appearance of the training subject is reenacted. Principal component analysis is applied to project the test subject geometry to the closest training subject geometry before reenactment. Evaluation of our proposal shows faster convergence, and more accurate and realistic results in comparison to other architectures without action units and gaze regularization.
|
|
|
Josep Brugues Pujolras, Lluis Gomez, & Dimosthenis Karatzas. (2022). A Multilingual Approach to Scene Text Visual Question Answering. In Document Analysis Systems.15th IAPR International Workshop, (DAS2022) (pp. 65–79).
Abstract: Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines.
Keywords: Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning
|
|
|
Josefina Mauri, Eduard Fernandez-Nofrerias, B. Garcia del Blanco, E. Iraculis, J.A. Gomez-Hospital, J. Comin, et al. (2000). Moviment del vas en l anàlisi d imatges d ecografia intracoronària: un model matemàtic. In Congrés de la Societat Catalana de Cardiologia..
|
|