Publicacions CVC -- Query Results

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–74]

Details

	Records
	Author	Andres Mafla; Rafael S. Rezende; Lluis Gomez; Diana Larlus; Dimosthenis Karatzas
	Title	StacMR: Scene-Text Aware Cross-Modal Retrieval			Type	Conference Article
	Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	2219-2229
	Keywords
	Abstract
	Address	Virtual; January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ MRG2021a			Serial	3492
Permanent link to this record



	Author	Lluis Gomez; Anguelos Nicolaou; Marçal Rusiñol; Dimosthenis Karatzas
	Title	12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding			Type	Book Chapter
	Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	K. Alahari; C.V. Jawahar
	Language		Summary Language		Original Title
	Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	GNR2020			Serial	3494
Permanent link to this record



	Author	Lluis Gomez; Dena Bazazian; Dimosthenis Karatzas
	Title	Historical review of scene text detection research			Type	Book Chapter
	Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	K. Alahari; C.V. Jawahar
	Language		Summary Language		Original Title
	Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ GBK2020			Serial	3495
Permanent link to this record



	Author	Jon Almazan; Lluis Gomez; Suman Ghosh; Ernest Valveny; Dimosthenis Karatzas
	Title	WATTS: A common representation of word images and strings using embedded attributes for text recognition and retrieval			Type	Book Chapter
	Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	Analysis”, K. Alahari; C.V. Jawahar
	Language		Summary Language		Original Title
	Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ AGG2020			Serial	3496
Permanent link to this record



	Author	Raul Gomez; Yahui Liu; Marco de Nadai; Dimosthenis Karatzas; Bruno Lepri; Nicu Sebe
	Title	Retrieval Guided Unsupervised Multi-domain Image to Image Translation			Type	Conference Article
	Year	2020	Publication	28th ACM International Conference on Multimedia	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Image to image translation aims to learn a mapping that transforms an image from one visual domain to another. Recent works assume that images descriptors can be disentangled into a domain-invariant content representation and a domain-specific style representation. Thus, translation models seek to preserve the content of source images while changing the style to a target visual domain. However, synthesizing new images is extremely challenging especially in multi-domain translations, as the network has to compose content and style to generate reliable and diverse images in multiple domains. In this paper we propose the use of an image retrieval system to assist the image-to-image translation task. First, we train an image-to-image translation model to map images to multiple domains. Then, we train an image retrieval model using real and generated images to find images similar to a query one in content but in a different domain. Finally, we exploit the image retrieval system to fine-tune the image-to-image translation model and generate higher quality images. Our experiments show the effectiveness of the proposed solution and highlight the contribution of the retrieval network, which can benefit from additional unlabeled data and help image-to-image translation models in the presence of scarce data.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ACM
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ GLN2020			Serial	3497
Permanent link to this record



	Author	Minesh Mathew; Dimosthenis Karatzas; C.V. Jawahar
	Title	DocVQA: A Dataset for VQA on Document Images			Type	Conference Article
	Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	2200-2209
	Keywords
	Abstract	We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets for VQA and reading comprehension is presented. We report several baseline results by adopting existing VQA and reading comprehension models. Although the existing models perform reasonably well on certain types of questions, there is large performance gap compared to human performance (94.36% accuracy). The models need to improve specifically on questions where understanding structure of the document is crucial. The dataset, code and leaderboard are available at docvqa. org
	Address	Virtual; January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ MKJ2021			Serial	3498
Permanent link to this record



	Author	Manuel Carbonell; Pau Riba; Mauricio Villegas; Alicia Fornes; Josep Llados
	Title	Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents			Type	Conference Article
	Year	2020	Publication	25th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The use of administrative documents to communicate and leave record of business information requires of methods able to automatically extract and understand the content from such documents in a robust and efficient way. In addition, the semi-structured nature of these reports is specially suited for the use of graph-based representations which are flexible enough to adapt to the deformations from the different document templates. Moreover, Graph Neural Networks provide the proper methodology to learn relations among the data elements in these documents. In this work we study the use of Graph Neural Network architectures to tackle the problem of entity recognition and relation extraction in semi-structured documents. Our approach achieves state of the art results in the three tasks involved in the process. Additionally, the experimentation with two datasets of different nature demonstrates the good generalization ability of our approach.
	Address	Virtual; January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ CRV2020			Serial	3509
Permanent link to this record



	Author	Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes
	Title	Learning Graph Edit Distance by Graph NeuralNetworks			Type	Miscellaneous
	Year	2020	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies. In this paper, we propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance. Hence, we propose an efficient graph distance based on the novel field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation. The performance of the proposed graph distance is validated on two different scenarios. On the one hand, in a graph retrieval of handwritten words~\ie~keyword spotting, showing its superior performance when compared with (approximate) graph edit distance benchmarks. On the other hand, demonstrating competitive results for graph similarity learning when compared with the current state-of-the-art on a recent benchmark dataset.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121; 600.140; 601.302			Approved	no
	Call Number	Admin @ si @ RFL2020			Serial	3555
Permanent link to this record



	Author	Klara Janousckova; Jiri Matas; Lluis Gomez; Dimosthenis Karatzas
	Title	Text Recognition – Real World Data and Where to Find Them			Type	Conference Article
	Year	2020	Publication	25th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	4489-4496
	Keywords
	Abstract	We present a method for exploiting weakly annotated images to improve text extraction pipelines. The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions. The method includes matching of imprecise transcriptions to weak annotations and an edit distance guided neighbourhood search. It produces nearly error-free, localised instances of scene text, which we treat as “pseudo ground truth” (PGT). The method is applied to two weakly-annotated datasets. Training with the extracted PGT consistently improves the accuracy of a state of the art recognition model, by 3.7% on average, across different benchmark datasets (image domains) and 24.5% on one of the weakly annotated datasets 1 1 Acknowledgements. The authors were supported by Czech Technical University student grant SGS20/171/0HK3/3TJ13, the MEYS VVV project CZ.02.1.01/0.010.0J16 019/0000765 Research Center for Informatics, the Spanish Research project TIN2017-89779-P and the CERCA Programme / Generalitat de Catalunya.
	Address	Virtual; January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ JMG2020			Serial	3557
Permanent link to this record



	Author	Minesh Mathew; Ruben Tito; Dimosthenis Karatzas; R.Manmatha; C.V. Jawahar
	Title	Document Visual Question Answering Challenge 2020			Type	Conference Article
	Year	2020	Publication	33rd IEEE Conference on Computer Vision and Pattern Recognition – Short paper	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This paper presents results of Document Visual Question Answering Challenge organized as part of “Text and Documents in the Deep Learning Era” workshop, in CVPR 2020. The challenge introduces a new problem – Visual Question Answering on document images. The challenge comprised two tasks. The first task concerns with asking questions on a single document image. On the other hand, the second task is set as a retrieval task where the question is posed over a collection of images. For the task 1 a new dataset is introduced comprising 50,000 questions-answer(s) pairs defined over 12,767 document images. For task 2 another dataset has been created comprising 20 questions over 14,362 document images which share the same document template.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPR
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ MTK2020			Serial	3558
Permanent link to this record

Select All Deselect All

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–74]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: