Publicacions CVC -- Query Results

[181–190] << 191 192 193 194 195 196 197 198 199 200 >> [201–210]

Details

Records
Author	Gabriel Villalonga; Antonio Lopez
Title	Co-Training for On-Board Deep Object Detection			Type	Journal Article
Year	2020	Publication	IEEE Access	Abbreviated Journal	ACCESS
Volume		Issue		Pages	194441 - 194456
Keywords
Abstract	Providing ground truth supervision to train visual models has been a bottleneck over the years, exacerbated by domain shifts which degenerate the performance of such models. This was the case when visual tasks relied on handcrafted features and shallow machine learning and, despite its unprecedented performance gains, the problem remains open within the deep learning paradigm due to its data-hungry nature. Best performing deep vision-based object detectors are trained in a supervised manner by relying on human-labeled bounding boxes which localize class instances (i.e. objects) within the training images. Thus, object detection is one of such tasks for which human labeling is a major bottleneck. In this article, we assess co-training as a semi-supervised learning method for self-labeling objects in unlabeled images, so reducing the human-labeling effort for developing deep object detectors. Our study pays special attention to a scenario involving domain shift; in particular, when we have automatically generated virtual-world images with object bounding boxes and we have real-world images which are unlabeled. Moreover, we are particularly interested in using co-training for deep object detection in the context of driver assistance systems and/or self-driving vehicles. Thus, using well-established datasets and protocols for object detection in these application contexts, we will show how co-training is a paradigm worth to pursue for alleviating object labeling, working both alone and together with task-agnostic domain adaptation.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ ViL2020			Serial	3488
Permanent link to this record



Author	Hannes Mueller; Andre Groger; Jonathan Hersh; Andrea Matranga; Joan Serrat
Title	Monitoring War Destruction from Space: A Machine Learning Approach			Type	Miscellaneous
Year	2020	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Existing data on building destruction in conflict zones rely on eyewitness reports or manual detection, which makes it generally scarce, incomplete and potentially biased. This lack of reliable data imposes severe limitations for media reporting, humanitarian relief efforts, human rights monitoring, reconstruction initiatives, and academic studies of violent conflict. This article introduces an automated method of measuring destruction in high-resolution satellite images using deep learning techniques combined with data augmentation to expand training samples. We apply this method to the Syrian civil war and reconstruct the evolution of damage in major cities across the country. The approach allows generating destruction data with unprecedented scope, resolution, and frequency – only limited by the available satellite imagery – which can alleviate data limitations decisively.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ MGH2020			Serial	3489
Permanent link to this record



Author	Yi Xiao; Felipe Codevilla; Akhil Gurram; Onay Urfalioglu; Antonio Lopez
Title	Multimodal end-to-end autonomous driving			Type	Journal Article
Year	2020	Publication	IEEE Transactions on Intelligent Transportation Systems	Abbreviated Journal	TITS
Volume		Issue		Pages	1-11
Keywords
Abstract	A crucial component of an autonomous vehicle (AV) is the artificial intelligence (AI) is able to drive towards a desired destination. Today, there are different paradigms addressing the development of AI drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception and maneuver planning and control. On the other hand, we find end-to-end driving approaches that try to learn a direct mapping from input raw sensor data to vehicle control signals. The later are relatively less studied, but are gaining popularity since they are less demanding in terms of sensor data annotation. This paper focuses on end-to-end autonomous driving. So far, most proposals relying on this paradigm assume RGB images as input sensor data. However, AVs will not be equipped only with cameras, but also with active sensors providing accurate depth information (e.g., LiDARs). Accordingly, this paper analyses whether combining RGB and depth modalities, i.e. using RGBD data, produces better end-to-end AI drivers than relying on a single modality. We consider multimodality based on early, mid and late fusion schemes, both in multisensory and single-sensor (monocular depth estimation) settings. Using the CARLA simulator and conditional imitation learning (CIL), we show how, indeed, early fusion multimodality outperforms single-modality.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ XCG2020			Serial	3490
Permanent link to this record



Author	Andres Mafla; Sounak Dey; Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas
Title	Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval			Type	Conference Article
Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	4022-4032
Keywords
Abstract
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MDB2021			Serial	3491
Permanent link to this record



Author	Andres Mafla; Rafael S. Rezende; Lluis Gomez; Diana Larlus; Dimosthenis Karatzas
Title	StacMR: Scene-Text Aware Cross-Modal Retrieval			Type	Conference Article
Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	2219-2229
Keywords
Abstract
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MRG2021a			Serial	3492
Permanent link to this record



Author	Andres Mafla; Ruben Tito; Sounak Dey; Lluis Gomez; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas
Title	Real-time Lexicon-free Scene Text Retrieval			Type	Journal Article
Year	2021	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	110	Issue		Pages	107656
Keywords
Abstract	In this work, we address the task of scene text retrieval: given a text query, the system returns all images containing the queried text. The proposed model uses a single shot CNN architecture that predicts bounding boxes and builds a compact representation of spotted words. In this way, this problem can be modeled as a nearest neighbor search of the textual representation of a query over the outputs of the CNN collected from the totality of an image database. Our experiments demonstrate that the proposed model outperforms previous state-of-the-art, while offering a significant increase in processing speed and unmatched expressiveness with samples never seen at training time. Several experiments to assess the generalization capability of the model are conducted in a multilingual dataset, as well as an application of real-time text spotting in videos.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121; 600.129; 601.338			Approved	no
Call Number	Admin @ si @ MTD2021			Serial	3493
Permanent link to this record



Author	Lluis Gomez; Anguelos Nicolaou; Marçal Rusiñol; Dimosthenis Karatzas
Title	12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding			Type	Book Chapter
Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	K. Alahari; C.V. Jawahar
Language		Summary Language		Original Title
Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	GNR2020			Serial	3494
Permanent link to this record



Author	Lluis Gomez; Dena Bazazian; Dimosthenis Karatzas
Title	Historical review of scene text detection research			Type	Book Chapter
Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	K. Alahari; C.V. Jawahar
Language		Summary Language		Original Title
Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ GBK2020			Serial	3495
Permanent link to this record



Author	Jon Almazan; Lluis Gomez; Suman Ghosh; Ernest Valveny; Dimosthenis Karatzas
Title	WATTS: A common representation of word images and strings using embedded attributes for text recognition and retrieval			Type	Book Chapter
Year	2020	Publication	Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	Analysis”, K. Alahari; C.V. Jawahar
Language		Summary Language		Original Title
Series Editor		Series Title	Series on Advances in Computer Vision and Pattern Recognition	Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ AGG2020			Serial	3496
Permanent link to this record



Author	Raul Gomez; Yahui Liu; Marco de Nadai; Dimosthenis Karatzas; Bruno Lepri; Nicu Sebe
Title	Retrieval Guided Unsupervised Multi-domain Image to Image Translation			Type	Conference Article
Year	2020	Publication	28th ACM International Conference on Multimedia	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Image to image translation aims to learn a mapping that transforms an image from one visual domain to another. Recent works assume that images descriptors can be disentangled into a domain-invariant content representation and a domain-specific style representation. Thus, translation models seek to preserve the content of source images while changing the style to a target visual domain. However, synthesizing new images is extremely challenging especially in multi-domain translations, as the network has to compose content and style to generate reliable and diverse images in multiple domains. In this paper we propose the use of an image retrieval system to assist the image-to-image translation task. First, we train an image-to-image translation model to map images to multiple domains. Then, we train an image retrieval model using real and generated images to find images similar to a query one in content but in a different domain. Finally, we exploit the image retrieval system to fine-tune the image-to-image translation model and generate higher quality images. Our experiments show the effectiveness of the proposed solution and highlight the contribution of the retrieval network, which can benefit from additional unlabeled data and help image-to-image translation models in the presence of scarce data.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ACM
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ GLN2020			Serial	3497
Permanent link to this record



Author	Minesh Mathew; Dimosthenis Karatzas; C.V. Jawahar
Title	DocVQA: A Dataset for VQA on Document Images			Type	Conference Article
Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	2200-2209
Keywords
Abstract	We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets for VQA and reading comprehension is presented. We report several baseline results by adopting existing VQA and reading comprehension models. Although the existing models perform reasonably well on certain types of questions, there is large performance gap compared to human performance (94.36% accuracy). The models need to improve specifically on questions where understanding structure of the document is crucial. The dataset, code and leaderboard are available at docvqa. org
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MKJ2021			Serial	3498
Permanent link to this record



Author	Tomas Sixta; Julio C. S. Jacques Junior; Pau Buch Cardona; Eduard Vazquez; Sergio Escalera
Title	FairFace Challenge at ECCV 2020: Analyzing Bias in Face Recognition			Type	Conference Article
Year	2020	Publication	ECCV Workshops	Abbreviated Journal
Volume	12540	Issue		Pages	463-481
Keywords
Abstract	This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were evaluated using an in-the-wild dataset based on reannotated IJB-C, further enriched 12.5K new images and additional labels. The dataset is not balanced, which simulates a real world scenario where AI-based models supposed to present fair outcomes are trained and evaluated on imbalanced data. The challenge attracted 151 participants, who made more 1.8K submissions in total. The final phase of the challenge attracted 36 active teams out of which 10 exceeded 0.999 AUC-ROC while achieving very low scores in the proposed bias metrics. Common strategies by the participants were face pre-processing, homogenization of data distributions, the use of bias aware loss functions and ensemble models. The analysis of top-10 teams shows higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too.
Address	Virtual; August 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ SJB2020			Serial	3499
Permanent link to this record



Author	Zhengying Liu; Zhen Xu; Shangeth Rajaa; Meysam Madadi; Julio C. S. Jacques Junior; Sergio Escalera; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu; Isabelle Guyon
Title	Towards Automated Deep Learning: Analysis of the AutoDL challenge series 2019			Type	Conference Article
Year	2020	Publication	Proceedings of Machine Learning Research	Abbreviated Journal
Volume	123	Issue		Pages	242-252
Keywords
Abstract	We present the design and results of recent competitions in Automated Deep Learning (AutoDL). In the AutoDL challenge series 2019, we organized 5 machine learning challenges: AutoCV, AutoCV2, AutoNLP, AutoSpeech and AutoDL. The first 4 challenges concern each a specific application domain, such as computer vision, natural language processing and speech recognition. At the time of March 2020, the last challenge AutoDL is still on-going and we only present its design. Some highlights of this work include: (1) a benchmark suite of baseline AutoML solutions, with emphasis on domains for which Deep Learning methods have had prior success (image, video, text, speech, etc); (2) a novel any-time learning framework, which opens doors for further theoretical consideration; (3) a repository of around 100 datasets (from all above domains) over half of which are released as public datasets to enable research on meta-learning; (4) analyses revealing that winning solutions generalize to new unseen datasets, validating progress towards universal AutoML solution; (5) open-sourcing of the challenge platform, the starting kit, the dataset formatting toolkit, and all winning solutions (All information available at {autodl.chalearn.org}).
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NEURIPS
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ LXR2020			Serial	3500
Permanent link to this record



Author	Albert Clapes; Julio C. S. Jacques Junior; Carla Morral; Sergio Escalera
Title	ChaLearn LAP 2020 Challenge on Identity-preserved Human Detection: Dataset and Results			Type	Conference Article
Year	2020	Publication	15th IEEE International Conference on Automatic Face and Gesture Recognition	Abbreviated Journal
Volume		Issue		Pages	801-808
Keywords
Abstract	This paper summarizes the ChaLearn Looking at People 2020 Challenge on Identity-preserved Human Detection (IPHD). For the purpose, we released a large novel dataset containing more than 112K pairs of spatiotemporally aligned depth and thermal frames (and 175K instances of humans) sampled from 780 sequences. The sequences contain hundreds of non-identifiable people appearing in a mix of in-the-wild and scripted scenarios recorded in public and private places. The competition was divided into three tracks depending on the modalities exploited for the detection: (1) depth, (2) thermal, and (3) depth-thermal fusion. Color was also captured but only used to facilitate the groundtruth annotation. Still the temporal synchronization of three sensory devices is challenging, so bad temporal matches across modalities can occur. Hence, the labels provided should considered “weak”, although test frames were carefully selected to minimize this effect and ensure the fairest comparison of the participants’ results. Despite this added difficulty, the results got by the participants demonstrate current fully-supervised methods can deal with that and achieve outstanding detection performance when measured in terms of AP@0.50.
Address	Virtual; November 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	FG
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ CJM2020			Serial	3501
Permanent link to this record



Author	Zhengying Liu; Adrien Pavao; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Sebastien Treguer
Title	How far are we from true AutoML: reflection from winning solutions and results of AutoDL challenge			Type	Conference Article
Year	2020	Publication	7th ICML Workshop on Automated Machine Learning	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Following the completion of the AutoDL challenge (the final challenge in the ChaLearn AutoDL challenge series 2019), we investigate winning solutions and challenge results to answer an important motivational question: how far are we from achieving true AutoML? On one hand, the winning solutions achieve good (accurate and fast) classification performance on unseen datasets. On the other hand, all winning solutions still contain a considerable amount of hard-coded knowledge on the domain (or modality) such as image, video, text, speech and tabular. This form of ad-hoc meta-learning could be replaced by more automated forms of meta-learning in the future. Organizing a meta-learning challenge could help forging AutoML solutions that generalize to new unseen domains (e.g. new types of sensor data) as well as gaining insights on the AutoML problem from a more fundamental point of view. The datasets of the AutoDL challenge are a resource that can be used for further benchmarks and the code of the winners has been outsourced, which is a big step towards “democratizing” Deep Learning.
Address	Virtual; July 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICML
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ LPX2020			Serial	3502
Permanent link to this record