Home | [181–190] << 191 192 193 194 195 196 197 198 199 200 >> [201–210] |
Records | |||||
---|---|---|---|---|---|
Author | Gabriel Villalonga; Antonio Lopez | ||||
Title | Co-Training for On-Board Deep Object Detection | Type | Journal Article | ||
Year | 2020 | Publication | IEEE Access | Abbreviated Journal | ACCESS |
Volume | Issue | Pages | 194441 - 194456 | ||
Keywords | |||||
Abstract | Providing ground truth supervision to train visual models has been a bottleneck over the years, exacerbated by domain shifts which degenerate the performance of such models. This was the case when visual tasks relied on handcrafted features and shallow machine learning and, despite its unprecedented performance gains, the problem remains open within the deep learning paradigm due to its data-hungry nature. Best performing deep vision-based object detectors are trained in a supervised manner by relying on human-labeled bounding boxes which localize class instances (i.e. objects) within the training images. Thus, object detection is one of such tasks for which human labeling is a major bottleneck. In this article, we assess co-training as a semi-supervised learning method for self-labeling objects in unlabeled images, so reducing the human-labeling effort for developing deep object detectors. Our study pays special attention to a scenario involving domain shift; in particular, when we have automatically generated virtual-world images with object bounding boxes and we have real-world images which are unlabeled. Moreover, we are particularly interested in using co-training for deep object detection in the context of driver assistance systems and/or self-driving vehicles. Thus, using well-established datasets and protocols for object detection in these application contexts, we will show how co-training is a paradigm worth to pursue for alleviating object labeling, working both alone and together with task-agnostic domain adaptation. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ ViL2020 | Serial | 3488 | ||
Permanent link to this record | |||||
Author | Hannes Mueller; Andre Groger; Jonathan Hersh; Andrea Matranga; Joan Serrat | ||||
Title | Monitoring War Destruction from Space: A Machine Learning Approach | Type | Miscellaneous | ||
Year | 2020 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Existing data on building destruction in conflict zones rely on eyewitness reports or manual detection, which makes it generally scarce, incomplete and potentially biased. This lack of reliable data imposes severe limitations for media reporting, humanitarian relief efforts, human rights monitoring, reconstruction initiatives, and academic studies of violent conflict. This article introduces an automated method of measuring destruction in high-resolution satellite images using deep learning techniques combined with data augmentation to expand training samples. We apply this method to the Syrian civil war and reconstruct the evolution of damage in major cities across the country. The approach allows generating destruction data with unprecedented scope, resolution, and frequency – only limited by the available satellite imagery – which can alleviate data limitations decisively. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ MGH2020 | Serial | 3489 | ||
Permanent link to this record | |||||
Author | Yi Xiao; Felipe Codevilla; Akhil Gurram; Onay Urfalioglu; Antonio Lopez | ||||
Title | Multimodal end-to-end autonomous driving | Type | Journal Article | ||
Year | 2020 | Publication | IEEE Transactions on Intelligent Transportation Systems | Abbreviated Journal | TITS |
Volume | Issue | Pages | 1-11 | ||
Keywords | |||||
Abstract | A crucial component of an autonomous vehicle (AV) is the artificial intelligence (AI) is able to drive towards a desired destination. Today, there are different paradigms addressing the development of AI drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception and maneuver planning and control. On the other hand, we find end-to-end driving approaches that try to learn a direct mapping from input raw sensor data to vehicle control signals. The later are relatively less studied, but are gaining popularity since they are less demanding in terms of sensor data annotation. This paper focuses on end-to-end autonomous driving. So far, most proposals relying on this paradigm assume RGB images as input sensor data. However, AVs will not be equipped only with cameras, but also with active sensors providing accurate depth information (e.g., LiDARs). Accordingly, this paper analyses whether combining RGB and depth modalities, i.e. using RGBD data, produces better end-to-end AI drivers than relying on a single modality. We consider multimodality based on early, mid and late fusion schemes, both in multisensory and single-sensor (monocular depth estimation) settings. Using the CARLA simulator and conditional imitation learning (CIL), we show how, indeed, early fusion multimodality outperforms single-modality. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ XCG2020 | Serial | 3490 | ||
Permanent link to this record | |||||
Author | Andres Mafla; Sounak Dey; Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 4022-4032 | ||
Keywords | |||||
Abstract | |||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MDB2021 | Serial | 3491 | ||
Permanent link to this record | |||||
Author | Andres Mafla; Rafael S. Rezende; Lluis Gomez; Diana Larlus; Dimosthenis Karatzas | ||||
Title | StacMR: Scene-Text Aware Cross-Modal Retrieval | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 2219-2229 | ||
Keywords | |||||
Abstract | |||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MRG2021a | Serial | 3492 | ||
Permanent link to this record | |||||
Author | Andres Mafla; Ruben Tito; Sounak Dey; Lluis Gomez; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas | ||||
Title | Real-time Lexicon-free Scene Text Retrieval | Type | Journal Article | ||
Year | 2021 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 110 | Issue | Pages | 107656 | |
Keywords | |||||
Abstract | In this work, we address the task of scene text retrieval: given a text query, the system returns all images containing the queried text. The proposed model uses a single shot CNN architecture that predicts bounding boxes and builds a compact representation of spotted words. In this way, this problem can be modeled as a nearest neighbor search of the textual representation of a query over the outputs of the CNN collected from the totality of an image database. Our experiments demonstrate that the proposed model outperforms previous state-of-the-art, while offering a significant increase in processing speed and unmatched expressiveness with samples never seen at training time. Several experiments to assess the generalization capability of the model are conducted in a multilingual dataset, as well as an application of real-time text spotting in videos. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121; 600.129; 601.338 | Approved | no | ||
Call Number | Admin @ si @ MTD2021 | Serial | 3493 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Anguelos Nicolaou; Marçal Rusiñol; Dimosthenis Karatzas | ||||
Title | 12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding | Type | Book Chapter | ||
Year | 2020 | Publication | Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | K. Alahari; C.V. Jawahar | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Series on Advances in Computer Vision and Pattern Recognition | Abbreviated Series Title | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | GNR2020 | Serial | 3494 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Dena Bazazian; Dimosthenis Karatzas | ||||
Title | Historical review of scene text detection research | Type | Book Chapter | ||
Year | 2020 | Publication | Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | K. Alahari; C.V. Jawahar | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Series on Advances in Computer Vision and Pattern Recognition | Abbreviated Series Title | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ GBK2020 | Serial | 3495 | ||
Permanent link to this record | |||||
Author | Jon Almazan; Lluis Gomez; Suman Ghosh; Ernest Valveny; Dimosthenis Karatzas | ||||
Title | WATTS: A common representation of word images and strings using embedded attributes for text recognition and retrieval | Type | Book Chapter | ||
Year | 2020 | Publication | Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | Analysis”, K. Alahari; C.V. Jawahar | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Series on Advances in Computer Vision and Pattern Recognition | Abbreviated Series Title | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ AGG2020 | Serial | 3496 | ||
Permanent link to this record | |||||
Author | Raul Gomez; Yahui Liu; Marco de Nadai; Dimosthenis Karatzas; Bruno Lepri; Nicu Sebe | ||||
Title | Retrieval Guided Unsupervised Multi-domain Image to Image Translation | Type | Conference Article | ||
Year | 2020 | Publication | 28th ACM International Conference on Multimedia | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Image to image translation aims to learn a mapping that transforms an image from one visual domain to another. Recent works assume that images descriptors can be disentangled into a domain-invariant content representation and a domain-specific style representation. Thus, translation models seek to preserve the content of source images while changing the style to a target visual domain. However, synthesizing new images is extremely challenging especially in multi-domain translations, as the network has to compose content and style to generate reliable and diverse images in multiple domains. In this paper we propose the use of an image retrieval system to assist the image-to-image translation task. First, we train an image-to-image translation model to map images to multiple domains. Then, we train an image retrieval model using real and generated images to find images similar to a query one in content but in a different domain. Finally, we exploit the image retrieval system to fine-tune the image-to-image translation model and generate higher quality images. Our experiments show the effectiveness of the proposed solution and highlight the contribution of the retrieval network, which can benefit from additional unlabeled data and help image-to-image translation models in the presence of scarce data. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ACM | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ GLN2020 | Serial | 3497 | ||
Permanent link to this record | |||||
Author | Minesh Mathew; Dimosthenis Karatzas; C.V. Jawahar | ||||
Title | DocVQA: A Dataset for VQA on Document Images | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 2200-2209 | ||
Keywords | |||||
Abstract | We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets for VQA and reading comprehension is presented. We report several baseline results by adopting existing VQA and reading comprehension models. Although the existing models perform reasonably well on certain types of questions, there is large performance gap compared to human performance (94.36% accuracy). The models need to improve specifically on questions where understanding structure of the document is crucial. The dataset, code and leaderboard are available at docvqa. org | ||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MKJ2021 | Serial | 3498 | ||
Permanent link to this record | |||||
Author | Tomas Sixta; Julio C. S. Jacques Junior; Pau Buch Cardona; Eduard Vazquez; Sergio Escalera | ||||
Title | FairFace Challenge at ECCV 2020: Analyzing Bias in Face Recognition | Type | Conference Article | ||
Year | 2020 | Publication | ECCV Workshops | Abbreviated Journal | |
Volume | 12540 | Issue | Pages | 463-481 | |
Keywords | |||||
Abstract | This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were evaluated using an in-the-wild dataset based on reannotated IJB-C, further enriched 12.5K new images and additional labels. The dataset is not balanced, which simulates a real world scenario where AI-based models supposed to present fair outcomes are trained and evaluated on imbalanced data. The challenge attracted 151 participants, who made more 1.8K submissions in total. The final phase of the challenge attracted 36 active teams out of which 10 exceeded 0.999 AUC-ROC while achieving very low scores in the proposed bias metrics. Common strategies by the participants were face pre-processing, homogenization of data distributions, the use of bias aware loss functions and ensemble models. The analysis of top-10 teams shows higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too. | ||||
Address | Virtual; August 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCVW | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ SJB2020 | Serial | 3499 | ||
Permanent link to this record | |||||
Author | Zhengying Liu; Zhen Xu; Shangeth Rajaa; Meysam Madadi; Julio C. S. Jacques Junior; Sergio Escalera; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu; Isabelle Guyon | ||||
Title | Towards Automated Deep Learning: Analysis of the AutoDL challenge series 2019 | Type | Conference Article | ||
Year | 2020 | Publication | Proceedings of Machine Learning Research | Abbreviated Journal | |
Volume | 123 | Issue | Pages | 242-252 | |
Keywords | |||||
Abstract | We present the design and results of recent competitions in Automated Deep Learning (AutoDL). In the AutoDL challenge series 2019, we organized 5 machine learning challenges: AutoCV, AutoCV2, AutoNLP, AutoSpeech and AutoDL. The first 4 challenges concern each a specific application domain, such as computer vision, natural language processing and speech recognition. At the time of March 2020, the last challenge AutoDL is still on-going and we only present its design. Some highlights of this work include: (1) a benchmark suite of baseline AutoML solutions, with emphasis on domains for which Deep Learning methods have had prior success (image, video, text, speech, etc); (2) a novel any-time learning framework, which opens doors for further theoretical consideration; (3) a repository of around 100 datasets (from all above domains) over half of which are released as public datasets to enable research on meta-learning; (4) analyses revealing that winning solutions generalize to new unseen datasets, validating progress towards universal AutoML solution; (5) open-sourcing of the challenge platform, the starting kit, the dataset formatting toolkit, and all winning solutions (All information available at {autodl.chalearn.org}). | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | NEURIPS | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ LXR2020 | Serial | 3500 | ||
Permanent link to this record | |||||
Author | Albert Clapes; Julio C. S. Jacques Junior; Carla Morral; Sergio Escalera | ||||
Title | ChaLearn LAP 2020 Challenge on Identity-preserved Human Detection: Dataset and Results | Type | Conference Article | ||
Year | 2020 | Publication | 15th IEEE International Conference on Automatic Face and Gesture Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 801-808 | ||
Keywords | |||||
Abstract | This paper summarizes the ChaLearn Looking at People 2020 Challenge on Identity-preserved Human Detection (IPHD). For the purpose, we released a large novel dataset containing more than 112K pairs of spatiotemporally aligned depth and thermal frames (and 175K instances of humans) sampled from 780 sequences. The sequences contain hundreds of non-identifiable people appearing in a mix of in-the-wild and scripted scenarios recorded in public and private places. The competition was divided into three tracks depending on the modalities exploited for the detection: (1) depth, (2) thermal, and (3) depth-thermal fusion. Color was also captured but only used to facilitate the groundtruth annotation. Still the temporal synchronization of three sensory devices is challenging, so bad temporal matches across modalities can occur. Hence, the labels provided should considered “weak”, although test frames were carefully selected to minimize this effect and ensure the fairest comparison of the participants’ results. Despite this added difficulty, the results got by the participants demonstrate current fully-supervised methods can deal with that and achieve outstanding detection performance when measured in terms of AP@0.50. | ||||
Address | Virtual; November 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FG | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ CJM2020 | Serial | 3501 | ||
Permanent link to this record | |||||
Author | Zhengying Liu; Adrien Pavao; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Sebastien Treguer | ||||
Title | How far are we from true AutoML: reflection from winning solutions and results of AutoDL challenge | Type | Conference Article | ||
Year | 2020 | Publication | 7th ICML Workshop on Automated Machine Learning | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Following the completion of the AutoDL challenge (the final challenge in the ChaLearn
AutoDL challenge series 2019), we investigate winning solutions and challenge results to answer an important motivational question: how far are we from achieving true AutoML? On one hand, the winning solutions achieve good (accurate and fast) classification performance on unseen datasets. On the other hand, all winning solutions still contain a considerable amount of hard-coded knowledge on the domain (or modality) such as image, video, text, speech and tabular. This form of ad-hoc meta-learning could be replaced by more automated forms of meta-learning in the future. Organizing a meta-learning challenge could help forging AutoML solutions that generalize to new unseen domains (e.g. new types of sensor data) as well as gaining insights on the AutoML problem from a more fundamental point of view. The datasets of the AutoDL challenge are a resource that can be used for further benchmarks and the code of the winners has been outsourced, which is a big step towards “democratizing” Deep Learning. |
||||
Address | Virtual; July 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICML | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ LPX2020 | Serial | 3502 | ||
Permanent link to this record |