|   | 
Details
   web
Records
Author Riccardo Del Chiaro; Bartlomiej Twardowski; Andrew Bagdanov; Joost Van de Weijer
Title Recurrent attention to transient tasks for continual image captioning Type Conference Article
Year 2020 Publication 34th Conference on Neural Information Processing Systems Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract Research on continual learning has led to a variety of approaches to mitigating catastrophic forgetting in feed-forward classification networks. Until now surprisingly little attention has been focused on continual learning of recurrent models applied to problems like image captioning. In this paper we take a systematic look at continual learning of LSTM-based models for image captioning. We propose an attention-based approach that explicitly accommodates the transient nature of vocabularies in continual image captioning tasks -- i.e. that task vocabularies are not disjoint. We call our method Recurrent Attention to Transient Tasks (RATT), and also show how to adapt continual learning approaches based on weight egularization and knowledge distillation to recurrent continual learning problems. We apply our approaches to incremental image captioning problem on two new continual learning benchmarks we define using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is able to sequentially learn five captioning tasks while incurring no forgetting of previously learned ones.
Address virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference NEURIPS
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ CTB2020 Serial 3484
Permanent link to this record
 

 
Author Yaxing Wang; Lu Yu; Joost Van de Weijer
Title DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs Type Conference Article
Year 2020 Publication 34th Conference on Neural Information Processing Systems Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract Image-to-image translation has recently achieved remarkable results. But despite current success, it suffers from inferior performance when translations between classes require large shape changes. We attribute this to the high-resolution bottlenecks which are used by current state-of-the-art image-to-image methods. Therefore, in this work, we propose a novel deep hierarchical Image-to-Image Translation method, called DeepI2I. We learn a model by leveraging hierarchical features: (a) structural information contained in the shallow layers and (b) semantic information extracted from the deep layers. To enable the training of deep I2I models on small datasets, we propose a novel transfer learning method, that transfers knowledge from pre-trained GANs. Specifically, we leverage the discriminator of a pre-trained GANs (i.e. BigGAN or StyleGAN) to initialize both the encoder and the discriminator and the pre-trained generator to initialize the generator of our model. Applying knowledge transfer leads to an alignment problem between the encoder and generator. We introduce an adaptor network to address this. On many-class image-to-image translation on three datasets (Animal faces, Birds, and Foods) we decrease mFID by at least 35% when compared to the state-of-the-art. Furthermore, we qualitatively and quantitatively demonstrate that transfer learning significantly improves the performance of I2I systems, especially for small datasets. Finally, we are the first to perform I2I translations for domains with over 100 classes.
Address virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference NEURIPS
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ WYW2020 Serial 3485
Permanent link to this record
 

 
Author Yaxing Wang; Salman Khan; Abel Gonzalez-Garcia; Joost Van de Weijer; Fahad Shahbaz Khan
Title Semi-supervised Learning for Few-shot Image-to-Image Translation Type Conference Article
Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract In the last few years, unpaired image-to-image translation has witnessed remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image translation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the source domain during training. To do so, we propose applying semi-supervised learning via a noise-tolerant pseudo-labeling procedure. We also apply a cycle consistency constraint to further exploit the information from unlabeled images, either from the same dataset or external. Additionally, we propose several structural modifications to facilitate the image translation task under these circumstances. Our semi-supervised method for few-shot image translation, called SEMIT, achieves excellent results on four different datasets using as little as 10% of the source labels, and matches the performance of the main fully-supervised competitor using only 20% labeled data. Our code and models are made public at: this https URL.
Address Virtual; June 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ WKG2020 Serial 3486
Permanent link to this record
 

 
Author Yi Xiao; Felipe Codevilla; Christopher Pal; Antonio Lopez
Title Action-Based Representation Learning for Autonomous Driving Type Conference Article
Year 2020 Publication Conference on Robot Learning Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract Human drivers produce a vast amount of data which could, in principle, be used to improve autonomous driving systems. Unfortunately, seemingly straightforward approaches for creating end-to-end driving models that map sensor data directly into driving actions are problematic in terms of interpretability, and typically have significant difficulty dealing with spurious correlations. Alternatively, we propose to use this kind of action-based driving data for learning representations. Our experiments show that an affordance-based driving model pre-trained with this approach can leverage a relatively small amount of weakly annotated imagery and outperform pure end-to-end driving models, while being more interpretable. Further, we demonstrate how this strategy outperforms previous methods based on learning inverse dynamics models as well as other methods based on heavy human supervision (ImageNet).
Address virtual; November 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CORL
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ XCP2020 Serial 3487
Permanent link to this record
 

 
Author Hannes Mueller; Andre Groger; Jonathan Hersh; Andrea Matranga; Joan Serrat
Title Monitoring War Destruction from Space: A Machine Learning Approach Type Miscellaneous
Year 2020 Publication Arxiv Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract Existing data on building destruction in conflict zones rely on eyewitness reports or manual detection, which makes it generally scarce, incomplete and potentially biased. This lack of reliable data imposes severe limitations for media reporting, humanitarian relief efforts, human rights monitoring, reconstruction initiatives, and academic studies of violent conflict. This article introduces an automated method of measuring destruction in high-resolution satellite images using deep learning techniques combined with data augmentation to expand training samples. We apply this method to the Syrian civil war and reconstruct the evolution of damage in major cities across the country. The approach allows generating destruction data with unprecedented scope, resolution, and frequency – only limited by the available satellite imagery – which can alleviate data limitations decisively.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ MGH2020 Serial 3489
Permanent link to this record
 

 
Author Lluis Gomez; Anguelos Nicolaou; Marçal Rusiñol; Dimosthenis Karatzas
Title 12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding Type Book Chapter
Year 2020 Publication Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Springer Place of Publication Editor K. Alahari; C.V. Jawahar
Language Summary Language Original Title
Series Editor Series Title Series on Advances in Computer Vision and Pattern Recognition Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number GNR2020 Serial 3494
Permanent link to this record
 

 
Author Lluis Gomez; Dena Bazazian; Dimosthenis Karatzas
Title Historical review of scene text detection research Type Book Chapter
Year 2020 Publication Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Springer Place of Publication Editor K. Alahari; C.V. Jawahar
Language Summary Language Original Title
Series Editor Series Title Series on Advances in Computer Vision and Pattern Recognition Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ GBK2020 Serial 3495
Permanent link to this record
 

 
Author Jon Almazan; Lluis Gomez; Suman Ghosh; Ernest Valveny; Dimosthenis Karatzas
Title WATTS: A common representation of word images and strings using embedded attributes for text recognition and retrieval Type Book Chapter
Year 2020 Publication Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Springer Place of Publication Editor Analysis”, K. Alahari; C.V. Jawahar
Language Summary Language Original Title
Series Editor Series Title Series on Advances in Computer Vision and Pattern Recognition Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ AGG2020 Serial 3496
Permanent link to this record
 

 
Author Raul Gomez; Yahui Liu; Marco de Nadai; Dimosthenis Karatzas; Bruno Lepri; Nicu Sebe
Title Retrieval Guided Unsupervised Multi-domain Image to Image Translation Type Conference Article
Year 2020 Publication 28th ACM International Conference on Multimedia Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract Image to image translation aims to learn a mapping that transforms an image from one visual domain to another. Recent works assume that images descriptors can be disentangled into a domain-invariant content representation and a domain-specific style representation. Thus, translation models seek to preserve the content of source images while changing the style to a target visual domain. However, synthesizing new images is extremely challenging especially in multi-domain translations, as the network has to compose content and style to generate reliable and diverse images in multiple domains. In this paper we propose the use of an image retrieval system to assist the image-to-image translation task. First, we train an image-to-image translation model to map images to multiple domains. Then, we train an image retrieval model using real and generated images to find images similar to a query one in content but in a different domain. Finally, we exploit the image retrieval system to fine-tune the image-to-image translation model and generate higher quality images. Our experiments show the effectiveness of the proposed solution and highlight the contribution of the retrieval network, which can benefit from additional unlabeled data and help image-to-image translation models in the presence of scarce data.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ACM
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ GLN2020 Serial 3497
Permanent link to this record
 

 
Author Zhengying Liu; Adrien Pavao; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Sebastien Treguer
Title How far are we from true AutoML: reflection from winning solutions and results of AutoDL challenge Type Conference Article
Year 2020 Publication 7th ICML Workshop on Automated Machine Learning Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract Following the completion of the AutoDL challenge (the final challenge in the ChaLearn
AutoDL challenge series 2019), we investigate winning solutions and challenge results to
answer an important motivational question: how far are we from achieving true AutoML?
On one hand, the winning solutions achieve good (accurate and fast) classification performance on unseen datasets. On the other hand, all winning solutions still contain a
considerable amount of hard-coded knowledge on the domain (or modality) such as image,
video, text, speech and tabular. This form of ad-hoc meta-learning could be replaced by
more automated forms of meta-learning in the future. Organizing a meta-learning challenge could help forging AutoML solutions that generalize to new unseen domains (e.g.
new types of sensor data) as well as gaining insights on the AutoML problem from a more
fundamental point of view. The datasets of the AutoDL challenge are a resource that can
be used for further benchmarks and the code of the winners has been outsourced, which is
a big step towards “democratizing” Deep Learning.
Address Virtual; July 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICML
Notes HUPBA Approved no
Call Number Admin @ si @ LPX2020 Serial 3502
Permanent link to this record
 

 
Author Marc Masana; Bartlomiej Twardowski; Joost Van de Weijer
Title On Class Orderings for Incremental Learning Type Conference Article
Year 2020 Publication ICML Workshop on Continual Learning Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract The influence of class orderings in the evaluation of incremental learning has received very little attention. In this paper, we investigate the impact of class orderings for incrementally learned classifiers. We propose a method to compute various orderings for a dataset. The orderings are derived by simulated annealing optimization from the confusion matrix and reflect different incremental learning scenarios, including maximally and minimally confusing tasks. We evaluate a wide range of state-of-the-art incremental learning methods on the proposed orderings. Results show that orderings can have a significant impact on performance and the ranking of the methods.
Address Virtual; July 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICMLW
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ MTW2020 Serial 3505
Permanent link to this record
 

 
Author David Berga; Marc Masana; Joost Van de Weijer
Title Disentanglement of Color and Shape Representations for Continual Learning Type Conference Article
Year 2020 Publication ICML Workshop on Continual Learning Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract We hypothesize that disentangled feature representations suffer less from catastrophic forgetting. As a case study we perform explicit disentanglement of color and shape, by adjusting the network architecture. We tested classification accuracy and forgetting in a task-incremental setting with Oxford-102 Flowers dataset. We combine our method with Elastic Weight Consolidation, Learning without Forgetting, Synaptic Intelligence and Memory Aware Synapses, and show that feature disentanglement positively impacts continual learning performance.
Address Virtual; July 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICMLW
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ BMW2020 Serial 3506
Permanent link to this record
 

 
Author Manuel Carbonell; Pau Riba; Mauricio Villegas; Alicia Fornes; Josep Llados
Title Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents Type Conference Article
Year 2020 Publication 25th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract The use of administrative documents to communicate and leave record of business information requires of methods
able to automatically extract and understand the content from
such documents in a robust and efficient way. In addition,
the semi-structured nature of these reports is specially suited
for the use of graph-based representations which are flexible
enough to adapt to the deformations from the different document
templates. Moreover, Graph Neural Networks provide the proper
methodology to learn relations among the data elements in
these documents. In this work we study the use of Graph
Neural Network architectures to tackle the problem of entity
recognition and relation extraction in semi-structured documents.
Our approach achieves state of the art results in the three
tasks involved in the process. Additionally, the experimentation
with two datasets of different nature demonstrates the good
generalization ability of our approach.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ CRV2020 Serial 3509
Permanent link to this record
 

 
Author Carlos Martin-Isla; Maryam Asadi-Aghbolaghi; Polyxeni Gkontra; Victor M. Campello; Sergio Escalera; Karim Lekadir
Title Stacked BCDU-net with semantic CMR synthesis: application to Myocardial Pathology Segmentation challenge Type Conference Article
Year 2020 Publication MYOPS challenge and workshop Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract
Address Virtual; October 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MICCAIW
Notes HUPBA Approved no
Call Number Admin @ si @ MAG2020 Serial 3518
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera
Title CLOTH3D: Clothed 3D Humans Type Conference Article
Year 2020 Publication 16th European Conference on Computer Vision Abbreviated Journal
Volume Issue Pages (down)
Keywords
Abstract This work presents CLOTH3D, the first big scale synthetic dataset of 3D clothed human sequences. CLOTH3D contains a large variability on garment type, topology, shape, size, tightness and fabric. Clothes are simulated on top of thousands of different pose sequences and body shapes, generating realistic cloth dynamics. We provide the dataset with a generative model for cloth generation. We propose a Conditional Variational Auto-Encoder (CVAE) based on graph convolutions (GCVAE) to learn garment latent spaces. This allows for realistic generation of 3D garments on top of SMPL model for any pose and shape.
Address Virtual; August 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCV
Notes HUPBA Approved no
Call Number Admin @ si @ BME2020 Serial 3519
Permanent link to this record