|
Records |
Links |
|
Author |
Josep Famadas; Meysam Madadi; Cristina Palmero; Sergio Escalera |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Generative Video Face Reenactment by AUs and Gaze Regularization |
Type |
Conference Article |
|
Year |
2020 |
Publication |
15th IEEE International Conference on Automatic Face and Gesture Recognition |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
444-451 |
|
|
Keywords |
|
|
|
Abstract |
In this work, we propose an encoder-decoder-like architecture to perform face reenactment in image sequences. Our goal is to transfer the training subject identity to a given test subject. We regularize face reenactment by facial action unit intensity and 3D gaze vector regression. This way, we enforce the network to transfer subtle facial expressions and eye dynamics, providing a more lifelike result. The proposed encoder-decoder receives as input the previous sequence frame stacked to the current frame image of facial landmarks. Thus, the generated frames benefit from appearance and geometry, while keeping temporal coherence for the generated sequence. At test stage, a new target subject with the facial performance of the source subject and the appearance of the training subject is reenacted. Principal component analysis is applied to project the test subject geometry to the closest training subject geometry before reenactment. Evaluation of our proposal shows faster convergence, and more accurate and realistic results in comparison to other architectures without action units and gaze regularization. |
|
|
Address |
Virtual; November 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
FG |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ FMP2020 |
Serial |
3517 |
|
Permanent link to this record |
|
|
|
|
Author |
Carlos Martin-Isla; Maryam Asadi-Aghbolaghi; Polyxeni Gkontra; Victor M. Campello; Sergio Escalera; Karim Lekadir |
![download PDF file pdf](img/file_PDF.gif)
|
|
Title |
Stacked BCDU-net with semantic CMR synthesis: application to Myocardial Pathology Segmentation challenge |
Type |
Conference Article |
|
Year |
2020 |
Publication |
MYOPS challenge and workshop |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Virtual; October 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MICCAIW |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ MAG2020 |
Serial |
3518 |
|
Permanent link to this record |
|
|
|
|
Author |
Hugo Bertiche; Meysam Madadi; Sergio Escalera |
![download PDF file pdf](img/file_PDF.gif)
|
|
Title |
CLOTH3D: Clothed 3D Humans |
Type |
Conference Article |
|
Year |
2020 |
Publication |
16th European Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This work presents CLOTH3D, the first big scale synthetic dataset of 3D clothed human sequences. CLOTH3D contains a large variability on garment type, topology, shape, size, tightness and fabric. Clothes are simulated on top of thousands of different pose sequences and body shapes, generating realistic cloth dynamics. We provide the dataset with a generative model for cloth generation. We propose a Conditional Variational Auto-Encoder (CVAE) based on graph convolutions (GCVAE) to learn garment latent spaces. This allows for realistic generation of 3D garments on top of SMPL model for any pose and shape. |
|
|
Address |
Virtual; August 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCV |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ BME2020 |
Serial |
3519 |
|
Permanent link to this record |
|
|
|
|
Author |
Reza Azad; Maryam Asadi-Aghbolaghi; Mahmood Fathy; Sergio Escalera |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Attention Deeplabv3+: Multi-level Context Attention Mechanism for Skin Lesion Segmentation |
Type |
Conference Article |
|
Year |
2020 |
Publication |
Bioimage computation workshop |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Virtual; August 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCVW |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ AAF2020 |
Serial |
3520 |
|
Permanent link to this record |
|
|
|
|
Author |
Petia Radeva |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Uncertainty Modeling within an End-to-end Framework for Food Image Analysis |
Type |
Conference Article |
|
Year |
2020 |
Publication |
1st DELTA |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DELTA |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ Rad2020 |
Serial |
3527 |
|
Permanent link to this record |
|
|
|
|
Author |
Mariona Caros; Maite Garolera; Petia Radeva; Xavier Giro |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Automatic Reminiscence Therapy for Dementia |
Type |
Conference Article |
|
Year |
2020 |
Publication |
10th ACM International Conference on Multimedia Retrieval |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
383-387 |
|
|
Keywords |
|
|
|
Abstract |
With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily. It affects more than 46 million people worldwide, and it is estimated that in 2050 more than 100 million will be affected. While there are not effective treatments for these terminal diseases, therapies such as reminiscence, that stimulate memories from the past are recommended. Currently, reminiscence therapy takes place in care homes and is guided by a therapist or a carer. In this work, we present an AI-based solution to automatize the reminiscence therapy, which consists in a dialogue system that uses photos as input to generate questions. We run a usability case study with patients diagnosed of mild cognitive impairment that shows they found the system very entertaining and challenging. Overall, this paper presents how reminiscence therapy can be automatized by using machine learning, and deployed to smartphones and laptops, making the therapy more accessible to every person affected by dementia. |
|
|
Address |
Virtual; October 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICRM |
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ CGR2020 |
Serial |
3529 |
|
Permanent link to this record |
|
|
|
|
Author |
Esmitt Ramirez; Carles Sanchez; Debora Gil |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Localizing Pulmonary Lesions Using Fuzzy Deep Learning |
Type |
Conference Article |
|
Year |
2019 |
Publication |
21st International Symposium on Symbolic and Numeric Algorithms for Scientific Computing |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
290-294 |
|
|
Keywords |
|
|
|
Abstract |
The usage of medical images is part of the clinical daily in several healthcare centers around the world. Particularly, Computer Tomography (CT) images are an important key in the early detection of suspicious lung lesions. The CT image exploration allows the detection of lung lesions before any invasive procedure (e.g. bronchoscopy, biopsy). The effective localization of lesions is performed using different image processing and computer vision techniques. Lately, the usage of deep learning models into medical imaging from detection to prediction shown that is a powerful tool for Computer-aided software. In this paper, we present an approach to localize pulmonary lung lesion using fuzzy deep learning. Our approach uses a simple convolutional neural network based using the LIDC-IDRI dataset. Each image is divided into patches associated a probability vector (fuzzy) according their belonging to anatomical structures on a CT. We showcase our approach as part of a full CAD system to exploration, planning, guiding and detection of pulmonary lesions. |
|
|
Address |
Timisoara; Rumania; September 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SYNASC |
|
|
Notes |
IAM; 600.145; 600.140; 601.337; 601.323 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RSG2019 |
Serial |
3531 |
|
Permanent link to this record |
|
|
|
|
Author |
Cristina Palmero; Javier Selva; Sorina Smeureanu; Julio C. S. Jacques Junior; Albert Clapes; Alexa Mosegui; Zejian Zhang; David Gallardo; Georgina Guilera; David Leiva; Sergio Escalera |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset |
Type |
Conference Article |
|
Year |
2021 |
Publication |
IEEE Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
1-12 |
|
|
Keywords |
|
|
|
Abstract |
This paper introduces UDIVA, a new non-acted dataset of face-to-face dyadic interactions, where interlocutors perform competitive and collaborative tasks with different behavior elicitation and cognitive workload. The dataset consists of 90.5 hours of dyadic interactions among 147 participants distributed in 188 sessions, recorded using multiple audiovisual and physiological sensors. Currently, it includes sociodemographic, self- and peer-reported personality, internal state, and relationship profiling from participants. As an initial analysis on UDIVA, we propose a
transformer-based method for self-reported personality inference in dyadic scenarios, which uses audiovisual data and different sources of context from both interlocutors to
regress a target person’s personality traits. Preliminary results from an incremental study show consistent improvements when using all available context information. |
|
|
Address |
Virtual; January 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ PSS2021 |
Serial |
3532 |
|
Permanent link to this record |
|
|
|
|
Author |
Julio C. S. Jacques Junior; Agata Lapedriza; Cristina Palmero; Xavier Baro; Sergio Escalera |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Person Perception Biases Exposed: Revisiting the First Impressions Dataset |
Type |
Conference Article |
|
Year |
2021 |
Publication |
IEEE Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
13-21 |
|
|
Keywords |
|
|
|
Abstract |
This work revisits the ChaLearn First Impressions database, annotated for personality perception using pairwise comparisons via crowdsourcing. We analyse for the first time the original pairwise annotations, and reveal existing person perception biases associated to perceived attributes like gender, ethnicity, age and face attractiveness.
We show how person perception bias can influence data labelling of a subjective task, which has received little attention from the computer vision and machine learning communities by now. We further show that the mechanism used to convert pairwise annotations to continuous values may magnify the biases if no special treatment is considered. The findings of this study are relevant for the computer vision community that is still creating new datasets on subjective tasks, and using them for practical applications, ignoring these perceptual biases. |
|
|
Address |
Virtual; January 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ JLP2021 |
Serial |
3533 |
|
Permanent link to this record |
|
|
|
|
Author |
Soumick Chatterjee; Fatima Saad; Chompunuch Sarasaen; Suhita Ghosh; Rupali Khatun; Petia Radeva; Georg Rose; Sebastian Stober; Oliver Speck; Andreas Nürnberger |
![download PDF file pdf](img/file_PDF.gif)
|
|
Title |
Exploration of Interpretability Techniques for Deep COVID-19 Classification using Chest X-ray Images |
Type |
Miscellaneous |
|
Year |
2020 |
Publication |
Arxiv |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
CoRR abs/2006.02570
The outbreak of COVID-19 has shocked the entire world with its fairly rapid spread and has challenged different sectors. One of the most effective ways to limit its spread is the early and accurate diagnosis of infected patients. Medical imaging such as X-ray and Computed Tomography (CT) combined with the potential of Artificial Intelligence (AI) plays an essential role in supporting the medical staff in the diagnosis process. Thereby, the use of five different deep learning models (ResNet18, ResNet34, InceptionV3, InceptionResNetV2, and DenseNet161) and their Ensemble have been used in this paper, to classify COVID-19, pneumoniæ and healthy subjects using Chest X-Ray. Multi-label classification was performed to predict multiple pathologies for each patient, if present. Foremost, the interpretability of each of the networks was thoroughly studied using techniques like occlusion, saliency, input X gradient, guided backpropagation, integrated gradients, and DeepLIFT. The mean Micro-F1 score of the models for COVID-19 classifications ranges from 0.66 to 0.875, and is 0.89 for the Ensemble of the network models. The qualitative results depicted the ResNets to be the most interpretable model. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ CSS2020 |
Serial |
3534 |
|
Permanent link to this record |
|
|
|
|
Author |
Estefania Talavera; Andreea Glavan; Alina Matei; Petia Radeva |
![download PDF file pdf](img/file_PDF.gif)
|
|
Title |
Eating Habits Discovery in Egocentric Photo-streams |
Type |
Miscellaneous |
|
Year |
2020 |
Publication |
Arxiv |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
CoRR abs/2009.07646
Eating habits are learned throughout the early stages of our lives. However, it is not easy to be aware of how our food-related routine affects our healthy living. In this work, we address the unsupervised discovery of nutritional habits from egocentric photo-streams. We build a food-related behavioural pattern discovery model, which discloses nutritional routines from the activities performed throughout the days. To do so, we rely on Dynamic-Time-Warping for the evaluation of similarity among the collected days. Within this framework, we present a simple, but robust and fast novel classification pipeline that outperforms the state-of-the-art on food-related image classification with a weighted accuracy and F-score of 70% and 63%, respectively. Later, we identify days composed of nutritional activities that do not describe the habits of the person as anomalies in the daily life of the user with the Isolation Forest method. Furthermore, we show an application for the identification of food-related scenes when the camera wearer eats in isolation. Results have shown the good performance of the proposed model and its relevance to visualize the nutritional habits of individuals. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ TGM2020 |
Serial |
3536 |
|
Permanent link to this record |
|
|
|
|
Author |
Marc Masana; Xialei Liu; Bartlomiej Twardowski; Mikel Menta; Andrew Bagdanov; Joost Van de Weijer |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Class-incremental learning: survey and performance evaluation |
Type |
Journal Article |
|
Year |
2022 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
For future learning systems incremental learning is desirable, because it allows for: efficient resource usage by eliminating the need to retrain from scratch at the arrival of new data; reduced memory usage by preventing or limiting the amount of data required to be stored -- also important when privacy limitations are imposed; and learning that more closely resembles human learning. The main challenge for incremental learning is catastrophic forgetting, which refers to the precipitous drop in performance on previously learned tasks after learning a new one. Incremental learning of deep neural networks has seen explosive growth in recent years. Initial work focused on task incremental learning, where a task-ID is provided at inference time. Recently we have seen a shift towards class-incremental learning where the learner must classify at inference time between all classes seen in previous tasks without recourse to a task-ID. In this paper, we provide a complete survey of existing methods for incremental learning, and in particular we perform an extensive experimental evaluation on twelve class-incremental methods. We consider several new experimental scenarios, including a comparison of class-incremental methods on multiple large-scale datasets, investigation into small and large domain shifts, and comparison on various network architectures. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MLT2022 |
Serial |
3538 |
|
Permanent link to this record |
|
|
|
|
Author |
Shiqi Yang; Yaxing Wang; Joost Van de Weijer; Luis Herranz |
![download PDF file pdf](img/file_PDF.gif)
|
|
Title |
Unsupervised Domain Adaptation without Source Data by Casting a BAIT |
Type |
Miscellaneous |
|
Year |
2020 |
Publication |
Arxiv |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
arXiv:2010.12427
Unsupervised domain adaptation (UDA) aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain. Existing UDA methods require access to source data during adaptation, which may not be feasible in some real-world applications. In this paper, we address the source-free unsupervised domain adaptation (SFUDA) problem, where only the source model is available during the adaptation. We propose a method named BAIT to address SFUDA. Specifically, given only the source model, with the source classifier head fixed, we introduce a new learnable classifier. When adapting to the target domain, class prototypes of the new added classifier will act as a bait. They will first approach the target features which deviate from prototypes of the source classifier due to domain shift. Then those target features are pulled towards the corresponding prototypes of the source classifier, thus achieving feature alignment with the source classifier in the absence of source data. Experimental results show that the proposed method achieves state-of-the-art performance on several benchmark datasets compared with existing UDA and SFUDA methods. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ YWW2020 |
Serial |
3539 |
|
Permanent link to this record |
|
|
|
|
Author |
Shiqi Yang; Kai Wang; Luis Herranz; Joost Van de Weijer |
![download PDF file pdf](img/file_PDF.gif)
|
|
Title |
Simple and effective localized attribute representations for zero-shot learning |
Type |
Miscellaneous |
|
Year |
2020 |
Publication |
Arxiv |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
arXiv:2006.05938
Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their semantic descriptions. Some recent papers have shown the importance of localized features together with fine-tuning the feature extractor to obtain discriminative and transferable features. However, these methods require complex attention or part detection modules to perform explicit localization in the visual space. In contrast, in this paper we propose localizing representations in the semantic/attribute space, with a simple but effective pipeline where localization is implicit. Focusing on attribute representations, we show that our method obtains state-of-the-art performance on CUB and SUN datasets, and also achieves competitive results on AWA2 dataset, outperforming generally more complex methods with explicit localization in the visual space. Our method can be implemented easily, which can be used as a new baseline for zero shot-learning. In addition, our localized representations are highly interpretable as attribute-specific heatmaps. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ YWH2020 |
Serial |
3542 |
|
Permanent link to this record |
|
|
|
|
Author |
Kai Wang; Luis Herranz; Joost Van de Weijer |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Continual learning in cross-modal retrieval |
Type |
Conference Article |
|
Year |
2021 |
Publication |
2nd CLVISION workshop |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](img/sort_asc.gif) |
|
Issue |
|
Pages |
3628-3638 |
|
|
Keywords |
|
|
|
Abstract |
Multimodal representations and continual learning are two areas closely related to human intelligence. The former considers the learning of shared representation spaces where information from different modalities can be compared and integrated (we focus on cross-modal retrieval between language and visual representations). The latter studies how to prevent forgetting a previously learned task when learning a new one. While humans excel in these two aspects, deep neural networks are still quite limited. In this paper, we propose a combination of both problems into a continual cross-modal retrieval setting, where we study how the catastrophic interference caused by new tasks impacts the embedding spaces and their cross-modal alignment required for effective retrieval. We propose a general framework that decouples the training, indexing and querying stages. We also identify and study different factors that may lead to forgetting, and propose tools to alleviate it. We found that the indexing stage pays an important role and that simply avoiding reindexing the database with updated embedding networks can lead to significant gains. We evaluated our methods in two image-text retrieval datasets, obtaining significant gains with respect to the fine tuning baseline. |
|
|
Address |
Virtual; June 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
LAMP; 600.120; 600.141; 600.147; 601.379 |
Approved |
no |
|
|
Call Number |
Admin @ si @ WHW2021 |
Serial |
3566 |
|
Permanent link to this record |