Home | << 1 2 3 4 5 6 7 8 9 >> |
Records | |||||
---|---|---|---|---|---|
Author | Razieh Rastgoo; Kourosh Kiani; Sergio Escalera | ||||
Title | Sign Language Recognition: A Deep Survey | Type | Journal Article | ||
Year | 2021 | Publication | Expert Systems With Applications | Abbreviated Journal | ESWA |
Volume | 164 | Issue | Pages | 113794 | |
Keywords | |||||
Abstract | Sign language, as a different form of the communication language, is important to large groups of people in society. There are different signs in each sign language with variability in hand shape, motion profile, and position of the hand, face, and body parts contributing to each sign. So, visual sign language recognition is a complex research area in computer vision. Many models have been proposed by different researchers with significant improvement by deep learning approaches in recent years. In this survey, we review the vision-based proposed models of sign language recognition using deep learning approaches from the last five years. While the overall trend of the proposed models indicates a significant improvement in recognition accuracy in sign language recognition, there are some challenges yet that need to be solved. We present a taxonomy to categorize the proposed models for isolated and continuous sign language recognition, discussing applications, datasets, hybrid models, complexity, and future lines of research in the field. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ RKE2021a | Serial | 3521 | ||
Permanent link to this record | |||||
Author | Giuseppe Pezzano; Vicent Ribas Ripoll; Petia Radeva | ||||
Title | CoLe-CNN: Context-learning convolutional neural network with adaptive loss function for lung nodule segmentation | Type | Journal Article | ||
Year | 2021 | Publication | Computer Methods and Programs in Biomedicine | Abbreviated Journal | CMPB |
Volume | 198 | Issue | Pages | 105792 | |
Keywords | |||||
Abstract | Background and objective:An accurate segmentation of lung nodules in computed tomography images is a crucial step for the physical characterization of the tumour. Being often completely manually accomplished, nodule segmentation turns to be a tedious and time-consuming procedure and this represents a high obstacle in clinical practice. In this paper, we propose a novel Convolutional Neural Network for nodule segmentation that combines a light and efficient architecture with innovative loss function and segmentation strategy. Methods:In contrast to most of the standard end-to-end architectures for nodule segmentation, our network learns the context of the nodules by producing two masks representing all the background and secondary-important elements in the Computed Tomography scan. The nodule is detected by subtracting the context from the original scan image. Additionally, we introduce an asymmetric loss function that automatically compensates for potential errors in the nodule annotations. We trained and tested our Neural Network on the public LIDC-IDRI database, compared it with the state of the art and run a pseudo-Turing test between four radiologists and the network. Results:The results proved that the behaviour of the algorithm is very near to the human performance and its segmentation masks are almost indistinguishable from the ones made by the radiologists. Our method clearly outperforms the state of the art on CT nodule segmentation in terms of F1 score and IoU of and respectively. Conclusions: The main structure of the network ensures all the properties of the UNet architecture, while the Multi Convolutional Layers give a more accurate pattern recognition. The newly adopted solutions also increase the details on the border of the nodule, even under the noisiest conditions. This method can be applied now for single CT slice nodule segmentation and it represents a starting point for the future development of a fully automatic 3D segmentation software. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ PRR2021 | Serial | 3530 | ||
Permanent link to this record | |||||
Author | Cristina Palmero; Javier Selva; Sorina Smeureanu; Julio C. S. Jacques Junior; Albert Clapes; Alexa Mosegui; Zejian Zhang; David Gallardo; Georgina Guilera; David Leiva; Sergio Escalera | ||||
Title | Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1-12 | ||
Keywords | |||||
Abstract | This paper introduces UDIVA, a new non-acted dataset of face-to-face dyadic interactions, where interlocutors perform competitive and collaborative tasks with different behavior elicitation and cognitive workload. The dataset consists of 90.5 hours of dyadic interactions among 147 participants distributed in 188 sessions, recorded using multiple audiovisual and physiological sensors. Currently, it includes sociodemographic, self- and peer-reported personality, internal state, and relationship profiling from participants. As an initial analysis on UDIVA, we propose a
transformer-based method for self-reported personality inference in dyadic scenarios, which uses audiovisual data and different sources of context from both interlocutors to regress a target person’s personality traits. Preliminary results from an incremental study show consistent improvements when using all available context information. |
||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ PSS2021 | Serial | 3532 | ||
Permanent link to this record | |||||
Author | Julio C. S. Jacques Junior; Agata Lapedriza; Cristina Palmero; Xavier Baro; Sergio Escalera | ||||
Title | Person Perception Biases Exposed: Revisiting the First Impressions Dataset | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 13-21 | ||
Keywords | |||||
Abstract | This work revisits the ChaLearn First Impressions database, annotated for personality perception using pairwise comparisons via crowdsourcing. We analyse for the first time the original pairwise annotations, and reveal existing person perception biases associated to perceived attributes like gender, ethnicity, age and face attractiveness.
We show how person perception bias can influence data labelling of a subjective task, which has received little attention from the computer vision and machine learning communities by now. We further show that the mechanism used to convert pairwise annotations to continuous values may magnify the biases if no special treatment is considered. The findings of this study are relevant for the computer vision community that is still creating new datasets on subjective tasks, and using them for practical applications, ignoring these perceptual biases. |
||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ JLP2021 | Serial | 3533 | ||
Permanent link to this record | |||||
Author | Carola Figueroa Flores; Bogdan Raducanu; David Berga; Joost Van de Weijer | ||||
Title | Hallucinating Saliency Maps for Fine-Grained Image Classification for Limited Data Domains | Type | Conference Article | ||
Year | 2021 | Publication | 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications | Abbreviated Journal | |
Volume | 4 | Issue | Pages | 163-171 | |
Keywords | |||||
Abstract | arXiv:2007.12562
Most of the saliency methods are evaluated on their ability to generate saliency maps, and not on their functionality in a complete vision pipeline, like for instance, image classification. In the current paper, we propose an approach which does not require explicit saliency maps to improve image classification, but they are learned implicitely, during the training of an end-to-end image classification task. We show that our approach obtains similar results as the case when the saliency maps are provided explicitely. Combining RGB data with saliency maps represents a significant advantage for object recognition, especially for the case when training data is limited. We validate our method on several datasets for fine-grained classification tasks (Flowers, Birds and Cars). In addition, we show that our saliency estimation method, which is trained without any saliency groundtruth data, obtains competitive results on real image saliency benchmark (Toronto), and outperforms deep saliency models with synthetic images (SID4VAM). |
||||
Address | Virtual; February 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISAPP | ||
Notes | LAMP | Approved | no | ||
Call Number | Admin @ si @ FRB2021c | Serial | 3540 | ||
Permanent link to this record | |||||
Author | Sudeep Katakol; Basem Elbarashy; Luis Herranz; Joost Van de Weijer; Antonio Lopez | ||||
Title | Distributed Learning and Inference with Compressed Images | Type | Journal Article | ||
Year | 2021 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 30 | Issue | Pages | 3069 - 3083 | |
Keywords | |||||
Abstract | Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time, or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; ADAS; 600.120; 600.118 | Approved | no | ||
Call Number | Admin @ si @ KEH2021 | Serial | 3543 | ||
Permanent link to this record | |||||
Author | Kai Wang; Luis Herranz; Joost Van de Weijer | ||||
Title | Continual learning in cross-modal retrieval | Type | Conference Article | ||
Year | 2021 | Publication | 2nd CLVISION workshop | Abbreviated Journal | |
Volume | Issue | Pages | 3628-3638 | ||
Keywords | |||||
Abstract | Multimodal representations and continual learning are two areas closely related to human intelligence. The former considers the learning of shared representation spaces where information from different modalities can be compared and integrated (we focus on cross-modal retrieval between language and visual representations). The latter studies how to prevent forgetting a previously learned task when learning a new one. While humans excel in these two aspects, deep neural networks are still quite limited. In this paper, we propose a combination of both problems into a continual cross-modal retrieval setting, where we study how the catastrophic interference caused by new tasks impacts the embedding spaces and their cross-modal alignment required for effective retrieval. We propose a general framework that decouples the training, indexing and querying stages. We also identify and study different factors that may lead to forgetting, and propose tools to alleviate it. We found that the indexing stage pays an important role and that simply avoiding reindexing the database with updated embedding networks can lead to significant gains. We evaluated our methods in two image-text retrieval datasets, obtaining significant gains with respect to the fine tuning baseline. | ||||
Address | Virtual; June 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP; 600.120; 600.141; 600.147; 601.379 | Approved | no | ||
Call Number | Admin @ si @ WHW2021 | Serial | 3566 | ||
Permanent link to this record | |||||
Author | Vincenzo Lomonaco; Lorenzo Pellegrini; Andrea Cossu; Antonio Carta; Gabriele Graffieti; Tyler L. Hayes; Matthias De Lange; Marc Masana; Jary Pomponi; Gido van de Ven; Martin Mundt; Qi She; Keiland Cooper; Jeremy Forest; Eden Belouadah; Simone Calderara; German I. Parisi; Fabio Cuzzolin; Andreas Tolias; Simone Scardapane; Luca Antiga; Subutai Amhad; Adrian Popescu; Christopher Kanan; Joost Van de Weijer; Tinne Tuytelaars; Davide Bacciu; Davide Maltoni | ||||
Title | Avalanche: an End-to-End Library for Continual Learning | Type | Conference Article | ||
Year | 2021 | Publication | 34th IEEE Conference on Computer Vision and Pattern Recognition Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 3595-3605 | ||
Keywords | |||||
Abstract | Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation of continual learning algorithms. | ||||
Address | Virtual; June 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ LPC2021 | Serial | 3567 | ||
Permanent link to this record | |||||
Author | Carola Figueroa Flores; David Berga; Joost Van de Weijer; Bogdan Raducanu | ||||
Title | Saliency for free: Saliency prediction as a side-effect of object recognition | Type | Journal Article | ||
Year | 2021 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 150 | Issue | Pages | 1-7 | |
Keywords | Saliency maps; Unsupervised learning; Object recognition | ||||
Abstract | Saliency is the perceptual capacity of our visual system to focus our attention (i.e. gaze) on relevant objects instead of the background. So far, computational methods for saliency estimation required the explicit generation of a saliency map, process which is usually achieved via eyetracking experiments on still images. This is a tedious process that needs to be repeated for each new dataset. In the current paper, we demonstrate that is possible to automatically generate saliency maps without ground-truth. In our approach, saliency maps are learned as a side effect of object recognition. Extensive experiments carried out on both real and synthetic datasets demonstrated that our approach is able to generate accurate saliency maps, achieving competitive results when compared with supervised methods. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.147; 600.120 | Approved | no | ||
Call Number | Admin @ si @ FBW2021 | Serial | 3559 | ||
Permanent link to this record | |||||
Author | Idoia Ruiz; Lorenzo Porzi; Samuel Rota Bulo; Peter Kontschieder; Joan Serrat | ||||
Title | Weakly Supervised Multi-Object Tracking and Segmentation | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 125-133 | ||
Keywords | |||||
Abstract | We introduce the problem of weakly supervised MultiObject Tracking and Segmentation, i.e. joint weakly supervised instance segmentation and multi-object tracking, in which we do not provide any kind of mask annotation.
To address it, we design a novel synergistic training strategy by taking advantage of multi-task learning, i.e. classification and tracking tasks guide the training of the unsupervised instance segmentation. For that purpose, we extract weak foreground localization information, provided by Grad-CAM heatmaps, to generate a partial ground truth to learn from. Additionally, RGB image level information is employed to refine the mask prediction at the edges of the objects. We evaluate our method on KITTI MOTS, the most representative benchmark for this task, reducing the performance gap on the MOTSP metric between the fully supervised and weakly supervised approach to just 12% and 12.7 % for cars and pedestrians, respectively. |
||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACVW | ||
Notes | ADAS; 600.118; 600.124 | Approved | no | ||
Call Number | Admin @ si @ RPR2021 | Serial | 3548 | ||
Permanent link to this record | |||||
Author | Henry Velesaca; Patricia Suarez; Raul Mira; Angel Sappa | ||||
Title | Computer Vision based Food Grain Classification: a Comprehensive Survey | Type | Journal Article | ||
Year | 2021 | Publication | Computers and Electronics in Agriculture | Abbreviated Journal | CEA |
Volume | 187 | Issue | Pages | 106287 | |
Keywords | |||||
Abstract | This manuscript presents a comprehensive survey on recent computer vision based food grain classification techniques. It includes state-of-the-art approaches intended for different grain varieties. The approaches proposed in the literature are analyzed according to the processing stages considered in the classification pipeline, making it easier to identify common techniques and comparisons. Additionally, the type of images considered by each approach (i.e., images from the: visible, infrared, multispectral, hyperspectral bands) together with the strategy used to generate ground truth data (i.e., real and synthetic images) are reviewed. Finally, conclusions highlighting future needs and challenges are presented. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MSIAU; 600.130; 600.122 | Approved | no | ||
Call Number | Admin @ si @ VSM2021 | Serial | 3576 | ||
Permanent link to this record | |||||
Author | Ozge Mercanoglu Sincan; Julio C. S. Jacques Junior; Sergio Escalera; Hacer Yalim Keles | ||||
Title | ChaLearn LAP Large Scale Signer Independent Isolated Sign Language Recognition Challenge: Design, Results and Future Research | Type | Conference Article | ||
Year | 2021 | Publication | Conference on Computer Vision and Pattern Recognition Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 3467-3476 | ||
Keywords | |||||
Abstract | The performances of Sign Language Recognition (SLR) systems have improved considerably in recent years. However, several open challenges still need to be solved to allow SLR to be useful in practice. The research in the field is in its infancy in regards to the robustness of the models to a large diversity of signs and signers, and to fairness of the models to performers from different demographics. This work summarises the ChaLearn LAP Large Scale Signer Independent Isolated SLR Challenge, organised at CVPR 2021 with the goal of overcoming some of the aforementioned challenges. We analyse and discuss the challenge design, top winning solutions and suggestions for future research. The challenge attracted 132 participants in the RGB track and 59 in the RGB+Depth track, receiving more than 1.5K submissions in total. Participants were evaluated using a new large-scale multi-modal Turkish Sign Language (AUTSL) dataset, consisting of 226 sign labels and 36,302 isolated sign video samples performed by 43 different signers. Winning teams achieved more than 96% recognition rate, and their approaches benefited from pose/hand/face estimation, transfer learning, external data, fusion/ensemble of modalities and different strategies to model spatio-temporal information. However, methods still fail to distinguish among very similar signs, in particular those sharing similar hand trajectories. | ||||
Address | Virtual; June 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ MJE2021 | Serial | 3560 | ||
Permanent link to this record | |||||
Author | Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan C. Moure | ||||
Title | 3D Perception With Slanted Stixels on GPU | Type | Journal Article | ||
Year | 2021 | Publication | IEEE Transactions on Parallel and Distributed Systems | Abbreviated Journal | TPDS |
Volume | 32 | Issue | 10 | Pages | 2434-2447 |
Keywords | Daniel Hernandez-Juarez; Antonio Espinosa; David Vazquez; Antonio M. Lopez; Juan C. Moure | ||||
Abstract | This article presents a GPU-accelerated software design of the recently proposed model of Slanted Stixels, which represents the geometric and semantic information of a scene in a compact and accurate way. We reformulate the measurement depth model to reduce the computational complexity of the algorithm, relying on the confidence of the depth estimation and the identification of invalid values to handle outliers. The proposed massively parallel scheme and data layout for the irregular computation pattern that corresponds to a Dynamic Programming paradigm is described and carefully analyzed in performance terms. Performance is shown to scale gracefully on current generation embedded GPUs. We assess the proposed methods in terms of semantic and geometric accuracy as well as run-time performance on three publicly available benchmark datasets. Our approach achieves real-time performance with high accuracy for 2048 × 1024 image sizes and 4 × 4 Stixel resolution on the low-power embedded GPU of an NVIDIA Tegra Xavier. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.124; 600.118 | Approved | no | ||
Call Number | Admin @ si @ HEV2021 | Serial | 3561 | ||
Permanent link to this record | |||||
Author | Jose Luis Gomez; Gabriel Villalonga; Antonio Lopez | ||||
Title | Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches | Type | Journal Article | ||
Year | 2021 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 21 | Issue | 9 | Pages | 3185 |
Keywords | co-training; multi-modality; vision-based object detection; ADAS; self-driving | ||||
Abstract | Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ GVL2021 | Serial | 3562 | ||
Permanent link to this record | |||||
Author | Shiqi Yang; Kai Wang; Luis Herranz; Joost Van de Weijer | ||||
Title | On Implicit Attribute Localization for Generalized Zero-Shot Learning | Type | Journal Article | ||
Year | 2021 | Publication | IEEE Signal Processing Letters | Abbreviated Journal | |
Volume | 28 | Issue | Pages | 872 - 876 | |
Keywords | |||||
Abstract | Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their attribute-based descriptions. Since attributes are often related to specific parts of objects, many recent works focus on discovering discriminative regions. However, these methods usually require additional complex part detection modules or attention mechanisms. In this paper, 1) we show that common ZSL backbones (without explicit attention nor part detection) can implicitly localize attributes, yet this property is not exploited. 2) Exploiting it, we then propose SELAR, a simple method that further encourages attribute localization, surprisingly achieving very competitive generalized ZSL (GZSL) performance when compared with more complex state-of-the-art methods. Our findings provide useful insight for designing future GZSL methods, and SELAR provides an easy to implement yet strong baseline. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | YWH2021 | Serial | 3563 | ||
Permanent link to this record |