toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Xiangyang Li; Luis Herranz; Shuqiang Jiang edit   pdf
url  openurl
  Title Multifaceted Analysis of Fine-Tuning in Deep Model for Visual Recognition Type Journal
  Year 2020 Publication ACM Transactions on Data Science Abbreviated Journal ACM  
  Volume Issue (up) Pages  
  Keywords  
  Abstract In recent years, convolutional neural networks (CNNs) have achieved impressive performance for various visual recognition scenarios. CNNs trained on large labeled datasets can not only obtain significant performance on most challenging benchmarks but also provide powerful representations, which can be used to a wide range of other tasks. However, the requirement of massive amounts of data to train deep neural networks is a major drawback of these models, as the data available is usually limited or imbalanced. Fine-tuning (FT) is an effective way to transfer knowledge learned in a source dataset to a target task. In this paper, we introduce and systematically investigate several factors that influence the performance of fine-tuning for visual recognition. These factors include parameters for the retraining procedure (e.g., the initial learning rate of fine-tuning), the distribution of the source and target data (e.g., the number of categories in the source dataset, the distance between the source and target datasets) and so on. We quantitatively and qualitatively analyze these factors, evaluate their influence, and present many empirical observations. The results reveal insights into what fine-tuning changes CNN parameters and provide useful and evidence-backed intuitions about how to implement fine-tuning for computer vision tasks.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ LHJ2020 Serial 3423  
Permanent link to this record
 

 
Author Yaxing Wang; Luis Herranz; Joost Van de Weijer edit   pdf
url  doi
openurl 
  Title Mix and match networks: multi-domain alignment for unpaired image-to-image translation Type Journal Article
  Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 128 Issue (up) Pages 2849–2872  
  Keywords  
  Abstract This paper addresses the problem of inferring unseen cross-modal image-to-image translations between multiple modalities. We assume that only some of the pairwise translations have been seen (i.e. trained) and infer the remaining unseen translations (where training pairs are not available). We propose mix and match networks, an approach where multiple encoders and decoders are aligned in such a way that the desired translation can be obtained by simply cascading the source encoder and the target decoder, even when they have not interacted during the training stage (i.e. unseen). The main challenge lies in the alignment of the latent representations at the bottlenecks of encoder-decoder pairs. We propose an architecture with several tools to encourage alignment, including autoencoders and robust side information and latent consistency losses. We show the benefits of our approach in terms of effectiveness and scalability compared with other pairwise image-to-image translation approaches. We also propose zero-pair cross-modal image translation, a challenging setting where the objective is inferring semantic segmentation from depth (and vice-versa) without explicit segmentation-depth pairs, and only from two (disjoint) segmentation-RGB and depth-RGB training sets. We observe that a certain part of the shared information between unseen modalities might not be reachable, so we further propose a variant that leverages pseudo-pairs which allows us to exploit this shared information between the unseen modalities  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.109; 600.106; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ WHW2020 Serial 3424  
Permanent link to this record
 

 
Author Yaxing Wang; Abel Gonzalez-Garcia; Luis Herranz; Joost Van de Weijer edit   pdf
url  openurl
  Title Controlling biases and diversity in diverse image-to-image translation Type Journal Article
  Year 2021 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU  
  Volume 202 Issue (up) Pages 103082  
  Keywords  
  Abstract JCR 2019 Q2, IF=3.121
The task of unpaired image-to-image translation is highly challenging due to the lack of explicit cross-domain pairs of instances. We consider here diverse image translation (DIT), an even more challenging setting in which an image can have multiple plausible translations. This is normally achieved by explicitly disentangling content and style in the latent representation and sampling different styles codes while maintaining the image content. Despite the success of current DIT models, they are prone to suffer from bias. In this paper, we study the problem of bias in image-to-image translation. Biased datasets may add undesired changes (e.g. change gender or race in face images) to the output translations as a consequence of the particular underlying visual distribution in the target domain. In order to alleviate the effects of this problem we propose the use of semantic constraints that enforce the preservation of desired image properties. Our proposed model is a step towards unbiased diverse image-to-image translation (UDIT), and results in less unwanted changes in the translated images while still performing the wanted transformation. Experiments on several heavily biased datasets show the effectiveness of the proposed techniques in different domains such as faces, objects, and scenes.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.141; 600.109; 600.147 Approved no  
  Call Number Admin @ si @ WGH2021 Serial 3464  
Permanent link to this record
 

 
Author Marc Masana; Xialei Liu; Bartlomiej Twardowski; Mikel Menta; Andrew Bagdanov; Joost Van de Weijer edit   pdf
doi  openurl
  Title Class-incremental learning: survey and performance evaluation Type Journal Article
  Year 2022 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume Issue (up) Pages  
  Keywords  
  Abstract For future learning systems incremental learning is desirable, because it allows for: efficient resource usage by eliminating the need to retrain from scratch at the arrival of new data; reduced memory usage by preventing or limiting the amount of data required to be stored -- also important when privacy limitations are imposed; and learning that more closely resembles human learning. The main challenge for incremental learning is catastrophic forgetting, which refers to the precipitous drop in performance on previously learned tasks after learning a new one. Incremental learning of deep neural networks has seen explosive growth in recent years. Initial work focused on task incremental learning, where a task-ID is provided at inference time. Recently we have seen a shift towards class-incremental learning where the learner must classify at inference time between all classes seen in previous tasks without recourse to a task-ID. In this paper, we provide a complete survey of existing methods for incremental learning, and in particular we perform an extensive experimental evaluation on twelve class-incremental methods. We consider several new experimental scenarios, including a comparison of class-incremental methods on multiple large-scale datasets, investigation into small and large domain shifts, and comparison on various network architectures.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ MLT2022 Serial 3538  
Permanent link to this record
 

 
Author Sudeep Katakol; Basem Elbarashy; Luis Herranz; Joost Van de Weijer; Antonio Lopez edit   pdf
url  doi
openurl 
  Title Distributed Learning and Inference with Compressed Images Type Journal Article
  Year 2021 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 30 Issue (up) Pages 3069 - 3083  
  Keywords  
  Abstract Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time, or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; 600.120; 600.118 Approved no  
  Call Number Admin @ si @ KEH2021 Serial 3543  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: