toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Mingyi Yang; Luis Herranz; Fei Yang; Luka Murn; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang; Marta Mrak edit  url
doi  openurl
  Title Semantic Preprocessor for Image Compression for Machines Type Conference Article
  Year 2023 Publication IEEE International Conference on Acoustics, Speech and Signal Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Visual content is being increasingly transmitted and consumed by machines rather than humans to perform automated content analysis tasks. In this paper, we propose an image preprocessor that optimizes the input image for machine consumption prior to encoding by an off-the-shelf codec designed for human consumption. To achieve a better trade-off between the accuracy of the machine analysis task and bitrate, we propose leveraging pre-extracted semantic information to improve the preprocessor’s ability to accurately identify and filter out task-irrelevant information. Furthermore, we propose a two-part loss function to optimize the preprocessor, consisted of a rate-task performance loss and a semantic distillation loss, which helps the reconstructed image obtain more information that contributes to the accuracy of the task. Experiments show that the proposed preprocessor can save up to 48.83% bitrate compared with the method without the preprocessor, and save up to 36.24% bitrate compared to existing preprocessors for machine vision.  
  Address Rodhes Islands; Greece; June 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICASSP  
  Notes (down) MACO; LAMP Approved no  
  Call Number Admin @ si @ YHY2023 Serial 3912  
Permanent link to this record
 

 
Author Jaykishan Patel; Alban Flachot; Javier Vazquez; David H. Brainard; Thomas S. A. Wallis; Marcus A. Brubaker; Richard F. Murray edit  url
openurl 
  Title A deep convolutional neural network trained to infer surface reflectance is deceived by mid-level lightness illusions Type Journal Article
  Year 2023 Publication Journal of Vision Abbreviated Journal JV  
  Volume 23 Issue 9 Pages 4817-4817  
  Keywords  
  Abstract A long-standing view is that lightness illusions are by-products of strategies employed by the visual system to stabilize its perceptual representation of surface reflectance against changes in illumination. Computationally, one such strategy is to infer reflectance from the retinal image, and to base the lightness percept on this inference. CNNs trained to infer reflectance from images have proven successful at solving this problem under limited conditions. To evaluate whether these CNNs provide suitable starting points for computational models of human lightness perception, we tested a state-of-the-art CNN on several lightness illusions, and compared its behaviour to prior measurements of human performance. We trained a CNN (Yu & Smith, 2019) to infer reflectance from luminance images. The network had a 30-layer hourglass architecture with skip connections. We trained the network via supervised learning on 100K images, rendered in Blender, each showing randomly placed geometric objects (surfaces, cubes, tori, etc.), with random Lambertian reflectance patterns (solid, Voronoi, or low-pass noise), under randomized point+ambient lighting. The renderer also provided the ground-truth reflectance images required for training. After training, we applied the network to several visual illusions. These included the argyle, Koffka-Adelson, snake, White’s, checkerboard assimilation, and simultaneous contrast illusions, along with their controls where appropriate. The CNN correctly predicted larger illusions in the argyle, Koffka-Adelson, and snake images than in their controls. It also correctly predicted an assimilation effect in White's illusion. It did not, however, account for the checkerboard assimilation or simultaneous contrast effects. These results are consistent with the view that at least some lightness phenomena are by-products of a rational approach to inferring stable representations of physical properties from intrinsically ambiguous retinal images. Furthermore, they suggest that CNN models may be a promising starting point for new models of human lightness perception.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) MACO; CIC Approved no  
  Call Number Admin @ si @ PFV2023 Serial 3890  
Permanent link to this record
 

 
Author Marcos V Conde; Florin Vasluianu; Javier Vazquez; Radu Timofte edit   pdf
url  openurl
  Title Perceptual image enhancement for smartphone real-time applications Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 1848-1858  
  Keywords  
  Abstract Recent advances in camera designs and imaging pipelines allow us to capture high-quality images using smartphones. However, due to the small size and lens limitations of the smartphone cameras, we commonly find artifacts or degradation in the processed images. The most common unpleasant effects are noise artifacts, diffraction artifacts, blur, and HDR overexposure. Deep learning methods for image restoration can successfully remove these artifacts. However, most approaches are not suitable for real-time applications on mobile devices due to their heavy computation and memory requirements. In this paper, we propose LPIENet, a lightweight network for perceptual image enhancement, with the focus on deploying it on smartphones. Our experiments show that, with much fewer parameters and operations, our model can deal with the mentioned artifacts and achieve competitive performance compared with state-of-the-art methods on standard benchmarks. Moreover, to prove the efficiency and reliability of our approach, we deployed the model directly on commercial smartphones and evaluated its performance. Our model can process 2K resolution images under 1 second in mid-level commercial smartphones.  
  Address Waikoloa; Hawai; USA; January 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes (down) MACO; CIC Approved no  
  Call Number Admin @ si @ CVV2023 Serial 3900  
Permanent link to this record
 

 
Author Yawei Li; Yulun Zhang; Radu Timofte; Luc Van Gool; Zhijun Tu; Kunpeng Du; Hailing Wang; Hanting Chen; Wei Li; Xiaofei Wang; Jie Hu; Yunhe Wang; Xiangyu Kong; Jinlong Wu; Dafeng Zhang; Jianxing Zhang; Shuai Liu; Furui Bai; Chaoyu Feng; Hao Wang; Yuqian Zhang; Guangqi Shao; Xiaotao Wang; Lei Lei; Rongjian Xu; Zhilu Zhang; Yunjin Chen; Dongwei Ren; Wangmeng Zuo; Qi Wu; Mingyan Han; Shen Cheng; Haipeng Li; Ting Jiang; Chengzhi Jiang; Xinpeng Li; Jinting Luo; Wenjie Lin; Lei Yu; Haoqiang Fan; Shuaicheng Liu; Aditya Arora; Syed Waqas Zamir; Javier Vazquez; Konstantinos G. Derpanis; Michael S. Brown; Hao Li; Zhihao Zhao; Jinshan Pan; Jiangxin Dong; Jinhui Tang; Bo Yang; Jingxiang Chen; Chenghua Li; Xi Zhang; Zhao Zhang; Jiahuan Ren; Zhicheng Ji; Kang Miao; Suiyi Zhao; Huan Zheng; YanYan Wei; Kangliang Liu; Xiangcheng Du; Sijie Liu; Yingbin Zheng; Xingjiao Wu; Cheng Jin; Rajeev Irny; Sriharsha Koundinya; Vighnesh Kamath; Gaurav Khandelwal; Sunder Ali Khowaja; Jiseok Yoon; Ik Hyun Lee; Shijie Chen; Chengqiang Zhao; Huabin Yang; Zhongjian Zhang; Junjia Huang; Yanru Zhang edit  url
doi  openurl
  Title NTIRE 2023 challenge on image denoising: Methods and results Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal  
  Volume Issue Pages 1904-1920  
  Keywords  
  Abstract This paper reviews the NTIRE 2023 challenge on image denoising (σ = 50) with a focus on the proposed solutions and results. The aim is to obtain a network design capable to produce high-quality results with the best performance measured by PSNR for image denoising. Independent additive white Gaussian noise (AWGN) is assumed and the noise level is 50. The challenge had 225 registered participants, and 16 teams made valid submissions. They gauge the state-of-the-art for image denoising.  
  Address Vancouver; Canada; June 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes (down) MACO; CIC Approved no  
  Call Number Admin @ si @ LZT2023 Serial 3910  
Permanent link to this record
 

 
Author Justine Giroux; Mohammad Reza Karimi Dastjerdi; Yannick Hold-Geoffroy; Javier Vazquez; Jean François Lalonde edit   pdf
url  openurl
  Title Towards a Perceptual Evaluation Framework for Lighting Estimation Type Conference Article
  Year 2024 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract rogress in lighting estimation is tracked by computing existing image quality assessment (IQA) metrics on images from standard datasets. While this may appear to be a reasonable approach, we demonstrate that doing so does not correlate to human preference when the estimated lighting is used to relight a virtual scene into a real photograph. To study this, we design a controlled psychophysical experiment where human observers must choose their preference amongst rendered scenes lit using a set of lighting estimation algorithms selected from the recent literature, and use it to analyse how these algorithms perform according to human perception. Then, we demonstrate that none of the most popular IQA metrics from the literature, taken individually, correctly represent human perception. Finally, we show that by learning a combination of existing IQA metrics, we can more accurately represent human preference. This provides a new perceptual framework to help evaluate future lighting estimation algorithms.  
  Address Seattle; USA; June 2024  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes (down) MACO; CIC Approved no  
  Call Number Admin @ si @ GDH2024 Serial 3999  
Permanent link to this record
 

 
Author Trevor Canham; Javier Vazquez; D Long; Richard F. Murray; Michael S Brown edit   pdf
openurl 
  Title Noise Prism: A Novel Multispectral Visualization Technique Type Journal Article
  Year 2021 Publication 31st Color and Imaging Conference Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract A novel technique for visualizing multispectral images is proposed. Inspired by how prisms work, our method spreads spectral information over a chromatic noise pattern. This is accomplished by populating the pattern with pixels representing each measurement band at a count proportional to its measured intensity. The method is advantageous because it allows for lightweight encoding and visualization of spectral information
while maintaining the color appearance of the stimulus. A four alternative forced choice (4AFC) experiment was conducted to validate the method’s information-carrying capacity in displaying metameric stimuli of varying colors and spectral basis functions. The scores ranged from 100% to 20% (less than chance given the 4AFC task), with many conditions falling somewhere in between at statistically significant intervals. Using this data, color and texture difference metrics can be evaluated and optimized to predict the legibility of the visualization technique.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CIC  
  Notes (down) MACO; CIC Approved no  
  Call Number Admin @ si @ CVL2021 Serial 4000  
Permanent link to this record
 

 
Author Zhaocheng Liu; Luis Herranz; Fei Yang; Saiping Zhang; Shuai Wan; Marta Mrak; Marc Gorriz edit   pdf
url  doi
openurl 
  Title Slimmable Video Codec Type Conference Article
  Year 2022 Publication CVPR 2022 Workshop and Challenge on Learned Image Compression (CLIC 2022, 5th Edition) Abbreviated Journal  
  Volume Issue Pages 1742-1746  
  Keywords  
  Abstract Neural video compression has emerged as a novel paradigm combining trainable multilayer neural net-works and machine learning, achieving competitive rate-distortion (RD) performances, but still remaining impractical due to heavy neural architectures, with large memory and computational demands. In addition, models are usually optimized for a single RD tradeoff. Recent slimmable image codecs can dynamically adjust their model capacity to gracefully reduce the memory and computation requirements, without harming RD performance. In this paper we propose a slimmable video codec (SlimVC), by integrating a slimmable temporal entropy model in a slimmable autoencoder. Despite a significantly more complex architecture, we show that slimming remains a powerful mechanism to control rate, memory footprint, computational cost and latency, all being important requirements for practical video compression.  
  Address Virtual; 19 June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes (down) MACO; 601.379; 601.161 Approved no  
  Call Number Admin @ si @ LHY2022 Serial 3687  
Permanent link to this record
 

 
Author Danna Xue; Fei Yang; Pei Wang; Luis Herranz; Jinqiu Sun; Yu Zhu; Yanning Zhang edit   pdf
doi  isbn
openurl 
  Title SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision Type Conference Article
  Year 2022 Publication 30th ACM International Conference on Multimedia Abbreviated Journal  
  Volume Issue Pages 6539-6548  
  Keywords  
  Abstract Accurate semantic segmentation models typically require significant computational resources, inhibiting their use in practical applications. Recent works rely on well-crafted lightweight models to achieve fast inference. However, these models cannot flexibly adapt to varying accuracy and efficiency requirements. In this paper, we propose a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference depending on the desired accuracy-efficiency tradeoff. More specifically, we employ parametrized channel slimming by stepwise downward knowledge distillation during training. Motivated by the observation that the differences between segmentation results of each submodel are mainly near the semantic borders, we introduce an additional boundary guided semantic segmentation loss to further improve the performance of each submodel. We show that our proposed SlimSeg with various mainstream networks can produce flexible models that provide dynamic adjustment of computational cost and better performance than independent models. Extensive experiments on semantic segmentation benchmarks, Cityscapes and CamVid, demonstrate the generalization ability of our framework.  
  Address Lisboa, Portugal, October 2022  
  Corporate Author Thesis  
  Publisher Association for Computing Machinery Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4503-9203-7 Medium  
  Area Expedition Conference MM  
  Notes (down) MACO; 600.161; 601.400 Approved no  
  Call Number Admin @ si @ XYW2022 Serial 3758  
Permanent link to this record
 

 
Author Saiping Zhang; Luis Herranz; Marta Mrak; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang edit   pdf
url  doi
openurl 
  Title DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video Type Conference Article
  Year 2022 Publication 47th International Conference on Acoustics, Speech, and Signal Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms.  
  Address Virtual; May 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICASSP  
  Notes (down) MACO; 600.161; 601.379 Approved no  
  Call Number Admin @ si @ ZHM2022a Serial 3765  
Permanent link to this record
 

 
Author Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji edit  url
doi  openurl
  Title Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding Type Conference Article
  Year 2022 Publication 29th IEEE International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics  
  Abstract In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM.  
  Address Bordeaux; France; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes (down) MACO Approved no  
  Call Number Admin @ si @ ZWM2022 Serial 3790  
Permanent link to this record
 

 
Author Javier Vazquez; Graham D. Finlayson; Luis Herranz edit  url
openurl 
  Title Improving the perception of low-light enhanced images Type Journal Article
  Year 2024 Publication Optics Express Abbreviated Journal  
  Volume 32 Issue 4 Pages 5174-5190  
  Keywords  
  Abstract Improving images captured under low-light conditions has become an important topic in computational color imaging, as it has a wide range of applications. Most current methods are either based on handcrafted features or on end-to-end training of deep neural networks that mostly focus on minimizing some distortion metric —such as PSNR or SSIM— on a set of training images. However, the minimization of distortion metrics does not mean that the results are optimal in terms of perception (i.e. perceptual quality). As an example, the perception-distortion trade-off states that, close to the optimal results, improving distortion results in worsening perception. This means that current low-light image enhancement methods —that focus on distortion minimization— cannot be optimal in the sense of obtaining a good image in terms of perception errors. In this paper, we propose a post-processing approach in which, given the original low-light image and the result of a specific method, we are able to obtain a result that resembles as much as possible that of the original method, but, at the same time, giving an improvement in the perception of the final image. More in detail, our method follows the hypothesis that in order to minimally modify the perception of an input image, any modification should be a combination of a local change in the shading across a scene and a global change in illumination color. We demonstrate the ability of our method quantitatively using perceptual blind image metrics such as BRISQUE, NIQE, or UNIQUE, and through user preference tests.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) MACO Approved no  
  Call Number Admin @ si @ VFH2024 Serial 4018  
Permanent link to this record
 

 
Author Pedro Martins; Paulo Carvalho; Carlo Gatta edit   pdf
doi  openurl
  Title On the completeness of feature-driven maximally stable extremal regions Type Journal Article
  Year 2016 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 74 Issue Pages 9-16  
  Keywords Local features; Completeness; Maximally Stable Extremal Regions  
  Abstract By definition, local image features provide a compact representation of the image in which most of the image information is preserved. This capability offered by local features has been overlooked, despite being relevant in many application scenarios. In this paper, we analyze and discuss the performance of feature-driven Maximally Stable Extremal Regions (MSER) in terms of the coverage of informative image parts (completeness). This type of features results from an MSER extraction on saliency maps in which features related to objects boundaries or even symmetry axes are highlighted. These maps are intended to be suitable domains for MSER detection, allowing this detector to provide a better coverage of informative image parts. Our experimental results, which were based on a large-scale evaluation, show that feature-driven MSER have relatively high completeness values and provide more complete sets than a traditional MSER detection even when sets of similar cardinality are considered.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier B.V. Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0167-8655 ISBN Medium  
  Area Expedition Conference  
  Notes (down) LAMP;MILAB; Approved no  
  Call Number Admin @ si @ MCG2016 Serial 2748  
Permanent link to this record
 

 
Author Akshita Gupta; Sanath Narayan; Salman Khan; Fahad Shahbaz Khan; Ling Shao; Joost Van de Weijer edit  doi
openurl 
  Title Generative Multi-Label Zero-Shot Learning Type Journal Article
  Year 2023 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 45 Issue 12 Pages 14611-14624  
  Keywords Generalized zero-shot learning; Multi-label classification; Zero-shot object detection; Feature synthesis  
  Abstract Multi-label zero-shot learning strives to classify images into multiple unseen categories for which no data is available during training. The test samples can additionally contain seen categories in the generalized variant. Existing approaches rely on learning either shared or label-specific attention from the seen classes. Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge. In contrast, state-of-the-art single-label generative adversarial network (GAN) based approaches learn to directly synthesize the class-specific visual features from the corresponding class attribute embeddings. However, synthesizing multi-label features from GANs is still unexplored in the context of zero-shot setting. When multiple objects occur jointly in a single image, a critical question is how to effectively fuse multi-class information. In this work, we introduce different fusion approaches at the attribute-level, feature-level and cross-level (across attribute and feature-levels) for synthesizing multi-label features from their corresponding multi-label class embeddings. To the best of our knowledge, our work is the first to tackle the problem of multi-label feature synthesis in the (generalized) zero-shot setting. Our cross-level fusion-based generative approach outperforms the state-of-the-art on three zero-shot benchmarks: NUS-WIDE, Open Images and MS COCO. Furthermore, we show the generalization capabilities of our fusion approach in the zero-shot detection task on MS COCO, achieving favorable performance against existing methods.  
  Address December 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) LAMP; PID2021-128178OB-I00 Approved no  
  Call Number Admin @ si @ Serial 3853  
Permanent link to this record
 

 
Author Shiqi Yang; Yaxing Wang; Kai Wang; Shangling Jui; Joost Van de Weijer edit  openurl
  Title One Ring to Bring Them All: Towards Open-Set Recognition under Domain Shift Type Miscellaneous
  Year 2022 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this paper, we investigate model adaptation under domain and category shift, where the final goal is to achieve
(SF-UNDA), which addresses the situation where there exist both domain and category shifts between source and target domains. Under the SF-UNDA setting, the model cannot access source data anymore during target adaptation, which aims to address data privacy concerns. We propose a novel training scheme to learn a (
+1)-way classifier to predict the
source classes and the unknown class, where samples of only known source categories are available for training. Furthermore, for target adaptation, we simply adopt a weighted entropy minimization to adapt the source pretrained model to the unlabeled target domain without source data. In experiments, we show:
After source training, the resulting source model can get excellent performance for
;
After target adaptation, our method surpasses current UNDA approaches which demand source data during adaptation. The versatility to several different tasks strongly proves the efficacy and generalization ability of our method.
When augmented with a closed-set domain adaptation approach during target adaptation, our source-free method further outperforms the current state-of-the-art UNDA method by 2.5%, 7.2% and 13% on Office-31, Office-Home and VisDA respectively.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) LAMP; no proj Approved no  
  Call Number Admin @ si @ YWW2022c Serial 3818  
Permanent link to this record
 

 
Author Marco Cotogni; Fei Yang; Claudio Cusano; Andrew Bagdanov; Joost Van de Weijer edit   pdf
openurl 
  Title Gated Class-Attention with Cascaded Feature Drift Compensation for Exemplar-free Continual Learning of Vision Transformers Type Miscellaneous
  Year 2022 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords Marco Cotogni, Fei Yang, Claudio Cusano, Andrew D. Bagdanov, Joost van de Weijer  
  Abstract We propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Exemplar replay, however, comes at the cost of retaining samples from previous tasks which for many applications may not be possible. To address the problem of continual ViT training, we first propose gated class-attention to minimize the drift in the final ViT transformer block. This mask-based gating is applied to class-attention mechanism of the last transformer block and strongly regulates the weights crucial for previous tasks. Importantly, gated class-attention does not require the task-ID during inference, which distinguishes it from other parameter isolation methods. Secondly, we propose a new method of feature drift compensation that accommodates feature drift in the backbone when learning new tasks. The combination of gated class-attention and cascaded feature drift compensation allows for plasticity towards new tasks while limiting forgetting of previous ones. Extensive experiments performed on CIFAR-100, Tiny-ImageNet and ImageNet100 demonstrate that our exemplar-free method obtains competitive results when compared to rehearsal based ViT methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) LAMP; no proj Approved no  
  Call Number Admin @ si @ CYC2022 Serial 3827  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: