|   | 
Details
   web
Records
Author Ozge Mercanoglu Sincan; Julio C. S. Jacques Junior; Sergio Escalera; Hacer Yalim Keles
Title ChaLearn LAP Large Scale Signer Independent Isolated Sign Language Recognition Challenge: Design, Results and Future Research Type Conference Article
Year 2021 Publication Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 3467-3476
Keywords
Abstract The performances of Sign Language Recognition (SLR) systems have improved considerably in recent years. However, several open challenges still need to be solved to allow SLR to be useful in practice. The research in the field is in its infancy in regards to the robustness of the models to a large diversity of signs and signers, and to fairness of the models to performers from different demographics. This work summarises the ChaLearn LAP Large Scale Signer Independent Isolated SLR Challenge, organised at CVPR 2021 with the goal of overcoming some of the aforementioned challenges. We analyse and discuss the challenge design, top winning solutions and suggestions for future research. The challenge attracted 132 participants in the RGB track and 59 in the RGB+Depth track, receiving more than 1.5K submissions in total. Participants were evaluated using a new large-scale multi-modal Turkish Sign Language (AUTSL) dataset, consisting of 226 sign labels and 36,302 isolated sign video samples performed by 43 different signers. Winning teams achieved more than 96% recognition rate, and their approaches benefited from pose/hand/face estimation, transfer learning, external data, fusion/ensemble of modalities and different strategies to model spatio-temporal information. However, methods still fail to distinguish among very similar signs, in particular those sharing similar hand trajectories.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ MJE2021 Serial 3560
Permanent link to this record
 

 
Author Marc Masana; Tinne Tuytelaars; Joost Van de Weijer
Title Ternary Feature Masks: zero-forgetting for task-incremental learning Type Conference Article
Year 2021 Publication 34th IEEE Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 3565-3574
Keywords
Abstract We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue -- and show experimentally -- that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ MTW2021 Serial 3565
Permanent link to this record
 

 
Author Sudeep Katakol; Luis Herranz; Fei Yang; Marta Mrak
Title DANICE: Domain adaptation without forgetting in neural image compression Type Conference Article
Year 2021 Publication Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 1921-1925
Keywords
Abstract Neural image compression (NIC) is a new coding paradigm where coding capabilities are captured by deep models learned from data. This data-driven nature enables new potential functionalities. In this paper, we study the adaptability of codecs to custom domains of interest. We show that NIC codecs are transferable and that they can be adapted with relatively few target domain images. However, naive adaptation interferes with the solution optimized for the original source domain, resulting in forgetting the original coding capabilities in that domain, and may even break the compatibility with previously encoded bitstreams. Addressing these problems, we propose Codec Adaptation without Forgetting (CAwF), a framework that can avoid these problems by adding a small amount of custom parameters, where the source codec remains embedded and unchanged during the adaptation process. Experiments demonstrate its effectiveness and provide useful insights on the characteristics of catastrophic interference in NIC.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes LAMP; 600.120; 600.141; 601.379 Approved no
Call Number Admin @ si @ KHY2021 Serial 3568
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Sabari Nathan; Priya Kansal; Armin Mehri; Parichehr Behjati Ardakani; A.Dalal; A.Akula; D.Sharma; S.Pandey; B.Kumar; J.Yao; R.Wu; KFeng; N.Li; Y.Zhao; H.Patel; V. Chudasama; K.Pjajapati; A.Sarvaiya; K.Upla; K.Raja; R.Ramachandra; C.Bush; F.Almasri; T.Vandamme; O.Debeir; N.Gutierrez; Q.Nguyen; W.Beksi
Title Thermal Image Super-Resolution Challenge – PBVS 2021 Type Conference Article
Year 2021 Publication Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 4359-4367
Keywords
Abstract This paper presents results from the second Thermal Image Super-Resolution (TISR) challenge organized in the framework of the Perception Beyond the Visible Spectrum (PBVS) 2021 workshop. For this second edition, the same thermal image dataset considered during the first challenge has been used; only mid-resolution (MR) and high-resolution (HR) sets have been considered. The dataset consists of 951 training images and 50 testing images for each resolution. A set of 20 images for each resolution is kept aside for evaluation. The two evaluation methodologies proposed for the first challenge are also considered in this opportunity. The first evaluation task consists of measuring the PSNR and SSIM between the obtained SR image and the corresponding ground truth (i.e., the HR thermal image downsampled by four). The second evaluation also consists of measuring the PSNR and SSIM, but in this case, considers the x2 SR obtained from the given MR thermal image; this evaluation is performed between the SR image with respect to the semi-registered HR image, which has been acquired with another camera. The results outperformed those from the first challenge, thus showing an improvement in both evaluation metrics.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes MSIAU; 600.130; 600.122 Approved no
Call Number Admin @ si @ RSV2021 Serial 3581
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera; Mohammad Sabokrou
Title Sign Language Production: A Review Type Conference Article
Year 2021 Publication Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 3472-3481
Keywords
Abstract Sign Language is the dominant yet non-primary form of communication language used in the deaf and hearing-impaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental. To this end, sign language recognition and production are two necessary parts for making such a two-way system. Sign language recognition and production need to cope with some critical challenges. In this survey, we review recent advances in Sign Language Production (SLP) and related areas using deep learning. This survey aims to briefly summarize recent achievements in SLP, discussing their advantages, limitations, and future directions of research.
Address Virtual; June 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ RKE2021b Serial 3603
Permanent link to this record
 

 
Author Kai Wang; Xialei Liu; Andrew Bagdanov; Luis Herranz; Shangling Jui; Joost Van de Weijer
Title Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition Type Conference Article
Year 2022 Publication CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) Abbreviated Journal
Volume Issue Pages 3728-3738
Keywords Training; Computer vision; Image recognition; Upper bound; Conferences; Pattern recognition; Task analysis
Abstract In this paper we consider the problem of incremental meta-learning in which classes are presented incrementally in discrete tasks. We propose Episodic Replay Distillation (ERD), that mixes classes from the current task with exemplars from previous tasks when sampling episodes for meta-learning. To allow the training to benefit from a large as possible variety of classes, which leads to more gener-
alizable feature representations, we propose the cross-task meta loss. Furthermore, we propose episodic replay distillation that also exploits exemplars for improved knowledge distillation. Experiments on four datasets demonstrate that ERD surpasses the state-of-the-art. In particular, on the more challenging one-shot, long task sequence scenarios, we reduce the gap between Incremental Meta-Learning and
the joint-training upper bound from 3.5% / 10.1% / 13.4% / 11.7% with the current state-of-the-art to 2.6% / 2.9% / 5.0% / 0.2% with our method on Tiered-ImageNet / Mini-ImageNet / CIFAR100 / CUB, respectively.
Address New Orleans, USA; 20 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes LAMP; 600.147 Approved no
Call Number Admin @ si @ WLB2022 Serial 3686
Permanent link to this record
 

 
Author Zhaocheng Liu; Luis Herranz; Fei Yang; Saiping Zhang; Shuai Wan; Marta Mrak; Marc Gorriz
Title Slimmable Video Codec Type Conference Article
Year 2022 Publication CVPR 2022 Workshop and Challenge on Learned Image Compression (CLIC 2022, 5th Edition) Abbreviated Journal
Volume Issue Pages 1742-1746
Keywords
Abstract Neural video compression has emerged as a novel paradigm combining trainable multilayer neural net-works and machine learning, achieving competitive rate-distortion (RD) performances, but still remaining impractical due to heavy neural architectures, with large memory and computational demands. In addition, models are usually optimized for a single RD tradeoff. Recent slimmable image codecs can dynamically adjust their model capacity to gracefully reduce the memory and computation requirements, without harming RD performance. In this paper we propose a slimmable video codec (SlimVC), by integrating a slimmable temporal entropy model in a slimmable autoencoder. Despite a significantly more complex architecture, we show that slimming remains a powerful mechanism to control rate, memory footprint, computational cost and latency, all being important requirements for practical video compression.
Address Virtual; 19 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes MACO; 601.379; 601.161 Approved no
Call Number Admin @ si @ LHY2022 Serial 3687
Permanent link to this record
 

 
Author Mohamed Ramzy Ibrahim; Robert Benavente; Felipe Lumbreras; Daniel Ponsa
Title 3DRRDB: Super Resolution of Multiple Remote Sensing Images using 3D Residual in Residual Dense Blocks Type Conference Article
Year 2022 Publication CVPR 2022 Workshop on IEEE Perception Beyond the Visible Spectrum workshop series (PBVS, 18th Edition) Abbreviated Journal
Volume Issue Pages
Keywords Training; Solid modeling; Three-dimensional displays; PSNR; Convolution; Superresolution; Pattern recognition
Abstract The rapid advancement of Deep Convolutional Neural Networks helped in solving many remote sensing problems, especially the problems of super-resolution. However, most state-of-the-art methods focus more on Single Image Super-Resolution neglecting Multi-Image Super-Resolution. In this work, a new proposed 3D Residual in Residual Dense Blocks model (3DRRDB) focuses on remote sensing Multi-Image Super-Resolution for two different single spectral bands. The proposed 3DRRDB model explores the idea of 3D convolution layers in deeply connected Dense Blocks and the effect of local and global residual connections with residual scaling in Multi-Image Super-Resolution. The model tested on the Proba-V challenge dataset shows a significant improvement above the current state-of-the-art models scoring a Corrected Peak Signal to Noise Ratio (cPSNR) of 48.79 dB and 50.83 dB for Near Infrared (NIR) and RED Bands respectively. Moreover, the proposed 3DRRDB model scores a Corrected Structural Similarity Index Measure (cSSIM) of 0.9865 and 0.9909 for NIR and RED bands respectively.
Address New Orleans, USA; 19 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes MSIAU; 600.130 Approved no
Call Number Admin @ si @ IBL2022 Serial 3693
Permanent link to this record
 

 
Author Bojana Gajic; Ariel Amato; Ramon Baldrich; Joost Van de Weijer; Carlo Gatta
Title Area Under the ROC Curve Maximization for Metric Learning Type Conference Article
Year 2022 Publication CVPR 2022 Workshop on Efficien Deep Learning for Computer Vision (ECV 2022, 5th Edition) Abbreviated Journal
Volume Issue Pages
Keywords Training; Computer vision; Conferences; Area measurement; Benchmark testing; Pattern recognition
Abstract Most popular metric learning losses have no direct relation with the evaluation metrics that are subsequently applied to evaluate their performance. We hypothesize that training a metric learning model by maximizing the area under the ROC curve (which is a typical performance measure of recognition systems) can induce an implicit ranking suitable for retrieval problems. This hypothesis is supported by previous work that proved that a curve dominates in ROC space if and only if it dominates in Precision-Recall space. To test this hypothesis, we design and maximize an approximated, derivable relaxation of the area under the ROC curve. The proposed AUC loss achieves state-of-the-art results on two large scale retrieval benchmark datasets (Stanford Online Products and DeepFashion In-Shop). Moreover, the AUC loss achieves comparable performance to more complex, domain specific, state-of-the-art methods for vehicle re-identification.
Address New Orleans, USA; 20 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes CIC; LAMP; Approved no
Call Number Admin @ si @ GAB2022 Serial 3700
Permanent link to this record
 

 
Author Alex Gomez-Villa; Bartlomiej Twardowski; Lu Yu; Andrew Bagdanov; Joost Van de Weijer
Title Continually Learning Self-Supervised Representations With Projected Functional Regularization Type Conference Article
Year 2022 Publication CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) Abbreviated Journal
Volume Issue Pages 3866-3876
Keywords Computer vision; Conferences; Self-supervised learning; Image representation; Pattern recognition
Abstract Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally – they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay
mechanism. We show that naive functional regularization,also known as feature distillation, leads to lower plasticity and limits continual learning performance. Instead, we propose Projected Functional Regularization in which a separate temporal projection network ensures that the newly learned feature space preserves information of the previous one, while at the same time allowing for the learning of new features. This prevents forgetting while maintaining the plasticity of the learner. Comparison with other incremental learning approaches applied to self-supervision demonstrates that our method obtains competitive performance in
different scenarios and on multiple datasets.
Address New Orleans, USA; 20 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes LAMP: 600.147; 600.120 Approved no
Call Number Admin @ si @ GTY2022 Serial 3704
Permanent link to this record
 

 
Author Bojana Gajic; Ramon Baldrich
Title Cross-domain fashion image retrieval Type Conference Article
Year 2018 Publication CVPR 2018 Workshop on Women in Computer Vision (WiCV 2018, 4th Edition) Abbreviated Journal
Volume Issue Pages 19500-19502
Keywords
Abstract Cross domain image retrieval is a challenging task that implies matching images from one domain to their pairs from another domain. In this paper we focus on fashion image retrieval, which involves matching an image of a fashion item taken by users, to the images of the same item taken in controlled condition, usually by professional photographer. When facing this problem, we have different products
in train and test time, and we use triplet loss to train the network. We stress the importance of proper training of simple architecture, as well as adapting general models to the specific task.
Address Salt Lake City, USA; 22 June 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes CIC; 600.087 Approved no
Call Number Admin @ si @ Serial 3709
Permanent link to this record
 

 
Author Spencer Low; Oliver Nina; Angel Sappa; Erik Blasch; Nathan Inkawhich
Title Multi-Modal Aerial View Object Classification Challenge Results – PBVS 2022 Type Conference Article
Year 2022 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Abbreviated Journal
Volume Issue Pages 350-358
Keywords
Abstract This paper details the results and main findings of the second iteration of the Multi-modal Aerial View Object Classification (MAVOC) challenge. The primary goal of both MAVOC challenges is to inspire research into methods for building recognition models that utilize both synthetic aperture radar (SAR) and electro-optical (EO) imagery. Teams are encouraged to develop multi-modal approaches that incorporate complementary information from both domains. While the 2021 challenge showed a proof of concept that both modalities could be used together, the 2022 challenge focuses on the detailed multi-modal methods. The 2022 challenge uses the same UNIfied Coincident Optical and Radar for recognitioN (UNICORN) dataset and competition format that was used in 2021. Specifically, the challenge focuses on two tasks, (1) SAR classification and (2) SAR + EO classification. The bulk of this document is dedicated to discussing the top performing methods and describing their performance on our blind test set. Notably, all of the top ten teams outperform a Resnet-18 baseline. For SAR classification, the top team showed a 129% improvement over baseline and an 8% average improvement from the 2021 winner. The top team for SAR + EO classification shows a 165% improvement with a 32% average improvement over 2021.
Address New Orleans; USA; June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes MSIAU Approved no
Call Number Admin @ si @ LNS2022 Serial 3768
Permanent link to this record
 

 
Author Aneesh Rangnekar; Zachary Mulhollan; Anthony Vodacek; Matthew Hoffman; Angel Sappa; Erik Blasch; Jun Yu; Liwen Zhang; Shenshen Du; Hao Chang; Keda Lu; Zhong Zhang; Fang Gao; Ye Yu; Feng Shuang; Lei Wang; Qiang Ling; Pranjay Shyam; Kuk-Jin Yoon; Kyung-Soo Kim
Title Semi-Supervised Hyperspectral Object Detection Challenge Results – PBVS 2022 Type Conference Article
Year 2022 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Abbreviated Journal
Volume Issue Pages 390-398
Keywords Training; Computer visio; Conferences; Training data; Object detection; Semisupervised learning; Transformers
Abstract This paper summarizes the top contributions to the first semi-supervised hyperspectral object detection (SSHOD) challenge, which was organized as a part of the Perception Beyond the Visible Spectrum (PBVS) 2022 workshop at the Computer Vision and Pattern Recognition (CVPR) conference. The SSHODC challenge is a first-of-its-kind hyperspectral dataset with temporally contiguous frames collected from a university rooftop observing a 4-way vehicle intersection over a period of three days. The dataset contains a total of 2890 frames, captured at an average resolution of 1600 × 192 pixels, with 51 hyperspectral bands from 400nm to 900nm. SSHOD challenge uses 989 images as the training set, 605 images as validation set and 1296 images as the evaluation (test) set. Each set was acquired on a different day to maximize the variance in weather conditions. Labels are provided for 10% of the annotated data, hence formulating a semi-supervised learning task for the participants which is evaluated in terms of average precision over the entire set of classes, as well as individual moving object classes: namely vehicle, bus and bike. The challenge received participation registration from 38 individuals, with 8 participating in the validation phase and 3 participating in the test phase. This paper describes the dataset acquisition, with challenge formulation, proposed methods and qualitative and quantitative results.
Address New Orleans; USA; June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes MSIAU; no menciona Approved no
Call Number Admin @ si @ RMV2022 Serial 3774
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Jin Kim; Dogun Kim; Zhihao Li; Yingchun Jian; Bo Yan; Leilei Cao; Fengliang Qi; Hongbin Wang Rongyuan Wu; Lingchen Sun; Yongqiang Zhao; Lin Li; Kai Wang; Yicheng Wang; Xuanming Zhang; Huiyuan Wei; Chonghua Lv; Qigong Sun; Xiaolin Tian; Zhuang Jia; Jiakui Hu; Chenyang Wang; Zhiwei Zhong; Xianming Liu; Junjun Jiang
Title Thermal Image Super-Resolution Challenge Results – PBVS 2022 Type Conference Article
Year 2022 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Abbreviated Journal
Volume Issue Pages 418-426
Keywords
Abstract This paper presents results from the third Thermal Image Super-Resolution (TISR) challenge organized in the Perception Beyond the Visible Spectrum (PBVS) 2022 workshop. The challenge uses the same thermal image dataset as the first two challenges, with 951 training images and 50 validation images at each resolution. A set of 20 images was kept aside for testing. The evaluation tasks were to measure the PSNR and SSIM between the SR image and the ground truth (HR thermal noisy image downsampled by four), and also to measure the PSNR and SSIM between the SR image and the semi-registered HR image (acquired with another camera). The results outperformed those from last year’s challenge, improving both evaluation metrics. This year, almost 100 teams participants registered for the challenge, showing the community’s interest in this hot topic.
Address New Orleans; USA; June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes MSIAU; no menciona Approved no
Call Number Admin @ si @ RSV2022c Serial 3775
Permanent link to this record
 

 
Author Francesco Pelosin; Saurav Jha; Andrea Torsello; Bogdan Raducanu; Joost Van de Weijer
Title Towards exemplar-free continual learning in vision transformers: an account of attention, functional and weight regularization Type Conference Article
Year 2022 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Abbreviated Journal
Volume Issue Pages
Keywords Learning systems; Weight measurement; Image recognition; Surgery; Benchmark testing; Transformers; Stability analysis
Abstract In this paper, we investigate the continual learning of Vision Transformers (ViT) for the challenging exemplar-free scenario, with special focus on how to efficiently distill the knowledge of its crucial self-attention mechanism (SAM). Our work takes an initial step towards a surgical investigation of SAM for designing coherent continual learning methods in ViTs. We first carry out an evaluation of established continual learning regularization techniques. We then examine the effect of regularization when applied to two key enablers of SAM: (a) the contextualized embedding layers, for their ability to capture well-scaled representations with respect to the values, and (b) the prescaled attention maps, for carrying value-independent global contextual information. We depict the perks of each distilling strategy on two image recognition benchmarks (CIFAR100 and ImageNet-32) – while (a) leads to a better overall accuracy, (b) helps enhance the rigidity by maintaining competitive performances. Furthermore, we identify the limitation imposed by the symmetric nature of regularization losses. To alleviate this, we propose an asymmetric variant and apply it to the pooled output distillation (POD) loss adapted for ViTs. Our experiments confirm that introducing asymmetry to POD boosts its plasticity while retaining stability across (a) and (b). Moreover, we acknowledge low forgetting measures for all the compared methods, indicating that ViTs might be naturally inclined continual learners. 1
Address New Orleans; USA; June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) CVPRW
Notes LAMP; 600.147 Approved no
Call Number Admin @ si @ PJT2022 Serial 3784
Permanent link to this record