Home | [131–140] << 141 142 143 144 145 146 147 148 149 150 >> [151–160] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Yaxing Wang; Salman Khan; Abel Gonzalez-Garcia; Joost Van de Weijer; Fahad Shahbaz Khan | ||||
Title | Semi-supervised Learning for Few-shot Image-to-Image Translation | Type | Conference Article | ||
Year | 2020 | Publication | 33rd IEEE Conference on Computer Vision and Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | In the last few years, unpaired image-to-image translation has witnessed remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image translation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the source domain during training. To do so, we propose applying semi-supervised learning via a noise-tolerant pseudo-labeling procedure. We also apply a cycle consistency constraint to further exploit the information from unlabeled images, either from the same dataset or external. Additionally, we propose several structural modifications to facilitate the image translation task under these circumstances. Our semi-supervised method for few-shot image translation, called SEMIT, achieves excellent results on four different datasets using as little as 10% of the source labels, and matches the performance of the main fully-supervised competitor using only 20% labeled data. Our code and models are made public at: this https URL. | ||||
Address | Virtual; June 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPR | ||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number ![]() |
Admin @ si @ WKG2020 | Serial | 3486 | ||
Permanent link to this record | |||||
Author | Kai Wang; Xialei Liu; Andrew Bagdanov; Luis Herranz; Shangling Jui; Joost Van de Weijer | ||||
Title | Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition | Type | Conference Article | ||
Year | 2022 | Publication | CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) | Abbreviated Journal | |
Volume | Issue | Pages | 3728-3738 | ||
Keywords | Training; Computer vision; Image recognition; Upper bound; Conferences; Pattern recognition; Task analysis | ||||
Abstract | In this paper we consider the problem of incremental meta-learning in which classes are presented incrementally in discrete tasks. We propose Episodic Replay Distillation (ERD), that mixes classes from the current task with exemplars from previous tasks when sampling episodes for meta-learning. To allow the training to benefit from a large as possible variety of classes, which leads to more gener-
alizable feature representations, we propose the cross-task meta loss. Furthermore, we propose episodic replay distillation that also exploits exemplars for improved knowledge distillation. Experiments on four datasets demonstrate that ERD surpasses the state-of-the-art. In particular, on the more challenging one-shot, long task sequence scenarios, we reduce the gap between Incremental Meta-Learning and the joint-training upper bound from 3.5% / 10.1% / 13.4% / 11.7% with the current state-of-the-art to 2.6% / 2.9% / 5.0% / 0.2% with our method on Tiered-ImageNet / Mini-ImageNet / CIFAR100 / CUB, respectively. |
||||
Address | New Orleans, USA; 20 June 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP; 600.147 | Approved | no | ||
Call Number ![]() |
Admin @ si @ WLB2022 | Serial | 3686 | ||
Permanent link to this record | |||||
Author | Maciej Wielgosz; Antonio Lopez; Muhamad Naveed Riaz | ||||
Title | CARLA-BSP: a simulated dataset with pedestrians | Type | Miscellaneous | ||
Year | 2023 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | We present a sample dataset featuring pedestrians generated using the ARCANE framework, a new framework for generating datasets in CARLA (0.9.13). We provide use cases for pedestrian detection, autoencoding, pose estimation, and pose lifting. We also showcase baseline results. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number ![]() |
Admin @ si @ WLN2023 | Serial | 3866 | ||
Permanent link to this record | |||||
Author | Pichao Wang; Wanqing Li; Philip Ogunbona; Jun Wan; Sergio Escalera | ||||
Title | RGB-D-based Human Motion Recognition with Deep Learning: A Survey | Type | Journal Article | ||
Year | 2018 | Publication | Computer Vision and Image Understanding | Abbreviated Journal | CVIU |
Volume | 171 | Issue | Pages | 118-139 | |
Keywords | Human motion recognition; RGB-D data; Deep learning; Survey | ||||
Abstract | Human motion recognition is one of the most important branches of human-centered research activities. In recent years, motion recognition based on RGB-D data has attracted much attention. Along with the development in artificial intelligence, deep learning techniques have gained remarkable success in computer vision. In particular, convolutional neural networks (CNN) have achieved great success for image-based tasks, and recurrent neural networks (RNN) are renowned for sequence-based problems. Specifically, deep learning methods based on the CNN and RNN architectures have been adopted for motion recognition using RGB-D data. In this paper, a detailed overview of recent advances in RGB-D-based motion recognition is presented. The reviewed methods are broadly categorized into four groups, depending on the modality adopted for recognition: RGB-based, depth-based, skeleton-based and RGB+D-based. As a survey focused on the application of deep learning to RGB-D-based motion recognition, we explicitly discuss the advantages and limitations of existing techniques. Particularly, we highlighted the methods of encoding spatial-temporal-structural information inherent in video sequence, and discuss potential directions for future research. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number ![]() |
Admin @ si @ WLO2018 | Serial | 3123 | ||
Permanent link to this record | |||||
Author | Yaxing Wang; Hector Laria Mantecon; Joost Van de Weijer; Laura Lopez-Fuentes; Bogdan Raducanu | ||||
Title | TransferI2I: Transfer Learning for Image-to-Image Translation from Small Datasets | Type | Conference Article | ||
Year | 2021 | Publication | 19th IEEE International Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 13990-13999 | ||
Keywords | |||||
Abstract | Image-to-image (I2I) translation has matured in recent years and is able to generate high-quality realistic images. However, despite current success, it still faces important challenges when applied to small domains. Existing methods use transfer learning for I2I translation, but they still require the learning of millions of parameters from scratch. This drawback severely limits its application on small domains. In this paper, we propose a new transfer learning for I2I translation (TransferI2I). We decouple our learning process into the image generation step and the I2I translation step. In the first step we propose two novel techniques: source-target initialization and self-initialization of the adaptor layer. The former finetunes the pretrained generative model (e.g., StyleGAN) on source and target data. The latter allows to initialize all non-pretrained network parameters without the need of any data. These techniques provide a better initialization for the I2I translation step. In addition, we introduce an auxiliary GAN that further facilitates the training of deep I2I systems even from small datasets. In extensive experiments on three datasets, (Animal faces, Birds, and Foods), we show that we outperform existing methods and that mFID improves on several datasets with over 25 points. | ||||
Address | Virtual; October 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCV | ||
Notes | LAMP; 600.147; 602.200; 600.120 | Approved | no | ||
Call Number ![]() |
Admin @ si @ WLW2021 | Serial | 3604 | ||
Permanent link to this record | |||||
Author | Jun Wan; Chi Lin; Longyin Wen; Yunan Li; Qiguang Miao; Sergio Escalera; Gholamreza Anbarjafari; Isabelle Guyon; Guodong Guo; Stan Z. Li | ||||
Title | ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture Recognition | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Cybernetics | Abbreviated Journal | TCIBERN |
Volume | 52 | Issue | 5 | Pages | 3422-3433 |
Keywords | |||||
Abstract | The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on Pattern Recognition (ICPR) 2016 and International Conference on Computer Vision (ICCV) 2017, attracting more than 200 teams round the world. This challenge has two tracks, focusing on isolated and continuous gesture recognition, respectively. This paper describes the creation of both benchmark datasets and analyzes the advances in large-scale gesture recognition based on these two datasets. We discuss the challenges of collecting large-scale ground-truth annotations of gesture recognition, and provide a detailed analysis of the current state-of-the-art methods for large-scale isolated and continuous gesture recognition based on RGB-D video sequences. In addition to recognition rate and mean jaccard index (MJI) as evaluation metrics used in our previous challenges, we also introduce the corrected segmentation rate (CSR) metric to evaluate the performance of temporal segmentation for continuous gesture recognition. Furthermore, we propose a bidirectional long short-term memory (Bi-LSTM) baseline method, determining the video division points based on the skeleton points extracted by convolutional pose machine (CPM). Experiments demonstrate that the proposed Bi-LSTM outperforms the state-of-the-art methods with an absolute improvement of 8.1% (from 0.8917 to 0.9639) of CSR. | ||||
Address | May 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number ![]() |
Admin @ si @ WLW2022 | Serial | 3522 | ||
Permanent link to this record | |||||
Author | Yifan Wang; Luka Murn; Luis Herranz; Fei Yang; Marta Mrak; Wei Zhang; Shuai Wan; Marc Gorriz Blanch | ||||
Title | Efficient Super-Resolution for Compression Of Gaming Videos | Type | Conference Article | ||
Year | 2023 | Publication | IEEE International Conference on Acoustics, Speech and Signal Processing | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Due to the increasing demand for game-streaming services, efficient compression of computer-generated video is more critical than ever, especially when the available bandwidth is low. This paper proposes a super-resolution framework that improves the coding efficiency of computer-generated gaming videos at low bitrates. Most state-of-the-art super-resolution networks generalize over a variety of RGB inputs and use a unified network architecture for frames of different levels of degradation, leading to high complexity and redundancy. Since games usually consist of a limited number of fixed scenarios, we specialize one model for each scenario and assign appropriate network capacities for different QPs to perform super-resolution under the guidance of reconstructed high-quality luma components. Experimental results show that our framework achieves a superior quality-complexity trade-off compared to the ESRnet baseline, saving at most 93.59% parameters while maintaining comparable performance. The compression efficiency compared to HEVC is also improved by more than 17% BD-rate gain. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICASSP | ||
Notes | LAMP; MACO | Approved | no | ||
Call Number ![]() |
Admin @ si @ WMH2023 | Serial | 3911 | ||
Permanent link to this record | |||||
Author | Chenshen Wu | ||||
Title | Going beyond Classification Problems for the Continual Learning of Deep Neural Networks | Type | Book Whole | ||
Year | 2023 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Deep learning has made tremendous progress in the last decade due to the explosion of training data and computational power. Through end-to-end training on a
large dataset, image representations are more discriminative than the previously used hand-crafted features. However, for many real-world applications, training and testing on a single dataset is not realistic, as the test distribution may change over time. Continuous learning takes this situation into account, where the learner must adapt to a sequence of tasks, each with a different distribution. If you would naively continue training the model with a new task, the performance of the model would drop dramatically for the previously learned data. This phenomenon is known as catastrophic forgetting. Many approaches have been proposed to address this problem, which can be divided into three main categories: regularization-based approaches, rehearsal-based approaches, and parameter isolation-based approaches. However, most of the existing works focus on image classification tasks and many other computer vision tasks have not been well-explored in the continual learning setting. Therefore, in this thesis, we study continual learning for image generation, object re-identification, and object counting. For the image generation problem, since the model can generate images from the previously learned task, it is free to apply rehearsal without any limitation. We developed two methods based on generative replay. The first one uses the generated image for joint training together with the new data. The second one is based on output pixel-wise alignment. We extensively evaluate these methods on several benchmarks. Next, we study continual learning for object Re-Identification (ReID). Although most state-of-the-art methods of ReID and continual ReID use softmax-triplet loss, we found that it is better to solve the ReID problem from a meta-learning perspective because continual learning of reID can benefit a lot from the generalization of metalearning. We also propose a distillation loss and found that the removal of the positive pairs before the distillation loss is critical. Finally, we study continual learning for the counting problem. We study the mainstream method based on density maps and propose a new approach for density map distillation. We found that fixing the counter head is crucial for the continual learning of object counting. To further improve results, we propose an adaptor to adapt the changing feature extractor for the fixed counter head. Extensive evaluation shows that this results in improved continual learning performance. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | IMPRIMA | Place of Publication | Editor | Joost Van de Weijer;Bogdan Raducanu | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-126409-0-8 | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP | Approved | no | ||
Call Number ![]() |
Admin @ si @ Wu2023 | Serial | 3960 | ||
Permanent link to this record | |||||
Author | Chenshen Wu; Joost Van de Weijer | ||||
Title | Density Map Distillation for Incremental Object Counting | Type | Conference Article | ||
Year | 2023 | Publication | Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 2505-2514 | ||
Keywords | |||||
Abstract | We investigate the problem of incremental learning for object counting, where a method must learn to count a variety of object classes from a sequence of datasets. A naïve approach to incremental object counting would suffer from catastrophic forgetting, where it would suffer from a dramatic performance drop on previous tasks. In this paper, we propose a new exemplar-free functional regularization method, called Density Map Distillation (DMD). During training, we introduce a new counter head for each task and introduce a distillation loss to prevent forgetting of previous tasks. Additionally, we introduce a cross-task adaptor that projects the features of the current backbone to the previous backbone. This projector allows for the learning of new features while the backbone retains the relevant features for previous tasks. Finally, we set up experiments of incremental learning for counting new objects. Results confirm that our method greatly reduces catastrophic forgetting and outperforms existing methods. | ||||
Address | Vancouver; Canada; June 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP | Approved | no | ||
Call Number ![]() |
Admin @ si @ WuW2023 | Serial | 3916 | ||
Permanent link to this record | |||||
Author | Kai Wang; Chenshen Wu; Andrew Bagdanov; Xialei Liu; Shiqi Yang; Shangling Jui; Joost Van de Weijer | ||||
Title | Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-Identification | Type | Conference Article | ||
Year | 2022 | Publication | 33rd British Machine Vision Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Lifelong object re-identification incrementally learns from a stream of re-identification tasks. The objective is to learn a representation that can be applied to all tasks and that generalizes to previously unseen re-identification tasks. The main challenge is that at inference time the representation must generalize to previously unseen identities. To address this problem, we apply continual meta metric learning to lifelong object re-identification. To prevent forgetting of previous tasks, we use knowledge distillation and explore the roles of positive and negative pairs. Based on our observation that the distillation and metric losses are antagonistic, we propose to remove positive pairs from distillation to robustify model updates. Our method, called Distillation without Positive Pairs (DwoPP), is evaluated on extensive intra-domain experiments on person and vehicle re-identification datasets, as well as inter-domain experiments on the LReID benchmark. Our experiments demonstrate that DwoPP significantly outperforms the state-of-the-art. | ||||
Address | London; UK; November 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | BMVC | ||
Notes | LAMP; 600.147 | Approved | no | ||
Call Number ![]() |
Admin @ si @ WWB2022 | Serial | 3794 | ||
Permanent link to this record | |||||
Author | Yaxing Wang; Chenshen Wu; Luis Herranz; Joost Van de Weijer; Abel Gonzalez-Garcia; Bogdan Raducanu | ||||
Title | Transferring GANs: generating images from limited data | Type | Conference Article | ||
Year | 2018 | Publication | 15th European Conference on Computer Vision | Abbreviated Journal | |
Volume | 11210 | Issue | Pages | 220-236 | |
Keywords | Generative adversarial networks; Transfer learning; Domain adaptation; Image generation | ||||
Abstract | ransferring knowledge of pre-trained networks to new domains by means of fine-tuning is a widely used practice for applications based on discriminative models. To the best of our knowledge this practice has not been studied within the context of generative deep networks. Therefore, we study domain adaptation applied to image generation with generative adversarial networks. We evaluate several aspects of domain adaptation, including the impact of target domain size, the relative distance between source and target domain, and the initialization of conditional GANs. Our results show that using knowledge from pre-trained networks can shorten the convergence time and can significantly improve the quality of the generated images, especially when target data is limited. We show that these conclusions can also be drawn for conditional GANs even when the pre-trained model was trained without conditioning. Our results also suggest that density is more important than diversity and a dataset with one or few densely sampled classes is a better source model than more diverse datasets such as ImageNet or Places. | ||||
Address | Munich; September 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCV | ||
Notes | LAMP; 600.109; 600.106; 600.120 | Approved | no | ||
Call Number ![]() |
Admin @ si @ WWH2018a | Serial | 3130 | ||
Permanent link to this record | |||||
Author | Yaxing Wang; Joost Van de Weijer; Luis Herranz | ||||
Title | Mix and match networks: encoder-decoder alignment for zero-pair image translation | Type | Conference Article | ||
Year | 2018 | Publication | 31st IEEE Conference on Computer Vision and Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 5467 - 5476 | ||
Keywords | |||||
Abstract | We address the problem of image translation between domains or modalities for which no direct paired data is available (i.e. zero-pair translation). We propose mix and match networks, based on multiple encoders and decoders aligned in such a way that other encoder-decoder pairs can be composed at test time to perform unseen image translation tasks between domains or modalities for which explicit paired samples were not seen during training. We study the impact of autoencoders, side information and losses in improving the alignment and transferability of trained pairwise translation models to unseen translations. We show our approach is scalable and can perform colorization and style transfer between unseen combinations of domains. We evaluate our system in a challenging cross-modal setting where semantic segmentation is estimated from depth images, without explicit access to any depth-semantic segmentation training pairs. Our model outperforms baselines based on pix2pix and CycleGAN models. | ||||
Address | Salt Lake City; USA; June 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPR | ||
Notes | LAMP; 600.109; 600.106; 600.120 | Approved | no | ||
Call Number ![]() |
Admin @ si @ WWH2018b | Serial | 3131 | ||
Permanent link to this record | |||||
Author | Kai Wang; Joost Van de Weijer; Luis Herranz | ||||
Title | ACAE-REMIND for online continual learning with compressed feature replay | Type | Journal Article | ||
Year | 2021 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 150 | Issue | Pages | 122-129 | |
Keywords | online continual learning; autoencoders; vector quantization | ||||
Abstract | Online continual learning aims to learn from a non-IID stream of data from a number of different tasks, where the learner is only allowed to consider data once. Methods are typically allowed to use a limited buffer to store some of the images in the stream. Recently, it was found that feature replay, where an intermediate layer representation of the image is stored (or generated) leads to superior results than image replay, while requiring less memory. Quantized exemplars can further reduce the memory usage. However, a drawback of these methods is that they use a fixed (or very intransigent) backbone network. This significantly limits the learning of representations that can discriminate between all tasks. To address this problem, we propose an auxiliary classifier auto-encoder (ACAE) module for feature replay at intermediate layers with high compression rates. The reduced memory footprint per image allows us to save more exemplars for replay. In our experiments, we conduct task-agnostic evaluation under online continual learning setting and get state-of-the-art performance on ImageNet-Subset, CIFAR100 and CIFAR10 dataset. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.147; 601.379; 600.120; 600.141 | Approved | no | ||
Call Number ![]() |
Admin @ si @ WWH2021 | Serial | 3575 | ||
Permanent link to this record | |||||
Author | Tao Wu; Kai Wang; Chuanming Tang; Jianlin Zhang | ||||
Title | Diffusion-based network for unsupervised landmark detection | Type | Journal Article | ||
Year | 2024 | Publication | Knowledge-Based Systems | Abbreviated Journal | |
Volume | 292 | Issue | Pages | 111627 | |
Keywords | |||||
Abstract | Landmark detection is a fundamental task aiming at identifying specific landmarks that serve as representations of distinct object features within an image. However, the present landmark detection algorithms often adopt complex architectures and are trained in a supervised manner using large datasets to achieve satisfactory performance. When faced with limited data, these algorithms tend to experience a notable decline in accuracy. To address these drawbacks, we propose a novel diffusion-based network (DBN) for unsupervised landmark detection, which leverages the generation ability of the diffusion models to detect the landmark locations. In particular, we introduce a dual-branch encoder (DualE) for extracting visual features and predicting landmarks. Additionally, we lighten the decoder structure for faster inference, referred to as LightD. By this means, we avoid relying on extensive data comparison and the necessity of designing complex architectures as in previous methods. Experiments on CelebA, AFLW, 300W and Deepfashion benchmarks have shown that DBN performs state-of-the-art compared to the existing methods. Furthermore, DBN shows robustness even when faced with limited data cases. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP | Approved | no | ||
Call Number ![]() |
Admin @ si @ WWT2024 | Serial | 4024 | ||
Permanent link to this record | |||||
Author | Fei Yang; Kai Wang; Joost Van de Weijer | ||||
Title | ScrollNet: DynamicWeight Importance for Continual Learning | Type | Conference Article | ||
Year | 2023 | Publication | Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 3345-3355 | ||
Keywords | |||||
Abstract | The principle underlying most existing continual learning (CL) methods is to prioritize stability by penalizing changes in parameters crucial to old tasks, while allowing for plasticity in other parameters. The importance of weights for each task can be determined either explicitly through learning a task-specific mask during training (e.g., parameter isolation-based approaches) or implicitly by introducing a regularization term (e.g., regularization-based approaches). However, all these methods assume that the importance of weights for each task is unknown prior to data exposure. In this paper, we propose ScrollNet as a scrolling neural network for continual learning. ScrollNet can be seen as a dynamic network that assigns the ranking of weight importance for each task before data exposure, thus achieving a more favorable stability-plasticity tradeoff during sequential task learning by reassigning this ranking for different tasks. Additionally, we demonstrate that ScrollNet can be combined with various CL methods, including regularization-based and replay-based approaches. Experimental results on CIFAR100 and TinyImagenet datasets show the effectiveness of our proposed method. | ||||
Address | Paris; France; October 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCVW | ||
Notes | LAMP | Approved | no | ||
Call Number ![]() |
Admin @ si @ WWW2023 | Serial | 3945 | ||
Permanent link to this record |