|   | 
Details
   web
Records
Author Mateusz Pyla; Kamil Deja; Bartłomiej Twardowski; Tomasz Trzcinski
Title (down) Bayesian Flow Networks in Continual Learning Type Miscellaneous
Year 2023 Publication arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Bayesian Flow Networks (BFNs) has been recently proposed as one of the most promising direction to universal generative modelling, having ability to learn any of the data type. Their power comes from the expressiveness of neural networks and Bayesian inference which make them suitable in the context of continual learning. We delve into the mechanics behind BFNs and conduct the experiments to empirically verify the generative capabilities on non-stationary data.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP Approved no
Call Number Admin @ si @ PDT2023 Serial 3972
Permanent link to this record
 

 
Author Yuyang Liu; Yang Cong; Dipam Goswami; Xialei Liu; Joost Van de Weijer
Title (down) Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection Type Conference Article
Year 2023 Publication 20th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 11367-11377
Keywords
Abstract In incremental learning, replaying stored samples from previous tasks together with current task samples is one of the most efficient approaches to address catastrophic forgetting. However, unlike incremental classification, image replay has not been successfully applied to incremental object detection (IOD). In this paper, we identify the overlooked problem of foreground shift as the main reason for this. Foreground shift only occurs when replaying images of previous tasks and refers to the fact that their background might contain foreground objects of the current task. To overcome this problem, a novel and efficient Augmented Box Replay (ABR) method is developed that only stores and replays foreground objects and thereby circumvents the foreground shift problem. In addition, we propose an innovative Attentive RoI Distillation loss that uses spatial attention from region-of-interest (RoI) features to constrain current model to focus on the most important information from old model. ABR significantly reduces forgetting of previous classes while maintaining high plasticity in current classes. Moreover, it considerably reduces the storage requirements when compared to standard image replay. Comprehensive experiments on Pascal-VOC and COCO datasets support the state-of-the-art performance of our model.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP Approved no
Call Number Admin @ si @ LCG2023 Serial 3949
Permanent link to this record
 

 
Author Marcin Przewiezlikowski; Mateusz Pyla; Bartosz Zielinski; Bartłomiej Twardowski; Jacek Tabor; Marek Smieja
Title (down) Augmentation-aware Self-supervised Learning with Guided Projector Type Miscellaneous
Year 2023 Publication arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Self-supervised learning (SSL) is a powerful technique for learning robust representations from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo are able to reach quality on par with supervised approaches. However, this invariance may be harmful to solving some downstream tasks which depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. In order for the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP Approved no
Call Number Admin @ si @ PPZ2023 Serial 3971
Permanent link to this record
 

 
Author Dipam Goswami; J Schuster; Joost Van de Weijer; Didier Stricker
Title (down) Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 3195-3204
Keywords
Abstract Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. D Goswami, R Schuster, J van de Weijer, D Stricker. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 3195-3204
Address Waikoloa; Hawai; USA; January 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes LAMP Approved no
Call Number Admin @ si @ GSW2023 Serial 3901
Permanent link to this record
 

 
Author Artur Xarles; Sergio Escalera; Thomas B. Moeslund; Albert Clapes
Title (down) ASTRA: An Action Spotting TRAnsformer for Soccer Videos Type Conference Article
Year 2023 Publication Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports Abbreviated Journal
Volume Issue Pages 93–102
Keywords
Abstract In this paper, we introduce ASTRA, a Transformer-based model designed for the task of Action Spotting in soccer matches. ASTRA addresses several challenges inherent in the task and dataset, including the requirement for precise action localization, the presence of a long-tail data distribution, non-visibility in certain actions, and inherent label noise. To do so, ASTRA incorporates (a) a Transformer encoder-decoder architecture to achieve the desired output temporal resolution and to produce precise predictions, (b) a balanced mixup strategy to handle the long-tail distribution of the data, (c) an uncertainty-aware displacement head to capture the label variability, and (d) input audio signal to enhance detection of non-visible actions. Results demonstrate the effectiveness of ASTRA, achieving a tight Average-mAP of 66.82 on the test set. Moreover, in the SoccerNet 2023 Action Spotting challenge, we secure the 3rd position with an Average-mAP of 70.21 on the challenge set.
Address Otawa; Canada; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MMSports
Notes HUPBA Approved no
Call Number Admin @ si @ XEM2023 Serial 3970
Permanent link to this record
 

 
Author Damian Sojka; Sebastian Cygert; Bartlomiej Twardowski; Tomasz Trzcinski
Title (down) AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal
Volume Issue Pages 3491-3495
Keywords
Abstract Test-time adaptation is a promising research direction that allows the source model to adapt itself to changes in data distribution without any supervision. Yet, current methods are usually evaluated on benchmarks that are only a simplification of real-world scenarios. Hence, we propose to validate test-time adaptation methods using the recently introduced datasets for autonomous driving, namely CLAD-C and SHIFT. We observe that current test-time adaptation methods struggle to effectively handle varying degrees of domain shift, often resulting in degraded performance that falls below that of the source model. We noticed that the root of the problem lies in the inability to preserve the knowledge of the source model and adapt to dynamically changing, temporally correlated data streams. Therefore, we enhance well-established self-training framework by incorporating a small memory buffer to increase model stability and at the same time perform dynamic adaptation based on the intensity of domain shift. The proposed method, named AR-TTA, outperforms existing approaches on both synthetic and more real-world benchmarks and shows robustness across a variety of TTA scenarios.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes LAMP Approved no
Call Number Admin @ si @ SCT2023 Serial 3943
Permanent link to this record
 

 
Author Gisel Bastidas-Guacho; Patricio Moreno; Boris X. Vintimilla; Angel Sappa
Title (down) Application on the Loop of Multimodal Image Fusion: Trends on Deep-Learning Based Approaches Type Conference Article
Year 2023 Publication 13th International Conference on Pattern Recognition Systems Abbreviated Journal
Volume 14234 Issue Pages 25–36
Keywords
Abstract Multimodal image fusion allows the combination of information from different modalities, which is useful for tasks such as object detection, edge detection, and tracking, to name a few. Using the fused representation for applications results in better task performance. There are several image fusion approaches, which have been summarized in surveys. However, the existing surveys focus on image fusion approaches where the application on the loop of multimodal image fusion is not considered. On the contrary, this study summarizes deep learning-based multimodal image fusion for computer vision (e.g., object detection) and image processing applications (e.g., semantic segmentation), that is, approaches where the application module leverages the multimodal fusion process to enhance the final result. Firstly, we introduce image fusion and the existing general frameworks for image fusion tasks such as multifocus, multiexposure and multimodal. Then, we describe the multimodal image fusion approaches. Next, we review the state-of-the-art deep learning multimodal image fusion approaches for vision applications. Finally, we conclude our survey with the trends of task-driven multimodal image fusion.
Address Guayaquil; Ecuador; July 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPRS
Notes MSIAU Approved no
Call Number Admin @ si @ BMV2023 Serial 3932
Permanent link to this record
 

 
Author Luca Ginanni Corradini; Simone Balocco; Luciano Maresca; Silvio Vitale; Matteo Stefanini
Title (down) Anatomical Modifications After Stent Implantation: A Comparative Analysis Between CGuard, Wallstent, and Roadsaver Carotid Stents Type Journal Article
Year 2023 Publication Journal of Endovascular Therapy Abbreviated Journal
Volume 30 Issue 1 Pages 18-24
Keywords Ginanni Corradini L, Balocco S, Maresca L, Vitale S, Stefanini M.
Abstract Abstract
Purpose:
Carotid revascularization can be associated with modifications of the vascular geometry, which may lead to complications. The changes on the vessel angulation before and after a carotid WallStent (WS) implantation are compared against 2 new dual-layer devices, CGuard (CG) and RoadSaver (RS).
Materials and Methods:
The study prospectively recruited 217 consecutive patients (112 GC, 73 WS, and 32 RS, respectively). Angiography projections were explored and the one having a higher arterial angle was selected as a basal view. After stent implantation, a stent control angiography was performed selecting the projection having the maximal angle. The same procedure is followed in all the 3 stent types to guarantee comparable conditions. The angulation changes on the stented segments were quantified from both angiographies. The statistical analysis quantitatively compared the pre-and post-angles for the 3 stent types. The results are qualitatively illustrated using boxplots. Finally, the relation between pre- and post-angles measurements is analyzed using linear regression.
Results:
For CG, no statistical difference in the axial vessel geometry between the basal and postprocedural angles was found. For WS and RS, statistical difference was found between pre- and post-angles. The regression analysis shows that CG induces lower changes from the original curvature with respect to WS and RS.
Conclusion:
Based on our results, CG determines minor changes over the basal morphology than WS and RS stents. Hence, CG respects better the native vessel anatomy than the other stents.
Level of Evidence: Level 4, Case Series.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes xxx Approved no
Call Number Admin @ si @ GBM2023 Serial 4006
Permanent link to this record
 

 
Author Qingshan Chen; Zhenzhen Quan; Yujun Li; Chao Zhai; Mikhail Mozerov
Title (down) An Unsupervised Domain Adaption Approach for Cross-Modality RGB-Infrared Person Re-Identification Type Journal Article
Year 2023 Publication IEEE Sensors Journal Abbreviated Journal IEEE-SENS
Volume 23 Issue 24 Pages
Keywords Q. Chen, Z. Quan, Y. Li, C. Zhai and M. G. Mozerov
Abstract Dual-camera systems commonly employed in surveillance serve as the foundation for RGB-infrared (IR) cross-modality person re-identification (ReID). However, significant modality differences give rise to inferior performance compared to single-modality scenarios. Furthermore, most existing studies in this area rely on supervised training with meticulously labeled datasets. Labeling RGB-IR image pairs is more complex than labeling conventional image data, and deploying pretrained models on unlabeled datasets can lead to catastrophic performance degradation. In contrast to previous solutions that focus solely on cross-modality or domain adaptation issues, this article presents an end-to-end unsupervised domain adaptation (UDA) framework for the cross-modality person ReID, which can simultaneously address both of these challenges. This model employs source domain classes, target domain clusters, and unclustered instance samples for the training, maximizing the comprehensive use of the dataset. Moreover, it addresses the problem of mismatched clustering labels between the two modalities in the target domain by incorporating a label matching module that reassigns reliable clusters with labels, ensuring correspondence between different modality labels. We construct the loss function by incorporating distinctiveness loss and multiplicity loss, both of which are determined by the similarity of neighboring features in the predicted feature space and the difference between distant features. This approach enables efficient feature clustering and cluster class assignment to occur concurrently. Eight UDA cross-modality person ReID experiments are conducted on three real datasets and six synthetic datasets. The experimental results unequivocally demonstrate that the proposed model outperforms the existing state-of-the-art algorithms to a significant degree. Notably, in RegDB → RegDB_light, the Rank-1 accuracy exhibits a remarkable improvement of 8.24%.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP Approved no
Call Number Admin @ si @ CQL2023 Serial 3884
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Pau Torras; Jialuo Chen; Alicia Fornes
Title (down) An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts Type Conference Article
Year 2023 Publication 7th International Workshop on Historical Document Imaging and Processing Abbreviated Journal
Volume Issue Pages 7-12
Keywords
Abstract This paper investigates the effectiveness of different deep learning HTR families, including LSTM, Seq2Seq, and transformer-based approaches with self-supervised pretraining, in recognizing ciphered manuscripts from different historical periods and cultures. The goal is to identify the most suitable method or training techniques for recognizing ciphered manuscripts and to provide insights into the challenges and opportunities in this field of research. We evaluate the performance of these models on several datasets of ciphered manuscripts and discuss their results. This study contributes to the development of more accurate and efficient methods for recognizing historical manuscripts for the preservation and dissemination of our cultural heritage.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference HIP
Notes DAG Approved no
Call Number Admin @ si @ STC2023 Serial 3849
Permanent link to this record
 

 
Author Jose Luis Gomez; Manuel Silva; Antonio Seoane; Agnes Borras; Mario Noriega; German Ros; Jose Antonio Iglesias; Antonio Lopez
Title (down) All for One, and One for All: UrbanSyn Dataset, the third Musketeer of Synthetic Driving Scenes Type Miscellaneous
Year 2023 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We introduce UrbanSyn, a photorealistic dataset acquired through semi-procedurally generated synthetic urban driving scenarios. Developed using high-quality geometry and materials, UrbanSyn provides pixel-level ground truth, including depth, semantic segmentation, and instance segmentation with object bounding boxes and occlusion degree. It complements GTAV and Synscapes datasets to form what we coin as the 'Three Musketeers'. We demonstrate the value of the Three Musketeers in unsupervised domain adaptation for image semantic segmentation. Results on real-world datasets, Cityscapes, Mapillary Vistas, and BDD100K, establish new benchmarks, largely attributed to UrbanSyn. We make UrbanSyn openly and freely accessible (this http URL).
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ GSS2023 Serial 4015
Permanent link to this record
 

 
Author Yi Xiao
Title (down) Advancing Vision-based End-to-End Autonomous Driving Type Book Whole
Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In autonomous driving, artificial intelligence (AI) processes the traffic environment to drive the vehicle to a desired destination. Currently, there are different paradigms that address the development of AI-enabled drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception, maneuver planning, and control. On the other hand, we find end-to-end driving approaches that attempt to learn the direct mapping of raw data from input sensors to vehicle control signals. The latter are relatively less studied but are gaining popularity as they are less demanding in terms of data labeling. Therefore, in this thesis, our goal is to investigate end-to-end autonomous driving.
We propose to evaluate three approaches to tackle the challenge of end-to-end
autonomous driving. First, we focus on the input, considering adding depth information as complementary to RGB data, in order to mimic the human being’s
ability to estimate the distance to obstacles. Notice that, in the real world, these depth maps can be obtained either from a LiDAR sensor, or a trained monocular
depth estimation module, where human labeling is not needed. Then, based on
the intuition that the latent space of end-to-end driving models encodes relevant
information for driving, we use it as prior knowledge for training an affordancebased driving model. In this case, the trained affordance-based model can achieve good performance while requiring less human-labeled data, and it can provide interpretability regarding driving actions. Finally, we present a new pure vision-based end-to-end driving model termed CIL++, which is trained by imitation learning.
CIL++ leverages modern best practices, such as a large horizontal field of view and
a self-attention mechanism, which are contributing to the agent’s understanding of
the driving scene and bringing a better imitation of human drivers. Using training
data without any human labeling, our model yields almost expert performance in
the CARLA NoCrash benchmark and could rival SOTA models that require large amounts of human-labeled data.
Address
Corporate Author Thesis Ph.D. thesis
Publisher IMPRIMA Place of Publication Editor Antonio Lopez
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-126409-4-6 Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ Xia2023 Serial 3964
Permanent link to this record
 

 
Author Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title (down) Advances in Face Presentation Attack Detection Type Book Whole
Year 2023 Publication Advances in Face Presentation Attack Detection Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ WGE2023a Serial 3955
Permanent link to this record
 

 
Author Filip Szatkowski; Mateusz Pyla; Marcin Przewięzlikowski; Sebastian Cygert; Bartłomiej Twardowski; Tomasz Trzcinski
Title (down) Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-Free Continual Learning Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal
Volume Issue Pages 3512-3517
Keywords
Abstract In this work, we investigate exemplar-free class incremental learning (CIL) with knowledge distillation (KD) as a regularization strategy, aiming to prevent forgetting. KD-based methods are successfully used in CIL, but they often struggle to regularize the model without access to exemplars of the training data from previous tasks. Our analysis reveals that this issue originates from substantial representation shifts in the teacher network when dealing with out-of-distribution data. This causes large errors in the KD loss component, leading to performance degradation in CIL. Inspired by recent test-time adaptation methods, we introduce Teacher Adaptation (TA), a method that concurrently updates the teacher and the main model during incremental training. Our method seamlessly integrates with KD-based CIL approaches and allows for consistent enhancement of their performance across multiple exemplar-free CIL benchmarks.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes LAMP Approved no
Call Number Admin @ si @ Serial 3944
Permanent link to this record
 

 
Author Sergi Garcia Bordils; Dimosthenis Karatzas; Marçal Rusiñol
Title (down) Accelerating Transformer-Based Scene Text Detection and Recognition via Token Pruning Type Conference Article
Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 14192 Issue Pages 106-121
Keywords Scene Text Detection; Scene Text Recognition; Transformer Acceleration
Abstract Scene text detection and recognition is a crucial task in computer vision with numerous real-world applications. Transformer-based approaches are behind all current state-of-the-art models and have achieved excellent performance. However, the computational requirements of the transformer architecture makes training these methods slow and resource heavy. In this paper, we introduce a new token pruning strategy that significantly decreases training and inference times without sacrificing performance, striking a balance between accuracy and speed. We have applied this pruning technique to our own end-to-end transformer-based scene text understanding architecture. Our method uses a separate detection branch to guide the pruning of uninformative image features, which significantly reduces the number of tokens at the input of the transformer. Experimental results show how our network is able to obtain competitive results on multiple public benchmarks while running at significantly higher speeds.
Address San Jose; CA; USA; August 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ GKR2023a Serial 3907
Permanent link to this record