Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Mateusz Pyla; Kamil Deja; Bartłomiej Twardowski; Tomasz Trzcinski
Title	Bayesian Flow Networks in Continual Learning			Type	Miscellaneous
Year	2023	Publication	arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Bayesian Flow Networks (BFNs) has been recently proposed as one of the most promising direction to universal generative modelling, having ability to learn any of the data type. Their power comes from the expressiveness of neural networks and Bayesian inference which make them suitable in the context of continual learning. We delve into the mechanics behind BFNs and conduct the experiments to empirically verify the generative capabilities on non-stationary data.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP			Approved	no
Call Number	Admin @ si @ PDT2023			Serial	3972
Permanent link to this record



Author	Yuyang Liu; Yang Cong; Dipam Goswami; Xialei Liu; Joost Van de Weijer
Title	Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection			Type	Conference Article
Year	2023	Publication	20th IEEE International Conference on Computer Vision	Abbreviated Journal
Volume		Issue		Pages	11367-11377
Keywords
Abstract	In incremental learning, replaying stored samples from previous tasks together with current task samples is one of the most efficient approaches to address catastrophic forgetting. However, unlike incremental classification, image replay has not been successfully applied to incremental object detection (IOD). In this paper, we identify the overlooked problem of foreground shift as the main reason for this. Foreground shift only occurs when replaying images of previous tasks and refers to the fact that their background might contain foreground objects of the current task. To overcome this problem, a novel and efficient Augmented Box Replay (ABR) method is developed that only stores and replays foreground objects and thereby circumvents the foreground shift problem. In addition, we propose an innovative Attentive RoI Distillation loss that uses spatial attention from region-of-interest (RoI) features to constrain current model to focus on the most important information from old model. ABR significantly reduces forgetting of previous classes while maintaining high plasticity in current classes. Moreover, it considerably reduces the storage requirements when compared to standard image replay. Comprehensive experiments on Pascal-VOC and COCO datasets support the state-of-the-art performance of our model.
Address	Paris; France; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCV
Notes	LAMP			Approved	no
Call Number	Admin @ si @ LCG2023			Serial	3949
Permanent link to this record



Author	Marcin Przewiezlikowski; Mateusz Pyla; Bartosz Zielinski; Bartłomiej Twardowski; Jacek Tabor; Marek Smieja
Title	Augmentation-aware Self-supervised Learning with Guided Projector			Type	Miscellaneous
Year	2023	Publication	arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Self-supervised learning (SSL) is a powerful technique for learning robust representations from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo are able to reach quality on par with supervised approaches. However, this invariance may be harmful to solving some downstream tasks which depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. In order for the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP			Approved	no
Call Number	Admin @ si @ PPZ2023			Serial	3971
Permanent link to this record



Author	Dipam Goswami; J Schuster; Joost Van de Weijer; Didier Stricker
Title	Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	3195-3204
Keywords
Abstract	Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. D Goswami, R Schuster, J van de Weijer, D Stricker. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 3195-3204
Address	Waikoloa; Hawai; USA; January 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	LAMP			Approved	no
Call Number	Admin @ si @ GSW2023			Serial	3901
Permanent link to this record



Author	Artur Xarles; Sergio Escalera; Thomas B. Moeslund; Albert Clapes
Title	ASTRA: An Action Spotting TRAnsformer for Soccer Videos			Type	Conference Article
Year	2023	Publication	Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports	Abbreviated Journal
Volume		Issue		Pages	93–102
Keywords
Abstract	In this paper, we introduce ASTRA, a Transformer-based model designed for the task of Action Spotting in soccer matches. ASTRA addresses several challenges inherent in the task and dataset, including the requirement for precise action localization, the presence of a long-tail data distribution, non-visibility in certain actions, and inherent label noise. To do so, ASTRA incorporates (a) a Transformer encoder-decoder architecture to achieve the desired output temporal resolution and to produce precise predictions, (b) a balanced mixup strategy to handle the long-tail distribution of the data, (c) an uncertainty-aware displacement head to capture the label variability, and (d) input audio signal to enhance detection of non-visible actions. Results demonstrate the effectiveness of ASTRA, achieving a tight Average-mAP of 66.82 on the test set. Moreover, in the SoccerNet 2023 Action Spotting challenge, we secure the 3rd position with an Average-mAP of 70.21 on the challenge set.
Address	Otawa; Canada; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MMSports
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ XEM2023			Serial	3970
Permanent link to this record



Author	Damian Sojka; Sebastian Cygert; Bartlomiej Twardowski; Tomasz Trzcinski
Title	AR-TTA: A Simple Method for Real-World Continual Test-Time Adaptation			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops	Abbreviated Journal
Volume		Issue		Pages	3491-3495
Keywords
Abstract	Test-time adaptation is a promising research direction that allows the source model to adapt itself to changes in data distribution without any supervision. Yet, current methods are usually evaluated on benchmarks that are only a simplification of real-world scenarios. Hence, we propose to validate test-time adaptation methods using the recently introduced datasets for autonomous driving, namely CLAD-C and SHIFT. We observe that current test-time adaptation methods struggle to effectively handle varying degrees of domain shift, often resulting in degraded performance that falls below that of the source model. We noticed that the root of the problem lies in the inability to preserve the knowledge of the source model and adapt to dynamically changing, temporally correlated data streams. Therefore, we enhance well-established self-training framework by incorporating a small memory buffer to increase model stability and at the same time perform dynamic adaptation based on the intensity of domain shift. The proposed method, named AR-TTA, outperforms existing approaches on both synthetic and more real-world benchmarks and shows robustness across a variety of TTA scenarios.
Address	Paris; France; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	LAMP			Approved	no
Call Number	Admin @ si @ SCT2023			Serial	3943
Permanent link to this record



Author	Gisel Bastidas-Guacho; Patricio Moreno; Boris X. Vintimilla; Angel Sappa
Title	Application on the Loop of Multimodal Image Fusion: Trends on Deep-Learning Based Approaches			Type	Conference Article
Year	2023	Publication	13th International Conference on Pattern Recognition Systems	Abbreviated Journal
Volume	14234	Issue		Pages	25–36
Keywords
Abstract	Multimodal image fusion allows the combination of information from different modalities, which is useful for tasks such as object detection, edge detection, and tracking, to name a few. Using the fused representation for applications results in better task performance. There are several image fusion approaches, which have been summarized in surveys. However, the existing surveys focus on image fusion approaches where the application on the loop of multimodal image fusion is not considered. On the contrary, this study summarizes deep learning-based multimodal image fusion for computer vision (e.g., object detection) and image processing applications (e.g., semantic segmentation), that is, approaches where the application module leverages the multimodal fusion process to enhance the final result. Firstly, we introduce image fusion and the existing general frameworks for image fusion tasks such as multifocus, multiexposure and multimodal. Then, we describe the multimodal image fusion approaches. Next, we review the state-of-the-art deep learning multimodal image fusion approaches for vision applications. Finally, we conclude our survey with the trends of task-driven multimodal image fusion.
Address	Guayaquil; Ecuador; July 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRS
Notes	MSIAU			Approved	no
Call Number	Admin @ si @ BMV2023			Serial	3932
Permanent link to this record



Author	Luca Ginanni Corradini; Simone Balocco; Luciano Maresca; Silvio Vitale; Matteo Stefanini
Title	Anatomical Modifications After Stent Implantation: A Comparative Analysis Between CGuard, Wallstent, and Roadsaver Carotid Stents			Type	Journal Article
Year	2023	Publication	Journal of Endovascular Therapy	Abbreviated Journal
Volume	30	Issue	1	Pages	18-24
Keywords	Ginanni Corradini L, Balocco S, Maresca L, Vitale S, Stefanini M.
Abstract	Abstract Purpose: Carotid revascularization can be associated with modifications of the vascular geometry, which may lead to complications. The changes on the vessel angulation before and after a carotid WallStent (WS) implantation are compared against 2 new dual-layer devices, CGuard (CG) and RoadSaver (RS). Materials and Methods: The study prospectively recruited 217 consecutive patients (112 GC, 73 WS, and 32 RS, respectively). Angiography projections were explored and the one having a higher arterial angle was selected as a basal view. After stent implantation, a stent control angiography was performed selecting the projection having the maximal angle. The same procedure is followed in all the 3 stent types to guarantee comparable conditions. The angulation changes on the stented segments were quantified from both angiographies. The statistical analysis quantitatively compared the pre-and post-angles for the 3 stent types. The results are qualitatively illustrated using boxplots. Finally, the relation between pre- and post-angles measurements is analyzed using linear regression. Results: For CG, no statistical difference in the axial vessel geometry between the basal and postprocedural angles was found. For WS and RS, statistical difference was found between pre- and post-angles. The regression analysis shows that CG induces lower changes from the original curvature with respect to WS and RS. Conclusion: Based on our results, CG determines minor changes over the basal morphology than WS and RS stents. Hence, CG respects better the native vessel anatomy than the other stents. Level of Evidence: Level 4, Case Series.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	xxx			Approved	no
Call Number	Admin @ si @ GBM2023			Serial	4006
Permanent link to this record



Author	Qingshan Chen; Zhenzhen Quan; Yujun Li; Chao Zhai; Mikhail Mozerov
Title	An Unsupervised Domain Adaption Approach for Cross-Modality RGB-Infrared Person Re-Identification			Type	Journal Article
Year	2023	Publication	IEEE Sensors Journal	Abbreviated Journal	IEEE-SENS
Volume	23	Issue	24	Pages
Keywords	Q. Chen, Z. Quan, Y. Li, C. Zhai and M. G. Mozerov
Abstract	Dual-camera systems commonly employed in surveillance serve as the foundation for RGB-infrared (IR) cross-modality person re-identification (ReID). However, significant modality differences give rise to inferior performance compared to single-modality scenarios. Furthermore, most existing studies in this area rely on supervised training with meticulously labeled datasets. Labeling RGB-IR image pairs is more complex than labeling conventional image data, and deploying pretrained models on unlabeled datasets can lead to catastrophic performance degradation. In contrast to previous solutions that focus solely on cross-modality or domain adaptation issues, this article presents an end-to-end unsupervised domain adaptation (UDA) framework for the cross-modality person ReID, which can simultaneously address both of these challenges. This model employs source domain classes, target domain clusters, and unclustered instance samples for the training, maximizing the comprehensive use of the dataset. Moreover, it addresses the problem of mismatched clustering labels between the two modalities in the target domain by incorporating a label matching module that reassigns reliable clusters with labels, ensuring correspondence between different modality labels. We construct the loss function by incorporating distinctiveness loss and multiplicity loss, both of which are determined by the similarity of neighboring features in the predicted feature space and the difference between distant features. This approach enables efficient feature clustering and cluster class assignment to occur concurrently. Eight UDA cross-modality person ReID experiments are conducted on three real datasets and six synthetic datasets. The experimental results unequivocally demonstrate that the proposed model outperforms the existing state-of-the-art algorithms to a significant degree. Notably, in RegDB → RegDB_light, the Rank-1 accuracy exhibits a remarkable improvement of 8.24%.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP			Approved	no
Call Number	Admin @ si @ CQL2023			Serial	3884
Permanent link to this record



Author	Mohamed Ali Souibgui; Pau Torras; Jialuo Chen; Alicia Fornes
Title	An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts			Type	Conference Article
Year	2023	Publication	7th International Workshop on Historical Document Imaging and Processing	Abbreviated Journal
Volume		Issue		Pages	7-12
Keywords
Abstract	This paper investigates the effectiveness of different deep learning HTR families, including LSTM, Seq2Seq, and transformer-based approaches with self-supervised pretraining, in recognizing ciphered manuscripts from different historical periods and cultures. The goal is to identify the most suitable method or training techniques for recognizing ciphered manuscripts and to provide insights into the challenges and opportunities in this field of research. We evaluate the performance of these models on several datasets of ciphered manuscripts and discuss their results. This study contributes to the development of more accurate and efficient methods for recognizing historical manuscripts for the preservation and dissemination of our cultural heritage.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	HIP
Notes	DAG			Approved	no
Call Number	Admin @ si @ STC2023			Serial	3849
Permanent link to this record



Author	Jose Luis Gomez; Manuel Silva; Antonio Seoane; Agnes Borras; Mario Noriega; German Ros; Jose Antonio Iglesias; Antonio Lopez
Title	All for One, and One for All: UrbanSyn Dataset, the third Musketeer of Synthetic Driving Scenes			Type	Miscellaneous
Year	2023	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	We introduce UrbanSyn, a photorealistic dataset acquired through semi-procedurally generated synthetic urban driving scenarios. Developed using high-quality geometry and materials, UrbanSyn provides pixel-level ground truth, including depth, semantic segmentation, and instance segmentation with object bounding boxes and occlusion degree. It complements GTAV and Synscapes datasets to form what we coin as the 'Three Musketeers'. We demonstrate the value of the Three Musketeers in unsupervised domain adaptation for image semantic segmentation. Results on real-world datasets, Cityscapes, Mapillary Vistas, and BDD100K, establish new benchmarks, largely attributed to UrbanSyn. We make UrbanSyn openly and freely accessible (this http URL).
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ GSS2023			Serial	4015
Permanent link to this record



Author	Yi Xiao
Title	Advancing Vision-based End-to-End Autonomous Driving			Type	Book Whole
Year	2023	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In autonomous driving, artificial intelligence (AI) processes the traffic environment to drive the vehicle to a desired destination. Currently, there are different paradigms that address the development of AI-enabled drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception, maneuver planning, and control. On the other hand, we find end-to-end driving approaches that attempt to learn the direct mapping of raw data from input sensors to vehicle control signals. The latter are relatively less studied but are gaining popularity as they are less demanding in terms of data labeling. Therefore, in this thesis, our goal is to investigate end-to-end autonomous driving. We propose to evaluate three approaches to tackle the challenge of end-to-end autonomous driving. First, we focus on the input, considering adding depth information as complementary to RGB data, in order to mimic the human being’s ability to estimate the distance to obstacles. Notice that, in the real world, these depth maps can be obtained either from a LiDAR sensor, or a trained monocular depth estimation module, where human labeling is not needed. Then, based on the intuition that the latent space of end-to-end driving models encodes relevant information for driving, we use it as prior knowledge for training an affordancebased driving model. In this case, the trained affordance-based model can achieve good performance while requiring less human-labeled data, and it can provide interpretability regarding driving actions. Finally, we present a new pure vision-based end-to-end driving model termed CIL++, which is trained by imitation learning. CIL++ leverages modern best practices, such as a large horizontal field of view and a self-attention mechanism, which are contributing to the agent’s understanding of the driving scene and bringing a better imitation of human drivers. Using training data without any human labeling, our model yields almost expert performance in the CARLA NoCrash benchmark and could rival SOTA models that require large amounts of human-labeled data.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher	IMPRIMA	Place of Publication		Editor	Antonio Lopez
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-84-126409-4-6	Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ Xia2023			Serial	3964
Permanent link to this record



Author	Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title	Advances in Face Presentation Attack Detection			Type	Book Whole
Year	2023	Publication	Advances in Face Presentation Attack Detection	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ WGE2023a			Serial	3955
Permanent link to this record



Author	Filip Szatkowski; Mateusz Pyla; Marcin Przewięzlikowski; Sebastian Cygert; Bartłomiej Twardowski; Tomasz Trzcinski
Title	Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-Free Continual Learning			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops	Abbreviated Journal
Volume		Issue		Pages	3512-3517
Keywords
Abstract	In this work, we investigate exemplar-free class incremental learning (CIL) with knowledge distillation (KD) as a regularization strategy, aiming to prevent forgetting. KD-based methods are successfully used in CIL, but they often struggle to regularize the model without access to exemplars of the training data from previous tasks. Our analysis reveals that this issue originates from substantial representation shifts in the teacher network when dealing with out-of-distribution data. This causes large errors in the KD loss component, leading to performance degradation in CIL. Inspired by recent test-time adaptation methods, we introduce Teacher Adaptation (TA), a method that concurrently updates the teacher and the main model during incremental training. Our method seamlessly integrates with KD-based CIL approaches and allows for consistent enhancement of their performance across multiple exemplar-free CIL benchmarks.
Address	Paris; France; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	LAMP			Approved	no
Call Number	Admin @ si @			Serial	3944
Permanent link to this record



Author	Sergi Garcia Bordils; Dimosthenis Karatzas; Marçal Rusiñol
Title	Accelerating Transformer-Based Scene Text Detection and Recognition via Token Pruning			Type	Conference Article
Year	2023	Publication	17th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	14192	Issue		Pages	106-121
Keywords	Scene Text Detection; Scene Text Recognition; Transformer Acceleration
Abstract	Scene text detection and recognition is a crucial task in computer vision with numerous real-world applications. Transformer-based approaches are behind all current state-of-the-art models and have achieved excellent performance. However, the computational requirements of the transformer architecture makes training these methods slow and resource heavy. In this paper, we introduce a new token pruning strategy that significantly decreases training and inference times without sacrificing performance, striking a balance between accuracy and speed. We have applied this pruning technique to our own end-to-end transformer-based scene text understanding architecture. Our method uses a separate detection branch to guide the pruning of uninformative image features, which significantly reduces the number of tokens at the input of the transformer. Experimental results show how our network is able to obtain competitive results on multiple public benchmarks while running at significantly higher speeds.
Address	San Jose; CA; USA; August 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG			Approved	no
Call Number	Admin @ si @ GKR2023a			Serial	3907
Permanent link to this record