Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Yi Xiao; Felipe Codevilla; Diego Porres; Antonio Lopez
Title	Scaling Vision-Based End-to-End Autonomous Driving with Multi-View Attention Learning			Type	Conference Article
Year	2023	Publication	International Conference on Intelligent Robots and Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing higher-resolution images using a human-inspired HFOV as an inductive bias and incorporating a proper attention mechanism. CIL++ achieves competitive performance compared to models which are more costly to develop. We propose to replace CILRS with CIL++ as a strong vision-based pure end-to-end driving baseline supervised by only vehicle signals and trained by conditional imitation learning.
Address	Detroit; USA; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IROS
Notes	ADAS			Approved	no
Call Number	Admin @ si @ XCP2023			Serial	3930
Permanent link to this record



Author	Weijia Wu; Yuzhong Zhao; Zhuang Li; Jiahong Li; Mike Zheng Shou; Umapada Pal; Dimosthenis Karatzas; Xiang Bai
Title	ICDAR 2023 Competition on Video Text Reading for Dense and Small Text			Type	Conference Article
Year	2023	Publication	17th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume	14188	Issue		Pages	405–419
Keywords	Video Text Spotting; Small Text; Text Tracking; Dense Text
Abstract	Recently, video text detection, tracking and recognition in natural scenes are becoming very popular in the computer vision community. However, most existing algorithms and benchmarks focus on common text cases (e.g., normal size, density) and single scenario, while ignore extreme video texts challenges, i.e., dense and small text in various scenarios. In this competition report, we establish a video text reading benchmark, named DSText, which focuses on dense and small text reading challenge in the video with various scenarios. Compared with the previous datasets, the proposed dataset mainly include three new challenges: 1) Dense video texts, new challenge for video text spotter. 2) High-proportioned small texts. 3) Various new scenarios, e.g., ‘Game’, ‘Sports’, etc. The proposed DSText includes 100 video clips from 12 open scenarios, supporting two tasks (i.e., video text tracking (Task 1) and end-to-end video text spotting (Task2)). During the competition period (opened on 15th February, 2023 and closed on 20th March, 2023), a total of 24 teams participated in the three proposed tasks with around 30 valid submissions, respectively. In this article, we describe detailed statistical information of the dataset, tasks, evaluation protocols and the results summaries of the ICDAR 2023 on DSText competition. Moreover, we hope the benchmark will promise the video text research in the community.
Address	San Jose; CA; USA; August 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG			Approved	no
Call Number	Admin @ si @ WZL2023			Serial	3898
Permanent link to this record



Author	Kai Wang; Fei Yang; Shiqi Yang; Muhammad Atif Butt; Joost Van de Weijer
Title	Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing			Type	Conference Article
Year	2023	Publication	37th Annual Conference on Neural Information Processing Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Poster
Address	New Orleans; USA; December 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NEURIPS
Notes	LAMP			Approved	no
Call Number	Admin @ si @ WYY2023			Serial	3935
Permanent link to this record



Author	Fei Yang; Kai Wang; Joost Van de Weijer
Title	ScrollNet: DynamicWeight Importance for Continual Learning			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops	Abbreviated Journal
Volume		Issue		Pages	3345-3355
Keywords
Abstract	The principle underlying most existing continual learning (CL) methods is to prioritize stability by penalizing changes in parameters crucial to old tasks, while allowing for plasticity in other parameters. The importance of weights for each task can be determined either explicitly through learning a task-specific mask during training (e.g., parameter isolation-based approaches) or implicitly by introducing a regularization term (e.g., regularization-based approaches). However, all these methods assume that the importance of weights for each task is unknown prior to data exposure. In this paper, we propose ScrollNet as a scrolling neural network for continual learning. ScrollNet can be seen as a dynamic network that assigns the ranking of weight importance for each task before data exposure, thus achieving a more favorable stability-plasticity tradeoff during sequential task learning by reassigning this ranking for different tasks. Additionally, we demonstrate that ScrollNet can be combined with various CL methods, including regularization-based and replay-based approaches. Experimental results on CIFAR100 and TinyImagenet datasets show the effectiveness of our proposed method.
Address	Paris; France; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	LAMP			Approved	no
Call Number	Admin @ si @ WWW2023			Serial	3945
Permanent link to this record



Author	Chenshen Wu; Joost Van de Weijer
Title	Density Map Distillation for Incremental Object Counting			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
Volume		Issue		Pages	2505-2514
Keywords
Abstract	We investigate the problem of incremental learning for object counting, where a method must learn to count a variety of object classes from a sequence of datasets. A naïve approach to incremental object counting would suffer from catastrophic forgetting, where it would suffer from a dramatic performance drop on previous tasks. In this paper, we propose a new exemplar-free functional regularization method, called Density Map Distillation (DMD). During training, we introduce a new counter head for each task and introduce a distillation loss to prevent forgetting of previous tasks. Additionally, we introduce a cross-task adaptor that projects the features of the current backbone to the previous backbone. This projector allows for the learning of new features while the backbone retains the relevant features for previous tasks. Finally, we set up experiments of incremental learning for counting new objects. Results confirm that our method greatly reduces catastrophic forgetting and outperforms existing methods.
Address	Vancouver; Canada; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	LAMP			Approved	no
Call Number	Admin @ si @ WuW2023			Serial	3916
Permanent link to this record



Author	Chenshen Wu
Title	Going beyond Classification Problems for the Continual Learning of Deep Neural Networks			Type	Book Whole
Year	2023	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Deep learning has made tremendous progress in the last decade due to the explosion of training data and computational power. Through end-to-end training on a large dataset, image representations are more discriminative than the previously used hand-crafted features. However, for many real-world applications, training and testing on a single dataset is not realistic, as the test distribution may change over time. Continuous learning takes this situation into account, where the learner must adapt to a sequence of tasks, each with a different distribution. If you would naively continue training the model with a new task, the performance of the model would drop dramatically for the previously learned data. This phenomenon is known as catastrophic forgetting. Many approaches have been proposed to address this problem, which can be divided into three main categories: regularization-based approaches, rehearsal-based approaches, and parameter isolation-based approaches. However, most of the existing works focus on image classification tasks and many other computer vision tasks have not been well-explored in the continual learning setting. Therefore, in this thesis, we study continual learning for image generation, object re-identification, and object counting. For the image generation problem, since the model can generate images from the previously learned task, it is free to apply rehearsal without any limitation. We developed two methods based on generative replay. The first one uses the generated image for joint training together with the new data. The second one is based on output pixel-wise alignment. We extensively evaluate these methods on several benchmarks. Next, we study continual learning for object Re-Identification (ReID). Although most state-of-the-art methods of ReID and continual ReID use softmax-triplet loss, we found that it is better to solve the ReID problem from a meta-learning perspective because continual learning of reID can benefit a lot from the generalization of metalearning. We also propose a distillation loss and found that the removal of the positive pairs before the distillation loss is critical. Finally, we study continual learning for the counting problem. We study the mainstream method based on density maps and propose a new approach for density map distillation. We found that fixing the counter head is crucial for the continual learning of object counting. To further improve results, we propose an adaptor to adapt the changing feature extractor for the fixed counter head. Extensive evaluation shows that this results in improved continual learning performance.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher	IMPRIMA	Place of Publication		Editor	Joost Van de Weijer;Bogdan Raducanu
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-84-126409-0-8	Medium
Area		Expedition		Conference
Notes	LAMP			Approved	no
Call Number	Admin @ si @ Wu2023			Serial	3960
Permanent link to this record



Author	Yifan Wang; Luka Murn; Luis Herranz; Fei Yang; Marta Mrak; Wei Zhang; Shuai Wan; Marc Gorriz Blanch
Title	Efficient Super-Resolution for Compression Of Gaming Videos			Type	Conference Article
Year	2023	Publication	IEEE International Conference on Acoustics, Speech and Signal Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Due to the increasing demand for game-streaming services, efficient compression of computer-generated video is more critical than ever, especially when the available bandwidth is low. This paper proposes a super-resolution framework that improves the coding efficiency of computer-generated gaming videos at low bitrates. Most state-of-the-art super-resolution networks generalize over a variety of RGB inputs and use a unified network architecture for frames of different levels of degradation, leading to high complexity and redundancy. Since games usually consist of a limited number of fixed scenarios, we specialize one model for each scenario and assign appropriate network capacities for different QPs to perform super-resolution under the guidance of reconstructed high-quality luma components. Experimental results show that our framework achieves a superior quality-complexity trade-off compared to the ESRnet baseline, saving at most 93.59% parameters while maintaining comparable performance. The compression efficiency compared to HEVC is also improved by more than 17% BD-rate gain.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICASSP
Notes	LAMP; MACO			Approved	no
Call Number	Admin @ si @ WMH2023			Serial	3911
Permanent link to this record



Author	Maciej Wielgosz; Antonio Lopez; Muhamad Naveed Riaz
Title	CARLA-BSP: a simulated dataset with pedestrians			Type	Miscellaneous
Year	2023	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	We present a sample dataset featuring pedestrians generated using the ARCANE framework, a new framework for generating datasets in CARLA (0.9.13). We provide use cases for pedestrian detection, autoencoding, pose estimation, and pose lifting. We also showcase baseline results.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ WLN2023			Serial	3866
Permanent link to this record



Author	Dong Wang; Jia Guo; Qiqi Shao; Haochi He; Zhian Chen; Chuanbao Xiao; Ajian Liu; Sergio Escalera; Hugo Jair Escalante; Zhen Lei; Jun Wan; Jiankang Deng
Title	Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
Volume		Issue		Pages	6379-6390
Keywords
Abstract	Face anti-spoofing (FAS) is an essential mechanism for safeguarding the integrity of automated face recognition systems. Despite substantial advancements, the generalization of existing approaches to real-world applications remains challenging. This limitation can be attributed to the scarcity and lack of diversity in publicly available FAS datasets, which often leads to overfitting during training or saturation during testing. In terms of quantity, the number of spoof subjects is a critical determinant. Most datasets comprise fewer than 2,000 subjects. With regard to diversity, the majority of datasets consist of spoof samples collected in controlled environments using repetitive, mechanical processes. This data collection methodology results in homogenized samples and a dearth of scenario diversity. To address these shortcomings, we introduce the Wild Face Anti-Spoofing (WFAS) dataset, a large-scale, diverse FAS dataset collected in unconstrained settings. Our dataset encompasses 853,729 images of 321,751 spoof subjects and 529,571 images of 148,169 live subjects, representing a substantial increase in quantity. Moreover, our dataset incorporates spoof data obtained from the internet, spanning a wide array of scenarios and various commercial sensors, including 17 presentation attacks (PAs) that encompass both 2D and 3D forms. This novel data collection strategy markedly enhances FAS data diversity. Leveraging the WFAS dataset and Protocol 1 (Known-Type), we host the Wild Face Anti-Spoofing Challenge at the CVPR2023 workshop. Additionally, we meticulously evaluate representative methods using Protocol 1 and Protocol 2 (Unknown-Type). Through an in-depth examination of the challenge outcomes and benchmark baselines, we provide insightful analyses and propose potential avenues for future research. The dataset is released under Insightface 1 .
Address	Vancouver; Canada; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ WGS2023			Serial	3919
Permanent link to this record



Author	Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title	Best Solutions Proposed in the Context of the Face Anti-spoofing Challenge Series			Type	Book Chapter
Year	2023	Publication	Advances in Face Presentation Attack Detection	Abbreviated Journal
Volume		Issue		Pages	37–78
Keywords
Abstract	The PAD competitions we organized attracted more than 835 teams from home and abroad, most of them from the industry, which shows that the topic of face anti-spoofing is closely related to daily life, and there is an urgent need for advanced algorithms to solve its application needs. Specifically, the Chalearn LAP multi-modal face anti-spoofing attack detection challenge attracted more than 300 teams for the development phase with a total of 13 teams qualifying for the final round; the Chalearn Face Anti-spoofing Attack Detection Challenge attracted 340 teams in the development stage, and finally, 11 and 8 teams have submitted their codes in the single-modal and multi-modal face anti-spoofing recognition challenges, respectively; the 3D High-Fidelity Mask Face Presentation Attack Detection Challenge attracted 195 teams for the development phase with a total of 18 teams qualifying for the final round. All the results were verified and re-run by the organizing team, and the results were used for the final ranking. In this chapter, we briefly the methods developed by the teams participating in each competition, and introduce the algorithm details of the top-three ranked teams in detail.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ WGE2023d			Serial	3958
Permanent link to this record



Author	Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title	Face Anti-spoofing Progress Driven by Academic Challenges			Type	Book Chapter
Year	2023	Publication	Advances in Face Presentation Attack Detection	Abbreviated Journal
Volume		Issue		Pages	1–15
Keywords
Abstract	With the ubiquity of facial authentication systems and the prevalence of security cameras around the world, the impact that facial presentation attack techniques may have is huge. However, research progress in this field has been slowed by a number of factors, including the lack of appropriate and realistic datasets, ethical and privacy issues that prevent the recording and distribution of facial images, the little attention that the community has given to potential ethnic biases among others. This chapter provides an overview of contributions derived from the organization of academic challenges in the context of face anti-spoofing detection. Specifically, we discuss the limitations of benchmarks and summarize our efforts in trying to boost research by the community via the participation in academic challenges
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	SLCV
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ WGE2023c			Serial	3957
Permanent link to this record



Author	Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title	Face Presentation Attack Detection (PAD) Challenges			Type	Book Chapter
Year	2023	Publication	Advances in Face Presentation Attack Detection	Abbreviated Journal
Volume		Issue		Pages	17–35
Keywords
Abstract	In recent years, the security of face recognition systems has been increasingly threatened. Face Anti-spoofing (FAS) is essential to secure face recognition systems primarily from various attacks. In order to attract researchers and push forward the state of the art in Face Presentation Attack Detection (PAD), we organized three editions of Face Anti-spoofing Workshop and Competition at CVPR 2019, CVPR 2020, and ICCV 2021, which have attracted more than 800 teams from academia and industry, and greatly promoted the algorithms to overcome many challenging problems. In this chapter, we introduce the detailed competition process, including the challenge phases, timeline and evaluation metrics. Along with the workshop, we will introduce the corresponding dataset for each competition including data acquisition details, data processing, statistics, and evaluation protocol. Finally, we provide the available link to download the datasets used in the challenges.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	SLCV
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ WGE2023b			Serial	3956
Permanent link to this record



Author	Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title	Advances in Face Presentation Attack Detection			Type	Book Whole
Year	2023	Publication	Advances in Face Presentation Attack Detection	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ WGE2023a			Serial	3955
Permanent link to this record



Author	Hao Wu; Alejandro Ariza-Casabona; Bartłomiej Twardowski; Tri Kurniawan Wijaya
Title	MM-GEF: Multi-modal representation meet collaborative filtering			Type	Miscellaneous
Year	2023	Publication	ARXIV	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In modern e-commerce, item content features in various modalities offer accurate yet comprehensive information to recommender systems. The majority of previous work either focuses on learning effective item representation during modelling user-item interactions, or exploring item-item relationships by analysing multi-modal features. Those methods, however, fail to incorporate the collaborative item-user-item relationships into the multi-modal feature-based item structure. In this work, we propose a graph-based item structure enhancement method MM-GEF: Multi-Modal recommendation with Graph Early-Fusion, which effectively combines the latent item structure underlying multi-modal contents with the collaborative signals. Instead of processing the content feature in different modalities separately, we show that the early-fusion of multi-modal features provides significant improvement. MM-GEF learns refined item representations by injecting structural information obtained from both multi-modal and collaborative signals. Through extensive experiments on four publicly available datasets, we demonstrate systematical improvements of our method over state-of-the-art multi-modal recommendation methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP			Approved	no
Call Number	Admin @ si @ WAT2023			Serial	3988
Permanent link to this record



Author	Diego Velazquez; Pau Rodriguez; Alexandre Lacoste; Issam H. Laradji; Xavier Roca; Jordi Gonzalez
Title	Evaluating Counterfactual Explainers			Type	Journal
Year	2023	Publication	Transactions on Machine Learning Research	Abbreviated Journal	TMLR
Volume		Issue		Pages
Keywords	Explainability; Counterfactuals; XAI
Abstract	Explainability methods have been widely used to provide insight into the decisions made by statistical models, thus facilitating their adoption in various domains within the industry. Counterfactual explanation methods aim to improve our understanding of a model by perturbing samples in a way that would alter its response in an unexpected manner. This information is helpful for users and for machine learning practitioners to understand and improve their models. Given the value provided by counterfactual explanations, there is a growing interest in the research community to investigate and propose new methods. However, we identify two issues that could hinder the progress in this field. (1) Existing metrics do not accurately reflect the value of an explainability method for the users. (2) Comparisons between methods are usually performed with datasets like CelebA, where images are annotated with attributes that do not fully describe them and with subjective attributes such as ``Attractive''. In this work, we address these problems by proposing an evaluation method with a principled metric to evaluate and compare different counterfactual explanation methods. The evaluation method is based on a synthetic dataset where images are fully described by their annotated attributes. As a result, we are able to perform a fair comparison of multiple explainability methods in the recent literature, obtaining insights about their performance. We make the code public for the benefit of the research community.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ VRL2023			Serial	3891
Permanent link to this record