|   | 
Details
   web
Records
Author Soumya Jahagirdar; Minesh Mathew; Dimosthenis Karatzas; CV Jawahar
Title Understanding Video Scenes Through Text: Insights from Text-Based Video Question Answering Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Researchers have extensively studied the field of vision and language, discovering that both visual and textual content is crucial for understanding scenes effectively. Particularly, comprehending text in videos holds great significance, requiring both scene text understanding and temporal reasoning. This paper focuses on exploring two recently introduced datasets, NewsVideoQA and M4-ViteVQA, which aim to address video question answering based on textual content. The NewsVideoQA dataset contains question-answer pairs related to the text in news videos, while M4- ViteVQA comprises question-answer pairs from diverse categories like vlogging, traveling, and shopping. We provide an analysis of the formulation of these datasets on various levels, exploring the degree of visual understanding and multi-frame comprehension required for answering the questions. Additionally, the study includes experimentation with BERT-QA, a text-only model, which demonstrates comparable performance to the original methods on both datasets, indicating the shortcomings in the formulation of these datasets. Furthermore, we also look into the domain adaptation aspect by examining the effectiveness of training on M4-ViteVQA and evaluating on NewsVideoQA and vice-versa, thereby shedding light on the challenges and potential benefits of out-of-domain training.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes DAG Approved no
Call Number Admin @ si @ JMK2023 Serial 3946
Permanent link to this record
 

 
Author Dawid Rymarczyk; Joost van de Weijer; Bartosz Zielinski; Bartlomiej Twardowski
Title ICICLE: Interpretable Class Incremental Continual Learning. Dawid Rymarczyk Type Conference Article
Year 2023 Publication 20th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 1887-1898
Keywords
Abstract Continual learning enables incremental learning of new tasks without forgetting those previously learned, resulting in positive knowledge transfer that can enhance performance on both new and old tasks. However, continual learning poses new challenges for interpretability, as the rationale behind model predictions may change over time, leading to interpretability concept drift. We address this problem by proposing Interpretable Class-InCremental LEarning (ICICLE), an exemplar-free approach that adopts a prototypical part-based approach. It consists of three crucial novelties: interpretability regularization that distills previously learned concepts while preserving user-friendly positive reasoning; proximity-based prototype initialization strategy dedicated to the fine-grained setting; and task-recency bias compensation devoted to prototypical parts. Our experimental results demonstrate that ICICLE reduces the interpretability concept drift and outperforms the existing exemplar-free methods of common class-incremental learning when applied to concept-based models.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP Approved no
Call Number Admin @ si @ RWZ2023 Serial 3947
Permanent link to this record
 

 
Author Jordy Van Landeghem; Ruben Tito; Lukasz Borchmann; Michal Pietruszka; Pawel Joziak; Rafal Powalski; Dawid Jurkiewicz; Mickael Coustaty; Bertrand Anckaert; Ernest Valveny; Matthew Blaschko; Sien Moens; Tomasz Stanislawek
Title Document Understanding Dataset and Evaluation (DUDE) Type Conference Article
Year 2023 Publication 20th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 19528-19540
Keywords
Abstract We call on the Document AI (DocAI) community to re-evaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks. Document Understanding Dataset and Evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs). We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins and dates. Moreover, we are pushing the boundaries of current methods by creating multi-task and multi-domain evaluation setups that more accurately simulate real-world situations where powerful generalization and adaptation under low-resource settings are desired. DUDE aims to set a new standard as a more practical, long-standing benchmark for the community, and we hope that it will lead to future extensions and contributions that address real-world challenges. Finally, our work illustrates the importance of finding more efficient ways to model language, images, and layout in DocAI.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes DAG Approved no
Call Number Admin @ si @ LTB2023 Serial 3948
Permanent link to this record
 

 
Author Yuyang Liu; Yang Cong; Dipam Goswami; Xialei Liu; Joost Van de Weijer
Title Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection Type Conference Article
Year 2023 Publication 20th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 11367-11377
Keywords
Abstract In incremental learning, replaying stored samples from previous tasks together with current task samples is one of the most efficient approaches to address catastrophic forgetting. However, unlike incremental classification, image replay has not been successfully applied to incremental object detection (IOD). In this paper, we identify the overlooked problem of foreground shift as the main reason for this. Foreground shift only occurs when replaying images of previous tasks and refers to the fact that their background might contain foreground objects of the current task. To overcome this problem, a novel and efficient Augmented Box Replay (ABR) method is developed that only stores and replays foreground objects and thereby circumvents the foreground shift problem. In addition, we propose an innovative Attentive RoI Distillation loss that uses spatial attention from region-of-interest (RoI) features to constrain current model to focus on the most important information from old model. ABR significantly reduces forgetting of previous classes while maintaining high plasticity in current classes. Moreover, it considerably reduces the storage requirements when compared to standard image replay. Comprehensive experiments on Pascal-VOC and COCO datasets support the state-of-the-art performance of our model.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP Approved no
Call Number Admin @ si @ LCG2023 Serial 3949
Permanent link to this record
 

 
Author Guillermo Torres; Debora Gil; Antoni Rosell; S. Mena; Carles Sanchez
Title Virtual Radiomics Biopsy for the Histological Diagnosis of Pulmonary Nodules Type Conference Article
Year 2023 Publication 37th International Congress and Exhibition is organized by Computer Assisted Radiology and Surgery Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Pòster
Address Munich; Germany; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CARS
Notes IAM Approved no
Call Number Admin @ si @ TGR2023a Serial 3950
Permanent link to this record
 

 
Author Sonia Baeza; Debora Gil; Carles Sanchez; Guillermo Torres; Ignasi Garcia Olive; Ignasi Guasch; Samuel Garcia Reina; Felipe Andreo; Jose Luis Mate; Jose Luis Vercher; Antonio Rosell
Title Biopsia virtual radiomica para el diagnóstico histológico de nódulos pulmonares – Resultados intermedios del proyecto Radiolung Type Conference Article
Year 2023 Publication SEPAR Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Pòster
Address Granada; Spain; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference SEPAR
Notes IAM Approved no
Call Number Admin @ si @ BGS2023 Serial 3951
Permanent link to this record
 

 
Author Debora Gil; Guillermo Torres; Carles Sanchez
Title Transforming radiomic features into radiological words Type Conference Article
Year 2023 Publication IEEE International Symposium on Biomedical Imaging Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Pòster
Address Cartagena de Indias; Colombia; April 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ISBI
Notes IAM Approved no
Call Number Admin @ si @ GTS2023 Serial 3952
Permanent link to this record
 

 
Author Pau Cano; Debora Gil; Eva Musulen
Title Towards automatic detection of helicobacter pylori in histological samples of gastric tissue Type Conference Article
Year 2023 Publication IEEE International Symposium on Biomedical Imaging Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Cartagena de Indias; Colombia; April 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ISBI
Notes IAM Approved no
Call Number Admin @ si @ CGM2023 Serial 3953
Permanent link to this record
 

 
Author Guillermo Torres; Debora Gil; Antonio Rosell; Sonia Baeza; Carles Sanchez
Title A radiomic biopsy for virtual histology of pulmonary nodules Type Conference Article
Year 2023 Publication IEEE International Symposium on Biomedical Imaging Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Pòster
Address Cartagena de Indias; Colombia; April 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ISBI
Notes IAM Approved no
Call Number Admin @ si @ TGR2023b Serial 3954
Permanent link to this record
 

 
Author Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title Advances in Face Presentation Attack Detection Type Book Whole
Year 2023 Publication Advances in Face Presentation Attack Detection Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ WGE2023a Serial 3955
Permanent link to this record
 

 
Author Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title Face Presentation Attack Detection (PAD) Challenges Type Book Chapter
Year 2023 Publication Advances in Face Presentation Attack Detection Abbreviated Journal
Volume Issue Pages 17–35
Keywords
Abstract In recent years, the security of face recognition systems has been increasingly threatened. Face Anti-spoofing (FAS) is essential to secure face recognition systems primarily from various attacks. In order to attract researchers and push forward the state of the art in Face Presentation Attack Detection (PAD), we organized three editions of Face Anti-spoofing Workshop and Competition at CVPR 2019, CVPR 2020, and ICCV 2021, which have attracted more than 800 teams from academia and industry, and greatly promoted the algorithms to overcome many challenging problems. In this chapter, we introduce the detailed competition process, including the challenge phases, timeline and evaluation metrics. Along with the workshop, we will introduce the corresponding dataset for each competition including data acquisition details, data processing, statistics, and evaluation protocol. Finally, we provide the available link to download the datasets used in the challenges.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title SLCV
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ WGE2023b Serial 3956
Permanent link to this record
 

 
Author Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title Best Solutions Proposed in the Context of the Face Anti-spoofing Challenge Series Type Book Chapter
Year 2023 Publication Advances in Face Presentation Attack Detection Abbreviated Journal
Volume Issue Pages 37–78
Keywords
Abstract The PAD competitions we organized attracted more than 835 teams from home and abroad, most of them from the industry, which shows that the topic of face anti-spoofing is closely related to daily life, and there is an urgent need for advanced algorithms to solve its application needs. Specifically, the Chalearn LAP multi-modal face anti-spoofing attack detection challenge attracted more than 300 teams for the development phase with a total of 13 teams qualifying for the final round; the Chalearn Face Anti-spoofing Attack Detection Challenge attracted 340 teams in the development stage, and finally, 11 and 8 teams have submitted their codes in the single-modal and multi-modal face anti-spoofing recognition challenges, respectively; the 3D High-Fidelity Mask Face Presentation Attack Detection Challenge attracted 195 teams for the development phase with a total of 18 teams qualifying for the final round. All the results were verified and re-run by the organizing team, and the results were used for the final ranking. In this chapter, we briefly the methods developed by the teams participating in each competition, and introduce the algorithm details of the top-three ranked teams in detail.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ WGE2023d Serial 3958
Permanent link to this record
 

 
Author Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z Li
Title Face Anti-spoofing Progress Driven by Academic Challenges Type Book Chapter
Year 2023 Publication Advances in Face Presentation Attack Detection Abbreviated Journal
Volume Issue Pages 1–15
Keywords
Abstract With the ubiquity of facial authentication systems and the prevalence of security cameras around the world, the impact that facial presentation attack techniques may have is huge. However, research progress in this field has been slowed by a number of factors, including the lack of appropriate and realistic datasets, ethical and privacy issues that prevent the recording and distribution of facial images, the little attention that the community has given to potential ethnic biases among others. This chapter provides an overview of contributions derived from the organization of academic challenges in the context of face anti-spoofing detection. Specifically, we discuss the limitations of benchmarks and summarize our efforts in trying to boost research by the community via the participation in academic challenges
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title SLCV
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ WGE2023c Serial 3957
Permanent link to this record
 

 
Author Armin Mehri
Title Deep learning based architectures for cross-domain image processing Type Book Whole
Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Human vision is restricted to the visual-optical spectrum. Machine vision is not.
Cameras sensitive to diverse infrared spectral bands can improve the capacities of
autonomous systems and provide a comprehensive view. Relevant scene content
can be made visible, particularly in situations when sensors of other modalities,
such as a visual-optical camera, require a source of illumination. As a result, increasing the level of automation not only avoids human errors but also reduces
machine-induced errors. Furthermore, multi-spectral sensor systems with infrared
imagery as one modality are a rich source of information and can conceivably
increase the robustness of many autonomous systems. Robotics, automobiles,
biometrics, security, surveillance, and the military are some examples of fields
that can profit from the use of infrared imagery in their respective applications.
Although multimodal spectral sensors have come a long way, there are still several
bottlenecks that prevent us from combining their output information and using
them as comprehensive images. The primary issue with infrared imaging is the lack
of potential benefits due to their cost influence on sensor resolution, which grows
exponentially with greater resolution. Due to the more costly sensor technology
required for their development, their resolutions are substantially lower than thoseof regular digital cameras.
This thesis aims to improve beyond-visible-spectrum machine vision by integrating multi-modal spectral sensors. The emphasis is on transforming the produced images to enhance their resolution to match expected human perception, bring the color representation close to human understanding of natural color, and improve machine vision application performance. This research focuses mainly on two tasks, image Colorization and Image Super resolution for both single- and cross-domain problems. We first start with an extensive review of the state of the art in both tasks, point out the shortcomings of existing approaches, and then present our solutions to address their limitations. Our solutions demonstrate that low-cost channel information (i.e., visible image) can be used to improve expensive channel
information (i.e., infrared image), resulting in images with higher quality and closer to human perception at a lower cost than a high-cost infrared camera.
Address
Corporate Author Thesis Ph.D. thesis
Publisher IMPRIMA Place of Publication Editor Angel Sappa
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-126409-1-5 Medium
Area Expedition Conference
Notes MSIAU Approved no
Call Number Admin @ si @ Meh2023 Serial 3959
Permanent link to this record
 

 
Author Chenshen Wu
Title Going beyond Classification Problems for the Continual Learning of Deep Neural Networks Type Book Whole
Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Deep learning has made tremendous progress in the last decade due to the explosion of training data and computational power. Through end-to-end training on a
large dataset, image representations are more discriminative than the previously
used hand-crafted features. However, for many real-world applications, training
and testing on a single dataset is not realistic, as the test distribution may change over time. Continuous learning takes this situation into account, where the learner must adapt to a sequence of tasks, each with a different distribution. If you would naively continue training the model with a new task, the performance of the model would drop dramatically for the previously learned data. This phenomenon is known as catastrophic forgetting.
Many approaches have been proposed to address this problem, which can be divided into three main categories: regularization-based approaches, rehearsal-based
approaches, and parameter isolation-based approaches. However, most of the existing works focus on image classification tasks and many other computer vision tasks
have not been well-explored in the continual learning setting. Therefore, in this
thesis, we study continual learning for image generation, object re-identification,
and object counting.
For the image generation problem, since the model can generate images from the previously learned task, it is free to apply rehearsal without any limitation. We developed two methods based on generative replay. The first one uses the generated image for joint training together with the new data. The second one is based on
output pixel-wise alignment. We extensively evaluate these methods on several
benchmarks.
Next, we study continual learning for object Re-Identification (ReID). Although
most state-of-the-art methods of ReID and continual ReID use softmax-triplet loss,
we found that it is better to solve the ReID problem from a meta-learning perspective because continual learning of reID can benefit a lot from the generalization of metalearning. We also propose a distillation loss and found that the removal of the positive pairs before the distillation loss is critical.
Finally, we study continual learning for the counting problem. We study the mainstream method based on density maps and propose a new approach for density
map distillation. We found that fixing the counter head is crucial for the continual learning of object counting. To further improve results, we propose an adaptor to adapt the changing feature extractor for the fixed counter head. Extensive evaluation shows that this results in improved continual learning performance.
Address
Corporate Author Thesis Ph.D. thesis
Publisher IMPRIMA Place of Publication Editor Joost Van de Weijer;Bogdan Raducanu
Language Summary Language (down) Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-126409-0-8 Medium
Area Expedition Conference
Notes LAMP Approved no
Call Number Admin @ si @ Wu2023 Serial 3960
Permanent link to this record