Publicacions CVC -- Query Results

Galadrielle Humblot-Renaux, Sergio Escalera, & Thomas B. Moeslund. (2023). Beyond AUROC & co. for evaluating out-of-distribution detection performance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 3880–3889). Abstract: While there has been a growing research interest in developing out-of-distribution (OOD) detection methods, there has been comparably little discussion around how these methods should be evaluated. Given their relevance for safe(r) AI, it is important to examine whether the basis for comparing OOD detection methods is consistent with practical needs. In this work, we take a closer look at the go-to metrics for evaluating OOD detection, and question the approach of exclusively reducing OOD detection to a binary classification task with little consideration for the detection threshold. We illustrate the limitations of current metrics (AUROC & its friends) and propose a new metric – Area Under the Threshold Curve (AUTC), which explicitly penalizes poor separation between ID and OOD samples. Scripts and data are available at https://github.com/glhr/beyond-auroc http://refbase.cvc.uab.es/show.php?record=3918
Dong Wang, Jia Guo, Qiqi Shao, Haochi He, Zhian Chen, Chuanbao Xiao, et al. (2023). Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 6379–6390). Abstract: Face anti-spoofing (FAS) is an essential mechanism for safeguarding the integrity of automated face recognition systems. Despite substantial advancements, the generalization of existing approaches to real-world applications remains challenging. This limitation can be attributed to the scarcity and lack of diversity in publicly available FAS datasets, which often leads to overfitting during training or saturation during testing. In terms of quantity, the number of spoof subjects is a critical determinant. Most datasets comprise fewer than 2,000 subjects. With regard to diversity, the majority of datasets consist of spoof samples collected in controlled environments using repetitive, mechanical processes. This data collection methodology results in homogenized samples and a dearth of scenario diversity. To address these shortcomings, we introduce the Wild Face Anti-Spoofing (WFAS) dataset, a large-scale, diverse FAS dataset collected in unconstrained settings. Our dataset encompasses 853,729 images of 321,751 spoof subjects and 529,571 images of 148,169 live subjects, representing a substantial increase in quantity. Moreover, our dataset incorporates spoof data obtained from the internet, spanning a wide array of scenarios and various commercial sensors, including 17 presentation attacks (PAs) that encompass both 2D and 3D forms. This novel data collection strategy markedly enhances FAS data diversity. Leveraging the WFAS dataset and Protocol 1 (Known-Type), we host the Wild Face Anti-Spoofing Challenge at the CVPR2023 workshop. Additionally, we meticulously evaluate representative methods using Protocol 1 and Protocol 2 (Unknown-Type). Through an in-depth examination of the challenge outcomes and benchmark baselines, we provide insightful analyses and propose potential avenues for future research. The dataset is released under Insightface 1 . http://refbase.cvc.uab.es/show.php?record=3919
Senmao Li, Joost Van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, & Jian Yang. (2023). 3D-Aware Multi-Class Image-to-Image Translation with NeRFs. In 36th IEEE Conference on Computer Vision and Pattern Recognition (pp. 12652–12662). Abstract: Recent advances in 3D-aware generative models (3D-aware GANs) combined with Neural Radiance Fields (NeRF) have achieved impressive results. However no prior works investigate 3D-aware GANs for 3D consistent multiclass image-to-image (3D-aware 121) translation. Naively using 2D-121 translation methods suffers from unrealistic shape/identity change. To perform 3D-aware multiclass 121 translation, we decouple this learning process into a multiclass 3D-aware GAN step and a 3D-aware 121 translation step. In the first step, we propose two novel techniques: a new conditional architecture and an effective training strategy. In the second step, based on the well-trained multiclass 3D-aware GAN architecture, that preserves view-consistency, we construct a 3D-aware 121 translation system. To further reduce the view-consistency problems, we propose several new techniques, including a U-net-like adaptor network design, a hierarchical representation constrain and a relative regularization loss. In exten-sive experiments on two datasets, quantitative and qualitative results demonstrate that we successfully perform 3D-aware 121 translation with multi-view consistency. Code is available in 3DI2I. http://refbase.cvc.uab.es/show.php?record=3920
Hugo Bertiche, Niloy J Mitra, Kuldeep Kulkarni, Chun Hao Paul Huang, Tuanfeng Y Wang, Meysam Madadi, et al. (2023). Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images. In 36th IEEE Conference on Computer Vision and Pattern Recognition (pp. 459–468). Abstract: Cinemagraphs are short looping videos created by adding subtle motions to a static image. This kind of media is popular and engaging. However, automatic generation of cinemagraphs is an underexplored area and current solutions require tedious low-level manual authoring by artists. In this paper, we present an automatic method that allows generating human cinemagraphs from single RGB images. We investigate the problem in the context of dressed humans under the wind. At the core of our method is a novel cyclic neural network that produces looping cinemagraphs for the target loop duration. To circumvent the problem of collecting real data, we demonstrate that it is possible, by working in the image normal space, to learn garment motion dynamics on synthetic data and generalize to real data. We evaluate our method on both synthetic and real data and demonstrate that it is possible to create compelling and plausible cinemagraphs from single RGB images. http://refbase.cvc.uab.es/show.php?record=3921
Fernando Vilariño, Dan Norton, & Onur Ferhat. (2015). Memory Fields: DJs in the Library. In 21 st Symposium of Electronic Arts. http://refbase.cvc.uab.es/show.php?record=2800
X. Orriols, & X. Binefa. (2001). An EM Algorithm for Video Summarization, Generative Model Approach.. http://refbase.cvc.uab.es/show.php?record=199
Rafael E. Rivadeneira, Angel Sappa, & Boris X. Vintimilla. (2020). Thermal Image Super-resolution: A Novel Architecture and Dataset. In 15th International Conference on Computer Vision Theory and Applications (pp. 111–119). Abstract: This paper proposes a novel CycleGAN architecture for thermal image super-resolution, together with a large dataset consisting of thermal images at different resolutions. The dataset has been acquired using three thermal cameras at different resolutions, which acquire images from the same scenario at the same time. The thermal cameras are mounted in rig trying to minimize the baseline distance to make easier the registration problem. The proposed architecture is based on ResNet6 as a Generator and PatchGAN as Discriminator. The novelty on the proposed unsupervised super-resolution training (CycleGAN) is possible due to the existence of aforementioned thermal images—images of the same scenario with different resolutions. The proposed approach is evaluated in the dataset and compared with classical bicubic interpolation. The dataset and the network are available. http://refbase.cvc.uab.es/show.php?record=3432
Jorge Charco, Angel Sappa, Boris X. Vintimilla, & Henry Velesaca. (2020). Transfer Learning from Synthetic Data in the Camera Pose Estimation Problem. In 15th International Conference on Computer Vision Theory and Applications. Abstract: This paper presents a novel Siamese network architecture, as a variant of Resnet-50, to estimate the relative camera pose on multi-view environments. In order to improve the performance of the proposed model a transfer learning strategy, based on synthetic images obtained from a virtual-world, is considered. The transfer learning consists of first training the network using pairs of images from the virtual-world scenario considering different conditions (i.e., weather, illumination, objects, buildings, etc.); then, the learned weight of the network are transferred to the real case, where images from real-world scenarios are considered. Experimental results and comparisons with the state of the art show both, improvements on the relative pose estimation accuracy using the proposed model, as well as further improvements when the transfer learning strategy (synthetic-world data transfer learning real-world data) is considered to tackle the limitation on the training due to the reduced number of pairs of real-images on most of the public data sets. http://refbase.cvc.uab.es/show.php?record=3433
C. Santa-Marta, Jaume Garcia, A. Bajo, J.J. Vaquero, M. Ledesma-Carbayo, & Debora Gil. (2008). Influence of the Temporal Resolution on the Quantification of Displacement Fields in Cardiac Magnetic Resonance Tagged Images. In S. A. Roberto hornero (Ed.), XXVI Congreso Anual de la Sociedad Española de Ingenieria Biomedica (352–353). Abstract: It is difficult to acquire tagged cardiac MR images with a high temporal and spatial resolution using clinical MR scanners. However, if such images are used for quantifying scores based on motion, it is essential a resolution as high as possibl e. This paper explores the influence of the temporal resolution of a tagged series on the quantification of myocardial dynamic parameters. To such purpose we have designed a SPAMM (Spatial Modulation of Magnetization) sequence allowing acquisition of sequences at simple and double temporal resolution. Sequences are processed to compute myocardial motion by an automatic technique based on the tracking of the harmonic phase of tagged images (the Harmonic Phase Flow, HPF). The results have been compared to manual tracking of myocardial tags. The error in displacement fields for double resolution sequences reduces 17%. http://refbase.cvc.uab.es/show.php?record=1033
F. Javier Sanchez, & Jorge Bernal. (2018). Use of Software Tools for Real-time Monitoring of Learning Processes: Application to Compilers subject. In 4th International Conference of Higher Education Advances (pp. 1359–1366). Abstract: The effective implementation of the Higher European Education Area has meant a change regarding the focus of the learning process, being now the student at its very center. This shift of focus requires a strong involvement and fluent communication between teachers and students to succeed. Considering the difficulties associated to motivate students to take a more active role in the learning process, we explore how the use of a software tool can help both actors to improve the learning experience. We present a tool that can help students to obtain instantaneous feedback with respect to their progress in the subject as well as providing teachers with useful information about the evolution of knowledge acquisition with respect to each of the subject areas. We compare the performance achieved by students in two academic years: results show an improvement in overall performance which, after observing graphs provided by our tool, can be associated to an increase in students interest in the subject. Keywords: Monitoring; Evaluation tool; Gamification; Student motivation http://refbase.cvc.uab.es/show.php?record=3165
Ana Maria Ares, Jorge Bernal, Maria Jesus Nozal, F. Javier Sanchez, & Jose Bernal. (2018). Results of the use of Kahoot! gamification tool in a course of Chemistry. In 4th International Conference on Higher Education Advances (pp. 1215–1222). Abstract: The present study examines the use of Kahoot! as a gamification tool to explore mixed learning strategies. We analyze its use in two different groups of a theoretical subject of the third course of the Degree in Chemistry. An empirical-analytical methodology was used using Kahoot! in two different groups of students, with different frequencies. The academic results of these two group of students were compared between them and with those obtained in the previous course, in which Kahoot! was not employed, with the aim of measuring the evolution in the students´ knowledge. The results showed, in all cases, that the use of Kahoot! has led to a significant increase in the overall marks, and in the number of students who passed the subject. Moreover, some differences were also observed in students´ academic performance according to the group. Finally, it can be concluded that the use of a gamification tool (Kahoot!) in a university classroom had generally improved students´ learning and marks, and that this improvement is more prevalent in those students who have achieved a better Kahoot! performance. http://refbase.cvc.uab.es/show.php?record=3246
Karla Lizbeth Caballero, Joel Barajas, Oriol Pujol, J. Mauri, & Petia Radeva. (2006). Using Radio Frequency Reconstructed IVUS Images in Tissue Classification. http://refbase.cvc.uab.es/show.php?record=761
David Rotger, Petia Radeva, & Oriol Rodriguez. (2006). Vessel Tortuosity Extraction from IVUS Images. http://refbase.cvc.uab.es/show.php?record=762
Robert Benavente, Ernest Valveny, Jaume Garcia, Agata Lapedriza, Miquel Ferrer, & Gemma Sanchez. (2008). Una experiencia de adaptacion al EEES de las asignaturas de programacion en Ingenieria Informatica. http://refbase.cvc.uab.es/show.php?record=1031
Monica Piñol, Angel Sappa, Angeles Lopez, & Ricardo Toledo. (2012). Feature Selection Based on Reinforcement Learning for Object Recognition. In Adaptive Learning Agents Workshop (pp. 33–39). http://refbase.cvc.uab.es/show.php?record=2018

Galadrielle Humblot-Renaux, Sergio Escalera, & Thomas B. Moeslund. (2023). Beyond AUROC & co. for evaluating out-of-distribution detection performance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 3880–3889).

Dong Wang, Jia Guo, Qiqi Shao, Haochi He, Zhian Chen, Chuanbao Xiao, et al. (2023). Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 6379–6390).

Senmao Li, Joost Van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, & Jian Yang. (2023). 3D-Aware Multi-Class Image-to-Image Translation with NeRFs. In 36th IEEE Conference on Computer Vision and Pattern Recognition (pp. 12652–12662).

Hugo Bertiche, Niloy J Mitra, Kuldeep Kulkarni, Chun Hao Paul Huang, Tuanfeng Y Wang, Meysam Madadi, et al. (2023). Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images. In 36th IEEE Conference on Computer Vision and Pattern Recognition (pp. 459–468).

Fernando Vilariño, Dan Norton, & Onur Ferhat. (2015). Memory Fields: DJs in the Library. In 21 st Symposium of Electronic Arts.

X. Orriols, & X. Binefa. (2001). An EM Algorithm for Video Summarization, Generative Model Approach..

Rafael E. Rivadeneira, Angel Sappa, & Boris X. Vintimilla. (2020). Thermal Image Super-resolution: A Novel Architecture and Dataset. In 15th International Conference on Computer Vision Theory and Applications (pp. 111–119).

Jorge Charco, Angel Sappa, Boris X. Vintimilla, & Henry Velesaca. (2020). Transfer Learning from Synthetic Data in the Camera Pose Estimation Problem. In 15th International Conference on Computer Vision Theory and Applications.

C. Santa-Marta, Jaume Garcia, A. Bajo, J.J. Vaquero, M. Ledesma-Carbayo, & Debora Gil. (2008). Influence of the Temporal Resolution on the Quantification of Displacement Fields in Cardiac Magnetic Resonance Tagged Images. In S. A. Roberto hornero (Ed.), XXVI Congreso Anual de la Sociedad Española de Ingenieria Biomedica (352–353).

F. Javier Sanchez, & Jorge Bernal. (2018). Use of Software Tools for Real-time Monitoring of Learning Processes: Application to Compilers subject. In 4th International Conference of Higher Education Advances (pp. 1359–1366).

Ana Maria Ares, Jorge Bernal, Maria Jesus Nozal, F. Javier Sanchez, & Jose Bernal. (2018). Results of the use of Kahoot! gamification tool in a course of Chemistry. In 4th International Conference on Higher Education Advances (pp. 1215–1222).

Karla Lizbeth Caballero, Joel Barajas, Oriol Pujol, J. Mauri, & Petia Radeva. (2006). Using Radio Frequency Reconstructed IVUS Images in Tissue Classification.

David Rotger, Petia Radeva, & Oriol Rodriguez. (2006). Vessel Tortuosity Extraction from IVUS Images.

Robert Benavente, Ernest Valveny, Jaume Garcia, Agata Lapedriza, Miquel Ferrer, & Gemma Sanchez. (2008). Una experiencia de adaptacion al EEES de las asignaturas de programacion en Ingenieria Informatica.

Monica Piñol, Angel Sappa, Angeles Lopez, & Ricardo Toledo. (2012). Feature Selection Based on Reinforcement Learning for Object Recognition. In Adaptive Learning Agents Workshop (pp. 33–39).