Publicacions CVC -- Query Results

Dimosthenis Karatzas, Sergi Robles, & Lluis Gomez. (2014). An on-line platform for ground truthing and performance evaluation of text extraction systems. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 242–246). Abstract: This paper presents a set of on-line software tools for creating ground truth and calculating performance evaluation metrics for text extraction tasks such as localization, segmentation and recognition. The platform supports the definition of comprehensive ground truth information at different text representation levels while it offers centralised management and quality control of the ground truthing effort. It implements a range of state of the art performance evaluation algorithms and offers functionality for the definition of evaluation scenarios, on-line calculation of various performance metrics and visualisation of the results. The presented platform, which comprises the backbone of the ICDAR 2011 (challenge 1) and 2013 (challenges 1 and 2) Robust Reading competitions, is now made available for public use. http://refbase.cvc.uab.es/show.php?record=2491
Dimosthenis Karatzas, V. Poulain d'Andecy, & Marçal Rusiñol. (2016). Human-Document Interaction – a new frontier for document image analysis. In 12th IAPR Workshop on Document Analysis Systems (pp. 369–374). Abstract: All indications show that paper documents will not cede in favour of their digital counterparts, but will instead be used increasingly in conjunction with digital information. An open challenge is how to seamlessly link the physical with the digital – how to continue taking advantage of the important affordances of paper, without missing out on digital functionality. This paper presents the authors’ experience with developing systems for Human-Document Interaction based on augmented document interfaces and examines new challenges and opportunities arising for the document image analysis field in this area. The system presented combines state of the art camera-based document image analysis techniques with a range of complementary tech-nologies to offer fluid Human-Document Interaction. Both fixed and nomadic setups are discussed that have gone through user testing in real-life environments, and use cases are presented that span the spectrum from business to educational application http://refbase.cvc.uab.es/show.php?record=2756
Dimosthenis Karatzas, & Ch. Lioutas. (1998). Software Package Development for Electron Diffraction Image Analysis. In Proceedings of the XIV Solid State Physics National Conference. http://refbase.cvc.uab.es/show.php?record=2045
Dipam Goswami, J Schuster, Joost Van de Weijer, & Didier Stricker. (2023). Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 3195–3204). Abstract: Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. D Goswami, R Schuster, J van de Weijer, D Stricker. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 3195-3204 http://refbase.cvc.uab.es/show.php?record=3901
Dipam Goswami, Yuyang Liu, Bartlomiej Twardowski, & Joost Van de Weijer. (2023). FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning. In 37th Annual Conference on Neural Information Processing Systems. Abstract: Poster http://refbase.cvc.uab.es/show.php?record=3934
Domicele Jonauskaite, Lucia Camenzind, C. Alejandro Parraga, Cecile N Diouf, Mathieu Mercapide Ducommun, Lauriane Müller, et al. (2021). Colour-emotion associations in individuals with red-green colour blindness. PeerJ, 9, e11180. Abstract: Colours and emotions are associated in languages and traditions. Some of us may convey sadness by saying feeling blue or by wearing black clothes at funerals. The first example is a conceptual experience of colour and the second example is an immediate perceptual experience of colour. To investigate whether one or the other type of experience more strongly drives colour-emotion associations, we tested 64 congenitally red-green colour-blind men and 66 non-colour-blind men. All participants associated 12 colours, presented as terms or patches, with 20 emotion concepts, and rated intensities of the associated emotions. We found that colour-blind and non-colour-blind men associated similar emotions with colours, irrespective of whether colours were conveyed via terms (r = .82) or patches (r = .80). The colour-emotion associations and the emotion intensities were not modulated by participants' severity of colour blindness. Hinting at some additional, although minor, role of actual colour perception, the consistencies in associations for colour terms and patches were higher in non-colour-blind than colour-blind men. Together, these results suggest that colour-emotion associations in adults do not require immediate perceptual colour experiences, as conceptual experiences are sufficient. Keywords: Affect; Chromotherapy; Colour cognition; Colour vision deficiency; Cross-modal correspondences; Daltonism; Deuteranopia; Dichromatic; Emotion; Protanopia. http://refbase.cvc.uab.es/show.php?record=3564
Domicele Jonauskaite, Nele Dael, C. Alejandro Parraga, Laetitia Chevre, Alejandro Garcia Sanchez, & Christine Mohr. (2018). Stripping #The Dress: The importance of contextual information on inter-individual differences in colour perception. PSYCHO R - Psychological Research, , 1–15. Abstract: In 2015, a picture of a Dress (henceforth the Dress) triggered popular and scientific interest; some reported seeing the Dress in white and gold (W&G) and others in blue and black (B&B). We aimed to describe the phenomenon and investigate the role of contextualization. Few days after the Dress had appeared on the Internet, we projected it to 240 students on two large screens in the classroom. Participants reported seeing the Dress in B&B (48%), W&G (38%), or blue and brown (B&Br; 7%). Amongst numerous socio-demographic variables, we only observed that W&G viewers were most likely to have always seen the Dress as W&G. In the laboratory, we tested how much contextual information is necessary for the phenomenon to occur. Fifty-seven participants selected colours most precisely matching predominant colours of parts or the full Dress. We presented, in this order, small squares (a), vertical strips (b), and the full Dress (c). We found that (1) B&B, B&Br, and W&G viewers had selected colours differing in lightness and chroma levels for contextualized images only (b, c conditions) and hue for fully contextualized condition only (c) and (2) B&B viewers selected colours most closely matching displayed colours of the Dress. Thus, the Dress phenomenon emerges due to inter-individual differences in subjectively perceived lightness, chroma, and hue, at least when all aspects of the picture need to be integrated. Our results support the previous conclusions that contextual information is key to colour perception; it should be important to understand how this actually happens. http://refbase.cvc.uab.es/show.php?record=3149
Dong Wang, Jia Guo, Qiqi Shao, Haochi He, Zhian Chen, Chuanbao Xiao, et al. (2023). Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 6379–6390). Abstract: Face anti-spoofing (FAS) is an essential mechanism for safeguarding the integrity of automated face recognition systems. Despite substantial advancements, the generalization of existing approaches to real-world applications remains challenging. This limitation can be attributed to the scarcity and lack of diversity in publicly available FAS datasets, which often leads to overfitting during training or saturation during testing. In terms of quantity, the number of spoof subjects is a critical determinant. Most datasets comprise fewer than 2,000 subjects. With regard to diversity, the majority of datasets consist of spoof samples collected in controlled environments using repetitive, mechanical processes. This data collection methodology results in homogenized samples and a dearth of scenario diversity. To address these shortcomings, we introduce the Wild Face Anti-Spoofing (WFAS) dataset, a large-scale, diverse FAS dataset collected in unconstrained settings. Our dataset encompasses 853,729 images of 321,751 spoof subjects and 529,571 images of 148,169 live subjects, representing a substantial increase in quantity. Moreover, our dataset incorporates spoof data obtained from the internet, spanning a wide array of scenarios and various commercial sensors, including 17 presentation attacks (PAs) that encompass both 2D and 3D forms. This novel data collection strategy markedly enhances FAS data diversity. Leveraging the WFAS dataset and Protocol 1 (Known-Type), we host the Wild Face Anti-Spoofing Challenge at the CVPR2023 workshop. Additionally, we meticulously evaluate representative methods using Protocol 1 and Protocol 2 (Unknown-Type). Through an in-depth examination of the challenge outcomes and benchmark baselines, we provide insightful analyses and propose potential avenues for future research. The dataset is released under Insightface 1 . http://refbase.cvc.uab.es/show.php?record=3919
Dorota Kaminska, Kadir Aktas, Davit Rizhinashvili, Danila Kuklyanov, Abdallah Hussein Sham, Sergio Escalera, et al. (2021). Two-stage Recognition and Beyond for Compound Facial Emotion Recognition. ELEC - Electronics, 10(22), 2847. Abstract: Facial emotion recognition is an inherently complex problem due to individual diversity in facial features and racial and cultural differences. Moreover, facial expressions typically reflect the mixture of people’s emotional statuses, which can be expressed using compound emotions. Compound facial emotion recognition makes the problem even more difficult because the discrimination between dominant and complementary emotions is usually weak. We have created a database that includes 31,250 facial images with different emotions of 115 subjects whose gender distribution is almost uniform to address compound emotion recognition. In addition, we have organized a competition based on the proposed dataset, held at FG workshop 2020. This paper analyzes the winner’s approach—a two-stage recognition method (1st stage, coarse recognition; 2nd stage, fine recognition), which enhances the classification of symmetrical emotion labels. Keywords: compound emotion recognition; facial expression recognition; dominant and complementary emotion recognition; deep learning http://refbase.cvc.uab.es/show.php?record=3642
Dustin Carrion Ojeda, Hong Chen, Adrian El Baz, Sergio Escalera, Chaoyu Guan, Isabelle Guyon, et al. (2022). NeurIPS’22 Cross-Domain MetaDL competition: Design and baseline results. In Understanding Social Behavior in Dyadic and Small Group Interactions (Vol. 191, pp. 24–37). Abstract: We present the design and baseline results for a new challenge in the ChaLearn meta-learning series, accepted at NeurIPS'22, focusing on “cross-domain” meta-learning. Meta-learning aims to leverage experience gained from previous tasks to solve new tasks efficiently (i.e., with better performance, little training data, and/or modest computational resources). While previous challenges in the series focused on within-domain few-shot learning problems, with the aim of learning efficiently N-way k-shot tasks (i.e., N class classification problems with k training examples), this competition challenges the participants to solve “any-way” and “any-shot” problems drawn from various domains (healthcare, ecology, biology, manufacturing, and others), chosen for their humanitarian and societal impact. To that end, we created Meta-Album, a meta-dataset of 40 image classification datasets from 10 domains, from which we carve out tasks with any number of “ways” (within the range 2-20) and any number of “shots” (within the range 1-20). The competition is with code submission, fully blind-tested on the CodaLab challenge platform. The code of the winners will be open-sourced, enabling the deployment of automated machine learning solutions for few-shot image classification across several domains. http://refbase.cvc.uab.es/show.php?record=3802
E Fernandez-Nofrerias, J. Mauri, A. Tovar, L. Cano, E. Martinez, C. Julia, et al. (2001). Correspondencia de las imagenes de angiografia y ecografia intracoronaria: La fusion.. http://refbase.cvc.uab.es/show.php?record=91
E. Barakova, Maya Dimitrova, T. Lorents, & Petia Radeva. (2004). The Web as an “Autobiographical Agent”. http://refbase.cvc.uab.es/show.php?record=475
E. Bondi, L. Sidenari, Andrew Bagdanov, & Alberto del Bimbo. (2014). Real-time people counting from depth imagery of crowded environments. In 11th IEEE International Conference on Advanced Video and Signal based Surveillance (pp. 337–342). Abstract: In this paper we describe a system for automatic people counting in crowded environments. The approach we propose is a counting-by-detection method based on depth imagery. It is designed to be deployed as an autonomous appliance for crowd analysis in video surveillance application scenarios. Our system performs foreground/background segmentation on depth image streams in order to coarsely segment persons, then depth information is used to localize head candidates which are then tracked in time on an automatically estimated ground plane. The system runs in real-time, at a frame-rate of about 20 fps. We collected a dataset of RGB-D sequences representing three typical and challenging surveillance scenarios, including crowds, queuing and groups. An extensive comparative evaluation is given between our system and more complex, Latent SVM-based head localization for person counting applications. http://refbase.cvc.uab.es/show.php?record=2540
E. Ceron. (2000). Programacion visual de tareas de pick and place: modulo de vision 3D. http://refbase.cvc.uab.es/show.php?record=348
E. Pastor, A. Agueda, Juan Andrade, M. Muñoz, Y. Perez, & E. Planas. (2006). Computing the rate of spread of linear flame fronts by thermal image processing. Fire Safety Journal, 41(8):569–579. http://refbase.cvc.uab.es/show.php?record=679

Dimosthenis Karatzas, Sergi Robles, & Lluis Gomez. (2014). An on-line platform for ground truthing and performance evaluation of text extraction systems. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 242–246).

Dimosthenis Karatzas, V. Poulain d'Andecy, & Marçal Rusiñol. (2016). Human-Document Interaction – a new frontier for document image analysis. In 12th IAPR Workshop on Document Analysis Systems (pp. 369–374).

Dimosthenis Karatzas, & Ch. Lioutas. (1998). Software Package Development for Electron Diffraction Image Analysis. In Proceedings of the XIV Solid State Physics National Conference.

Dipam Goswami, J Schuster, Joost Van de Weijer, & Didier Stricker. (2023). Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 3195–3204).

Dipam Goswami, Yuyang Liu, Bartlomiej Twardowski, & Joost Van de Weijer. (2023). FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning. In 37th Annual Conference on Neural Information Processing Systems.

Domicele Jonauskaite, Lucia Camenzind, C. Alejandro Parraga, Cecile N Diouf, Mathieu Mercapide Ducommun, Lauriane Müller, et al. (2021). Colour-emotion associations in individuals with red-green colour blindness. PeerJ, 9, e11180.

Domicele Jonauskaite, Nele Dael, C. Alejandro Parraga, Laetitia Chevre, Alejandro Garcia Sanchez, & Christine Mohr. (2018). Stripping #The Dress: The importance of contextual information on inter-individual differences in colour perception. PSYCHO R - Psychological Research, , 1–15.

Dong Wang, Jia Guo, Qiqi Shao, Haochi He, Zhian Chen, Chuanbao Xiao, et al. (2023). Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 6379–6390).

Dorota Kaminska, Kadir Aktas, Davit Rizhinashvili, Danila Kuklyanov, Abdallah Hussein Sham, Sergio Escalera, et al. (2021). Two-stage Recognition and Beyond for Compound Facial Emotion Recognition. ELEC - Electronics, 10(22), 2847.

Dustin Carrion Ojeda, Hong Chen, Adrian El Baz, Sergio Escalera, Chaoyu Guan, Isabelle Guyon, et al. (2022). NeurIPS’22 Cross-Domain MetaDL competition: Design and baseline results. In Understanding Social Behavior in Dyadic and Small Group Interactions (Vol. 191, pp. 24–37).

E Fernandez-Nofrerias, J. Mauri, A. Tovar, L. Cano, E. Martinez, C. Julia, et al. (2001). Correspondencia de las imagenes de angiografia y ecografia intracoronaria: La fusion..

E. Barakova, Maya Dimitrova, T. Lorents, & Petia Radeva. (2004). The Web as an “Autobiographical Agent”.

E. Bondi, L. Sidenari, Andrew Bagdanov, & Alberto del Bimbo. (2014). Real-time people counting from depth imagery of crowded environments. In 11th IEEE International Conference on Advanced Video and Signal based Surveillance (pp. 337–342).

E. Ceron. (2000). Programacion visual de tareas de pick and place: modulo de vision 3D.

E. Pastor, A. Agueda, Juan Andrade, M. Muñoz, Y. Perez, & E. Planas. (2006). Computing the rate of spread of linear flame fronts by thermal image processing. Fire Safety Journal, 41(8):569–579.