|
Antonio Lopez, Joan Serrat, J. Saludes, Cristina Cañero, Felipe Lumbreras, & T. Graf. (2005). Ridgeness for Detecting Lane Markings.
|
|
|
Antonio Lopez, Ricardo Toledo, Joan Serrat, & Juan J. Villanueva. (1999). Extraction of vessel centerlines from 2D coronary angiographies.
|
|
|
Antonio Lopez, W. Niessen, Joan Serrat, K. Nicolay, Bart M. Ter Haar Romeny, Juan J. Villanueva, et al. (2000). New improvements in the multiscale analysis of trabecular bone patterns..
|
|
|
Antonio Lopez, W. Niessen, Joan Serrat, K. Nicolay, Bart M. Ter Haar Romeny, Juan J. Villanueva, et al. (1999). New improvements in the multiscale analysis of trabecular bone patterns..
|
|
|
Arya Farkhondeh, Cristina Palmero, Simone Scardapane, & Sergio Escalera. (2022). Towards Self-Supervised Gaze Estimation.
Abstract: Recent joint embedding-based self-supervised methods have surpassed standard supervised approaches on various image recognition tasks such as image classification. These self-supervised methods aim at maximizing agreement between features extracted from two differently transformed views of the same image, which results in learning an invariant representation with respect to appearance and geometric image transformations. However, the effectiveness of these approaches remains unclear in the context of gaze estimation, a structured regression task that requires equivariance under geometric transformations (e.g., rotations, horizontal flip). In this work, we propose SwAT, an equivariant version of the online clustering-based self-supervised approach SwAV, to learn more informative representations for gaze estimation. We demonstrate that SwAT, with ResNet-50 and supported with uncurated unlabeled face images, outperforms state-of-the-art gaze estimation methods and supervised baselines in various experiments. In particular, we achieve up to 57% and 25% improvements in cross-dataset and within-dataset evaluation tasks on existing benchmarks (ETH-XGaze, Gaze360, and MPIIFaceGaze).
|
|
|
Ayan Banerjee, Sanket Biswas, Josep Llados, & Umapada Pal. (2024). GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation.
Abstract: Object detection in documents is a key step to automate the structural elements identification process in a digital or scanned document through understanding the hierarchical structure and relationships between different elements. Large and complex models, while achieving high accuracy, can be computationally expensive and memory-intensive, making them impractical for deployment on resource constrained devices. Knowledge distillation allows us to create small and more efficient models that retain much of the performance of their larger counterparts. Here we present a graph-based knowledge distillation framework to correctly identify and localize the document objects in a document image. Here, we design a structured graph with nodes containing proposal-level features and edges representing the relationship between the different proposal regions. Also, to reduce text bias an adaptive node sampling strategy is designed to prune the weight distribution and put more weightage on non-text nodes. We encode the complete graph as a knowledge representation and transfer it from the teacher to the student through the proposed distillation loss by effectively capturing both local and global information concurrently. Extensive experimentation on competitive benchmarks demonstrates that the proposed framework outperforms the current state-of-the-art approaches. The code will be available at: this https URL.
|
|
|
Azadeh S. Mozafari, David Vazquez, Mansour Jamzad, & Antonio Lopez. (2016). Node-Adapt, Path-Adapt and Tree-Adapt:Model-Transfer Domain Adaptation for Random Forest.
Abstract: Random Forest (RF) is a successful paradigm for learning classifiers due to its ability to learn from large feature spaces and seamlessly integrate multi-class classification, as well as the achieved accuracy and processing efficiency. However, as many other classifiers, RF requires domain adaptation (DA) provided that there is a mismatch between the training (source) and testing (target) domains which provokes classification degradation. Consequently, different RF-DA methods have been proposed, which not only require target-domain samples but revisiting the source-domain ones, too. As novelty, we propose three inherently different methods (Node-Adapt, Path-Adapt and Tree-Adapt) that only require the learned source-domain RF and a relatively few target-domain samples for DA, i.e. source-domain samples do not need to be available. To assess the performance of our proposals we focus on image-based object detection, using the pedestrian detection problem as challenging proof-of-concept. Moreover, we use the RF with expert nodes because it is a competitive patch-based pedestrian model. We test our Node-, Path- and Tree-Adapt methods in standard benchmarks, showing that DA is largely achieved.
Keywords: Domain Adaptation; Pedestrian detection; Random Forest
|
|
|
B. Moghaddam, David Guillamet, & Jordi Vitria. (2003). , Local Appearance-Based Models using High-Order Statistics of Image Features.
|
|
|
B. Moghaddam, David Guillamet, & Jordi Vitria. (2003). Local Appearance-Based Models using High-Order Statistics of Image Features.
|
|
|
Bart M. Ter Haar Romeny, W. Niessen, J. Weickert, P. Van Roermund, W. Van Enk, Antonio Lopez, et al. (1996). Orientation detection of trabecular bone.
|
|
|
Bhaskar Chakraborty. (2008). View-Invariant Human-Body Detection with Extension to Human Action Recognition using Component Wise HMM of Body Parts.
|
|
|
Bogdan Raducanu, & Jordi Vitria. (2006). Aprendiendo a Aprender: de Maquinas Listas a Maquinas Inteligentes.
|
|
|
Bogdan Raducanu, & Jordi Vitria. (2005). Real-Time Face Tracking for Context-Aware Computing.
|
|
|
Bonifaz Stuhr, Jurgen Brauer, Bernhard Schick, & Jordi Gonzalez. (2023). Masked Discriminators for Content-Consistent Unpaired Image-to-Image Translation.
Abstract: A common goal of unpaired image-to-image translation is to preserve content consistency between source images and translated images while mimicking the style of the target domain. Due to biases between the datasets of both domains, many methods suffer from inconsistencies caused by the translation process. Most approaches introduced to mitigate these inconsistencies do not constrain the discriminator, leading to an even more ill-posed training setup. Moreover, none of these approaches is designed for larger crop sizes. In this work, we show that masking the inputs of a global discriminator for both domains with a content-based mask is sufficient to reduce content inconsistencies significantly. However, this strategy leads to artifacts that can be traced back to the masking process. To reduce these artifacts, we introduce a local discriminator that operates on pairs of small crops selected with a similarity sampling strategy. Furthermore, we apply this sampling strategy to sample global input crops from the source and target dataset. In addition, we propose feature-attentive denormalization to selectively incorporate content-based statistics into the generator stream. In our experiments, we show that our method achieves state-of-the-art performance in photorealistic sim-to-real translation and weather translation and also performs well in day-to-night translation. Additionally, we propose the cKVD metric, which builds on the sKVD metric and enables the examination of translation quality at the class or category level.
|
|
|
C. Mariño, M.G. Penas, M. Penedo, David Lloret, & M.J. Carreira. (2001). Integration of Mutual Information and Creaseness Based Methods for the Automatic Registration of SLO Sequences..
|
|