|
Gabriel Villalonga and Antonio Lopez. 2020. Co-Training for On-Board Deep Object Detection. ACCESS, 194441–194456.
Abstract: Providing ground truth supervision to train visual models has been a bottleneck over the years, exacerbated by domain shifts which degenerate the performance of such models. This was the case when visual tasks relied on handcrafted features and shallow machine learning and, despite its unprecedented performance gains, the problem remains open within the deep learning paradigm due to its data-hungry nature. Best performing deep vision-based object detectors are trained in a supervised manner by relying on human-labeled bounding boxes which localize class instances (i.e. objects) within the training images. Thus, object detection is one of such tasks for which human labeling is a major bottleneck. In this article, we assess co-training as a semi-supervised learning method for self-labeling objects in unlabeled images, so reducing the human-labeling effort for developing deep object detectors. Our study pays special attention to a scenario involving domain shift; in particular, when we have automatically generated virtual-world images with object bounding boxes and we have real-world images which are unlabeled. Moreover, we are particularly interested in using co-training for deep object detection in the context of driver assistance systems and/or self-driving vehicles. Thus, using well-established datasets and protocols for object detection in these application contexts, we will show how co-training is a paradigm worth to pursue for alleviating object labeling, working both alone and together with task-agnostic domain adaptation.
|
|
|
Jiaolong Xu, Liang Xiao and Antonio Lopez. 2019. Self-supervised Domain Adaptation for Computer Vision Tasks. ACCESS, 7, 156694–156706.
Abstract: Recent progress of self-supervised visual representation learning has achieved remarkable success on many challenging computer vision benchmarks. However, whether these techniques can be used for domain adaptation has not been explored. In this work, we propose a generic method for self-supervised domain adaptation, using object recognition and semantic segmentation of urban scenes as use cases. Focusing on simple pretext/auxiliary tasks (e.g. image rotation prediction), we assess different learning strategies to improve domain adaptation effectiveness by self-supervision. Additionally, we propose two complementary strategies to further boost the domain adaptation accuracy on semantic segmentation within our method, consisting of prediction layer alignment and batch normalization calibration. The experimental results show adaptation levels comparable to most studied domain adaptation methods, thus, bringing self-supervision as a new alternative for reaching domain adaptation. The code is available at this link. https://github.com/Jiaolong/self-supervised-da.
|
|
|
Katerine Diaz, Jesus Martinez del Rincon, Aura Hernandez-Sabate and Debora Gil. 2018. Continuous head pose estimation using manifold subspace embedding and multivariate regression. ACCESS, 6, 18325–18334.
Abstract: In this paper, a continuous head pose estimation system is proposed to estimate yaw and pitch head angles from raw facial images. Our approach is based on manifold learningbased methods, due to their promising generalization properties shown for face modelling from images. The method combines histograms of oriented gradients, generalized discriminative common vectors and continuous local regression to achieve successful performance. Our proposal was tested on multiple standard face datasets, as well as in a realistic scenario. Results show a considerable performance improvement and a higher consistence of our model in comparison with other state-of-art methods, with angular errors varying between 9 and 17 degrees.
Keywords: Head Pose estimation; HOG features; Generalized Discriminative Common Vectors; B-splines; Multiple linear regression
|
|
|
A.F. Sole, S. Ngan, G. Sapiro, X. Hu and Antonio Lopez. 2001. Anisotropic 2-D and 3-D Averaging of fMRI Signals. IEEE Transactions on Medical Imaging, 2020(2), 86–93.
|
|
|
J. Pladellorens, Joan Serrat, A. Castell and M.J. Yzuel. 1993. Using mathematical morphology to determine left ventricular contours..
|
|
|
Carme Julia, Angel Sappa, Felipe Lumbreras, Joan Serrat and Antonio Lopez. 2008. Rank Estimation in 3D Multibody Motion Segmentation. Electronic Letters, 44(4), 279–280.
Abstract: A novel technique for rank estimation in 3D multibody motion segmentation is proposed. It is based on the study of the frequency spectra of moving rigid objects and does not use or assume a prior knowledge of the objects contained in the scene (i.e. number of objects and motion). The significance of rank estimation on multibody motion segmentation results is shown by using two motion segmentation algorithms over both synthetic and real data.
|
|
|
A. Pujol, Jordi Vitria, Felipe Lumbreras and Juan J. Villanueva. 2001. Topological principal component analysis for face encoding and recognition. PRL, 22(6-7), 769–776.
|
|
|
Angel Sappa and 6 others. 2016. Monocular visual odometry: A cross-spectral image fusion based approach. RAS, 85, 26–36.
Abstract: This manuscript evaluates the usage of fused cross-spectral images in a monocular visual odometry approach. Fused images are obtained through a Discrete Wavelet Transform (DWT) scheme, where the best setup is empirically obtained by means of a mutual information based evaluation metric. The objective is to have a flexible scheme where fusion parameters are adapted according to the characteristics of the given images. Visual odometry is computed from the fused monocular images using an off the shelf approach. Experimental results using data sets obtained with two different platforms are presented. Additionally, comparison with a previous approach as well as with monocular-visible/infrared spectra are also provided showing the advantages of the proposed scheme.
Keywords: Monocular visual odometry; LWIR-RGB cross-spectral imaging; Image fusion
|
|
|
Miguel Oliveira, Victor Santos, Angel Sappa, P. Dias and A. Moreira. 2016. Incremental Scenario Representations for Autonomous Driving using Geometric Polygonal Primitives. RAS, 83, 312–325.
Abstract: When an autonomous vehicle is traveling through some scenario it receives a continuous stream of sensor data. This sensor data arrives in an asynchronous fashion and often contains overlapping or redundant information. Thus, it is not trivial how a representation of the environment observed by the vehicle can be created and updated over time. This paper presents a novel methodology to compute an incremental 3D representation of a scenario from 3D range measurements. We propose to use macro scale polygonal primitives to model the scenario. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Furthermore, we propose mechanisms designed to update the geometric polygonal primitives over time whenever fresh sensor data is collected. Results show that the approach is capable of producing accurate descriptions of the scene, and that it is computationally very efficient when compared to other reconstruction techniques.
Keywords: Incremental scene reconstruction; Point clouds; Autonomous vehicles; Polygonal primitives
|
|
|
Meysam Madadi, Sergio Escalera, Jordi Gonzalez, Xavier Roca and Felipe Lumbreras. 2015. Multi-part body segmentation based on depth maps for soft biometry analysis. PRL, 56, 14–21.
Abstract: This paper presents a novel method extracting biometric measures using depth sensors. Given a multi-part labeled training data, a new subject is aligned to the best model of the dataset, and soft biometrics such as lengths or circumference sizes of limbs and body are computed. The process is performed by training relevant pose clusters, defining a representative model, and fitting a 3D shape context descriptor within an iterative matching procedure. We show robust measures by applying orthogonal plates to body hull. We test our approach in a novel full-body RGB-Depth data set, showing accurate estimation of soft biometrics and better segmentation accuracy in comparison with random forest approach without requiring large training data.
Keywords: 3D shape context; 3D point cloud alignment; Depth maps; Human body segmentation; Soft biometry analysis
|
|