|
Fadi Dornaika and Angel Sappa. 2009. A Featureless and Stochastic Approach to On-board Stereo Vision System Pose. IMAVIS, 27(9), 1382–1393.
Abstract: This paper presents a direct and stochastic technique for real-time estimation of on-board stereo head’s position and orientation. Unlike existing works which rely on feature extraction either in the image domain or in 3D space, our proposed approach directly estimates the unknown parameters from the stream of stereo pairs’ brightness. The pose parameters are tracked using the particle filtering framework which implicitly enforces the smoothness constraints on the estimated parameters. The proposed technique can be used with a driver assistance applications as well as with augmented reality applications. Extended experiments on urban environments with different road geometries are presented. Comparisons with a 3D data-based approach are presented. Moreover, we provide a performance study aiming at evaluating the accuracy of the proposed approach.
Keywords: On-board stereo vision system; Pose estimation; Featureless approach; Particle filtering; Image warping
|
|
|
Debora Gil, Aura Hernandez-Sabate, Mireia Brunat, Steven Jansen and Jordi Martinez-Vilalta. 2011. Structure-preserving smoothing of biomedical images. PR, 44(9), 1842–1851.
Abstract: Smoothing of biomedical images should preserve gray-level transitions between adjacent tissues, while restoring contours consistent with anatomical structures. Anisotropic diffusion operators are based on image appearance discontinuities (either local or contextual) and might fail at weak inter-tissue transitions. Meanwhile, the output of block-wise and morphological operations is prone to present a block structure due to the shape and size of the considered pixel neighborhood. In this contribution, we use differential geometry concepts to define a diffusion operator that restricts to image consistent level-sets. In this manner, the final state is a non-uniform intensity image presenting homogeneous inter-tissue transitions along anatomical structures, while smoothing intra-structure texture. Experiments on different types of medical images (magnetic resonance, computerized tomography) illustrate its benefit on a further process (such as segmentation) of images.
Keywords: Non-linear smoothing; Differential geometry; Anatomical structures; segmentation; Cardiac magnetic resonance; Computerized tomography
|
|
|
Cristhian Aguilera, Fernando Barrera, Felipe Lumbreras, Angel Sappa and Ricardo Toledo. 2012. Multispectral Image Feature Points. SENS, 12(9), 12661–12672.
Abstract: Far-Infrared and Visible Spectrum images. It allows matching interest points on images of the same scene but acquired in different spectral bands. Initially, points of interest are detected on both images through a SIFT-like based scale space representation. Then, these points are characterized using an Edge Oriented Histogram (EOH) descriptor. Finally, points of interest from multispectral images are matched by finding nearest couples using the information from the descriptor. The provided experimental results and comparisons with similar methods show both the validity of the proposed approach as well as the improvements it offers with respect to the current state-of-the-art.
Keywords: multispectral image descriptor; color and infrared images; feature point descriptor
|
|
|
Naveen Onkarappa and Angel Sappa. 2015. Synthetic sequences and ground-truth flow field generation for algorithm validation. MTAP, 74(9), 3121–3135.
Abstract: Research in computer vision is advancing by the availability of good datasets that help to improve algorithms, validate results and obtain comparative analysis. The datasets can be real or synthetic. For some of the computer vision problems such as optical flow it is not possible to obtain ground-truth optical flow with high accuracy in natural outdoor real scenarios directly by any sensor, although it is possible to obtain ground-truth data of real scenarios in a laboratory setup with limited motion. In this difficult situation computer graphics offers a viable option for creating realistic virtual scenarios. In the current work we present a framework to design virtual scenes and generate sequences as well as ground-truth flow fields. Particularly, we generate a dataset containing sequences of driving scenarios. The sequences in the dataset vary in different speeds of the on-board vision system, different road textures, complex motion of vehicle and independent moving vehicles in the scene. This dataset enables analyzing and adaptation of existing optical flow methods, and leads to invention of new approaches particularly for driver assistance systems.
Keywords: Ground-truth optical flow; Synthetic sequence; Algorithm validation
|
|
|
Adrien Gaidon, Antonio Lopez and Florent Perronnin. 2018. The Reasonable Effectiveness of Synthetic Visual Data. IJCV, 126(9), 899–901.
|
|
|
Jose L. Gomez, Gabriel Villalonga and Antonio Lopez. 2021. Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches. SENS, 21(9), 3185.
Abstract: Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images.
Keywords: co-training; multi-modality; vision-based object detection; ADAS; self-driving
|
|
|
Monica Piñol, Angel Sappa and Ricardo Toledo. 2015. Adaptive Feature Descriptor Selection based on a Multi-Table Reinforcement Learning Strategy. NEUCOM, 150(A), 106–115.
Abstract: This paper presents and evaluates a framework to improve the performance of visual object classification methods, which are based on the usage of image feature descriptors as inputs. The goal of the proposed framework is to learn the best descriptor for each image in a given database. This goal is reached by means of a reinforcement learning process using the minimum information. The visual classification system used to demonstrate the proposed framework is based on a bag of features scheme, and the reinforcement learning technique is implemented through the Q-learning approach. The behavior of the reinforcement learning with different state definitions is evaluated. Additionally, a method that combines all these states is formulated in order to select the optimal state. Finally, the chosen actions are obtained from the best set of image descriptors in the literature: PHOW, SIFT, C-SIFT, SURF and Spin. Experimental results using two public databases (ETH and COIL) are provided showing both the validity of the proposed approach and comparisons with state of the art. In all the cases the best results are obtained with the proposed approach.
Keywords: Reinforcement learning; Q-learning; Bag of features; Descriptors
|
|