|
Records |
Links |
|
Author |
Monica Piñol; Angel Sappa; Ricardo Toledo |
|
|
Title |
Adaptive Feature Descriptor Selection based on a Multi-Table Reinforcement Learning Strategy |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Neurocomputing |
Abbreviated Journal |
NEUCOM |
|
|
Volume |
150 |
Issue |
A |
Pages |
106–115 |
|
|
Keywords |
Reinforcement learning; Q-learning; Bag of features; Descriptors |
|
|
Abstract |
This paper presents and evaluates a framework to improve the performance of visual object classification methods, which are based on the usage of image feature descriptors as inputs. The goal of the proposed framework is to learn the best descriptor for each image in a given database. This goal is reached by means of a reinforcement learning process using the minimum information. The visual classification system used to demonstrate the proposed framework is based on a bag of features scheme, and the reinforcement learning technique is implemented through the Q-learning approach. The behavior of the reinforcement learning with different state definitions is evaluated. Additionally, a method that combines all these states is formulated in order to select the optimal state. Finally, the chosen actions are obtained from the best set of image descriptors in the literature: PHOW, SIFT, C-SIFT, SURF and Spin. Experimental results using two public databases (ETH and COIL) are provided showing both the validity of the proposed approach and comparisons with state of the art. In all the cases the best results are obtained with the proposed approach. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.055; 600.076 |
Approved |
no |
|
|
Call Number |
Admin @ si @ PST2015 |
Serial |
2473 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; J. Chazalon; Katerine Diaz |
|
|
Title |
Augmented Songbook: an Augmented Reality Educational Application for Raising Music Awareness |
Type |
Journal Article |
|
Year |
2018 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
|
|
Volume |
77 |
Issue |
11 |
Pages |
13773-13798 |
|
|
Keywords |
Augmented reality; Document image matching; Educational applications |
|
|
Abstract |
This paper presents the development of an Augmented Reality mobile application which aims at sensibilizing young children to abstract concepts of music. Such concepts are, for instance, the musical notation or the idea of rhythm. Recent studies in Augmented Reality for education suggest that such technologies have multiple benefits for students, including younger ones. As mobile document image acquisition and processing gains maturity on mobile platforms, we explore how it is possible to build a markerless and real-time application to augment the physical documents with didactic animations and interactive virtual content. Given a standard image processing pipeline, we compare the performance of different local descriptors at two key stages of the process. Results suggest alternatives to the SIFT local descriptors, regarding result quality and computational efficiency, both for document model identification and perspective transform estimation. All experiments are performed on an original and public dataset we introduce here. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; ADAS; 600.084; 600.121; 600.118; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RCD2018 |
Serial |
2996 |
|
Permanent link to this record |
|
|
|
|
Author |
Fernando Barrera; Felipe Lumbreras; Angel Sappa |
|
|
Title |
Multispectral Piecewise Planar Stereo using Manhattan-World Assumption |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
34 |
Issue |
1 |
Pages |
52-61 |
|
|
Keywords |
Multispectral stereo rig; Dense disparity maps from multispectral stereo; Color and infrared images |
|
|
Abstract |
This paper proposes a new framework for extracting dense disparity maps from a multispectral stereo rig. The system is constructed with an infrared and a color camera. It is intended to explore novel multispectral stereo matching approaches that will allow further extraction of semantic information. The proposed framework consists of three stages. Firstly, an initial sparse disparity map is generated by using a cost function based on feature matching in a multiresolution scheme. Then, by looking at the color image, a set of planar hypotheses is defined to describe the surfaces on the scene. Finally, the previous stages are combined by reformulating the disparity computation as a global minimization problem. The paper has two main contributions. The first contribution combines mutual information with a shape descriptor based on gradient in a multiresolution scheme. The second contribution, which is based on the Manhattan-world assumption, extracts a dense disparity representation using the graph cut algorithm. Experimental results in outdoor scenarios are provided showing the validity of the proposed framework. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.054; 600.055; 605.203 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BLS2013 |
Serial |
2245 |
|
Permanent link to this record |
|
|
|
|
Author |
Fadi Dornaika; Jose Manuel Alvarez; Angel Sappa; Antonio Lopez |
|
|
Title |
A New Framework for Stereo Sensor Pose through Road Segmentation and Registration |
Type |
Journal Article |
|
Year |
2011 |
Publication |
IEEE Transactions on Intelligent Transportation Systems |
Abbreviated Journal |
TITS |
|
|
Volume |
12 |
Issue |
4 |
Pages |
954-966 |
|
|
Keywords |
road detection |
|
|
Abstract |
This paper proposes a new framework for real-time estimation of the onboard stereo head's position and orientation relative to the road surface, which is required for any advanced driver-assistance application. This framework can be used with all road types: highways, urban, etc. Unlike existing works that rely on feature extraction in either the image domain or 3-D space, we propose a framework that directly estimates the unknown parameters from the stream of stereo pairs' brightness. The proposed approach consists of two stages that are invoked for every stereo frame. The first stage segments the road region in one monocular view. The second stage estimates the camera pose using a featureless registration between the segmented monocular road region and the other view in the stereo pair. This paper has two main contributions. The first contribution combines a road segmentation algorithm with a registration technique to estimate the online stereo camera pose. The second contribution solves the registration using a featureless method, which is carried out using two different optimization techniques: 1) the differential evolution algorithm and 2) the Levenberg-Marquardt (LM) algorithm. We provide experiments and evaluations of performance. The results presented show the validity of our proposed framework. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1524-9050 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ DAS2011; ADAS @ adas @ das2011a |
Serial |
1833 |
|
Permanent link to this record |
|
|
|
|
Author |
Fernando Barrera; Felipe Lumbreras; Angel Sappa |
|
|
Title |
Multimodal Stereo Vision System: 3D Data Extraction and Algorithm Evaluation |
Type |
Journal Article |
|
Year |
2012 |
Publication |
IEEE Journal of Selected Topics in Signal Processing |
Abbreviated Journal |
J-STSP |
|
|
Volume |
6 |
Issue |
5 |
Pages |
437-446 |
|
|
Keywords |
|
|
|
Abstract |
This paper proposes an imaging system for computing sparse depth maps from multispectral images. A special stereo head consisting of an infrared and a color camera defines the proposed multimodal acquisition system. The cameras are rigidly attached so that their image planes are parallel. Details about the calibration and image rectification procedure are provided. Sparse disparity maps are obtained by the combined use of mutual information enriched with gradient information. The proposed approach is evaluated using a Receiver Operating Characteristics curve. Furthermore, a multispectral dataset, color and infrared images, together with their corresponding ground truth disparity maps, is generated and used as a test bed. Experimental results in real outdoor scenarios are provided showing its viability and that the proposed approach is not restricted to a specific domain. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1932-4553 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ BLS2012b |
Serial |
2155 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Lukas Schneider; P. Cebrian; A. Espinosa; David Vazquez; Antonio Lopez; Uwe Franke; Marc Pollefeys; Juan Carlos Moure |
|
|
Title |
Slanted Stixels: A way to represent steep streets |
Type |
Journal Article |
|
Year |
2019 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
|
|
Volume |
127 |
Issue |
|
Pages |
1643–1658 |
|
|
Keywords |
|
|
|
Abstract |
This work presents and evaluates a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced in order to significantly reduce the computational complexity of the Stixel algorithm, and then achieve real-time computation capabilities. The idea is to first perform an over-segmentation of the image, discarding the unlikely Stixel cuts, and apply the algorithm only on the remaining Stixel cuts. This work presents a novel over-segmentation strategy based on a fully convolutional network, which outperforms an approach based on using local extrema of the disparity map. We evaluate the proposed methods in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.118; 600.124 |
Approved |
no |
|
|
Call Number |
Admin @ si @ HSC2019 |
Serial |
3304 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Luis Gomez; Gabriel Villalonga; Antonio Lopez |
|
|
Title |
Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches |
Type |
Journal Article |
|
Year |
2021 |
Publication |
Sensors |
Abbreviated Journal |
SENS |
|
|
Volume |
21 |
Issue |
9 |
Pages |
3185 |
|
|
Keywords |
co-training; multi-modality; vision-based object detection; ADAS; self-driving |
|
|
Abstract |
Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GVL2021 |
Serial |
3562 |
|
Permanent link to this record |
|
|
|
|
Author |
Naveen Onkarappa; Angel Sappa |
|
|
Title |
A Novel Space Variant Image Representation |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Journal of Mathematical Imaging and Vision |
Abbreviated Journal |
JMIV |
|
|
Volume |
47 |
Issue |
1-2 |
Pages |
48-59 |
|
|
Keywords |
Space-variant representation; Log-polar mapping; Onboard vision applications |
|
|
Abstract |
Traditionally, in machine vision images are represented using cartesian coordinates with uniform sampling along the axes. On the contrary, biological vision systems represent images using polar coordinates with non-uniform sampling. For various advantages provided by space-variant representations many researchers are interested in space-variant computer vision. In this direction the current work proposes a novel and simple space variant representation of images. The proposed representation is compared with the classical log-polar mapping. The log-polar representation is motivated by biological vision having the characteristic of higher resolution at the fovea and reduced resolution at the periphery. On the contrary to the log-polar, the proposed new representation has higher resolution at the periphery and lower resolution at the fovea. Our proposal is proved to be a better representation in navigational scenarios such as driver assistance systems and robotics. The experimental results involve analysis of optical flow fields computed on both proposed and log-polar representations. Additionally, an egomotion estimation application is also shown as an illustrative example. The experimental analysis comprises results from synthetic as well as real sequences. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0924-9907 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.055; 605.203; 601.215 |
Approved |
no |
|
|
Call Number |
Admin @ si @ OnS2013a |
Serial |
2243 |
|
Permanent link to this record |
|
|
|
|
Author |
Fei Yang; Luis Herranz; Joost Van de Weijer; Jose Antonio Iglesias; Antonio Lopez; Mikhail Mozerov |
|
|
Title |
Variable Rate Deep Image Compression with Modulated Autoencoder |
Type |
Journal Article |
|
Year |
2020 |
Publication |
IEEE Signal Processing Letters |
Abbreviated Journal |
SPL |
|
|
Volume |
27 |
Issue |
|
Pages |
331-335 |
|
|
Keywords |
|
|
|
Abstract |
Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; ADAS; 600.141; 600.120; 600.118 |
Approved |
no |
|
|
Call Number |
Admin @ si @ YHW2020 |
Serial |
3346 |
|
Permanent link to this record |
|
|
|
|
Author |
Ferran Diego; Joan Serrat; Antonio Lopez |
|
|
Title |
Joint spatio-temporal alignment of sequences |
Type |
Journal Article |
|
Year |
2013 |
Publication |
IEEE Transactions on Multimedia |
Abbreviated Journal |
TMM |
|
|
Volume |
15 |
Issue |
6 |
Pages |
1377-1387 |
|
|
Keywords |
video alignment |
|
|
Abstract |
Video alignment is important in different areas of computer vision such as wide baseline matching, action recognition, change detection, video copy detection and frame dropping prevention. Current video alignment methods usually deal with a relatively simple case of fixed or rigidly attached cameras or simultaneous acquisition. Therefore, in this paper we propose a joint video alignment for bringing two video sequences into a spatio-temporal alignment. Specifically, the novelty of the paper is to formulate the video alignment to fold the spatial and temporal alignment into a single alignment framework. This simultaneously satisfies a frame-correspondence and frame-alignment similarity; exploiting the knowledge among neighbor frames by a standard pairwise Markov random field (MRF). This new formulation is able to handle the alignment of sequences recorded at different times by independent moving cameras that follows a similar trajectory, and also generalizes the particular cases that of fixed geometric transformation and/or linear temporal mapping. We conduct experiments on different scenarios such as sequences recorded simultaneously or by moving cameras to validate the robustness of the proposed approach. The proposed method provides the highest video alignment accuracy compared to the state-of-the-art methods on sequences recorded from vehicles driving along the same track at different times. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1520-9210 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ DSL2013; ADAS @ adas @ |
Serial |
2228 |
|
Permanent link to this record |