|
Felipe Codevilla, Antonio Lopez, Vladlen Koltun, & Alexey Dosovitskiy. (2018). On Offline Evaluation of Vision-based Driving Models. In 15th European Conference on Computer Vision (Vol. 11219, pp. 246–262). LNCS.
Abstract: Autonomous driving models should ideally be evaluated by deploying
them on a fleet of physical vehicles in the real world. Unfortunately, this approach is not practical for the vast majority of researchers. An attractive alternative is to evaluate models offline, on a pre-collected validation dataset with ground truth annotation. In this paper, we investigate the relation between various online and offline metrics for evaluation of autonomous driving models. We find that offline prediction error is not necessarily correlated with driving quality, and two models with identical prediction error can differ dramatically in their driving performance. We show that the correlation of offline evaluation with driving quality can be significantly improved by selecting an appropriate validation dataset and
suitable offline metrics.
Keywords: Autonomous driving; deep learning
|
|
|
Felipe Codevilla, Eder Santana, Antonio Lopez, & Adrien Gaidon. (2019). Exploring the Limitations of Behavior Cloning for Autonomous Driving. In 18th IEEE International Conference on Computer Vision (pp. 9328–9337).
Abstract: Driving requires reacting to a wide variety of complex environment conditions and agent behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation learning can, in theory, leverage data from large fleets of human-driven cars. Behavior cloning in particular has been successfully used to learn simple visuomotor policies end-to-end, but scaling to the full spectrum of driving behaviors remains an unsolved problem. In this paper, we propose a new benchmark to experimentally investigate the scalability and limitations of behavior cloning. We show that behavior cloning leads to state-of-the-art results, executing complex lateral and longitudinal maneuvers, even in unseen environments, without being explicitly programmed to do so. However, we confirm some limitations of the behavior cloning approach: some well-known limitations (eg, dataset bias and overfitting), new generalization issues (eg, dynamic objects and the lack of a causal modeling), and training instabilities, all requiring further research before behavior cloning can graduate to real-world driving. The code, dataset, benchmark, and agent studied in this paper can be found at github.
|
|
|
Felipe Codevilla, Matthias Muller, Antonio Lopez, Vladlen Koltun, & Alexey Dosovitskiy. (2018). End-to-end Driving via Conditional Imitation Learning. In IEEE International Conference on Robotics and Automation (pp. 4693–4700).
Abstract: Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at this https URL
|
|
|
Felipe Lumbreras. (2001). Segmentation, classification and modelization of textures by means of multiresolution decomposition techniques..
|
|
|
Felipe Lumbreras, & Joan Serrat. (1996). Wavelet filtering for the segmentation of marble images..
|
|
|
Felipe Lumbreras, & Joan Serrat. (1996). Segmentation of petrographical images of marbles. Computers and Geosciences. 22(5):547–558, .
|
|
|
Felipe Lumbreras, & Joan Serrat. (1996). Wavelet filtering for the segmentation of marble images.
|
|
|
Felipe Lumbreras, & Joan Serrat. (1996). Segmentation of petrographical image of marbles.
|
|
|
Felipe Lumbreras, Joan Serrat, Ramon Baldrich, Maria Vanrell, & Juan J. Villanueva. (2001). Color Texture Recognition Through Multiresolution Features.
|
|
|
Felipe Lumbreras, Ramon Baldrich, Maria Vanrell, Joan Serrat, & Juan J. Villanueva. (1999). Multiresolution colour texture representations for tile classification.
|
|
|
Felipe Lumbreras, Ramon Baldrich, Maria Vanrell, Joan Serrat, & Juan J. Villanueva. (1999). Multiresolution texture classification of ceramic tiles. In Recent Research developments in optical engineering, Research Signpost, 2: 213–228.
|
|
|
Felipe Lumbreras, Xavier Roca, Daniel Ponsa, Robert Benavente, J. Martinez, Silvia Sanchez, et al. (2001). Visual Inspection of Safety Belts. In International Conference on Quality Control by Artificial Vision (Vol. 2, 526–531).
|
|
|
Fernando Alonso, Xavier Baro, Sergio Escalera, Jordi Gonzalez, Martha Mackay, & Anna Serrahima. (2016). CARE RESPITE: TAKING CARE OF THE CAREGIVERS, Theme 5 The Strategic use of Mobile and Digital Health and Care Solutions. In 16th International Conference for Integrated Care.
|
|
|
Fernando Barrera. (2012). Multimodal Stereo from Thermal Infrared and Visible Spectrum (Felipe Lumbreras, & Angel Sappa, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Recent advances in thermal infrared imaging (LWIR) has allowed its use in applications beyond of the military domain. Nowadays, this new family of sensors is included in different technical and scientific applications. They offer features that facilitate tasks, such as detection of pedestrians, hot spots, differences in temperature, among others, which can significantly improve the performance of a system where the persons are expected to play the principal role. For instance, video surveillance applications, monitoring, and pedestrian detection.
During this dissertation the next question is stated: Could a couple of sensors measuring different bands of the electromagnetic spectrum, as the visible and thermal infrared, be used to extract depth information? Although it is a complex question, we shows that a system of these characteristics is possible as well as their advantages, drawbacks, and potential opportunities.
The matching and fusion of data coming from different sensors, as the emissions registered at visible and infrared bands, represents a special challenge, because it has been showed that theses signals are weak correlated. Therefore, many traditional techniques of image processing and computer vision are not helpful, requiring adjustments for their correct performance in every modality.
In this research an experimental study that compares different cost functions and matching approaches is performed, in order to build a multimodal stereovision system. Furthermore, the common problems in infrared/visible stereo, specially in the outdoor scenes are identified. Our framework summarizes the architecture of a generic stereo algorithm, at different levels: computational, functional, and structural, which can be extended toward high-level fusion (semantic) and high-order (prior).The proposed framework is intended to explore novel multimodal stereo matching approaches, going from sparse to dense representations (both disparity and depth maps). Moreover, context information is added in form of priors and assumptions. Finally, this dissertation shows a promissory way toward the integration of multiple sensors for recovering three-dimensional information.
|
|
|
Fernando Barrera, Felipe Lumbreras, & Angel Sappa. (2010). Multimodal Template Matching based on Gradient and Mutual Information using Scale-Space. In 17th IEEE International Conference on Image Processing (2749–2752).
Abstract: This paper presents the combined use of gradient and mutual information for infrared and intensity templates matching. We propose to joint: (i) feature matching in a multiresolution context and (ii) information propagation through scale-space representations. Our method consists in combining mutual information with a shape descriptor based on gradient, and propagate them following a coarse-to-fine strategy. The main contributions of this work are: to offer a theoretical formulation towards a multimodal stereo matching; to show that gradient and mutual information can be reinforced while they are propagated between consecutive levels; and to show that they are valid cost functions in multimodal template matchings. Comparisons are presented showing the improvements and viability of the proposed approach.
|
|