Felipe Codevilla, Eder Santana, Antonio Lopez, & Adrien Gaidon. (2019). Exploring the Limitations of Behavior Cloning for Autonomous Driving. In 18th IEEE International Conference on Computer Vision (pp. 9328–9337).
Abstract: Driving requires reacting to a wide variety of complex environment conditions and agent behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation learning can, in theory, leverage data from large fleets of human-driven cars. Behavior cloning in particular has been successfully used to learn simple visuomotor policies end-to-end, but scaling to the full spectrum of driving behaviors remains an unsolved problem. In this paper, we propose a new benchmark to experimentally investigate the scalability and limitations of behavior cloning. We show that behavior cloning leads to state-of-the-art results, executing complex lateral and longitudinal maneuvers, even in unseen environments, without being explicitly programmed to do so. However, we confirm some limitations of the behavior cloning approach: some well-known limitations (eg, dataset bias and overfitting), new generalization issues (eg, dynamic objects and the lack of a causal modeling), and training instabilities, all requiring further research before behavior cloning can graduate to real-world driving. The code, dataset, benchmark, and agent studied in this paper can be found at github.
|
Felipe Codevilla, Matthias Muller, Antonio Lopez, Vladlen Koltun, & Alexey Dosovitskiy. (2018). End-to-end Driving via Conditional Imitation Learning. In IEEE International Conference on Robotics and Automation (pp. 4693–4700).
Abstract: Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at this https URL
|
Felipe Lumbreras. (2001). Segmentation, classification and modelization of textures by means of multiresolution decomposition techniques..
|
Felipe Lumbreras, & Joan Serrat. (1996). Wavelet filtering for the segmentation of marble images..
|
Felipe Lumbreras, & Joan Serrat. (1996). Segmentation of petrographical images of marbles. Computers and Geosciences. 22(5):547–558, .
|
Felipe Lumbreras, & Joan Serrat. (1996). Wavelet filtering for the segmentation of marble images.
|
Felipe Lumbreras, & Joan Serrat. (1996). Segmentation of petrographical image of marbles.
|
Felipe Lumbreras, Joan Serrat, Ramon Baldrich, Maria Vanrell, & Juan J. Villanueva. (2001). Color Texture Recognition Through Multiresolution Features.
|
Felipe Lumbreras, Ramon Baldrich, Maria Vanrell, Joan Serrat, & Juan J. Villanueva. (1999). Multiresolution colour texture representations for tile classification.
|
Felipe Lumbreras, Ramon Baldrich, Maria Vanrell, Joan Serrat, & Juan J. Villanueva. (1999). Multiresolution texture classification of ceramic tiles. In Recent Research developments in optical engineering, Research Signpost, 2: 213–228.
|
Felipe Lumbreras, Xavier Roca, Daniel Ponsa, Robert Benavente, Judit Martinez, Silvia Sanchez, et al. (2001). Visual Inspection of Safety Belts. In International Conference on Quality Control by Artificial Vision (Vol. 2, 526–531).
|
Fernando Alonso, Xavier Baro, Sergio Escalera, Jordi Gonzalez, Martha Mackay, & Anna Serrahima. (2016). CARE RESPITE: TAKING CARE OF THE CAREGIVERS, Theme 5 The Strategic use of Mobile and Digital Health and Care Solutions. In 16th International Conference for Integrated Care.
|
Fernando Barrera. (2012). Multimodal Stereo from Thermal Infrared and Visible Spectrum (Felipe Lumbreras, & Angel Sappa, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Recent advances in thermal infrared imaging (LWIR) has allowed its use in applications beyond of the military domain. Nowadays, this new family of sensors is included in different technical and scientific applications. They offer features that facilitate tasks, such as detection of pedestrians, hot spots, differences in temperature, among others, which can significantly improve the performance of a system where the persons are expected to play the principal role. For instance, video surveillance applications, monitoring, and pedestrian detection.
During this dissertation the next question is stated: Could a couple of sensors measuring different bands of the electromagnetic spectrum, as the visible and thermal infrared, be used to extract depth information? Although it is a complex question, we shows that a system of these characteristics is possible as well as their advantages, drawbacks, and potential opportunities.
The matching and fusion of data coming from different sensors, as the emissions registered at visible and infrared bands, represents a special challenge, because it has been showed that theses signals are weak correlated. Therefore, many traditional techniques of image processing and computer vision are not helpful, requiring adjustments for their correct performance in every modality.
In this research an experimental study that compares different cost functions and matching approaches is performed, in order to build a multimodal stereovision system. Furthermore, the common problems in infrared/visible stereo, specially in the outdoor scenes are identified. Our framework summarizes the architecture of a generic stereo algorithm, at different levels: computational, functional, and structural, which can be extended toward high-level fusion (semantic) and high-order (prior).The proposed framework is intended to explore novel multimodal stereo matching approaches, going from sparse to dense representations (both disparity and depth maps). Moreover, context information is added in form of priors and assumptions. Finally, this dissertation shows a promissory way toward the integration of multiple sensors for recovering three-dimensional information.
|
Fernando Barrera, Felipe Lumbreras, & Angel Sappa. (2010). Multimodal Template Matching based on Gradient and Mutual Information using Scale-Space. In 17th IEEE International Conference on Image Processing (2749–2752).
Abstract: This paper presents the combined use of gradient and mutual information for infrared and intensity templates matching. We propose to joint: (i) feature matching in a multiresolution context and (ii) information propagation through scale-space representations. Our method consists in combining mutual information with a shape descriptor based on gradient, and propagate them following a coarse-to-fine strategy. The main contributions of this work are: to offer a theoretical formulation towards a multimodal stereo matching; to show that gradient and mutual information can be reinforced while they are propagated between consecutive levels; and to show that they are valid cost functions in multimodal template matchings. Comparisons are presented showing the improvements and viability of the proposed approach.
|
Fernando Barrera, Felipe Lumbreras, & Angel Sappa. (2012). Evaluation of Similarity Functions in Multimodal Stereo. In 9th International Conference on Image Analysis and Recognition (Vol. 7324, pp. 320–329). LNCS. Springer Berlin Heidelberg.
Abstract: This paper presents an evaluation framework for multimodal stereo matching, which allows to compare the performance of four similarity functions. Additionally, it presents details of a multimodal stereo head that supply thermal infrared and color images, as well as, aspects of its calibration and rectification. The pipeline includes a novel method for the disparity selection, which is suitable for evaluating the similarity functions. Finally, a benchmark for comparing different initializations of the proposed framework is presented. Similarity functions are based on mutual information, gradient orientation and scale space representations. Their evaluation is performed using two metrics: i) disparity error, and ii) number of correct matches on planar regions. In addition to the proposed evaluation, the current paper also shows that 3D sparse representations can be recovered from such a multimodal stereo head.
Keywords: Aveiro, Portugal
|