|
Jose Manuel Alvarez and Antonio Lopez. 2008. Novel Index for Objective Evaluation of Road Detection Algorithms. Intelligent Transportation Systems. 11th International IEEE Conference on,.815–820.
|
|
|
Fadi Dornaika and Angel Sappa. 2007. Real-time Vehicle Ego-Motion using Stereo Pairs and Particle Filters. Int. Conf. on Image Analysis and Recognition,.469–480. (LNCS.)
|
|
|
G.D. Evangelidis, Ferran Diego, Joan Serrat and Antonio Lopez. 2011. Slice Matching for Accurate Spatio-Temporal Alignment. In ICCV Workshop on Visual Surveillance.
Abstract: Video synchronization and alignment is a rather recent topic in computer vision. It usually deals with the problem of aligning sequences recorded simultaneously by static, jointly- or independently-moving cameras. In this paper, we investigate the more difficult problem of matching videos captured at different times from independently-moving cameras, whose trajectories are approximately coincident or parallel. To this end, we propose a novel method that pixel-wise aligns videos and allows thus to automatically highlight their differences. This primarily aims at visual surveillance but the method can be adopted as is by other related video applications, like object transfer (augmented reality) or high dynamic range video. We build upon a slice matching scheme to first synchronize the sequences, while we develop a spatio-temporal alignment scheme to spatially register corresponding frames and refine the temporal mapping. We investigate the performance of the proposed method on videos recorded from vehicles driven along different types of roads and compare with related previous works.
Keywords: video alignment
|
|
|
Joan Serrat, Ferran Diego, Jose Manuel Alvarez and Felipe Lumbreras. 2007. Alignment of Videos Recorded from Moving Vehicles. in 14th International Conference on Image Analysis and Processing,.512–517.
|
|
|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Michael Felsberg and J.Laaksonen. 2015. Deep semantic pyramids for human attributes and action recognition. Image Analysis, Proceedings of 19th Scandinavian Conference , SCIA 2015. Springer International Publishing, 341–353.
Abstract: Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features.
We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature.
Keywords: Action recognition; Human attributes; Semantic pyramids
|
|
|
German Ros, Angel Sappa, Daniel Ponsa and Antonio Lopez. 2012. Visual SLAM for Driverless Cars: A Brief Survey. IEEE Workshop on Navigation, Perception, Accurate Positioning and Mapping for Intelligent Vehicles.
|
|
|
Idoia Ruiz, Lorenzo Porzi, Samuel Rota Bulo, Peter Kontschieder and Joan Serrat. 2021. Weakly Supervised Multi-Object Tracking and Segmentation. IEEE Winter Conference on Applications of Computer Vision Workshops.125–133.
Abstract: We introduce the problem of weakly supervised MultiObject Tracking and Segmentation, i.e. joint weakly supervised instance segmentation and multi-object tracking, in which we do not provide any kind of mask annotation.
To address it, we design a novel synergistic training strategy by taking advantage of multi-task learning, i.e. classification and tracking tasks guide the training of the unsupervised instance segmentation. For that purpose, we extract weak foreground localization information, provided by
Grad-CAM heatmaps, to generate a partial ground truth to learn from. Additionally, RGB image level information is employed to refine the mask prediction at the edges of the
objects. We evaluate our method on KITTI MOTS, the most representative benchmark for this task, reducing the performance gap on the MOTSP metric between the fully supervised and weakly supervised approach to just 12% and 12.7 % for cars and pedestrians, respectively.
|
|
|
German Ros, Sebastian Ramos, Manuel Granados, Amir Bakhtiary, David Vazquez and Antonio Lopez. 2015. Vision-based Offline-Online Perception Paradigm for Autonomous Driving. IEEE Winter Conference on Applications of Computer Vision.231–238.
Abstract: Autonomous driving is a key factor for future mobility. Properly perceiving the environment of the vehicles is essential for a safe driving, which requires computing accurate geometric and semantic information in real-time. In this paper, we challenge state-of-the-art computer vision algorithms for building a perception system for autonomous driving. An inherent drawback in the computation of visual semantics is the trade-off between accuracy and computational cost. We propose to circumvent this problem by following an offline-online strategy. During the offline stage dense 3D semantic maps are created. In the online stage the current driving area is recognized in the maps via a re-localization process, which allows to retrieve the pre-computed accurate semantics and 3D geometry in realtime. Then, detecting the dynamic obstacles we obtain a rich understanding of the current scene. We evaluate quantitatively our proposal in the KITTI dataset and discuss the related open challenges for the computer vision community.
Keywords: Autonomous Driving; Scene Understanding; SLAM; Semantic Segmentation
|
|
|
Daniel Hernandez, Antonio Espinosa, David Vazquez, Antonio Lopez and Juan Carlos Moure. 2017. GPU-accelerated real-time stixel computation. IEEE Winter Conference on Applications of Computer Vision.1054–1062.
Abstract: The Stixel World is a medium-level, compact representation of road scenes that abstracts millions of disparity pixels into hundreds or thousands of stixels. The goal of this work is to implement and evaluate a complete multi-stixel estimation pipeline on an embedded, energyefficient, GPU-accelerated device. This work presents a full GPU-accelerated implementation of stixel estimation that produces reliable results at 26 frames per second (real-time) on the Tegra X1 for disparity images of 1024×440 pixels and stixel widths of 5 pixels, and achieves more than 400 frames per second on a high-end Titan X GPU card.
Keywords: Autonomous Driving; GPU; Stixel
|
|
|
Patricia Suarez, Angel Sappa and Boris X. Vintimilla. 2017. Cross-Spectral Image Patch Similarity using Convolutional Neural Network. IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics.
Abstract: The ability to compare image regions (patches) has been the basis of many approaches to core computer vision problems, including object, texture and scene categorization. Hence, developing representations for image patches have been of interest in several works. The current work focuses on learning similarity between cross-spectral image patches with a 2 channel convolutional neural network (CNN) model. The proposed approach is an adaptation of a previous work, trying to obtain similar results than the state of the art but with a lowcost hardware. Hence, obtained results are compared with both
classical approaches, showing improvements, and a state of the art CNN based approach.
|
|