|
Records |
Links |
|
Author |
Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Michael Felsberg; J.Laaksonen |
|
|
Title |
Deep semantic pyramids for human attributes and action recognition |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Image Analysis, Proceedings of 19th Scandinavian Conference , SCIA 2015 |
Abbreviated Journal |
|
|
|
Volume |
9127 |
Issue |
|
Pages |
341-353 |
|
|
Keywords |
Action recognition; Human attributes; Semantic pyramids |
|
|
Abstract |
Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features.
We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature. |
|
|
Address |
Denmark; Copenhagen; June 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer International Publishing |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-319-19664-0 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SCIA |
|
|
Notes |
LAMP; 600.068; 600.079;ADAS;CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ KRW2015b |
Serial |
2672 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados |
|
|
Title |
Towards Query-by-Speech Handwritten Keyword Spotting |
Type |
Conference Article |
|
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
501-505 |
|
|
Keywords |
|
|
|
Abstract |
In this paper, we present a new querying paradigm for handwritten keyword spotting. We propose to represent handwritten word images both by visual and audio representations, enabling a query-by-speech keyword spotting system. The two representations are merged together and projected to a common sub-space in the training phase. This transform allows to, given a spoken query, retrieve word instances that were only represented by the visual modality. In addition, the same method can be used backwards at no additional cost to produce a handwritten text-tospeech system. We present our first results on this new querying mechanism using synthetic voices over the George Washington
dataset. |
|
|
Address |
Nancy; France; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.084; 600.061; 601.223; 600.077;ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ RAT2015b |
Serial |
2682 |
|
Permanent link to this record |
|
|
|
|
Author |
Jiaolong Xu; David Vazquez; Krystian Mikolajczyk; Antonio Lopez |
|
|
Title |
Hierarchical online domain adaptation of deformable part-based models |
Type |
Conference Article |
|
Year |
2016 |
Publication |
IEEE International Conference on Robotics and Automation |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
5536-5541 |
|
|
Keywords |
Domain Adaptation; Pedestrian Detection |
|
|
Abstract |
We propose an online domain adaptation method for the deformable part-based model (DPM). The online domain adaptation is based on a two-level hierarchical adaptation tree, which consists of instance detectors in the leaf nodes and a category detector at the root node. Moreover, combined with a multiple object tracking procedure (MOT), our proposal neither requires target-domain annotated data nor revisiting the source-domain data for performing the source-to-target domain adaptation of the DPM. From a practical point of view this means that, given a source-domain DPM and new video for training on a new domain without object annotations, our procedure outputs a new DPM adapted to the domain represented by the video. As proof-of-concept we apply our proposal to the challenging task of pedestrian detection. In this case, each instance detector is an exemplar classifier trained online with only one pedestrian per frame. The pedestrian instances are collected by MOT and the hierarchical model is constructed dynamically according to the pedestrian trajectories. Our experimental results show that the adapted detector achieves the accuracy of recent supervised domain adaptation methods (i.e., requiring manually annotated targetdomain data), and improves the source detector more than 10 percentage points. |
|
|
Address |
Stockholm; Sweden; May 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICRA |
|
|
Notes |
ADAS; 600.085; 600.082; 600.076 |
Approved |
no |
|
|
Call Number |
Admin @ si @ XVM2016 |
Serial |
2728 |
|
Permanent link to this record |
|
|
|
|
Author |
Victor Campmany; Sergio Silva; Juan Carlos Moure; Toni Espinosa; David Vazquez; Antonio Lopez |
|
|
Title |
GPU-based pedestrian detection for autonomous driving |
Type |
Conference Article |
|
Year |
2016 |
Publication |
GPU Technology Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Pedestrian Detection; GPU |
|
|
Abstract |
Pedestrian detection for autonomous driving is one of the hardest tasks within computer vision, and involves huge computational costs. Obtaining acceptable real-time performance, measured in frames per second (fps), for the most advanced algorithms is nowadays a hard challenge. Taking the work in [1] as our baseline, we propose a CUDA implementation of a pedestrian detection system that includes LBP and HOG as feature descriptors and SVM and Random forest as classifiers. We introduce significant algorithmic adjustments and optimizations to adapt the problem to the NVIDIA GPU architecture. The aim is to deploy a real-time system providing reliable results. |
|
|
Address |
Silicon Valley; San Francisco; USA; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GTC |
|
|
Notes |
ADAS; 600.085; 600.082; 600.076 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ CSM2016 |
Serial |
2737 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Juan Carlos Moure; Toni Espinosa; Alejandro Chacon; David Vazquez; Antonio Lopez |
|
|
Title |
Real-time 3D Reconstruction for Autonomous Driving via Semi-Global Matching |
Type |
Conference Article |
|
Year |
2016 |
Publication |
GPU Technology Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Stereo; Autonomous Driving; GPU; 3d reconstruction |
|
|
Abstract |
Robust and dense computation of depth information from stereo-camera systems is a computationally demanding requirement for real-time autonomous driving. Semi-Global Matching (SGM) [1] approximates heavy-computation global algorithms results but with lower computational complexity, therefore it is a good candidate for a real-time implementation. SGM minimizes energy along several 1D paths across the image. The aim of this work is to provide a real-time system producing reliable results on energy-efficient hardware. Our design runs on a NVIDIA Titan X GPU at 104.62 FPS and on a NVIDIA Drive PX at 6.7 FPS, promising for real-time platforms |
|
|
Address |
Silicon Valley; San Francisco; USA; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GTC |
|
|
Notes |
ADAS; 600.085; 600.082; 600.076 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HME2016 |
Serial |
2738 |
|
Permanent link to this record |
|
|
|
|
Author |
German Ros; Laura Sellart; Joanna Materzynska; David Vazquez; Antonio Lopez |
|
|
Title |
The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes |
Type |
Conference Article |
|
Year |
2016 |
Publication |
29th IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
3234-3243 |
|
|
Keywords |
Domain Adaptation; Autonomous Driving; Virtual Data; Semantic Segmentation |
|
|
Abstract |
Vision-based semantic segmentation in urban scenarios is a key functionality for autonomous driving. The irruption of deep convolutional neural networks (DCNNs) allows to foresee obtaining reliable classifiers to perform such a visual task. However, DCNNs require to learn many parameters from raw images; thus, having a sufficient amount of diversified images with this class annotations is needed. These annotations are obtained by a human cumbersome labour specially challenging for semantic segmentation, since pixel-level annotations are required. In this paper, we propose to use a virtual world for automatically generating realistic synthetic images with pixel-level annotations. Then, we address the question of how useful can be such data for the task of semantic segmentation; in particular, when using a DCNN paradigm. In order to answer this question we have generated a synthetic diversified collection of urban images, named SynthCity, with automatically generated class annotations. We use SynthCity in combination with publicly available real-world urban images with manually provided annotations. Then, we conduct experiments on a DCNN setting that show how the inclusion of SynthCity in the training stage significantly improves the performance of the semantic segmentation task |
|
|
Address |
Las Vegas; USA; June 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPR |
|
|
Notes |
ADAS; 600.085; 600.082; 600.076 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ RSM2016 |
Serial |
2739 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Alejandro Chacon; Antonio Espinosa; David Vazquez; Juan Carlos Moure; Antonio Lopez |
|
|
Title |
Embedded real-time stereo estimation via Semi-Global Matching on the GPU |
Type |
Conference Article |
|
Year |
2016 |
Publication |
16th International Conference on Computational Science |
Abbreviated Journal |
|
|
|
Volume |
80 |
Issue |
|
Pages |
143-153 |
|
|
Keywords |
Autonomous Driving; Stereo; CUDA; 3d reconstruction |
|
|
Abstract |
Dense, robust and real-time computation of depth information from stereo-camera systems is a computationally demanding requirement for robotics, advanced driver assistance systems (ADAS) and autonomous vehicles. Semi-Global Matching (SGM) is a widely used algorithm that propagates consistency constraints along several paths across the image. This work presents a real-time system producing reliable disparity estimation results on the new embedded energy-efficient GPU devices. Our design runs on a Tegra X1 at 41 frames per second for an image size of 640x480, 128 disparity levels, and using 4 path directions for the SGM method. |
|
|
Address |
San Diego; CA; USA; June 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCS |
|
|
Notes |
ADAS; 600.085; 600.082; 600.076 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HCE2016a |
Serial |
2740 |
|
Permanent link to this record |
|
|
|
|
Author |
Victor Campmany; Sergio Silva; Antonio Espinosa; Juan Carlos Moure; David Vazquez; Antonio Lopez |
|
|
Title |
GPU-based pedestrian detection for autonomous driving |
Type |
Conference Article |
|
Year |
2016 |
Publication |
16th International Conference on Computational Science |
Abbreviated Journal |
|
|
|
Volume |
80 |
Issue |
|
Pages |
2377-2381 |
|
|
Keywords |
Pedestrian detection; Autonomous Driving; CUDA |
|
|
Abstract |
We propose a real-time pedestrian detection system for the embedded Nvidia Tegra X1 GPU-CPU hybrid platform. The pipeline is composed by the following state-of-the-art algorithms: Histogram of Local Binary Patterns (LBP) and Histograms of Oriented Gradients (HOG) features extracted from the input image; Pyramidal Sliding Window technique for foreground segmentation; and Support Vector Machine (SVM) for classification. Results show a 8x speedup in the target Tegra X1 platform and a better performance/watt ratio than desktop CUDA platforms in study. |
|
|
Address |
San Diego; CA; USA; June 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCS |
|
|
Notes |
ADAS; 600.085; 600.082; 600.076 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ CSE2016 |
Serial |
2741 |
|
Permanent link to this record |
|
|
|
|
Author |
Eugenio Alcala; Laura Sellart; Vicenc Puig; Joseba Quevedo; Jordi Saludes; David Vazquez; Antonio Lopez |
|
|
Title |
Comparison of two non-linear model-based control strategies for autonomous vehicles |
Type |
Conference Article |
|
Year |
2016 |
Publication |
24th Mediterranean Conference on Control and Automation |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
846-851 |
|
|
Keywords |
Autonomous Driving; Control |
|
|
Abstract |
This paper presents the comparison of two nonlinear model-based control strategies for autonomous cars. A control oriented model of vehicle based on a bicycle model is used. The two control strategies use a model reference approach. Using this approach, the error dynamics model is developed. Both controllers receive as input the longitudinal, lateral and orientation errors generating as control outputs the steering angle and the velocity of the vehicle. The first control approach is based on a non-linear control law that is designed by means of the Lyapunov direct approach. The second approach is based on a sliding mode-control that defines a set of sliding surfaces over which the error trajectories will converge. The main advantage of the sliding-control technique is the robustness against non-linearities and parametric uncertainties in the model. However, the main drawback of first order sliding mode is the chattering, so it has been implemented a high order sliding mode control. To test and compare the proposed control strategies, different path following scenarios are used in simulation. |
|
|
Address |
Athens; Greece; June 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MED |
|
|
Notes |
ADAS; 600.085; 600.082; 600.076 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ ASP2016 |
Serial |
2750 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan Carlos Moure |
|
|
Title |
GPU-accelerated real-time stixel computation |
Type |
Conference Article |
|
Year |
2017 |
Publication |
IEEE Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1054-1062 |
|
|
Keywords |
Autonomous Driving; GPU; Stixel |
|
|
Abstract |
The Stixel World is a medium-level, compact representation of road scenes that abstracts millions of disparity pixels into hundreds or thousands of stixels. The goal of this work is to implement and evaluate a complete multi-stixel estimation pipeline on an embedded, energyefficient, GPU-accelerated device. This work presents a full GPU-accelerated implementation of stixel estimation that produces reliable results at 26 frames per second (real-time) on the Tegra X1 for disparity images of 1024×440 pixels and stixel widths of 5 pixels, and achieves more than 400 frames per second on a high-end Titan X GPU card. |
|
|
Address |
Santa Rosa; CA; USA; March 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HEV2017b |
Serial |
2812 |
|
Permanent link to this record |