toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Akhil Gurram; Onay Urfalioglu; Ibrahim Halfaoui; Fahd Bouzaraa; Antonio Lopez edit  url
doi  openurl
  Title Semantic Monocular Depth Estimation Based on Artificial Intelligence Type Journal Article
  Year (down) 2020 Publication IEEE Intelligent Transportation Systems Magazine Abbreviated Journal ITSM  
  Volume Issue Pages  
  Keywords  
  Abstract Depth estimation provides essential information to perform autonomous driving and driver assistance. A promising line of work consists of introducing additional semantic information about the traffic scene when training CNNs for depth estimation. In practice, this means that the depth data used for CNN training is complemented with images having pixel-wise semantic labels where the same raw training data is associated with both types of ground truth, i.e., depth and semantic labels. The main contribution of this paper is to show that this hard constraint can be circumvented, i.e., that we can train CNNs for depth estimation by leveraging the depth and semantic information coming from heterogeneous datasets. In order to illustrate the benefits of our approach, we combine KITTI depth and Cityscapes semantic segmentation datasets, outperforming state-of-the-art results on monocular depth estimation.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.124 Approved no  
  Call Number Admin @ si @ GUH2019 Serial 3306  
Permanent link to this record
 

 
Author Fei Yang; Luis Herranz; Joost van de Weijer; Jose Antonio Iglesias; Antonio Lopez; Mikhail Mozerov edit  url
doi  openurl
  Title Variable Rate Deep Image Compression with Modulated Autoencoder Type Journal Article
  Year (down) 2020 Publication IEEE Signal Processing Letters Abbreviated Journal SPL  
  Volume 27 Issue Pages 331-335  
  Keywords  
  Abstract Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; no menciona Approved no  
  Call Number Admin @ si @ YHW2020 Serial 3346  
Permanent link to this record
 

 
Author Gabriel Villalonga; Joost van de Weijer; Antonio Lopez edit  url
doi  openurl
  Title Recognizing new classes with synthetic data in the loop: application to traffic sign recognition Type Journal Article
  Year (down) 2020 Publication Sensors Abbreviated Journal SENS  
  Volume 20 Issue 3 Pages 583  
  Keywords  
  Abstract On-board vision systems may need to increase the number of classes that can be recognized in a relatively short period. For instance, a traffic sign recognition system may suddenly be required to recognize new signs. Since collecting and annotating samples of such new classes may need more time than we wish, especially for uncommon signs, we propose a method to generate these samples by combining synthetic images and Generative Adversarial Network (GAN) technology. In particular, the GAN is trained on synthetic and real-world samples from known classes to perform synthetic-to-real domain adaptation, but applied to synthetic samples of the new classes. Using the Tsinghua dataset with a synthetic counterpart, SYNTHIA-TS, we have run an extensive set of experiments. The results show that the proposed method is indeed effective, provided that we use a proper Convolutional Neural Network (CNN) to perform the traffic sign recognition (classification) task as well as a proper GAN to transform the synthetic images. Here, a ResNet101-based classifier and domain adaptation based on CycleGAN performed extremely well for a ratio∼ 1/4 for new/known classes; even for more challenging ratios such as∼ 4/1, the results are also very positive.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; no proj Approved no  
  Call Number Admin @ si @ VWL2020 Serial 3405  
Permanent link to this record
 

 
Author Katerine Diaz; Jesus Martinez del Rincon; Marçal Rusiñol; Aura Hernandez-Sabate edit  doi
openurl 
  Title Feature Extraction by Using Dual-Generalized Discriminative Common Vectors Type Journal Article
  Year (down) 2019 Publication Journal of Mathematical Imaging and Vision Abbreviated Journal JMIV  
  Volume 61 Issue 3 Pages 331-351  
  Keywords Online feature extraction; Generalized discriminative common vectors; Dual learning; Incremental learning; Decremental learning  
  Abstract In this paper, a dual online subspace-based learning method called dual-generalized discriminative common vectors (Dual-GDCV) is presented. The method extends incremental GDCV by exploiting simultaneously both the concepts of incremental and decremental learning for supervised feature extraction and classification. Our methodology is able to update the feature representation space without recalculating the full projection or accessing the previously processed training data. It allows both adding information and removing unnecessary data from a knowledge base in an efficient way, while retaining the previously acquired knowledge. The proposed method has been theoretically proved and empirically validated in six standard face recognition and classification datasets, under two scenarios: (1) removing and adding samples of existent classes, and (2) removing and adding new classes to a classification problem. Results show a considerable computational gain without compromising the accuracy of the model in comparison with both batch methodologies and other state-of-art adaptive methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; ADAS; 600.084 Approved no  
  Call Number Admin @ si @ DRR2019 Serial 3172  
Permanent link to this record
 

 
Author Jiaolong Xu; Liang Xiao; Antonio Lopez edit  doi
openurl 
  Title Self-supervised Domain Adaptation for Computer Vision Tasks Type Journal Article
  Year (down) 2019 Publication IEEE ACCESS Abbreviated Journal ACCESS  
  Volume 7 Issue Pages 156694 - 156706  
  Keywords  
  Abstract Recent progress of self-supervised visual representation learning has achieved remarkable success on many challenging computer vision benchmarks. However, whether these techniques can be used for domain adaptation has not been explored. In this work, we propose a generic method for self-supervised domain adaptation, using object recognition and semantic segmentation of urban scenes as use cases. Focusing on simple pretext/auxiliary tasks (e.g. image rotation prediction), we assess different learning strategies to improve domain adaptation effectiveness by self-supervision. Additionally, we propose two complementary strategies to further boost the domain adaptation accuracy on semantic segmentation within our method, consisting of prediction layer alignment and batch normalization calibration. The experimental results show adaptation levels comparable to most studied domain adaptation methods, thus, bringing self-supervision as a new alternative for reaching domain adaptation. The code is available at this link. https://github.com/Jiaolong/self-supervised-da.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; no proj Approved no  
  Call Number Admin @ si @ XXL2019 Serial 3302  
Permanent link to this record
 

 
Author Cesar de Souza; Adrien Gaidon; Yohann Cabon; Naila Murray; Antonio Lopez edit  doi
openurl 
  Title Generating Human Action Videos by Coupling 3D Game Engines and Probabilistic Graphical Models Type Journal Article
  Year (down) 2019 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 128 Issue Pages 1505–1536  
  Keywords Procedural generation; Human action recognition; Synthetic data; Physics  
  Abstract Deep video action recognition models have been highly successful in recent years but require large quantities of manually-annotated data, which are expensive and laborious to obtain. In this work, we investigate the generation of synthetic training data for video action recognition, as synthetic data have been successfully used to supervise models for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation, physics models and other components of modern game engines. With this model we generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for “Procedural Human Action Videos”. PHAV contains a total of 39,982 videos, with more than 1000 examples for each of 35 action categories. Our video generation approach is not limited to existing motion capture sequences: 14 of these 35 categories are procedurally-defined synthetic actions. In addition, each video is represented with 6 different data modalities, including RGB, optical flow and pixel-level semantic labels. These modalities are generated almost simultaneously using the Multiple Render Targets feature of modern GPUs. In order to leverage PHAV, we introduce a deep multi-task (i.e. that considers action classes from multiple datasets) representation learning architecture that is able to simultaneously learn from synthetic and real video datasets, even when their action categories differ. Our experiments on the UCF-101 and HMDB-51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance. Our approach also significantly outperforms video representations produced by fine-tuning state-of-the-art unsupervised generative models of videos.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.124 Approved no  
  Call Number Admin @ si @ SGC2019 Serial 3303  
Permanent link to this record
 

 
Author Daniel Hernandez; Lukas Schneider; P. Cebrian; A. Espinosa; David Vazquez; Antonio Lopez; Uwe Franke; Marc Pollefeys; Juan Carlos Moure edit  url
openurl 
  Title Slanted Stixels: A way to represent steep streets Type Journal Article
  Year (down) 2019 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 127 Issue Pages 1643–1658  
  Keywords  
  Abstract This work presents and evaluates a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced in order to significantly reduce the computational complexity of the Stixel algorithm, and then achieve real-time computation capabilities. The idea is to first perform an over-segmentation of the image, discarding the unlikely Stixel cuts, and apply the algorithm only on the remaining Stixel cuts. This work presents a novel over-segmentation strategy based on a fully convolutional network, which outperforms an approach based on using local extrema of the disparity map. We evaluate the proposed methods in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118; 600.124 Approved no  
  Call Number Admin @ si @ HSC2019 Serial 3304  
Permanent link to this record
 

 
Author Zhijie Fang; Antonio Lopez edit  url
doi  openurl
  Title Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation Type Journal Article
  Year (down) 2019 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume Issue Pages  
  Keywords  
  Abstract Anticipating the intentions of vulnerable road users (VRUs) such as pedestrians and cyclists is critical for performing safe and comfortable driving maneuvers. This is the case for human driving and, thus, should be taken into account by systems providing any level of driving assistance, from advanced driver assistant systems (ADAS) to fully autonomous vehicles (AVs). In this paper, we show how the latest advances on monocular vision-based human pose estimation, i.e. those relying on deep Convolutional Neural Networks (CNNs), enable to recognize the intentions of such VRUs. In the case of cyclists, we assume that they follow traffic rules to indicate future maneuvers with arm signals. In the case of pedestrians, no indications can be assumed. Instead, we hypothesize that the walking pattern of a pedestrian allows to determine if he/she has the intention of crossing the road in the path of the ego-vehicle, so that the ego-vehicle must maneuver accordingly (e.g. slowing down or stopping). In this paper, we show how the same methodology can be used for recognizing pedestrians and cyclists' intentions. For pedestrians, we perform experiments on the JAAD dataset. For cyclists, we did not found an analogous dataset, thus, we created our own one by acquiring and annotating videos which we share with the research community. Overall, the proposed pipeline provides new state-of-the-art results on the intention recognition of VRUs.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; no proj Approved no  
  Call Number Admin @ si @ FaL2019 Serial 3305  
Permanent link to this record
 

 
Author Marçal Rusiñol; J. Chazalon; Katerine Diaz edit   pdf
doi  openurl
  Title Augmented Songbook: an Augmented Reality Educational Application for Raising Music Awareness Type Journal Article
  Year (down) 2018 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 77 Issue 11 Pages 13773-13798  
  Keywords Augmented reality; Document image matching; Educational applications  
  Abstract This paper presents the development of an Augmented Reality mobile application which aims at sensibilizing young children to abstract concepts of music. Such concepts are, for instance, the musical notation or the idea of rhythm. Recent studies in Augmented Reality for education suggest that such technologies have multiple benefits for students, including younger ones. As mobile document image acquisition and processing gains maturity on mobile platforms, we explore how it is possible to build a markerless and real-time application to augment the physical documents with didactic animations and interactive virtual content. Given a standard image processing pipeline, we compare the performance of different local descriptors at two key stages of the process. Results suggest alternatives to the SIFT local descriptors, regarding result quality and computational efficiency, both for document model identification and perspective transform estimation. All experiments are performed on an original and public dataset we introduce here.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; ADAS; 600.084; 600.121; 600.118 Approved no  
  Call Number Admin @ si @ RCD2018 Serial 2996  
Permanent link to this record
 

 
Author Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate; Marçal Rusiñol; Francesc J. Ferri edit   pdf
doi  openurl
  Title Fast Kernel Generalized Discriminative Common Vectors for Feature Extraction Type Journal Article
  Year (down) 2018 Publication Journal of Mathematical Imaging and Vision Abbreviated Journal JMIV  
  Volume 60 Issue 4 Pages 512-524  
  Keywords  
  Abstract This paper presents a supervised subspace learning method called Kernel Generalized Discriminative Common Vectors (KGDCV), as a novel extension of the known Discriminative Common Vectors method with Kernels. Our method combines the advantages of kernel methods to model complex data and solve nonlinear
problems with moderate computational complexity, with the better generalization properties of generalized approaches for large dimensional data. These attractive combination makes KGDCV specially suited for feature extraction and classification in computer vision, image processing and pattern recognition applications. Two different approaches to this generalization are proposed, a first one based on the kernel trick (KT) and a second one based on the nonlinear projection trick (NPT) for even higher efficiency. Both methodologies
have been validated on four different image datasets containing faces, objects and handwritten digits, and compared against well known non-linear state-of-art methods. Results show better discriminant properties than other generalized approaches both linear or kernel. In addition, the KGDCV-NPT approach presents a considerable computational gain, without compromising the accuracy of the model.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; ADAS; 600.086; 600.130; 600.121; 600.118 Approved no  
  Call Number Admin @ si @ DMH2018a Serial 3062  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: