toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links (down)
Author Sudeep Katakol; Basem Elbarashy; Luis Herranz; Joost Van de Weijer; Antonio Lopez edit   pdf
url  doi
openurl 
  Title Distributed Learning and Inference with Compressed Images Type Journal Article
  Year 2021 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 30 Issue Pages 3069 - 3083  
  Keywords  
  Abstract Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time, or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; 600.120; 600.118 Approved no  
  Call Number Admin @ si @ KEH2021 Serial 3543  
Permanent link to this record
 

 
Author Yi Xiao; Felipe Codevilla; Akhil Gurram; Onay Urfalioglu; Antonio Lopez edit   pdf
url  doi
openurl 
  Title Multimodal end-to-end autonomous driving Type Journal Article
  Year 2020 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume Issue Pages 1-11  
  Keywords  
  Abstract A crucial component of an autonomous vehicle (AV) is the artificial intelligence (AI) is able to drive towards a desired destination. Today, there are different paradigms addressing the development of AI drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception and maneuver planning and control. On the other hand, we find end-to-end driving approaches that try to learn a direct mapping from input raw sensor data to vehicle control signals. The later are relatively less studied, but are gaining popularity since they are less demanding in terms of sensor data annotation. This paper focuses on end-to-end autonomous driving. So far, most proposals relying on this paradigm assume RGB images as input sensor data. However, AVs will not be equipped only with cameras, but also with active sensors providing accurate depth information (e.g., LiDARs). Accordingly, this paper analyses whether combining RGB and depth modalities, i.e. using RGBD data, produces better end-to-end AI drivers than relying on a single modality. We consider multimodality based on early, mid and late fusion schemes, both in multisensory and single-sensor (monocular depth estimation) settings. Using the CARLA simulator and conditional imitation learning (CIL), we show how, indeed, early fusion multimodality outperforms single-modality.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ XCG2020 Serial 3490  
Permanent link to this record
 

 
Author Fei Yang; Luis Herranz; Joost Van de Weijer; Jose Antonio Iglesias; Antonio Lopez; Mikhail Mozerov edit   pdf
url  doi
openurl 
  Title Variable Rate Deep Image Compression with Modulated Autoencoder Type Journal Article
  Year 2020 Publication IEEE Signal Processing Letters Abbreviated Journal SPL  
  Volume 27 Issue Pages 331-335  
  Keywords  
  Abstract Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; 600.141; 600.120; 600.118 Approved no  
  Call Number Admin @ si @ YHW2020 Serial 3346  
Permanent link to this record
 

 
Author Akhil Gurram; Onay Urfalioglu; Ibrahim Halfaoui; Fahd Bouzaraa; Antonio Lopez edit  url
doi  openurl
  Title Semantic Monocular Depth Estimation Based on Artificial Intelligence Type Journal Article
  Year 2020 Publication IEEE Intelligent Transportation Systems Magazine Abbreviated Journal ITSM  
  Volume 13 Issue 4 Pages 99-103  
  Keywords  
  Abstract Depth estimation provides essential information to perform autonomous driving and driver assistance. A promising line of work consists of introducing additional semantic information about the traffic scene when training CNNs for depth estimation. In practice, this means that the depth data used for CNN training is complemented with images having pixel-wise semantic labels where the same raw training data is associated with both types of ground truth, i.e., depth and semantic labels. The main contribution of this paper is to show that this hard constraint can be circumvented, i.e., that we can train CNNs for depth estimation by leveraging the depth and semantic information coming from heterogeneous datasets. In order to illustrate the benefits of our approach, we combine KITTI depth and Cityscapes semantic segmentation datasets, outperforming state-of-the-art results on monocular depth estimation.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.124; 600.118 Approved no  
  Call Number Admin @ si @ GUH2019 Serial 3306  
Permanent link to this record
 

 
Author Zhijie Fang; Antonio Lopez edit   pdf
url  doi
openurl 
  Title Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation Type Journal Article
  Year 2019 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 21 Issue 11 Pages 4773 - 4783  
  Keywords  
  Abstract Anticipating the intentions of vulnerable road users (VRUs) such as pedestrians and cyclists is critical for performing safe and comfortable driving maneuvers. This is the case for human driving and, thus, should be taken into account by systems providing any level of driving assistance, from advanced driver assistant systems (ADAS) to fully autonomous vehicles (AVs). In this paper, we show how the latest advances on monocular vision-based human pose estimation, i.e. those relying on deep Convolutional Neural Networks (CNNs), enable to recognize the intentions of such VRUs. In the case of cyclists, we assume that they follow traffic rules to indicate future maneuvers with arm signals. In the case of pedestrians, no indications can be assumed. Instead, we hypothesize that the walking pattern of a pedestrian allows to determine if he/she has the intention of crossing the road in the path of the ego-vehicle, so that the ego-vehicle must maneuver accordingly (e.g. slowing down or stopping). In this paper, we show how the same methodology can be used for recognizing pedestrians and cyclists' intentions. For pedestrians, we perform experiments on the JAAD dataset. For cyclists, we did not found an analogous dataset, thus, we created our own one by acquiring and annotating videos which we share with the research community. Overall, the proposed pipeline provides new state-of-the-art results on the intention recognition of VRUs.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ FaL2019 Serial 3305  
Permanent link to this record
 

 
Author Jose L. Gomez; Gabriel Villalonga; Antonio Lopez edit   pdf
url  openurl
  Title Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches Type Journal Article
  Year 2021 Publication Sensors Abbreviated Journal SENS  
  Volume 21 Issue 9 Pages 3185  
  Keywords co-training; multi-modality; vision-based object detection; ADAS; self-driving  
  Abstract Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ GVL2021 Serial 3562  
Permanent link to this record
 

 
Author Xavier Soria; Angel Sappa; Riad I. Hammoud edit   pdf
url  doi
openurl 
  Title Wide-Band Color Imagery Restoration for RGB-NIR Single Sensor Images Type Journal Article
  Year 2018 Publication Sensors Abbreviated Journal SENS  
  Volume 18 Issue 7 Pages 2059  
  Keywords RGB-NIR sensor; multispectral imaging; deep learning; CNNs  
  Abstract Multi-spectral RGB-NIR sensors have become ubiquitous in recent years. These sensors allow the visible and near-infrared spectral bands of a given scene to be captured at the same time. With such cameras, the acquired imagery has a compromised RGB color representation due to near-infrared bands (700–1100 nm) cross-talking with the visible bands (400–700 nm).
This paper proposes two deep learning-based architectures to recover the full RGB color images, thus removing the NIR information from the visible bands. The proposed approaches directly restore the high-resolution RGB image by means of convolutional neural networks. They are evaluated with several outdoor images; both architectures reach a similar performance when evaluated in different
scenarios and using different similarity metrics. Both of them improve the state of the art approaches.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; MSIAU; 600.086; 600.130; 600.122; 600.118 Approved no  
  Call Number Admin @ si @ SSH2018 Serial 3145  
Permanent link to this record
 

 
Author David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville edit   pdf
url  openurl
  Title A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images Type Journal Article
  Year 2017 Publication Journal of Healthcare Engineering Abbreviated Journal JHCE  
  Volume Issue Pages 2040-2295  
  Keywords Colonoscopy images; Deep Learning; Semantic Segmentation  
  Abstract Colorectal cancer (CRC) is the third cause of cancer death world-wide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss- rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aim- ing to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image segmentation, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. The proposed dataset consists of 4 relevant classes to inspect the endolumninal scene, tar- geting different clinical needs. Together with the dataset and taking advantage of advances in semantic segmentation literature, we provide new baselines by training standard fully convolutional networks (FCN). We perform a compar- ative study to show that FCN significantly outperform, without any further post-processing, prior results in endoluminal scene segmentation, especially with respect to polyp segmentation and localization.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118 Approved no  
  Call Number VBS2017b Serial 2940  
Permanent link to this record
 

 
Author Hannes Mueller; Andre Groeger; Jonathan Hersh; Andrea Matranga; Joan Serrat edit   pdf
url  doi
openurl 
  Title Monitoring war destruction from space using machine learning Type Journal Article
  Year 2021 Publication Proceedings of the National Academy of Sciences of the United States of America Abbreviated Journal PNAS  
  Volume 118 Issue 23 Pages e2025400118  
  Keywords  
  Abstract Existing data on building destruction in conflict zones rely on eyewitness reports or manual detection, which makes it generally scarce, incomplete, and potentially biased. This lack of reliable data imposes severe limitations for media reporting, humanitarian relief efforts, human-rights monitoring, reconstruction initiatives, and academic studies of violent conflict. This article introduces an automated method of measuring destruction in high-resolution satellite images using deep-learning techniques combined with label augmentation and spatial and temporal smoothing, which exploit the underlying spatial and temporal structure of destruction. As a proof of concept, we apply this method to the Syrian civil war and reconstruct the evolution of damage in major cities across the country. Our approach allows generating destruction data with unprecedented scope, resolution, and frequency—and makes use of the ever-higher frequency at which satellite imagery becomes available.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ MGH2021 Serial 3584  
Permanent link to this record
 

 
Author Miguel Oliveira; Victor Santos; Angel Sappa; P. Dias; A. Moreira edit   pdf
url  openurl
  Title Incremental texture mapping for autonomous driving Type Journal Article
  Year 2016 Publication Robotics and Autonomous Systems Abbreviated Journal RAS  
  Volume 84 Issue Pages 113-128  
  Keywords Scene reconstruction; Autonomous driving; Texture mapping  
  Abstract Autonomous vehicles have a large number of on-board sensors, not only for providing coverage all around the vehicle, but also to ensure multi-modality in the observation of the scene. Because of this, it is not trivial to come up with a single, unique representation that feeds from the data given by all these sensors. We propose an algorithm which is capable of mapping texture collected from vision based sensors onto a geometric description of the scenario constructed from data provided by 3D sensors. The algorithm uses a constrained Delaunay triangulation to produce a mesh which is updated using a specially devised sequence of operations. These enforce a partial configuration of the mesh that avoids bad quality textures and ensures that there are no gaps in the texture. Results show that this algorithm is capable of producing fine quality textures.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.086 Approved no  
  Call Number Admin @ si @ OSS2016b Serial 2912  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: