toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Akhil Gurram; Ahmet Faruk Tuna; Fengyi Shen; Onay Urfalioglu; Antonio Lopez edit   pdf
doi  openurl
  Title Monocular Depth Estimation through Virtual-world Supervision and Real-world SfM Self-Supervision Type Journal Article
  Year 2021 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 23 Issue 8 Pages 12738-12751  
  Keywords  
  Abstract Depth information is essential for on-board perception in autonomous driving and driver assistance. Monocular depth estimation (MDE) is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i.e., assuming pixelwise ground truth (GT). Usually, this GT is acquired at training time through a calibrated multi-modal suite of sensors. However, also using only a monocular system at training time is cheaper and more scalable. This is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. In this paper, we perform monocular depth estimation by virtual-world supervision (MonoDEVS) and real-world SfM self-supervision. We compensate the SfM self-supervision limitations by leveraging virtual-world images with accurate semantic and depth supervision and addressing the virtual-to-real domain gap. Our MonoDEVSNet outperforms previous MDE CNNs trained on monocular and even stereo sequences.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) ADAS; 600.118 Approved no  
  Call Number Admin @ si @ GTS2021 Serial 3598  
Permanent link to this record
 

 
Author Gabriel Villalonga edit  isbn
openurl 
  Title Leveraging Synthetic Data to Create Autonomous Driving Perception Systems Type Book Whole
  Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Manually annotating images to develop vision models has been a major bottleneck
since computer vision and machine learning started to walk together. This has
been more evident since computer vision falls on the shoulders of data-hungry
deep learning techniques. When addressing on-board perception for autonomous
driving, the curse of data annotation is exacerbated due to the use of additional
sensors such as LiDAR. Therefore, any approach aiming at reducing such a timeconsuming and costly work is of high interest for addressing autonomous driving
and, in fact, for any application requiring some sort of artificial perception. In the
last decade, it has been shown that leveraging from synthetic data is a paradigm
worth to pursue in order to minimizing manual data annotation. The reason is
that the automatic process of generating synthetic data can also produce different
types of associated annotations (e.g. object bounding boxes for synthetic images
and LiDAR pointclouds, pixel/point-wise semantic information, etc.). Directly
using synthetic data for training deep perception models may not be the definitive
solution in all circumstances since it can appear a synth-to-real domain shift. In
this context, this work focuses on leveraging synthetic data to alleviate manual
annotation for three perception tasks related to driving assistance and autonomous
driving. In all cases, we assume the use of deep convolutional neural networks
(CNNs) to develop our perception models.
The first task addresses traffic sign recognition (TSR), a kind of multi-class
classification problem. We assume that the number of sign classes to be recognized
must be suddenly increased without having annotated samples to perform the
corresponding TSR CNN re-training. We show that leveraging synthetic samples of
such new classes and transforming them by a generative adversarial network (GAN)
trained on the known classes (i.e. without using samples from the new classes), it is
possible to re-train the TSR CNN to properly classify all the signs for a ∼ 1/4 ratio of
new/known sign classes. The second task addresses on-board 2D object detection,
focusing on vehicles and pedestrians. In this case, we assume that we receive a set
of images without the annotations required to train an object detector, i.e. without
object bounding boxes. Therefore, our goal is to self-annotate these images so
that they can later be used to train the desired object detector. In order to reach
this goal, we leverage from synthetic data and propose a semi-supervised learning
approach based on the co-training idea. In fact, we use a GAN to reduce the synthto-real domain shift before applying co-training. Our quantitative results show
that co-training and GAN-based image-to-image translation complement each
other up to allow the training of object detectors without manual annotation, and still almost reaching the upper-bound performances of the detectors trained from
human annotations. While in previous tasks we focus on vision-based perception,
the third task we address focuses on LiDAR pointclouds. Our initial goal was to
develop a 3D object detector trained on synthetic LiDAR-style pointclouds. While
for images we may expect synth/real-to-real domain shift due to differences in
their appearance (e.g. when source and target images come from different camera
sensors), we did not expect so for LiDAR pointclouds since these active sensors
factor out appearance and provide sampled shapes. However, in practice, we have
seen that it can be domain shift even among real-world LiDAR pointclouds. Factors
such as the sampling parameters of the LiDARs, the sensor suite configuration onboard the ego-vehicle, and the human annotation of 3D bounding boxes, do induce
a domain shift. We show it through comprehensive experiments with different
publicly available datasets and 3D detectors. This redirected our goal towards the
design of a GAN for pointcloud-to-pointcloud translation, a relatively unexplored
topic.
Finally, it is worth to mention that all the synthetic datasets used for these three
tasks, have been designed and generated in the context of this PhD work and will
be publicly released. Overall, we think this PhD presents several steps forward to
encourage leveraging synthetic data for developing deep perception models in the
field of driving assistance and autonomous driving.
 
  Address February 2021  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;German Ros  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-122714-2-3 Medium  
  Area Expedition Conference  
  Notes (down) ADAS; 600.118 Approved no  
  Call Number Admin @ si @ Vil2021 Serial 3599  
Permanent link to this record
 

 
Author Yi Xiao; Felipe Codevilla; Christopher Pal; Antonio Lopez edit   pdf
openurl 
  Title Action-Based Representation Learning for Autonomous Driving Type Conference Article
  Year 2020 Publication Conference on Robot Learning Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Human drivers produce a vast amount of data which could, in principle, be used to improve autonomous driving systems. Unfortunately, seemingly straightforward approaches for creating end-to-end driving models that map sensor data directly into driving actions are problematic in terms of interpretability, and typically have significant difficulty dealing with spurious correlations. Alternatively, we propose to use this kind of action-based driving data for learning representations. Our experiments show that an affordance-based driving model pre-trained with this approach can leverage a relatively small amount of weakly annotated imagery and outperform pure end-to-end driving models, while being more interpretable. Further, we demonstrate how this strategy outperforms previous methods based on learning inverse dynamics models as well as other methods based on heavy human supervision (ImageNet).  
  Address virtual; November 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CORL  
  Notes (down) ADAS; 600.118 Approved no  
  Call Number Admin @ si @ XCP2020 Serial 3487  
Permanent link to this record
 

 
Author Gabriel Villalonga; Antonio Lopez edit   pdf
doi  openurl
  Title Co-Training for On-Board Deep Object Detection Type Journal Article
  Year 2020 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume Issue Pages 194441 - 194456  
  Keywords  
  Abstract Providing ground truth supervision to train visual models has been a bottleneck over the years, exacerbated by domain shifts which degenerate the performance of such models. This was the case when visual tasks relied on handcrafted features and shallow machine learning and, despite its unprecedented performance gains, the problem remains open within the deep learning paradigm due to its data-hungry nature. Best performing deep vision-based object detectors are trained in a supervised manner by relying on human-labeled bounding boxes which localize class instances (i.e. objects) within the training images. Thus, object detection is one of such tasks for which human labeling is a major bottleneck. In this article, we assess co-training as a semi-supervised learning method for self-labeling objects in unlabeled images, so reducing the human-labeling effort for developing deep object detectors. Our study pays special attention to a scenario involving domain shift; in particular, when we have automatically generated virtual-world images with object bounding boxes and we have real-world images which are unlabeled. Moreover, we are particularly interested in using co-training for deep object detection in the context of driver assistance systems and/or self-driving vehicles. Thus, using well-established datasets and protocols for object detection in these application contexts, we will show how co-training is a paradigm worth to pursue for alleviating object labeling, working both alone and together with task-agnostic domain adaptation.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) ADAS; 600.118 Approved no  
  Call Number Admin @ si @ ViL2020 Serial 3488  
Permanent link to this record
 

 
Author Hannes Mueller; Andre Groger; Jonathan Hersh; Andrea Matranga; Joan Serrat edit   pdf
url  openurl
  Title Monitoring War Destruction from Space: A Machine Learning Approach Type Miscellaneous
  Year 2020 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Existing data on building destruction in conflict zones rely on eyewitness reports or manual detection, which makes it generally scarce, incomplete and potentially biased. This lack of reliable data imposes severe limitations for media reporting, humanitarian relief efforts, human rights monitoring, reconstruction initiatives, and academic studies of violent conflict. This article introduces an automated method of measuring destruction in high-resolution satellite images using deep learning techniques combined with data augmentation to expand training samples. We apply this method to the Syrian civil war and reconstruct the evolution of damage in major cities across the country. The approach allows generating destruction data with unprecedented scope, resolution, and frequency – only limited by the available satellite imagery – which can alleviate data limitations decisively.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) ADAS; 600.118 Approved no  
  Call Number Admin @ si @ MGH2020 Serial 3489  
Permanent link to this record
 

 
Author Jose Luis Gomez; Gabriel Villalonga; Antonio Lopez edit   pdf
url  openurl
  Title Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches Type Journal Article
  Year 2021 Publication Sensors Abbreviated Journal SENS  
  Volume 21 Issue 9 Pages 3185  
  Keywords co-training; multi-modality; vision-based object detection; ADAS; self-driving  
  Abstract Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) ADAS; 600.118 Approved no  
  Call Number Admin @ si @ GVL2021 Serial 3562  
Permanent link to this record
 

 
Author Hannes Mueller; Andre Groeger; Jonathan Hersh; Andrea Matranga; Joan Serrat edit   pdf
url  doi
openurl 
  Title Monitoring war destruction from space using machine learning Type Journal Article
  Year 2021 Publication Proceedings of the National Academy of Sciences of the United States of America Abbreviated Journal PNAS  
  Volume 118 Issue 23 Pages e2025400118  
  Keywords  
  Abstract Existing data on building destruction in conflict zones rely on eyewitness reports or manual detection, which makes it generally scarce, incomplete, and potentially biased. This lack of reliable data imposes severe limitations for media reporting, humanitarian relief efforts, human-rights monitoring, reconstruction initiatives, and academic studies of violent conflict. This article introduces an automated method of measuring destruction in high-resolution satellite images using deep-learning techniques combined with label augmentation and spatial and temporal smoothing, which exploit the underlying spatial and temporal structure of destruction. As a proof of concept, we apply this method to the Syrian civil war and reconstruct the evolution of damage in major cities across the country. Our approach allows generating destruction data with unprecedented scope, resolution, and frequency—and makes use of the ever-higher frequency at which satellite imagery becomes available.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) ADAS; 600.118 Approved no  
  Call Number Admin @ si @ MGH2021 Serial 3584  
Permanent link to this record
 

 
Author Felipe Codevilla; Matthias Muller; Antonio Lopez; Vladlen Koltun; Alexey Dosovitskiy edit   pdf
doi  openurl
  Title End-to-end Driving via Conditional Imitation Learning Type Conference Article
  Year 2018 Publication IEEE International Conference on Robotics and Automation Abbreviated Journal  
  Volume Issue Pages 4693 - 4700  
  Keywords  
  Abstract Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at this https URL  
  Address Brisbane; Australia; May 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICRA  
  Notes (down) ADAS; 600.116; 600.124; 600.118 Approved no  
  Call Number Admin @ si @ CML2018 Serial 3108  
Permanent link to this record
 

 
Author Gemma Rotger; Francesc Moreno-Noguer; Felipe Lumbreras; Antonio Agudo edit  doi
openurl 
  Title Single view facial hair 3D reconstruction Type Conference Article
  Year 2019 Publication 9th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal  
  Volume 11867 Issue Pages 423-436  
  Keywords 3D Vision; Shape Reconstruction; Facial Hair Modeling  
  Abstract n this work, we introduce a novel energy-based framework that addresses the challenging problem of 3D reconstruction of facial hair from a single RGB image. To this end, we identify hair pixels over the image via texture analysis and then determine individual hair fibers that are modeled by means of a parametric hair model based on 3D helixes. We propose to minimize an energy composed of several terms, in order to adapt the hair parameters that better fit the image detections. The final hairs respond to the resulting fibers after a post-processing step where we encourage further realism. The resulting approach generates realistic facial hair fibers from solely an RGB image without assuming any training data nor user interaction. We provide an experimental evaluation on real-world pictures where several facial hair styles and image conditions are observed, showing consistent results and establishing a comparison with respect to competing approaches.  
  Address Madrid; July 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IbPRIA  
  Notes (down) ADAS; 600.086; 600.130; 600.122 Approved no  
  Call Number Admin @ si @ Serial 3707  
Permanent link to this record
 

 
Author Gemma Rotger; Francesc Moreno-Noguer; Felipe Lumbreras; Antonio Agudo edit  url
openurl 
  Title Detailed 3D face reconstruction from a single RGB image Type Journal
  Year 2019 Publication Journal of WSCG Abbreviated Journal JWSCG  
  Volume 27 Issue 2 Pages 103-112  
  Keywords 3D Wrinkle Reconstruction; Face Analysis, Optimization.  
  Abstract This paper introduces a method to obtain a detailed 3D reconstruction of facial skin from a single RGB image.
To this end, we propose the exclusive use of an input image without requiring any information about the observed material nor training data to model the wrinkle properties. They are detected and characterized directly from the image via a simple and effective parametric model, determining several features such as location, orientation, width, and height. With these ingredients, we propose to minimize a photometric error to retrieve the final detailed 3D map, which is initialized by current techniques based on deep learning. In contrast with other approaches, we only require estimating a depth parameter, making our approach fast and intuitive. Extensive experimental evaluation is presented in a wide variety of synthetic and real images, including different skin properties and facial
expressions. In all cases, our method outperforms the current approaches regarding 3D reconstruction accuracy, providing striking results for both large and fine wrinkles.
 
  Address 2019/11  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) ADAS; 600.086; 600.130; 600.122 Approved no  
  Call Number Admin @ si @ Serial 3708  
Permanent link to this record
 

 
Author Gemma Rotger; Felipe Lumbreras; Francesc Moreno-Noguer; Antonio Agudo edit   pdf
doi  openurl
  Title 2D-to-3D Facial Expression Transfer Type Conference Article
  Year 2018 Publication 24th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 2008 - 2013  
  Keywords  
  Abstract Automatically changing the expression and physical features of a face from an input image is a topic that has been traditionally tackled in a 2D domain. In this paper, we bring this problem to 3D and propose a framework that given an
input RGB video of a human face under a neutral expression, initially computes his/her 3D shape and then performs a transfer to a new and potentially non-observed expression. For this purpose, we parameterize the rest shape –obtained from standard factorization approaches over the input video– using a triangular
mesh which is further clustered into larger macro-segments. The expression transfer problem is then posed as a direct mapping between this shape and a source shape, such as the blend shapes of an off-the-shelf 3D dataset of human facial expressions. The mapping is resolved to be geometrically consistent between 3D models by requiring points in specific regions to map on semantic
equivalent regions. We validate the approach on several synthetic and real examples of input faces that largely differ from the source shapes, yielding very realistic expression transfers even in cases with topology changes, such as a synthetic video sequence of a single-eyed cyclops.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes (down) ADAS; 600.086; 600.130; 600.118 Approved no  
  Call Number Admin @ si @ RLM2018 Serial 3232  
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco; Angel Sappa; Cristhian Aguilera; Ricardo Toledo edit   pdf
doi  openurl
  Title Cross-Spectral Local Descriptors via Quadruplet Network Type Journal Article
  Year 2017 Publication Sensors Abbreviated Journal SENS  
  Volume 17 Issue 4 Pages 873  
  Keywords  
  Abstract This paper presents a novel CNN-based architecture, referred to as Q-Net, to learn local feature descriptors that are useful for matching image patches from two different spectral bands. Given correctly matched and non-matching cross-spectral image pairs, a quadruplet network is trained to map input image patches to a common Euclidean space, regardless of the input spectral band. Our approach is inspired by the recent success of triplet networks in the visible spectrum, but adapted for cross-spectral scenarios, where, for each matching pair, there are always two possible non-matching patches: one for each spectrum. Experimental evaluations on a public cross-spectral VIS-NIR dataset shows that the proposed approach improves the state-of-the-art. Moreover, the proposed technique can also be used in mono-spectral settings, obtaining a similar performance to triplet network descriptors, but requiring less training data.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) ADAS; 600.086; 600.118 Approved no  
  Call Number Admin @ si @ ASA2017 Serial 2914  
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla edit   pdf
doi  openurl
  Title Cross-Spectral Image Patch Similarity using Convolutional Neural Network Type Conference Article
  Year 2017 Publication IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The ability to compare image regions (patches) has been the basis of many approaches to core computer vision problems, including object, texture and scene categorization. Hence, developing representations for image patches have been of interest in several works. The current work focuses on learning similarity between cross-spectral image patches with a 2 channel convolutional neural network (CNN) model. The proposed approach is an adaptation of a previous work, trying to obtain similar results than the state of the art but with a lowcost hardware. Hence, obtained results are compared with both
classical approaches, showing improvements, and a state of the art CNN based approach.
 
  Address San Sebastian; Spain; May 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECMSM  
  Notes (down) ADAS; 600.086; 600.118 Approved no  
  Call Number Admin @ si @ SSV2017a Serial 2916  
Permanent link to this record
 

 
Author Angel Valencia; Roger Idrovo; Angel Sappa; Douglas Plaza; Daniel Ochoa edit   pdf
openurl 
  Title A 3D Vision Based Approach for Optimal Grasp of Vacuum Grippers Type Conference Article
  Year 2017 Publication IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In general, robot grasping approaches are based on the usage of multi-finger grippers. However, when large size objects need to be manipulated vacuum grippers are preferred, instead of finger based grippers. This paper aims to estimate the best picking place for a two suction cups vacuum gripper,
when planar objects with an unknown size and geometry are considered. The approach is based on the estimation of geometric properties of object’s shape from a partial cloud of points (a single 3D view), in such a way that combine with considerations of a theoretical model to generate an optimal contact point
that minimizes the vacuum force needed to guarantee a grasp.
Experimental results in real scenarios are presented to show the validity of the proposed approach.
 
  Address San Sebastian; Spain; May 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECMSM  
  Notes (down) ADAS; 600.086; 600.118 Approved no  
  Call Number Admin @ si @ VIS2017 Serial 2917  
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla edit   pdf
doi  openurl
  Title Infrared Image Colorization based on a Triplet DCGAN Architecture Type Conference Article
  Year 2017 Publication IEEE Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper proposes a novel approach for colorizing near infrared (NIR) images using Deep Convolutional Generative Adversarial Network (GAN) architectures. The proposed approach is based on the usage of a triplet model for learning each color channel independently, in a more homogeneous way. It allows a fast convergence during the training, obtaining a greater similarity between the given NIR image and the corresponding ground truth. The proposed approach has been evaluated with a large data set of NIR images and compared with a recent approach, which is also based on a GAN architecture but in this case all the
color channels are obtained at the same time.
 
  Address Honolulu; Hawaii; USA; July 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes (down) ADAS; 600.086; 600.118 Approved no  
  Call Number Admin @ si @ SSV2017b Serial 2920  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: