Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–11] |
Records | |||||
---|---|---|---|---|---|
Author | Ishaan Gulrajani; Kundan Kumar; Faruk Ahmed; Adrien Ali Taiga; Francesco Visin; David Vazquez; Aaron Courville | ||||
Title | PixelVAE: A Latent Variable Model for Natural Images | Type | Conference Article | ||
Year | 2017 | Publication | 5th International Conference on Learning Representations | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Deep Learning; Unsupervised Learning | ||||
Abstract | Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and generate samples that preserve global structure but tend to suffer from image blurriness. PixelCNNs model sharp contours and details very well, but lack an explicit latent representation and have difficulty modeling large-scale structure in a computationally efficient way. In this paper, we present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. The resulting architecture achieves state-of-the-art log-likelihood on binarized MNIST. We extend PixelVAE to a hierarchy of multiple latent variables at different scales; this hierarchical model achieves competitive likelihood on 64x64 ImageNet and generates high-quality samples on LSUN bedrooms. | ||||
Address | Toulon; France; April 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICLR | ||
Notes | ADAS; 600.085; 600.076; 601.281; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ GKA2017 | Serial | 2815 | ||
Permanent link to this record | |||||
Author | Xavier Perez Sala; Fernando De la Torre; Laura Igual; Sergio Escalera; Cecilio Angulo | ||||
Title | Subspace Procrustes Analysis | Type | Journal Article | ||
Year | 2017 | Publication | International Journal of Computer Vision | Abbreviated Journal | IJCV |
Volume | 121 | Issue | 3 | Pages | 327–343 |
Keywords | |||||
Abstract | Procrustes Analysis (PA) has been a popular technique to align and build 2-D statistical models of shapes. Given a set of 2-D shapes PA is applied to remove rigid transformations. Then, a non-rigid 2-D model is computed by modeling (e.g., PCA) the residual. Although PA has been widely used, it has several limitations for modeling 2-D shapes: occluded landmarks and missing data can result in local minima solutions, and there is no guarantee that the 2-D shapes provide a uniform sampling of the 3-D space of rotations for the object. To address previous issues, this paper proposes Subspace PA (SPA). Given several
instances of a 3-D object, SPA computes the mean and a 2-D subspace that can simultaneously model all rigid and non-rigid deformations of the 3-D object. We propose a discrete (DSPA) and continuous (CSPA) formulation for SPA, assuming that 3-D samples of an object are provided. DSPA extends the traditional PA, and produces unbiased 2-D models by uniformly sampling different views of the 3-D object. CSPA provides a continuous approach to uniformly sample the space of 3-D rotations, being more efficient in space and time. Experiments using SPA to learn 2-D models of bodies from motion capture data illustrate the benefits of our approach. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ PTI2017 | Serial | 2841 | ||
Permanent link to this record | |||||
Author | Frederic Sampedro; Anna Domenech; Sergio Escalera; Ignasi Carrio | ||||
Title | Computing quantitative indicators of structural renal damage in pediatric DMSA scans | Type | Journal Article | ||
Year | 2017 | Publication | Revista Española de Medicina Nuclear e Imagen Molecular | Abbreviated Journal | REMNIM |
Volume | 36 | Issue | 2 | Pages | 72-77 |
Keywords | |||||
Abstract | OBJECTIVES:
The proposal and implementation of a computational framework for the quantification of structural renal damage from 99mTc-dimercaptosuccinic acid (DMSA) scans. The aim of this work is to propose, implement, and validate a computational framework for the quantification of structural renal damage from DMSA scans and in an observer-independent manner. MATERIALS AND METHODS: From a set of 16 pediatric DMSA-positive scans and 16 matched controls and using both expert-guided and automatic approaches, a set of image-derived quantitative indicators was computed based on the relative size, intensity and histogram distribution of the lesion. A correlation analysis was conducted in order to investigate the association of these indicators with other clinical data of interest in this scenario, including C-reactive protein (CRP), white cell count, vesicoureteral reflux, fever, relative perfusion, and the presence of renal sequelae in a 6-month follow-up DMSA scan. RESULTS: A fully automatic lesion detection and segmentation system was able to successfully classify DMSA-positive from negative scans (AUC=0.92, sensitivity=81% and specificity=94%). The image-computed relative size of the lesion correlated with the presence of fever and CRP levels (p<0.05), and a measurement derived from the distribution histogram of the lesion obtained significant performance results in the detection of permanent renal damage (AUC=0.86, sensitivity=100% and specificity=75%). CONCLUSIONS: The proposal and implementation of a computational framework for the quantification of structural renal damage from DMSA scans showed a promising potential to complement visual diagnosis and non-imaging indicators. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ SDE2017 | Serial | 2842 | ||
Permanent link to this record | |||||
Author | Jose Garcia-Rodriguez; Isabelle Guyon; Sergio Escalera; Alexandra Psarrou; Andrew Lewis; Miguel Cazorla | ||||
Title | Editorial: Special Issue on Computational Intelligence for Vision and Robotics | Type | Journal Article | ||
Year | 2017 | Publication | Neural Computing and Applications | Abbreviated Journal | Neural Computing and Applications |
Volume | 28 | Issue | 5 | Pages | 853–854 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ GGE2017 | Serial | 2845 | ||
Permanent link to this record | |||||
Author | Arash Akbarinia; Karl R. Gegenfurtner | ||||
Title | Metameric Mismatching in Natural and Artificial Reflectances | Type | Journal Article | ||
Year | 2017 | Publication | Journal of Vision | Abbreviated Journal | JV |
Volume | 17 | Issue | 10 | Pages | 390-390 |
Keywords | Metamer; colour perception; spectral discrimination; photoreceptors | ||||
Abstract | The human visual system and most digital cameras sample the continuous spectral power distribution through three classes of receptors. This implies that two distinct spectral reflectances can result in identical tristimulus values under one illuminant and differ under another – the problem of metamer mismatching. It is still debated how frequent this issue arises in the real world, using naturally occurring reflectance functions and common illuminants.
We gathered more than ten thousand spectral reflectance samples from various sources, covering a wide range of environments (e.g., flowers, plants, Munsell chips) and evaluated their responses under a number of natural and artificial source of lights. For each pair of reflectance functions, we estimated the perceived difference using the CIE-defined distance ΔE2000 metric in Lab color space. The degree of metamer mismatching depended on the lower threshold value l when two samples would be considered to lead to equal sensor excitations (ΔE < l), and on the higher threshold value h when they would be considered different. For example, for l=h=1, we found that 43.129 comparisons out of a total of 6×107 pairs would be considered metameric (1 in 104). For l=1 and h=5, this number reduced to 705 metameric pairs (2 in 106). Extreme metamers, for instance l=1 and h=10, were rare (22 pairs or 6 in 108), as were instances where the two members of a metameric pair would be assigned to different color categories. Not unexpectedly, we observed variations among different reflectance databases and illuminant spectra with more frequency under artificial illuminants than natural ones. Overall, our numbers are not very different from those obtained earlier (Foster et al, JOSA A, 2006). However, our results also show that the degree of metamerism is typically not very strong and that category switches hardly ever occur. |
||||
Address | Florida, USA; May 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | NEUROBIT; no menciona | Approved | no | ||
Call Number | Admin @ si @ AkG2017 | Serial | 2899 | ||
Permanent link to this record | |||||
Author | Marta Diez-Ferrer; Debora Gil; Elena Carreño; Susana Padrones; Samantha Aso | ||||
Title | Positive Airway Pressure-Enhanced CT to Improve Virtual Bronchoscopic Navigation | Type | Journal Article | ||
Year | 2017 | Publication | Journal of Thoracic Oncology | Abbreviated Journal | JTO |
Volume | 12 | Issue | 1S | Pages | S596-S597 |
Keywords | Thorax CT; diagnosis; Peripheral Pulmonary Nodule | ||||
Abstract | A main weakness of virtual bronchoscopic navigation (VBN) is unsuccessful segmentation of distal branches approaching peripheral pulmonary nodules (PPN). CT scan acquisition protocol is pivotal for segmentation covering the utmost periphery. We hypothesize that application of continuous positive airway pressure (CPAP) during CT acquisition could improve visualization and segmentation of peripheral bronchi. The purpose of the present pilot study is to compare quality of segmentations under 4 CT acquisition modes: inspiration (INSP), expiration (EXP) and both with CPAP (INSP-CPAP and EXP-CPAP). | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.096; 600.075; 600.145 | Approved | no | ||
Call Number | Admin @ si @ DGC2017a | Serial | 2883 | ||
Permanent link to this record | |||||
Author | Simon Jégou; Michal Drozdzal; David Vazquez; Adriana Romero; Yoshua Bengio | ||||
Title | The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation | Type | Conference Article | ||
Year | 2017 | Publication | IEEE Conference on Computer Vision and Pattern Recognition Workshops | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Semantic Segmentation | ||||
Abstract | State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g. Conditional Random Fields) to refine the model predictions.
Recently, a new CNN architecture, Densely Connected Convolutional Networks (DenseNets), has shown excellent results on image classification tasks. The idea of DenseNets is based on the observation that if each layer is directly connected to every other layer in a feed-forward fashion then the network will be more accurate and easier to train. In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets. |
||||
Address | Honolulu; USA; July 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | MILAB; ADAS; 600.076; 600.085; 601.281 | Approved | no | ||
Call Number | ADAS @ adas @ JDV2016 | Serial | 2866 | ||
Permanent link to this record | |||||
Author | Pau Riba; Josep Llados; Alicia Fornes | ||||
Title | Error-tolerant coarse-to-fine matching model for hierarchical graphs | Type | Conference Article | ||
Year | 2017 | Publication | 11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition | Abbreviated Journal | |
Volume | 10310 | Issue | Pages | 107-117 | |
Keywords | Graph matching; Hierarchical graph; Graph-based representation; Coarse-to-fine matching | ||||
Abstract | Graph-based representations are effective tools to capture structural information from visual elements. However, retrieving a query graph from a large database of graphs implies a high computational complexity. Moreover, these representations are very sensitive to noise or small changes. In this work, a novel hierarchical graph representation is designed. Using graph clustering techniques adapted from graph-based social media analysis, we propose to generate a hierarchy able to deal with different levels of abstraction while keeping information about the topology. For the proposed representations, a coarse-to-fine matching method is defined. These approaches are validated using real scenarios such as classification of colour images and handwritten word spotting. | ||||
Address | Anacapri; Italy; May 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer International Publishing | Place of Publication | Editor | Pasquale Foggia; Cheng-Lin Liu; Mario Vento | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | GbRPR | ||
Notes | DAG; 600.097; 601.302; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RLF2017a | Serial | 2951 | ||
Permanent link to this record | |||||
Author | Veronica Romero; Alicia Fornes; Enrique Vidal; Joan Andreu Sanchez | ||||
Title | Information Extraction in Handwritten Marriage Licenses Books Using the MGGI Methodology | Type | Conference Article | ||
Year | 2017 | Publication | 8th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | 10255 | Issue | Pages | 287-294 | |
Keywords | Handwritten Text Recognition; Information extraction; Language modeling; MGGI; Categories-based language model | ||||
Abstract | Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach. | ||||
Address | Faro; Portugal; June 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | L.A. Alexandre; J.Salvador Sanchez; Joao M. F. Rodriguez | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-319-58837-7 | Medium | ||
Area | Expedition | Conference | IbPRIA | ||
Notes | DAG; 602.006; 600.097; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RFV2017 | Serial | 2952 | ||
Permanent link to this record | |||||
Author | Antonio Lopez; Jiaolong Xu; Jose Luis Gomez; David Vazquez; German Ros | ||||
Title | From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example | Type | Book Chapter | ||
Year | 2017 | Publication | Domain Adaptation in Computer Vision Applications | Abbreviated Journal | |
Volume | Issue | 13 | Pages | 243-258 | |
Keywords | Domain Adaptation | ||||
Abstract | Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | Gabriela Csurka | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.085; 601.223; 600.076; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ LXG2017 | Serial | 2872 | ||
Permanent link to this record | |||||
Author | Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta | ||||
Title | Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases | Type | Journal Article | ||
Year | 2017 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 87 | Issue | Pages | 203-211 | |
Keywords | |||||
Abstract | Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.097; 602.006; 603.053; 600.121 | Approved | no | ||
Call Number | RLF2017b | Serial | 2873 | ||
Permanent link to this record | |||||
Author | Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan Carlos Moure | ||||
Title | Embedded Real-time Stixel Computation | Type | Conference Article | ||
Year | 2017 | Publication | GPU Technology Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | GPU; CUDA; Stixels; Autonomous Driving | ||||
Abstract | |||||
Address | Silicon Valley; USA; May 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | GTC | ||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ HEV2017a | Serial | 2879 | ||
Permanent link to this record | |||||
Author | David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville | ||||
Title | A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images | Type | Conference Article | ||
Year | 2017 | Publication | 31st International Congress and Exhibition on Computer Assisted Radiology and Surgery | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Deep Learning; Medical Imaging | ||||
Abstract | Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss-rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. We provide new baselines on this dataset by training standard fully convolutional networks (FCN) for semantic segmentation and significantly outperforming, without any further post-processing, prior results in endoluminal scene segmentation. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CARS | ||
Notes | ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ VBS2017a | Serial | 2880 | ||
Permanent link to this record | |||||
Author | David Geronimo; David Vazquez; Arturo de la Escalera | ||||
Title | Vision-Based Advanced Driver Assistance Systems | Type | Book Chapter | ||
Year | 2017 | Publication | Computer Vision in Vehicle Technology: Land, Sea, and Air | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | ADAS; Autonomous Driving | ||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ GVE2017 | Serial | 2881 | ||
Permanent link to this record | |||||
Author | German Ros; Laura Sellart; Gabriel Villalonga; Elias Maidanik; Francisco Molero; Marc Garcia; Adriana Cedeño; Francisco Perez; Didier Ramirez; Eduardo Escobar; Jose Luis Gomez; David Vazquez; Antonio Lopez | ||||
Title | Semantic Segmentation of Urban Scenes via Domain Adaptation of SYNTHIA | Type | Book Chapter | ||
Year | 2017 | Publication | Domain Adaptation in Computer Vision Applications | Abbreviated Journal | |
Volume | 12 | Issue | Pages | 227-241 | |
Keywords | SYNTHIA; Virtual worlds; Autonomous Driving | ||||
Abstract | Vision-based semantic segmentation in urban scenarios is a key functionality for autonomous driving. Recent revolutionary results of deep convolutional neural networks (DCNNs) foreshadow the advent of reliable classifiers to perform such visual tasks. However, DCNNs require learning of many parameters from raw images; thus, having a sufficient amount of diverse images with class annotations is needed. These annotations are obtained via cumbersome, human labour which is particularly challenging for semantic segmentation since pixel-level annotations are required. In this chapter, we propose to use a combination of a virtual world to automatically generate realistic synthetic images with pixel-level annotations, and domain adaptation to transfer the models learnt to correctly operate in real scenarios. We address the question of how useful synthetic data can be for semantic segmentation – in particular, when using a DCNN paradigm. In order to answer this question we have generated a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations and object identifiers. We use SYNTHIA in combination with publicly available real-world urban images with manually provided annotations. Then, we conduct experiments with DCNNs that show that combining SYNTHIA with simple domain adaptation techniques in the training stage significantly improves performance on semantic segmentation. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | Gabriela Csurka | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.085; 600.082; 600.076; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ RSV2017 | Serial | 2882 | ||
Permanent link to this record |