|
Records |
Links |
|
Author |
German Ros; Laura Sellart; Gabriel Villalonga; Elias Maidanik; Francisco Molero; Marc Garcia; Adriana Cedeño; Francisco Perez; Didier Ramirez; Eduardo Escobar; Jose Luis Gomez; David Vazquez; Antonio Lopez |
|
|
Title |
Semantic Segmentation of Urban Scenes via Domain Adaptation of SYNTHIA |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
Domain Adaptation in Computer Vision Applications |
Abbreviated Journal |
|
|
|
Volume |
12 |
Issue |
|
Pages |
227-241 |
|
|
Keywords |
SYNTHIA; Virtual worlds; Autonomous Driving |
|
|
Abstract |
Vision-based semantic segmentation in urban scenarios is a key functionality for autonomous driving. Recent revolutionary results of deep convolutional neural networks (DCNNs) foreshadow the advent of reliable classifiers to perform such visual tasks. However, DCNNs require learning of many parameters from raw images; thus, having a sufficient amount of diverse images with class annotations is needed. These annotations are obtained via cumbersome, human labour which is particularly challenging for semantic segmentation since pixel-level annotations are required. In this chapter, we propose to use a combination of a virtual world to automatically generate realistic synthetic images with pixel-level annotations, and domain adaptation to transfer the models learnt to correctly operate in real scenarios. We address the question of how useful synthetic data can be for semantic segmentation – in particular, when using a DCNN paradigm. In order to answer this question we have generated a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations and object identifiers. We use SYNTHIA in combination with publicly available real-world urban images with manually provided annotations. Then, we conduct experiments with DCNNs that show that combining SYNTHIA with simple domain adaptation techniques in the training stage significantly improves performance on semantic segmentation. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
|
Editor |
Gabriela Csurka |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.085; 600.082; 600.076; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ RSV2017 |
Serial |
2882 |
|
Permanent link to this record |
|
|
|
|
Author |
H. Martin Kjer; Jens Fagertun; Sergio Vera; Debora Gil |
|
|
Title |
Medial structure generation for registration of anatomical structures |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
Skeletonization, Theory, Methods and Applications |
Abbreviated Journal |
|
|
|
Volume |
11 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM; 600.096; 600.075; 600.145 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MFV2017a |
Serial |
2935 |
|
Permanent link to this record |
|
|
|
|
Author |
Jean-Pascal Jacob; Mariella Dimiccoli; L. Moisan |
|
|
Title |
Active skeleton for bacteria modelling |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization |
Abbreviated Journal |
CMBBE |
|
|
Volume |
5 |
Issue |
4 |
Pages |
274-286 |
|
|
Keywords |
|
|
|
Abstract |
The investigation of spatio-temporal dynamics of bacterial cells and their molecular components requires automated image analysis tools to track cell shape properties and molecular component locations inside the cells. In the study of bacteria aging, the molecular components of interest are protein aggregates accumulated near bacteria boundaries. This particular location makes very ambiguous the correspondence between aggregates and cells, since computing accurately bacteria boundaries in phase-contrast time-lapse imaging is a challenging task. This paper proposes an active skeleton formulation for bacteria modelling which provides several advantages: an easy computation of shape properties (perimeter, length, thickness and orientation), an improved boundary accuracy in noisy images and a natural bacteria-centred coordinate system that permits the intrinsic location of molecular components inside the cell. Starting from an initial skeleton estimate, the medial axis of the bacterium is obtained by minimising an energy function which incorporates bacteria shape constraints. Experimental results on biological images and comparative evaluation of the performances validate the proposed approach for modelling cigar-shaped bacteria like Escherichia coli. The Image-J plugin of the proposed method can be found online at http://fluobactracker.inrialpes.fr. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Taylor & Francis Group |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; |
Approved |
no |
|
|
Call Number |
Admin @ si @JDM2017 |
Serial |
2784 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan Carlos Moure |
|
|
Title |
GPU-accelerated real-time stixel computation |
Type |
Conference Article |
|
Year |
2017 |
Publication |
IEEE Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1054-1062 |
|
|
Keywords |
Autonomous Driving; GPU; Stixel |
|
|
Abstract |
The Stixel World is a medium-level, compact representation of road scenes that abstracts millions of disparity pixels into hundreds or thousands of stixels. The goal of this work is to implement and evaluate a complete multi-stixel estimation pipeline on an embedded, energyefficient, GPU-accelerated device. This work presents a full GPU-accelerated implementation of stixel estimation that produces reliable results at 26 frames per second (real-time) on the Tegra X1 for disparity images of 1024×440 pixels and stixel widths of 5 pixels, and achieves more than 400 frames per second on a high-end Titan X GPU card. |
|
|
Address |
Santa Rosa; CA; USA; March 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HEV2017b |
Serial |
2812 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Lukas Schneider; Antonio Espinosa; David Vazquez; Antonio Lopez; Uwe Franke; Marc Pollefeys; Juan C. Moure |
|
|
Title |
Slanted Stixels: Representing San Francisco's Steepest Streets |
Type |
Conference Article |
|
Year |
2017 |
Publication |
28th British Machine Vision Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
In this work we present a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced that uses an extremely efficient over-segmentation. In doing so, the computational complexity of the Stixel inference algorithm is reduced significantly, achieving real-time computation capabilities with only a slight drop in accuracy. We evaluate the proposed approach in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset. |
|
|
Address |
London; uk; September 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
BMVC |
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HSE2017a |
Serial |
2945 |
|
Permanent link to this record |
|
|
|
|
Author |
Ozan Caglayan; Walid Aransa; Adrien Bardet; Mercedes Garcia-Martinez; Fethi Bougares; Loic Barrault; Marc Masana; Luis Herranz; Joost Van de Weijer |
|
|
Title |
LIUM-CVC Submissions for WMT17 Multimodal Translation Task |
Type |
Conference Article |
|
Year |
2017 |
Publication |
2nd Conference on Machine Translation |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This paper describes the monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation. We mainly explored two multimodal architectures where either global visual features or convolutional feature maps are integrated in order to benefit from visual context. Our final systems ranked first for both En-De and En-Fr language pairs according to the automatic evaluation metrics METEOR and BLEU. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WMT |
|
|
Notes |
LAMP; 600.106; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CAB2017 |
Serial |
3035 |
|
Permanent link to this record |
|
|
|
|
Author |
Ishaan Gulrajani; Kundan Kumar; Faruk Ahmed; Adrien Ali Taiga; Francesco Visin; David Vazquez; Aaron Courville |
|
|
Title |
PixelVAE: A Latent Variable Model for Natural Images |
Type |
Conference Article |
|
Year |
2017 |
Publication |
5th International Conference on Learning Representations |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Deep Learning; Unsupervised Learning |
|
|
Abstract |
Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and generate samples that preserve global structure but tend to suffer from image blurriness. PixelCNNs model sharp contours and details very well, but lack an explicit latent representation and have difficulty modeling large-scale structure in a computationally efficient way. In this paper, we present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. The resulting architecture achieves state-of-the-art log-likelihood on binarized MNIST. We extend PixelVAE to a hierarchy of multiple latent variables at different scales; this hierarchical model achieves competitive likelihood on 64x64 ImageNet and generates high-quality samples on LSUN bedrooms. |
|
|
Address |
Toulon; France; April 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICLR |
|
|
Notes |
ADAS; 600.085; 600.076; 601.281; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ GKA2017 |
Serial |
2815 |
|
Permanent link to this record |
|
|
|
|
Author |
Simon Jégou; Michal Drozdzal; David Vazquez; Adriana Romero; Yoshua Bengio |
|
|
Title |
The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation |
Type |
Conference Article |
|
Year |
2017 |
Publication |
IEEE Conference on Computer Vision and Pattern Recognition Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Semantic Segmentation |
|
|
Abstract |
State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g. Conditional Random Fields) to refine the model predictions.
Recently, a new CNN architecture, Densely Connected Convolutional Networks (DenseNets), has shown excellent results on image classification tasks. The idea of DenseNets is based on the observation that if each layer is directly connected to every other layer in a feed-forward fashion then the network will be more accurate and easier to train.
In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets. |
|
|
Address |
Honolulu; USA; July 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
MILAB; ADAS; 600.076; 600.085; 601.281 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ JDV2016 |
Serial |
2866 |
|
Permanent link to this record |
|
|
|
|
Author |
Antonio Lopez; Jiaolong Xu; Jose Luis Gomez; David Vazquez; German Ros |
|
|
Title |
From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
Domain Adaptation in Computer Vision Applications |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
13 |
Pages |
243-258 |
|
|
Keywords |
Domain Adaptation |
|
|
Abstract |
Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
|
Editor |
Gabriela Csurka |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.085; 601.223; 600.076; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ LXG2017 |
Serial |
2872 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan Carlos Moure |
|
|
Title |
Embedded Real-time Stixel Computation |
Type |
Conference Article |
|
Year |
2017 |
Publication |
GPU Technology Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
GPU; CUDA; Stixels; Autonomous Driving |
|
|
Abstract |
|
|
|
Address |
Silicon Valley; USA; May 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GTC |
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HEV2017a |
Serial |
2879 |
|
Permanent link to this record |
|
|
|
|
Author |
David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville |
|
|
Title |
A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images |
Type |
Conference Article |
|
Year |
2017 |
Publication |
31st International Congress and Exhibition on Computer Assisted Radiology and Surgery |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Deep Learning; Medical Imaging |
|
|
Abstract |
Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss-rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. We provide new baselines on this dataset by training standard fully convolutional networks (FCN) for semantic segmentation and significantly outperforming, without any further post-processing, prior results in endoluminal scene segmentation. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CARS |
|
|
Notes |
ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ VBS2017a |
Serial |
2880 |
|
Permanent link to this record |
|
|
|
|
Author |
David Geronimo; David Vazquez; Arturo de la Escalera |
|
|
Title |
Vision-Based Advanced Driver Assistance Systems |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
Computer Vision in Vehicle Technology: Land, Sea, and Air |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
ADAS; Autonomous Driving |
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ GVE2017 |
Serial |
2881 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Y. Patel; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas |
|
|
Title |
Self‐supervised learning of visual features through embedding images into text topic spaces |
Type |
Conference Article |
|
Year |
2017 |
Publication |
30th IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
End-to-end training from scratch of current deep architectures for new computer vision problems would require Imagenet-scale datasets, and this is not always possible. In this paper we present a method that is able to take advantage of freely available multi-modal content to train computer vision algorithms without human supervision. We put forward the idea of performing self-supervised learning of visual features by mining a large scale corpus of multi-modal (text and image) documents. We show that discriminative visual features can be learnt efficiently by training a CNN to predict the semantic context in which a particular image is more probable to appear as an illustration. For this we leverage the hidden semantic structures discovered in the text corpus with a well-known topic modeling technique. Our experiments demonstrate state of the art performance in image classification, object detection, and multi-modal retrieval compared to recent self-supervised or natural-supervised approaches. |
|
|
Address |
Honolulu; Hawaii; July 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPR |
|
|
Notes |
DAG; 600.084; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GPR2017 |
Serial |
2889 |
|
Permanent link to this record |
|
|
|
|
Author |
Victor Vaquero; German Ros; Francesc Moreno-Noguer; Antonio Lopez; Alberto Sanfeliu |
|
|
Title |
Joint coarse-and-fine reasoning for deep optical flow |
Type |
Conference Article |
|
Year |
2017 |
Publication |
24th International Conference on Image Processing |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2558-2562 |
|
|
Keywords |
|
|
|
Abstract |
We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning. The coarse reasoning is performed over a discrete classification space to obtain a general rough solution, while the fine details of the solution are obtained over a continuous regression space. In our approach both components are jointly estimated, which proved to be beneficial for improving estimation accuracy. Additionally, we propose a new network architecture, which combines coarse and fine components by treating the fine estimation as a refinement built on top of the coarse solution, and therefore adding details to the general prediction. We apply our approach to the challenging problem of optical flow estimation and empirically validate it against state-of-the-art CNN-based solutions trained from scratch and tested on large optical flow datasets. |
|
|
Address |
Beijing; China; September 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICIP |
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
Admin @ si @ VRM2017 |
Serial |
2898 |
|
Permanent link to this record |
|
|
|
|
Author |
Patricia Suarez; Angel Sappa; Boris X. Vintimilla |
|
|
Title |
Cross-Spectral Image Patch Similarity using Convolutional Neural Network |
Type |
Conference Article |
|
Year |
2017 |
Publication |
IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The ability to compare image regions (patches) has been the basis of many approaches to core computer vision problems, including object, texture and scene categorization. Hence, developing representations for image patches have been of interest in several works. The current work focuses on learning similarity between cross-spectral image patches with a 2 channel convolutional neural network (CNN) model. The proposed approach is an adaptation of a previous work, trying to obtain similar results than the state of the art but with a lowcost hardware. Hence, obtained results are compared with both
classical approaches, showing improvements, and a state of the art CNN based approach. |
|
|
Address |
San Sebastian; Spain; May 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECMSM |
|
|
Notes |
ADAS; 600.086; 600.118 |
Approved |
no |
|
|
Call Number |
Admin @ si @ SSV2017a |
Serial |
2916 |
|
Permanent link to this record |