|
Records |
Links |
|
Author |
Marc Bolaños; Mariella Dimiccoli; Petia Radeva |
|
|
Title |
Towards Storytelling from Visual Lifelogging: An Overview |
Type |
Journal Article |
|
Year |
2017 |
Publication |
IEEE Transactions on Human-Machine Systems |
Abbreviated Journal |
THMS |
|
|
Volume |
47 |
Issue |
1 |
Pages |
77 - 90 |
|
|
Keywords |
|
|
|
Abstract |
Visual lifelogging consists of acquiring images that capture the daily experiences of the user by wearing a camera over a long period of time. The pictures taken offer considerable potential for knowledge mining concerning how people live their lives, hence, they open up new opportunities for many potential applications in fields including healthcare, security, leisure and
the quantified self. However, automatically building a story from a huge collection of unstructured egocentric data presents major challenges. This paper provides a thorough review of advances made so far in egocentric data analysis, and in view of the current state of the art, indicates new lines of research to move us towards storytelling from visual lifelogging. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; 601.235 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BDR2017 |
Serial |
2712 |
|
Permanent link to this record |
|
|
|
|
Author |
Mariella Dimiccoli; Marc Bolaños; Estefania Talavera; Maedeh Aghaei; Stavri G. Nikolov; Petia Radeva |
|
|
Title |
SR-Clustering: Semantic Regularized Clustering for Egocentric Photo Streams Segmentation |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Computer Vision and Image Understanding |
Abbreviated Journal |
CVIU |
|
|
Volume |
155 |
Issue |
|
Pages |
55-69 |
|
|
Keywords |
|
|
|
Abstract |
While wearable cameras are becoming increasingly popular, locating relevant information in large unstructured collections of egocentric images is still a tedious and time consuming processes. This paper addresses the problem of organizing egocentric photo streams acquired by a wearable camera into semantically meaningful segments. First, contextual and semantic information is extracted for each image by employing a Convolutional Neural Networks approach. Later, by integrating language processing, a vocabulary of concepts is defined in a semantic space. Finally, by exploiting the temporal coherence in photo streams, images which share contextual and semantic attributes are grouped together. The resulting temporal segmentation is particularly suited for further analysis, ranging from activity and event recognition to semantic indexing and summarization. Experiments over egocentric sets of nearly 17,000 images, show that the proposed approach outperforms state-of-the-art methods. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; 601.235 |
Approved |
no |
|
|
Call Number |
Admin @ si @ DBT2017 |
Serial |
2714 |
|
Permanent link to this record |
|
|
|
|
Author |
Hugo Jair Escalante; Victor Ponce; Sergio Escalera; Xavier Baro; Alicia Morales-Reyes; Jose Martinez-Carranza |
|
|
Title |
Evolving weighting schemes for the Bag of Visual Words |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Neural Computing and Applications |
Abbreviated Journal |
Neural Computing and Applications |
|
|
Volume |
28 |
Issue |
5 |
Pages |
925–939 |
|
|
Keywords |
Bag of Visual Words; Bag of features; Genetic programming; Term-weighting schemes; Computer vision |
|
|
Abstract |
The Bag of Visual Words (BoVW) is an established representation in computer vision. Taking inspiration from text mining, this representation has proved
to be very effective in many domains. However, in most cases, standard term-weighting schemes are adopted (e.g.,term-frequency or TF-IDF). It remains open the question of whether alternative weighting schemes could boost the
performance of methods based on BoVW. More importantly, it is unknown whether it is possible to automatically learn and determine effective weighting schemes from
scratch. This paper brings some light into both of these unknowns. On the one hand, we report an evaluation of the most common weighting schemes used in text mining, but rarely used in computer vision tasks. Besides, we propose an evolutionary algorithm capable of automatically learning weighting schemes for computer vision problems. We report empirical results of an extensive study in several computer vision problems. Results show the usefulness of the proposed method. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
Springer |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA;MV; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ EPE2017 |
Serial |
2743 |
|
Permanent link to this record |
|
|
|
|
Author |
Cristina Palmero; Jordi Esquirol; Vanessa Bayo; Miquel Angel Cos; Pouya Ahmadmonfared; Joan Salabert; David Sanchez; Sergio Escalera |
|
|
Title |
Automatic Sleep System Recommendation by Multi-modal RBG-Depth-Pressure Anthropometric Analysis |
Type |
Journal Article |
|
Year |
2017 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
|
|
Volume |
122 |
Issue |
2 |
Pages |
212–227 |
|
|
Keywords |
Sleep system recommendation; RGB-Depth data Pressure imaging; Anthropometric landmark extraction; Multi-part human body segmentation |
|
|
Abstract |
This paper presents a novel system for automatic sleep system recommendation using RGB, depth and pressure information. It consists of a validated clinical knowledge-based model that, along with a set of prescription variables extracted automatically, obtains a personalized bed design recommendation. The automatic process starts by performing multi-part human body RGB-D segmentation combining GrabCut, 3D Shape Context descriptor and Thin Plate Splines, to then extract a set of anthropometric landmark points by applying orthogonal plates to the segmented human body. The extracted variables are introduced to the computerized clinical model to calculate body circumferences, weight, morphotype and Body Mass Index categorization. Furthermore, pressure image analysis is performed to extract pressure values and at-risk points, which are also introduced to the model to eventually obtain the final prescription of mattress, topper, and pillow. We validate the complete system in a set of 200 subjects, showing accurate category classification and high correlation results with respect to manual measures. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA;MILAB; 303.100 |
Approved |
no |
|
|
Call Number |
Admin @ si @ PEB2017 |
Serial |
2765 |
|
Permanent link to this record |
|
|
|
|
Author |
Debora Gil; Sergio Vera; Agnes Borras; Albert Andaluz; Miguel Angel Gonzalez Ballester |
|
|
Title |
Anatomical Medial Surfaces with Efficient Resolution of Branches Singularities |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Medical Image Analysis |
Abbreviated Journal |
MIA |
|
|
Volume |
35 |
Issue |
|
Pages |
390-402 |
|
|
Keywords |
Medial Representations; Shape Recognition; Medial Branching Stability ; Singular Points |
|
|
Abstract |
Medial surfaces are powerful tools for shape description, but their use has been limited due to the sensibility existing methods to branching artifacts. Medial branching artifacts are associated to perturbations of the object boundary rather than to geometric features. Such instability is a main obstacle for a condent application in shape recognition and description. Medial branches correspond to singularities of the medial surface and, thus, they are problematic for existing morphological and energy-based algorithms. In this paper, we use algebraic geometry concepts in an energy-based approach to compute a medial surface presenting a stable branching topology. We also present an ecient GPU-CPU implementation using standard image processing tools. We show the method computational eciency and quality on a custom made synthetic database. Finally, we present some results on a medical imaging application for localization of abdominal pathologies. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier B.V. |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM; 600.060; 600.096; 600.075; 600.145 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GVB2017 |
Serial |
2775 |
|
Permanent link to this record |
|
|
|
|
Author |
Jean-Pascal Jacob; Mariella Dimiccoli; L. Moisan |
|
|
Title |
Active skeleton for bacteria modelling |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization |
Abbreviated Journal |
CMBBE |
|
|
Volume |
5 |
Issue |
4 |
Pages |
274-286 |
|
|
Keywords |
|
|
|
Abstract |
The investigation of spatio-temporal dynamics of bacterial cells and their molecular components requires automated image analysis tools to track cell shape properties and molecular component locations inside the cells. In the study of bacteria aging, the molecular components of interest are protein aggregates accumulated near bacteria boundaries. This particular location makes very ambiguous the correspondence between aggregates and cells, since computing accurately bacteria boundaries in phase-contrast time-lapse imaging is a challenging task. This paper proposes an active skeleton formulation for bacteria modelling which provides several advantages: an easy computation of shape properties (perimeter, length, thickness and orientation), an improved boundary accuracy in noisy images and a natural bacteria-centred coordinate system that permits the intrinsic location of molecular components inside the cell. Starting from an initial skeleton estimate, the medial axis of the bacterium is obtained by minimising an energy function which incorporates bacteria shape constraints. Experimental results on biological images and comparative evaluation of the performances validate the proposed approach for modelling cigar-shaped bacteria like Escherichia coli. The Image-J plugin of the proposed method can be found online at http://fluobactracker.inrialpes.fr. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Taylor & Francis Group |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; |
Approved |
no |
|
|
Call Number |
Admin @ si @JDM2017 |
Serial |
2784 |
|
Permanent link to this record |
|
|
|
|
Author |
Alejandro Gonzalez Alzate; David Vazquez; Antonio Lopez; Jaume Amores |
|
|
Title |
On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts |
Type |
Journal Article |
|
Year |
2017 |
Publication |
IEEE Transactions on cybernetics |
Abbreviated Journal |
Cyber |
|
|
Volume |
47 |
Issue |
11 |
Pages |
3980 - 3990 |
|
|
Keywords |
Multicue; multimodal; multiview; object detection |
|
|
Abstract |
Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities, and a strong multiview (MV) classifier that accounts for different object views and poses. In this paper, we provide an extensive evaluation that gives insight into how each of these aspects (multicue, multimodality, and strong MV classifier) affect accuracy both individually and when integrated together. In the multimodality component, we explore the fusion of RGB and depth maps obtained by high-definition light detection and ranging, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
2168-2267 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.085; 600.082; 600.076; 600.118 |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
2810 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan Carlos Moure |
|
|
Title |
GPU-accelerated real-time stixel computation |
Type |
Conference Article |
|
Year |
2017 |
Publication |
IEEE Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1054-1062 |
|
|
Keywords |
Autonomous Driving; GPU; Stixel |
|
|
Abstract |
The Stixel World is a medium-level, compact representation of road scenes that abstracts millions of disparity pixels into hundreds or thousands of stixels. The goal of this work is to implement and evaluate a complete multi-stixel estimation pipeline on an embedded, energyefficient, GPU-accelerated device. This work presents a full GPU-accelerated implementation of stixel estimation that produces reliable results at 26 frames per second (real-time) on the Tegra X1 for disparity images of 1024×440 pixels and stixel widths of 5 pixels, and achieves more than 400 frames per second on a high-end Titan X GPU card. |
|
|
Address |
Santa Rosa; CA; USA; March 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HEV2017b |
Serial |
2812 |
|
Permanent link to this record |
|
|
|
|
Author |
Ishaan Gulrajani; Kundan Kumar; Faruk Ahmed; Adrien Ali Taiga; Francesco Visin; David Vazquez; Aaron Courville |
|
|
Title |
PixelVAE: A Latent Variable Model for Natural Images |
Type |
Conference Article |
|
Year |
2017 |
Publication |
5th International Conference on Learning Representations |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Deep Learning; Unsupervised Learning |
|
|
Abstract |
Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and generate samples that preserve global structure but tend to suffer from image blurriness. PixelCNNs model sharp contours and details very well, but lack an explicit latent representation and have difficulty modeling large-scale structure in a computationally efficient way. In this paper, we present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. The resulting architecture achieves state-of-the-art log-likelihood on binarized MNIST. We extend PixelVAE to a hierarchy of multiple latent variables at different scales; this hierarchical model achieves competitive likelihood on 64x64 ImageNet and generates high-quality samples on LSUN bedrooms. |
|
|
Address |
Toulon; France; April 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICLR |
|
|
Notes |
ADAS; 600.085; 600.076; 601.281; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ GKA2017 |
Serial |
2815 |
|
Permanent link to this record |
|
|
|
|
Author |
Xavier Perez Sala; Fernando De la Torre; Laura Igual; Sergio Escalera; Cecilio Angulo |
|
|
Title |
Subspace Procrustes Analysis |
Type |
Journal Article |
|
Year |
2017 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
|
|
Volume |
121 |
Issue |
3 |
Pages |
327–343 |
|
|
Keywords |
|
|
|
Abstract |
Procrustes Analysis (PA) has been a popular technique to align and build 2-D statistical models of shapes. Given a set of 2-D shapes PA is applied to remove rigid transformations. Then, a non-rigid 2-D model is computed by modeling (e.g., PCA) the residual. Although PA has been widely used, it has several limitations for modeling 2-D shapes: occluded landmarks and missing data can result in local minima solutions, and there is no guarantee that the 2-D shapes provide a uniform sampling of the 3-D space of rotations for the object. To address previous issues, this paper proposes Subspace PA (SPA). Given several
instances of a 3-D object, SPA computes the mean and a 2-D subspace that can simultaneously model all rigid and non-rigid deformations of the 3-D object. We propose a discrete (DSPA) and continuous (CSPA) formulation for SPA, assuming that 3-D samples of an object are provided. DSPA extends the traditional PA, and produces unbiased 2-D models by uniformly sampling different views of the 3-D object. CSPA provides a continuous approach to uniformly sample the space of 3-D rotations, being more efficient in space and time. Experiments using SPA to learn 2-D models of bodies from motion capture data illustrate the benefits of our approach. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; HuPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ PTI2017 |
Serial |
2841 |
|
Permanent link to this record |
|
|
|
|
Author |
Frederic Sampedro; Anna Domenech; Sergio Escalera; Ignasi Carrio |
|
|
Title |
Computing quantitative indicators of structural renal damage in pediatric DMSA scans |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Revista Española de Medicina Nuclear e Imagen Molecular |
Abbreviated Journal |
REMNIM |
|
|
Volume |
36 |
Issue |
2 |
Pages |
72-77 |
|
|
Keywords |
|
|
|
Abstract |
OBJECTIVES:
The proposal and implementation of a computational framework for the quantification of structural renal damage from 99mTc-dimercaptosuccinic acid (DMSA) scans. The aim of this work is to propose, implement, and validate a computational framework for the quantification of structural renal damage from DMSA scans and in an observer-independent manner.
MATERIALS AND METHODS:
From a set of 16 pediatric DMSA-positive scans and 16 matched controls and using both expert-guided and automatic approaches, a set of image-derived quantitative indicators was computed based on the relative size, intensity and histogram distribution of the lesion. A correlation analysis was conducted in order to investigate the association of these indicators with other clinical data of interest in this scenario, including C-reactive protein (CRP), white cell count, vesicoureteral reflux, fever, relative perfusion, and the presence of renal sequelae in a 6-month follow-up DMSA scan.
RESULTS:
A fully automatic lesion detection and segmentation system was able to successfully classify DMSA-positive from negative scans (AUC=0.92, sensitivity=81% and specificity=94%). The image-computed relative size of the lesion correlated with the presence of fever and CRP levels (p<0.05), and a measurement derived from the distribution histogram of the lesion obtained significant performance results in the detection of permanent renal damage (AUC=0.86, sensitivity=100% and specificity=75%).
CONCLUSIONS:
The proposal and implementation of a computational framework for the quantification of structural renal damage from DMSA scans showed a promising potential to complement visual diagnosis and non-imaging indicators. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA;MILAB; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ SDE2017 |
Serial |
2842 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Garcia-Rodriguez; Isabelle Guyon; Sergio Escalera; Alexandra Psarrou; Andrew Lewis; Miguel Cazorla |
|
|
Title |
Editorial: Special Issue on Computational Intelligence for Vision and Robotics |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Neural Computing and Applications |
Abbreviated Journal |
Neural Computing and Applications |
|
|
Volume |
28 |
Issue |
5 |
Pages |
853–854 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA;MILAB; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ GGE2017 |
Serial |
2845 |
|
Permanent link to this record |
|
|
|
|
Author |
Simon Jégou; Michal Drozdzal; David Vazquez; Adriana Romero; Yoshua Bengio |
|
|
Title |
The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation |
Type |
Conference Article |
|
Year |
2017 |
Publication |
IEEE Conference on Computer Vision and Pattern Recognition Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Semantic Segmentation |
|
|
Abstract |
State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g. Conditional Random Fields) to refine the model predictions.
Recently, a new CNN architecture, Densely Connected Convolutional Networks (DenseNets), has shown excellent results on image classification tasks. The idea of DenseNets is based on the observation that if each layer is directly connected to every other layer in a feed-forward fashion then the network will be more accurate and easier to train.
In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets. |
|
|
Address |
Honolulu; USA; July 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
MILAB; ADAS; 600.076; 600.085; 601.281 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ JDV2016 |
Serial |
2866 |
|
Permanent link to this record |
|
|
|
|
Author |
Antonio Lopez; Jiaolong Xu; Jose Luis Gomez; David Vazquez; German Ros |
|
|
Title |
From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
Domain Adaptation in Computer Vision Applications |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
13 |
Pages |
243-258 |
|
|
Keywords |
Domain Adaptation |
|
|
Abstract |
Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
|
Editor |
Gabriela Csurka |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.085; 601.223; 600.076; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ LXG2017 |
Serial |
2872 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta |
|
|
Title |
Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
87 |
Issue |
|
Pages |
203-211 |
|
|
Keywords |
|
|
|
Abstract |
Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 602.006; 603.053; 600.121 |
Approved |
no |
|
|
Call Number |
RLF2017b |
Serial |
2873 |
|
Permanent link to this record |