Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Simon Jégou; Michal Drozdzal; David Vazquez; Adriana Romero; Yoshua Bengio
Title	The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation			Type	Conference Article
Year	2017	Publication	IEEE Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
Volume		Issue		Pages
Keywords	Semantic Segmentation
Abstract	State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g. Conditional Random Fields) to refine the model predictions. Recently, a new CNN architecture, Densely Connected Convolutional Networks (DenseNets), has shown excellent results on image classification tasks. The idea of DenseNets is based on the observation that if each layer is directly connected to every other layer in a feed-forward fashion then the network will be more accurate and easier to train. In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets.
Address	Honolulu; USA; July 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	MILAB; ADAS; 600.076; 600.085; 601.281			Approved	no
Call Number	ADAS @ adas @ JDV2016			Serial	2866
Permanent link to this record



Author	Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta
Title	Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases			Type	Journal Article
Year	2017	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	87	Issue		Pages	203-211
Keywords
Abstract	Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.097; 602.006; 603.053; 600.121			Approved	no
Call Number	RLF2017b			Serial	2873
Permanent link to this record



Author	Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan Carlos Moure
Title	Embedded Real-time Stixel Computation			Type	Conference Article
Year	2017	Publication	GPU Technology Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords	GPU; CUDA; Stixels; Autonomous Driving
Abstract
Address	Silicon Valley; USA; May 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	GTC
Notes	ADAS; 600.118			Approved	no
Call Number	ADAS @ adas @ HEV2017a			Serial	2879
Permanent link to this record



Author	David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville
Title	A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images			Type	Conference Article
Year	2017	Publication	31st International Congress and Exhibition on Computer Assisted Radiology and Surgery	Abbreviated Journal
Volume		Issue		Pages
Keywords	Deep Learning; Medical Imaging
Abstract	Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss-rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. We provide new baselines on this dataset by training standard fully convolutional networks (FCN) for semantic segmentation and significantly outperforming, without any further post-processing, prior results in endoluminal scene segmentation.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CARS
Notes	ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118			Approved	no
Call Number	ADAS @ adas @ VBS2017a			Serial	2880
Permanent link to this record



Author	David Geronimo; David Vazquez; Arturo de la Escalera
Title	Vision-Based Advanced Driver Assistance Systems			Type	Book Chapter
Year	2017	Publication	Computer Vision in Vehicle Technology: Land, Sea, and Air	Abbreviated Journal
Volume		Issue		Pages
Keywords	ADAS; Autonomous Driving
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	ADAS @ adas @ GVE2017			Serial	2881
Permanent link to this record



Author	Lluis Gomez; Dimosthenis Karatzas
Title	TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild			Type	Journal Article
Year	2017	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	70	Issue		Pages	60-74
Keywords
Abstract	Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way. Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.084; 601.197; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ GoK2017			Serial	2886
Permanent link to this record



Author	Lluis Gomez; Anguelos Nicolaou; Dimosthenis Karatzas
Title	Improving patch‐based scene text script identification with ensembles of conjoined networks			Type	Journal Article
Year	2017	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	67	Issue		Pages	85-96
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.084; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ GNK2017			Serial	2887
Permanent link to this record



Author	Lluis Gomez; Y. Patel; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas
Title	Self‐supervised learning of visual features through embedding images into text topic spaces			Type	Conference Article
Year	2017	Publication	30th IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	End-to-end training from scratch of current deep architectures for new computer vision problems would require Imagenet-scale datasets, and this is not always possible. In this paper we present a method that is able to take advantage of freely available multi-modal content to train computer vision algorithms without human supervision. We put forward the idea of performing self-supervised learning of visual features by mining a large scale corpus of multi-modal (text and image) documents. We show that discriminative visual features can be learnt efficiently by training a CNN to predict the semantic context in which a particular image is more probable to appear as an illustration. For this we leverage the hidden semantic structures discovered in the text corpus with a well-known topic modeling technique. Our experiments demonstrate state of the art performance in image classification, object detection, and multi-modal retrieval compared to recent self-supervised or natural-supervised approaches.
Address	Honolulu; Hawaii; July 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	DAG; 600.084; 600.121			Approved	no
Call Number	Admin @ si @ GPR2017			Serial	2889
Permanent link to this record



Author	Ivet Rafegas; Javier Vazquez; Robert Benavente; Maria Vanrell; Susana Alvarez
Title	Enhancing spatio-chromatic representation with more-than-three color coding for image description			Type	Journal Article
Year	2017	Publication	Journal of the Optical Society of America A	Abbreviated Journal	JOSA A
Volume	34	Issue	5	Pages	827-837
Keywords
Abstract	Extraction of spatio-chromatic features from color images is usually performed independently on each color channel. Usual 3D color spaces, such as RGB, present a high inter-channel correlation for natural images. This correlation can be reduced using color-opponent representations, but the spatial structure of regions with small color differences is not fully captured in two generic Red-Green and Blue-Yellow channels. To overcome these problems, we propose a new color coding that is adapted to the specific content of each image. Our proposal is based on two steps: (a) setting the number of channels to the number of distinctive colors we find in each image (avoiding the problem of channel correlation), and (b) building a channel representation that maximizes contrast differences within each color channel (avoiding the problem of low local contrast). We call this approach more-than-three color coding (MTT) to enhance the fact that the number of channels is adapted to the image content. The higher color complexity an image has, the more channels can be used to represent it. Here we select distinctive colors as the most predominant in the image, which we call color pivots, and we build the new color coding using these color pivots as a basis. To evaluate the proposed approach we measure its efficiency in an image categorization task. We show how a generic descriptor improves its performance at the description level when applied on the MTT coding.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC; 600.087			Approved	no
Call Number	Admin @ si @ RVB2017			Serial	2892
Permanent link to this record



Author	Victor Vaquero; German Ros; Francesc Moreno-Noguer; Antonio Lopez; Alberto Sanfeliu
Title	Joint coarse-and-fine reasoning for deep optical flow			Type	Conference Article
Year	2017	Publication	24th International Conference on Image Processing	Abbreviated Journal
Volume		Issue		Pages	2558-2562
Keywords
Abstract	We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning. The coarse reasoning is performed over a discrete classification space to obtain a general rough solution, while the fine details of the solution are obtained over a continuous regression space. In our approach both components are jointly estimated, which proved to be beneficial for improving estimation accuracy. Additionally, we propose a new network architecture, which combines coarse and fine components by treating the fine estimation as a refinement built on top of the coarse solution, and therefore adding details to the general prediction. We apply our approach to the challenging problem of optical flow estimation and empirically validate it against state-of-the-art CNN-based solutions trained from scratch and tested on large optical flow datasets.
Address	Beijing; China; September 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICIP
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ VRM2017			Serial	2898
Permanent link to this record



Author	Cristhian A. Aguilera-Carrasco; Angel Sappa; Cristhian Aguilera; Ricardo Toledo
Title	Cross-Spectral Local Descriptors via Quadruplet Network			Type	Journal Article
Year	2017	Publication	Sensors	Abbreviated Journal	SENS
Volume	17	Issue	4	Pages	873
Keywords
Abstract	This paper presents a novel CNN-based architecture, referred to as Q-Net, to learn local feature descriptors that are useful for matching image patches from two different spectral bands. Given correctly matched and non-matching cross-spectral image pairs, a quadruplet network is trained to map input image patches to a common Euclidean space, regardless of the input spectral band. Our approach is inspired by the recent success of triplet networks in the visible spectrum, but adapted for cross-spectral scenarios, where, for each matching pair, there are always two possible non-matching patches: one for each spectrum. Experimental evaluations on a public cross-spectral VIS-NIR dataset shows that the proposed approach improves the state-of-the-art. Moreover, the proposed technique can also be used in mono-spectral settings, obtaining a similar performance to triplet network descriptors, but requiring less training data.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.086; 600.118			Approved	no
Call Number	Admin @ si @ ASA2017			Serial	2914
Permanent link to this record



Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla
Title	Cross-Spectral Image Patch Similarity using Convolutional Neural Network			Type	Conference Article
Year	2017	Publication	IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	The ability to compare image regions (patches) has been the basis of many approaches to core computer vision problems, including object, texture and scene categorization. Hence, developing representations for image patches have been of interest in several works. The current work focuses on learning similarity between cross-spectral image patches with a 2 channel convolutional neural network (CNN) model. The proposed approach is an adaptation of a previous work, trying to obtain similar results than the state of the art but with a lowcost hardware. Hence, obtained results are compared with both classical approaches, showing improvements, and a state of the art CNN based approach.
Address	San Sebastian; Spain; May 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECMSM
Notes	ADAS; 600.086; 600.118			Approved	no
Call Number	Admin @ si @ SSV2017a			Serial	2916
Permanent link to this record



Author	Angel Valencia; Roger Idrovo; Angel Sappa; Douglas Plaza; Daniel Ochoa
Title	A 3D Vision Based Approach for Optimal Grasp of Vacuum Grippers			Type	Conference Article
Year	2017	Publication	IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In general, robot grasping approaches are based on the usage of multi-finger grippers. However, when large size objects need to be manipulated vacuum grippers are preferred, instead of finger based grippers. This paper aims to estimate the best picking place for a two suction cups vacuum gripper, when planar objects with an unknown size and geometry are considered. The approach is based on the estimation of geometric properties of object’s shape from a partial cloud of points (a single 3D view), in such a way that combine with considerations of a theoretical model to generate an optimal contact point that minimizes the vacuum force needed to guarantee a grasp. Experimental results in real scenarios are presented to show the validity of the proposed approach.
Address	San Sebastian; Spain; May 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECMSM
Notes	ADAS; 600.086; 600.118			Approved	no
Call Number	Admin @ si @ VIS2017			Serial	2917
Permanent link to this record



Author	Cristhian Aguilera; Xavier Soria; Angel Sappa; Ricardo Toledo
Title	RGBN Multispectral Images: a Novel Color Restoration Approach			Type	Conference Article
Year	2017	Publication	15th International Conference on Practical Applications of Agents and Multi-Agent System	Abbreviated Journal
Volume		Issue		Pages
Keywords	Multispectral Imaging; Free Sensor Model; Neural Network
Abstract	This paper describes a color restoration technique used to remove NIR information from single sensor cameras where color and near-infrared images are simultaneously acquired\|referred to in the literature as RGBN images. The proposed approach is based on a neural network architecture that learns the NIR information contained in the RGBN images. The proposed approach is evaluated on real images obtained by using a pair of RGBN cameras. Additionally, qualitative comparisons with a nave color correction technique based on mean square error minimization are provided.
Address	Porto; Portugal; June 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	PAAMS
Notes	ADAS; MSIAU; 600.118; 600.122			Approved	no
Call Number	Admin @ si @ ASS2017			Serial	2918
Permanent link to this record



Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla
Title	Learning to Colorize Infrared Images			Type	Conference Article
Year	2017	Publication	15th International Conference on Practical Applications of Agents and Multi-Agent System	Abbreviated Journal
Volume		Issue		Pages
Keywords	CNN in multispectral imaging; Image colorization
Abstract	This paper focuses on near infrared (NIR) image colorization by using a Generative Adversarial Network (GAN) architecture model. The proposed architecture consists of two stages. Firstly, it learns to colorize the given input, resulting in a RGB image. Then, in the second stage, a discriminative model is used to estimate the probability that the generated image came from the training dataset, rather than the image automatically generated. The proposed model starts the learning process from scratch, because our set of images is very dierent from the dataset used in existing pre-trained models, so transfer learning strategies cannot be used. Infrared image colorization is an important problem when human perception need to be considered, e.g, in remote sensing applications. Experimental results with a large set of real images are provided showing the validity of the proposed approach.
Address	Porto; Portugal; June 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	PAAMS
Notes	ADAS; MSIAU; 600.086; 600.122; 600.118			Approved	no
Call Number	Admin @ si @			Serial	2919
Permanent link to this record