Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes
Title	Optical Music Recognition by Recurrent Neural Networks			Type	Conference Article
Year	2017	Publication	14th IAPR International Workshop on Graphics Recognition	Abbreviated Journal
Volume		Issue		Pages	25-26
Keywords	Optical Music Recognition; Recurrent Neural Network; Long Short-Term Memory
Abstract	Optical Music Recognition is the task of transcribing a music score into a machine readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.097; 601.302; 600.121			Approved	no
Call Number	Admin @ si @ BRC2017			Serial	3056
Permanent link to this record



Author	Arash Akbarinia; Raquel Gil Rodriguez; C. Alejandro Parraga
Title	Colour Constancy: Biologically-inspired Contrast Variant Pooling Mechanism			Type	Conference Article
Year	2017	Publication	28th British Machine Vision Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Pooling is a ubiquitous operation in image processing algorithms that allows for higher-level processes to collect relevant low-level features from a region of interest. Currently, max-pooling is one of the most commonly used operators in the computational literature. However, it can lack robustness to outliers due to the fact that it relies merely on the peak of a function. Pooling mechanisms are also present in the primate visual cortex where neurons of higher cortical areas pool signals from lower ones. The receptive fields of these neurons have been shown to vary according to the contrast by aggregating signals over a larger region in the presence of low contrast stimuli. We hypothesise that this contrast-variant-pooling mechanism can address some of the shortcomings of maxpooling. We modelled this contrast variation through a histogram clipping in which the percentage of pooled signal is inversely proportional to the local contrast of an image. We tested our hypothesis by applying it to the phenomenon of colour constancy where a number of popular algorithms utilise a max-pooling step (e.g. White-Patch, Grey-Edge and Double-Opponency). For each of these methods, we investigated the consequences of replacing their original max-pooling by the proposed contrast-variant-pooling. Our experiments on three colour constancy benchmark datasets suggest that previous results can significantly improve by adopting a contrast-variant-pooling mechanism.
Address	London; September 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	BMVC
Notes	NEUROBIT; 600.068; 600.072			Approved	no
Call Number	Admin @ si @ AGP2017			Serial	2992
Permanent link to this record



Author	Arash Akbarinia; Karl R. Gegenfurtner
Title	Metameric Mismatching in Natural and Artificial Reflectances			Type	Journal Article
Year	2017	Publication	Journal of Vision	Abbreviated Journal	JV
Volume	17	Issue	10	Pages	390-390
Keywords	Metamer; colour perception; spectral discrimination; photoreceptors
Abstract	The human visual system and most digital cameras sample the continuous spectral power distribution through three classes of receptors. This implies that two distinct spectral reflectances can result in identical tristimulus values under one illuminant and differ under another – the problem of metamer mismatching. It is still debated how frequent this issue arises in the real world, using naturally occurring reflectance functions and common illuminants. We gathered more than ten thousand spectral reflectance samples from various sources, covering a wide range of environments (e.g., flowers, plants, Munsell chips) and evaluated their responses under a number of natural and artificial source of lights. For each pair of reflectance functions, we estimated the perceived difference using the CIE-defined distance ΔE2000 metric in Lab color space. The degree of metamer mismatching depended on the lower threshold value l when two samples would be considered to lead to equal sensor excitations (ΔE < l), and on the higher threshold value h when they would be considered different. For example, for l=h=1, we found that 43.129 comparisons out of a total of 6×107 pairs would be considered metameric (1 in 104). For l=1 and h=5, this number reduced to 705 metameric pairs (2 in 106). Extreme metamers, for instance l=1 and h=10, were rare (22 pairs or 6 in 108), as were instances where the two members of a metameric pair would be assigned to different color categories. Not unexpectedly, we observed variations among different reflectance databases and illuminant spectra with more frequency under artificial illuminants than natural ones. Overall, our numbers are not very different from those obtained earlier (Foster et al, JOSA A, 2006). However, our results also show that the degree of metamerism is typically not very strong and that category switches hardly ever occur.
Address	Florida, USA; May 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	NEUROBIT; no menciona			Approved	no
Call Number	Admin @ si @ AkG2017			Serial	2899
Permanent link to this record



Author	Arash Akbarinia; C. Alejandro Parraga; Marta Exposito; Bogdan Raducanu; Xavier Otazu
Title	Can biological solutions help computers detect symmetry?			Type	Conference Article
Year	2017	Publication	40th European Conference on Visual Perception	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Berlin; Germany; August 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECVP
Notes	NEUROBIT			Approved	no
Call Number	Admin @ si @ APE2017			Serial	2995
Permanent link to this record



Author	Arash Akbarinia
Title	Computational Model of Visual Perception: From Colour to Form			Type	Book Whole
Year	2017	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	The original idea of this project was to study the role of colour in the challenging task of object recognition. We started by extending previous research on colour naming showing that it is feasible to capture colour terms through parsimonious ellipsoids. Although, the results of our model exceeded state-of-the-art in two benchmark datasets, we realised that the two phenomena of metameric lights and colour constancy must be addressed prior to any further colour processing. Our investigation of metameric pairs reached the conclusion that they are infrequent in real world scenarios. Contrary to that, the illumination of a scene often changes dramatically. We addressed this issue by proposing a colour constancy model inspired by the dynamical centre-surround adaptation of neurons in the visual cortex. This was implemented through two overlapping asymmetric Gaussians whose variances and heights are adjusted according to the local contrast of pixels. We complemented this model with a generic contrast-variant pooling mechanism that inversely connect the percentage of pooled signal to the local contrast of a region. The results of our experiments on four benchmark datasets were indeed promising: the proposed model, although simple, outperformed even learning-based approaches in many cases. Encouraged by the success of our contrast-variant surround modulation, we extended this approach to detect boundaries of objects. We proposed an edge detection model based on the first derivative of the Gaussian kernel. We incorporated four types of surround: full, far, iso- and orthogonal-orientation. Furthermore, we accounted for the pooling mechanism at higher cortical areas and the shape feedback sent to lower areas. Our results in three benchmark datasets showed significant improvement over non-learning algorithms. To summarise, we demonstrated that biologically-inspired models offer promising solutions to computer vision problems, such as, colour naming, colour constancy and edge detection. We believe that the greatest contribution of this Ph.D dissertation is modelling the concept of dynamic surround modulation that shows the significance of contrast-variant surround integration. The models proposed here are grounded on only a portion of what we know about the human visual system. Therefore, it is only natural to complement them accordingly in future works.
Address	October 2017
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	C. Alejandro Parraga
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-84-945373-4-9	Medium
Area		Expedition		Conference
Notes	NEUROBIT			Approved	no
Call Number	Admin @ si @ Akb2017			Serial	3019
Permanent link to this record



Author	Antonio Lopez; Jiaolong Xu; Jose Luis Gomez; David Vazquez; German Ros
Title	From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example			Type	Book Chapter
Year	2017	Publication	Domain Adaptation in Computer Vision Applications	Abbreviated Journal
Volume		Issue	13	Pages	243-258
Keywords	Domain Adaptation
Abstract	Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world.
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	Gabriela Csurka
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.085; 601.223; 600.076; 600.118			Approved	no
Call Number	ADAS @ adas @ LXG2017			Serial	2872
Permanent link to this record



Author	Antonio Lopez; Gabriel Villalonga; Laura Sellart; German Ros; David Vazquez; Jiaolong Xu; Javier Marin; Azadeh S. Mozafari
Title	Training my car to see using virtual worlds			Type	Journal Article
Year	2017	Publication	Image and Vision Computing	Abbreviated Journal	IMAVIS
Volume	38	Issue		Pages	102-118
Keywords
Abstract	Computer vision technologies are at the core of different advanced driver assistance systems (ADAS) and will play a key role in oncoming autonomous vehicles too. One of the main challenges for such technologies is to perceive the driving environment, i.e. to detect and track relevant driving information in a reliable manner (e.g. pedestrians in the vehicle route, free space to drive through). Nowadays it is clear that machine learning techniques are essential for developing such a visual perception for driving. In particular, the standard working pipeline consists of collecting data (i.e. on-board images), manually annotating the data (e.g. drawing bounding boxes around pedestrians), learning a discriminative data representation taking advantage of such annotations (e.g. a deformable part-based model, a deep convolutional neural network), and then assessing the reliability of such representation with the acquired data. In the last two decades most of the research efforts focused on representation learning (first, designing descriptors and learning classifiers; later doing it end-to-end). Hence, collecting data and, especially, annotating it, is essential for learning good representations. While this has been the case from the very beginning, only after the disruptive appearance of deep convolutional neural networks that it became a serious issue due to their data hungry nature. In this context, the problem is that manual data annotation is a tiresome work prone to errors. Accordingly, in the late 00’s we initiated a research line consisting of training visual models using photo-realistic computer graphics, especially focusing on assisted and autonomous driving. In this paper, we summarize such a work and show how it has become a new tendency with increasing acceptance.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ LVS2017			Serial	2985
Permanent link to this record



Author	Antonio Lopez; Atsushi Imiya; Tomas Pajdla; Jose Manuel Alvarez
Title	Computer Vision in Vehicle Technology: Land, Sea & Air			Type	Book Whole
Year	2017	Publication		Abbreviated Journal
Volume		Issue		Pages	161-163
Keywords
Abstract	Summary This chapter examines different vision-based commercial solutions for real-live problems related to vehicles. It is worth mentioning the recent astonishing performance of deep convolutional neural networks (DCNNs) in difficult visual tasks such as image classification, object recognition/localization/detection, and semantic segmentation. In fact, different DCNN architectures are already being explored for low-level tasks such as optical flow and disparity computation, and higher level ones such as place recognition.
Address
Corporate Author				Thesis
Publisher	John Wiley & Sons, Ltd	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-118-86807-2	Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ LIP2017a			Serial	2937
Permanent link to this record



Author	Anjan Dutta; Pau Riba; Josep Llados; Alicia Fornes
Title	Pyramidal Stochastic Graphlet Embedding for Document Pattern Classification			Type	Conference Article
Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume		Issue		Pages	33-38
Keywords	graph embedding; hierarchical graph representation; graph clustering; stochastic graphlet embedding; graph classification
Abstract	Document pattern classification methods using graphs have received a lot of attention because of its robust representation paradigm and rich theoretical background. However, the way of preserving and the process for delineating documents with graphs introduce noise in the rendition of underlying data, which creates instability in the graph representation. To deal with such unreliability in representation, in this paper, we propose Pyramidal Stochastic Graphlet Embedding (PSGE). Given a graph representing a document pattern, our method first computes a graph pyramid by successively reducing the base graph. Once the graph pyramid is computed, we apply Stochastic Graphlet Embedding (SGE) for each level of the pyramid and combine their embedded representation to obtain a global delineation of the original graph. The consideration of pyramid of graphs rather than just a base graph extends the representational power of the graph embedding, which reduces the instability caused due to noise and distortion. When plugged with support vector machine, our proposed PSGE has outperformed the state-of-the-art results in recognition of handwritten words as well as graphical symbols
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.097; 601.302; 600.121			Approved	no
Call Number	Admin @ si @ DRL2017			Serial	3054
Permanent link to this record



Author	Aniol Lidon; Marc Bolaños; Mariella Dimiccoli; Petia Radeva; Maite Garolera; Xavier Giro
Title	Semantic Summarization of Egocentric Photo-Stream Events			Type	Conference Article
Year	2017	Publication	2nd Workshop on Lifelogging Tools and Applications	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	San Francisco; USA; October 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4503-5503-2	Medium
Area		Expedition		Conference	ACMW (LTA)
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ LBD2017			Serial	3024
Permanent link to this record



Author	Angel Valencia; Roger Idrovo; Angel Sappa; Douglas Plaza; Daniel Ochoa
Title	A 3D Vision Based Approach for Optimal Grasp of Vacuum Grippers			Type	Conference Article
Year	2017	Publication	IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In general, robot grasping approaches are based on the usage of multi-finger grippers. However, when large size objects need to be manipulated vacuum grippers are preferred, instead of finger based grippers. This paper aims to estimate the best picking place for a two suction cups vacuum gripper, when planar objects with an unknown size and geometry are considered. The approach is based on the estimation of geometric properties of object’s shape from a partial cloud of points (a single 3D view), in such a way that combine with considerations of a theoretical model to generate an optimal contact point that minimizes the vacuum force needed to guarantee a grasp. Experimental results in real scenarios are presented to show the validity of the proposed approach.
Address	San Sebastian; Spain; May 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECMSM
Notes	ADAS; 600.086; 600.118			Approved	no
Call Number	Admin @ si @ VIS2017			Serial	2917
Permanent link to this record



Author	Andrei Polzounov; Artsiom Ablavatski; Sergio Escalera; Shijian Lu; Jianfei Cai
Title	WordFences: Text Localization and Recognition			Type	Conference Article
Year	2017	Publication	24th International Conference on Image Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Beijing; China; September 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICIP
Notes	HUPBA; no menciona			Approved	no
Call Number	Admin @ si @ PAE2017			Serial	3007
Permanent link to this record



Author	Alicia Fornes; Veronica Romero; Arnau Baro; Juan Ignacio Toledo; Joan Andreu Sanchez; Enrique Vidal; Josep Llados
Title	ICDAR2017 Competition on Information Extraction in Historical Handwritten Records			Type	Conference Article
Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume		Issue		Pages	1389-1394
Keywords
Abstract	The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results.
Address	Kyoto; Japan; November 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.097; 601.225; 600.121			Approved	no
Call Number	Admin @ si @ FRB2017			Serial	3052
Permanent link to this record



Author	Alicia Fornes; Beata Megyesi; Joan Mas
Title	Transcription of Encoded Manuscripts with Image Processing Techniques			Type	Conference Article
Year	2017	Publication	Digital Humanities Conference	Abbreviated Journal
Volume		Issue		Pages	441-443
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DH
Notes	DAG; 600.097; 600.121			Approved	no
Call Number	Admin @ si @ FMM2017			Serial	3061
Permanent link to this record



Author	Alexey Dosovitskiy; German Ros; Felipe Codevilla; Antonio Lopez; Vladlen Koltun
Title	CARLA: An Open Urban Driving Simulator			Type	Conference Article
Year	2017	Publication	1st Annual Conference on Robot Learning. Proceedings of Machine Learning	Abbreviated Journal
Volume	78	Issue		Pages	1-16
Keywords	Autonomous driving; sensorimotor control; simulation
Abstract	We introduce CARLA, an open-source simulator for autonomous driving research. CARLA has been developed from the ground up to support development, training, and validation of autonomous urban driving systems. In addition to open-source code and protocols, CARLA provides open digital assets (urban layouts, buildings, vehicles) that were created for this purpose and can be used freely. The simulation platform supports flexible specification of sensor suites and environmental conditions. We use CARLA to study the performance of three approaches to autonomous driving: a classic modular pipeline, an endto-end model trained via imitation learning, and an end-to-end model trained via reinforcement learning. The approaches are evaluated in controlled scenarios of increasing difficulty, and their performance is examined via metrics provided by CARLA, illustrating the platform’s utility for autonomous driving research.
Address	Mountain View; CA; USA; November 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CORL
Notes	ADAS; 600.085; 600.118			Approved	no
Call Number	Admin @ si @ DRC2017			Serial	2988
Permanent link to this record