Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–11] |
Records | |||||
---|---|---|---|---|---|
Author | Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes | ||||
Title | Optical Music Recognition by Recurrent Neural Networks | Type | Conference Article | ||
Year | 2017 | Publication | 14th IAPR International Workshop on Graphics Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 25-26 | ||
Keywords | Optical Music Recognition; Recurrent Neural Network; Long Short-Term Memory | ||||
Abstract | Optical Music Recognition is the task of transcribing a music score into a machine readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.097; 601.302; 600.121 | Approved | no | ||
Call Number | Admin @ si @ BRC2017 | Serial | 3056 | ||
Permanent link to this record | |||||
Author | Arash Akbarinia; Raquel Gil Rodriguez; C. Alejandro Parraga | ||||
Title | Colour Constancy: Biologically-inspired Contrast Variant Pooling Mechanism | Type | Conference Article | ||
Year | 2017 | Publication | 28th British Machine Vision Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Pooling is a ubiquitous operation in image processing algorithms that allows for higher-level processes to collect relevant low-level features from a region of interest. Currently, max-pooling is one of the most commonly used operators in the computational literature. However, it can lack robustness to outliers due to the fact that it relies merely on the peak of a function. Pooling mechanisms are also present in the primate visual cortex where neurons of higher cortical areas pool signals from lower ones. The receptive fields of these neurons have been shown to vary according to the contrast by aggregating signals over a larger region in the presence of low contrast stimuli. We hypothesise that this contrast-variant-pooling mechanism can address some of the shortcomings of maxpooling. We modelled this contrast variation through a histogram clipping in which the percentage of pooled signal is inversely proportional to the local contrast of an image. We tested our hypothesis by applying it to the phenomenon of colour constancy where a number of popular algorithms utilise a max-pooling step (e.g. White-Patch, Grey-Edge and Double-Opponency). For each of these methods, we investigated the consequences of replacing their original max-pooling by the proposed contrast-variant-pooling. Our experiments on three colour constancy benchmark datasets suggest that previous results can significantly improve by adopting a contrast-variant-pooling mechanism. | ||||
Address | London; September 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | BMVC | ||
Notes | NEUROBIT; 600.068; 600.072 | Approved | no | ||
Call Number | Admin @ si @ AGP2017 | Serial | 2992 | ||
Permanent link to this record | |||||
Author | Arash Akbarinia; Karl R. Gegenfurtner | ||||
Title | Metameric Mismatching in Natural and Artificial Reflectances | Type | Journal Article | ||
Year | 2017 | Publication | Journal of Vision | Abbreviated Journal | JV |
Volume | 17 | Issue | 10 | Pages | 390-390 |
Keywords | Metamer; colour perception; spectral discrimination; photoreceptors | ||||
Abstract | The human visual system and most digital cameras sample the continuous spectral power distribution through three classes of receptors. This implies that two distinct spectral reflectances can result in identical tristimulus values under one illuminant and differ under another – the problem of metamer mismatching. It is still debated how frequent this issue arises in the real world, using naturally occurring reflectance functions and common illuminants.
We gathered more than ten thousand spectral reflectance samples from various sources, covering a wide range of environments (e.g., flowers, plants, Munsell chips) and evaluated their responses under a number of natural and artificial source of lights. For each pair of reflectance functions, we estimated the perceived difference using the CIE-defined distance ΔE2000 metric in Lab color space. The degree of metamer mismatching depended on the lower threshold value l when two samples would be considered to lead to equal sensor excitations (ΔE < l), and on the higher threshold value h when they would be considered different. For example, for l=h=1, we found that 43.129 comparisons out of a total of 6×107 pairs would be considered metameric (1 in 104). For l=1 and h=5, this number reduced to 705 metameric pairs (2 in 106). Extreme metamers, for instance l=1 and h=10, were rare (22 pairs or 6 in 108), as were instances where the two members of a metameric pair would be assigned to different color categories. Not unexpectedly, we observed variations among different reflectance databases and illuminant spectra with more frequency under artificial illuminants than natural ones. Overall, our numbers are not very different from those obtained earlier (Foster et al, JOSA A, 2006). However, our results also show that the degree of metamerism is typically not very strong and that category switches hardly ever occur. |
||||
Address | Florida, USA; May 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | NEUROBIT; no menciona | Approved | no | ||
Call Number | Admin @ si @ AkG2017 | Serial | 2899 | ||
Permanent link to this record | |||||
Author | Arash Akbarinia; C. Alejandro Parraga; Marta Exposito; Bogdan Raducanu; Xavier Otazu | ||||
Title | Can biological solutions help computers detect symmetry? | Type | Conference Article | ||
Year | 2017 | Publication | 40th European Conference on Visual Perception | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Berlin; Germany; August 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECVP | ||
Notes | NEUROBIT | Approved | no | ||
Call Number | Admin @ si @ APE2017 | Serial | 2995 | ||
Permanent link to this record | |||||
Author | Arash Akbarinia | ||||
Title | Computational Model of Visual Perception: From Colour to Form | Type | Book Whole | ||
Year | 2017 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The original idea of this project was to study the role of colour in the challenging task of object recognition. We started by extending previous research on colour naming showing that it is feasible to capture colour terms through parsimonious ellipsoids. Although, the results of our model exceeded state-of-the-art in two benchmark datasets, we realised that the two phenomena of metameric lights and colour constancy must be addressed prior to any further colour processing. Our investigation of metameric pairs reached the conclusion that they are infrequent in real world scenarios. Contrary to that, the illumination of a scene often changes dramatically. We addressed this issue by proposing a colour constancy model inspired by the dynamical centre-surround adaptation of neurons in the visual cortex. This was implemented through two overlapping asymmetric Gaussians whose variances and heights are adjusted according to the local contrast of pixels. We complemented this model with a generic contrast-variant pooling mechanism that inversely connect the percentage of pooled signal to the local contrast of a region. The results of our experiments on four benchmark datasets were indeed promising: the proposed model, although simple, outperformed even learning-based approaches in many cases. Encouraged by the success of our contrast-variant surround modulation, we extended this approach to detect boundaries of objects. We proposed an edge detection model based on the first derivative of the Gaussian kernel. We incorporated four types of surround: full, far, iso- and orthogonal-orientation. Furthermore, we accounted for the pooling mechanism at higher cortical areas and the shape feedback sent to lower areas. Our results in three benchmark datasets showed significant improvement over non-learning algorithms.
To summarise, we demonstrated that biologically-inspired models offer promising solutions to computer vision problems, such as, colour naming, colour constancy and edge detection. We believe that the greatest contribution of this Ph.D dissertation is modelling the concept of dynamic surround modulation that shows the significance of contrast-variant surround integration. The models proposed here are grounded on only a portion of what we know about the human visual system. Therefore, it is only natural to complement them accordingly in future works. |
||||
Address | October 2017 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | C. Alejandro Parraga | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-945373-4-9 | Medium | ||
Area | Expedition | Conference | |||
Notes | NEUROBIT | Approved | no | ||
Call Number | Admin @ si @ Akb2017 | Serial | 3019 | ||
Permanent link to this record | |||||
Author | Antonio Lopez; Jiaolong Xu; Jose Luis Gomez; David Vazquez; German Ros | ||||
Title | From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example | Type | Book Chapter | ||
Year | 2017 | Publication | Domain Adaptation in Computer Vision Applications | Abbreviated Journal | |
Volume | Issue | 13 | Pages | 243-258 | |
Keywords | Domain Adaptation | ||||
Abstract | Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | Gabriela Csurka | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.085; 601.223; 600.076; 600.118 | Approved | no | ||
Call Number | ADAS @ adas @ LXG2017 | Serial | 2872 | ||
Permanent link to this record | |||||
Author | Antonio Lopez; Gabriel Villalonga; Laura Sellart; German Ros; David Vazquez; Jiaolong Xu; Javier Marin; Azadeh S. Mozafari | ||||
Title | Training my car to see using virtual worlds | Type | Journal Article | ||
Year | 2017 | Publication | Image and Vision Computing | Abbreviated Journal | IMAVIS |
Volume | 38 | Issue | Pages | 102-118 | |
Keywords | |||||
Abstract | Computer vision technologies are at the core of different advanced driver assistance systems (ADAS) and will play a key role in oncoming autonomous vehicles too. One of the main challenges for such technologies is to perceive the driving environment, i.e. to detect and track relevant driving information in a reliable manner (e.g. pedestrians in the vehicle route, free space to drive through). Nowadays it is clear that machine learning techniques are essential for developing such a visual perception for driving. In particular, the standard working pipeline consists of collecting data (i.e. on-board images), manually annotating the data (e.g. drawing bounding boxes around pedestrians), learning a discriminative data representation taking advantage of such annotations (e.g. a deformable part-based model, a deep convolutional neural network), and then assessing the reliability of such representation with the acquired data. In the last two decades most of the research efforts focused on representation learning (first, designing descriptors and learning classifiers; later doing it end-to-end). Hence, collecting data and, especially, annotating it, is essential for learning good representations. While this has been the case from the very beginning, only after the disruptive appearance of deep convolutional neural networks that it became a serious issue due to their data hungry nature. In this context, the problem is that manual data annotation is a tiresome work prone to errors. Accordingly, in the late 00’s we initiated a research line consisting of training visual models using photo-realistic computer graphics, especially focusing on assisted and autonomous driving. In this paper, we summarize such a work and show how it has become a new tendency with increasing acceptance. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ LVS2017 | Serial | 2985 | ||
Permanent link to this record | |||||
Author | Antonio Lopez; Atsushi Imiya; Tomas Pajdla; Jose Manuel Alvarez | ||||
Title | Computer Vision in Vehicle Technology: Land, Sea & Air | Type | Book Whole | ||
Year | 2017 | Publication | Abbreviated Journal | ||
Volume | Issue | Pages | 161-163 | ||
Keywords | |||||
Abstract | Summary This chapter examines different vision-based commercial solutions for real-live problems related to vehicles. It is worth mentioning the recent astonishing performance of deep convolutional neural networks (DCNNs) in difficult visual tasks such as image classification, object recognition/localization/detection, and semantic segmentation. In fact,
different DCNN architectures are already being explored for low-level tasks such as optical flow and disparity computation, and higher level ones such as place recognition. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | John Wiley & Sons, Ltd | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-118-86807-2 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ LIP2017a | Serial | 2937 | ||
Permanent link to this record | |||||
Author | Anjan Dutta; Pau Riba; Josep Llados; Alicia Fornes | ||||
Title | Pyramidal Stochastic Graphlet Embedding for Document Pattern Classification | Type | Conference Article | ||
Year | 2017 | Publication | 14th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 33-38 | ||
Keywords | graph embedding; hierarchical graph representation; graph clustering; stochastic graphlet embedding; graph classification | ||||
Abstract | Document pattern classification methods using graphs have received a lot of attention because of its robust representation paradigm and rich theoretical background. However, the way of preserving and the process for delineating documents with graphs introduce noise in the rendition of underlying data, which creates instability in the graph representation. To deal with such unreliability in representation, in this paper, we propose Pyramidal Stochastic Graphlet Embedding (PSGE).
Given a graph representing a document pattern, our method first computes a graph pyramid by successively reducing the base graph. Once the graph pyramid is computed, we apply Stochastic Graphlet Embedding (SGE) for each level of the pyramid and combine their embedded representation to obtain a global delineation of the original graph. The consideration of pyramid of graphs rather than just a base graph extends the representational power of the graph embedding, which reduces the instability caused due to noise and distortion. When plugged with support vector machine, our proposed PSGE has outperformed the state-of-the-art results in recognition of handwritten words as well as graphical symbols |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.097; 601.302; 600.121 | Approved | no | ||
Call Number | Admin @ si @ DRL2017 | Serial | 3054 | ||
Permanent link to this record | |||||
Author | Aniol Lidon; Marc Bolaños; Mariella Dimiccoli; Petia Radeva; Maite Garolera; Xavier Giro | ||||
Title | Semantic Summarization of Egocentric Photo-Stream Events | Type | Conference Article | ||
Year | 2017 | Publication | 2nd Workshop on Lifelogging Tools and Applications | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | San Francisco; USA; October 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-5503-2 | Medium | ||
Area | Expedition | Conference | ACMW (LTA) | ||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ LBD2017 | Serial | 3024 | ||
Permanent link to this record | |||||
Author | Angel Valencia; Roger Idrovo; Angel Sappa; Douglas Plaza; Daniel Ochoa | ||||
Title | A 3D Vision Based Approach for Optimal Grasp of Vacuum Grippers | Type | Conference Article | ||
Year | 2017 | Publication | IEEE International Workshop of Electronics, Control, Measurement, Signals and their application to Mechatronics | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | In general, robot grasping approaches are based on the usage of multi-finger grippers. However, when large size objects need to be manipulated vacuum grippers are preferred, instead of finger based grippers. This paper aims to estimate the best picking place for a two suction cups vacuum gripper,
when planar objects with an unknown size and geometry are considered. The approach is based on the estimation of geometric properties of object’s shape from a partial cloud of points (a single 3D view), in such a way that combine with considerations of a theoretical model to generate an optimal contact point that minimizes the vacuum force needed to guarantee a grasp. Experimental results in real scenarios are presented to show the validity of the proposed approach. |
||||
Address | San Sebastian; Spain; May 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECMSM | ||
Notes | ADAS; 600.086; 600.118 | Approved | no | ||
Call Number | Admin @ si @ VIS2017 | Serial | 2917 | ||
Permanent link to this record | |||||
Author | Andrei Polzounov; Artsiom Ablavatski; Sergio Escalera; Shijian Lu; Jianfei Cai | ||||
Title | WordFences: Text Localization and Recognition | Type | Conference Article | ||
Year | 2017 | Publication | 24th International Conference on Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Beijing; China; September 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIP | ||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ PAE2017 | Serial | 3007 | ||
Permanent link to this record | |||||
Author | Alicia Fornes; Veronica Romero; Arnau Baro; Juan Ignacio Toledo; Joan Andreu Sanchez; Enrique Vidal; Josep Llados | ||||
Title | ICDAR2017 Competition on Information Extraction in Historical Handwritten Records | Type | Conference Article | ||
Year | 2017 | Publication | 14th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1389-1394 | ||
Keywords | |||||
Abstract | The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results. | ||||
Address | Kyoto; Japan; November 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.097; 601.225; 600.121 | Approved | no | ||
Call Number | Admin @ si @ FRB2017 | Serial | 3052 | ||
Permanent link to this record | |||||
Author | Alicia Fornes; Beata Megyesi; Joan Mas | ||||
Title | Transcription of Encoded Manuscripts with Image Processing Techniques | Type | Conference Article | ||
Year | 2017 | Publication | Digital Humanities Conference | Abbreviated Journal | |
Volume | Issue | Pages | 441-443 | ||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DH | ||
Notes | DAG; 600.097; 600.121 | Approved | no | ||
Call Number | Admin @ si @ FMM2017 | Serial | 3061 | ||
Permanent link to this record | |||||
Author | Alexey Dosovitskiy; German Ros; Felipe Codevilla; Antonio Lopez; Vladlen Koltun | ||||
Title | CARLA: An Open Urban Driving Simulator | Type | Conference Article | ||
Year | 2017 | Publication | 1st Annual Conference on Robot Learning. Proceedings of Machine Learning | Abbreviated Journal | |
Volume | 78 | Issue | Pages | 1-16 | |
Keywords | Autonomous driving; sensorimotor control; simulation | ||||
Abstract | We introduce CARLA, an open-source simulator for autonomous driving research. CARLA has been developed from the ground up to support development, training, and validation of autonomous urban driving systems. In addition to open-source code and protocols, CARLA provides open digital assets (urban layouts, buildings, vehicles) that were created for this purpose and can be used freely. The simulation platform supports flexible specification of sensor suites and environmental conditions. We use CARLA to study the performance of three approaches to autonomous driving: a classic modular pipeline, an endto-end
model trained via imitation learning, and an end-to-end model trained via reinforcement learning. The approaches are evaluated in controlled scenarios of increasing difficulty, and their performance is examined via metrics provided by CARLA, illustrating the platform’s utility for autonomous driving research. |
||||
Address | Mountain View; CA; USA; November 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CORL | ||
Notes | ADAS; 600.085; 600.118 | Approved | no | ||
Call Number | Admin @ si @ DRC2017 | Serial | 2988 | ||
Permanent link to this record |