toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Vassileios Balntas; Edgar Riba; Daniel Ponsa; Krystian Mikolajczyk edit   pdf
openurl 
  Title Learning local feature descriptors with triplets and shallow convolutional neural networks Type Conference Article
  Year 2016 Publication 27th British Machine Vision Conference Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract It has recently been demonstrated that local feature descriptors based on convolutional neural networks (CNN) can significantly improve the matching performance. Previous work on learning such descriptors has focused on exploiting pairs of positive and negative patches to learn discriminative CNN representations. In this work, we propose to utilize triplets of training samples, together with in-triplet mining of hard negatives.
We show that our method achieves state of the art results, without the computational overhead typically associated with mining of negatives and with lower complexity of the network architecture. We compare our approach to recently introduced convolutional local feature descriptors, and demonstrate the advantages of the proposed methods in terms of performance and speed. We also examine different loss functions associated with triplets.
 
  Address York; UK; September 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference BMVC  
  Notes ADAS; 600.086 Approved no  
  Call Number Admin @ si @ BRP2016 Serial (down) 2818  
Permanent link to this record
 

 
Author Jose A. Garcia; David Masip; Valerio Sbragaglia; Jacopo Aguzzi edit   pdf
openurl 
  Title Using ORB, BoW and SVM to identificate and track tagged Norway lobster Nephrops Norvegicus (L.) Type Conference Article
  Year 2016 Publication 3rd International Conference on Maritime Technology and Engineering Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Sustainable capture policies of many species strongly depend on the understanding of their social behaviour. Nevertheless, the analysis of emergent behaviour in marine species poses several challenges. Usually animals are captured and observed in tanks, and their behaviour is inferred from their dynamics and interactions. Therefore, researchers must deal with thousands of hours of video data. Without loss of generality, this paper proposes a computer
vision approach to identify and track specific species, the Norway lobster, Nephrops norvegicus. We propose an identification scheme were animals are marked using black and white tags with a geometric shape in the center (holed
triangle, filled triangle, holed circle and filled circle). Using a massive labelled dataset; we extract local features based on the ORB descriptor. These features are a posteriori clustered, and we construct a Bag of Visual Words feature vector per animal. This approximation yields us invariance to rotation
and translation. A SVM classifier achieves generalization results above 99%. In a second contribution, we will make the code and training data publically available.
 
  Address Lisboa; Portugal; July 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MARTECH  
  Notes OR;MV; Approved no  
  Call Number Admin @ si @ GMS2016b Serial (down) 2817  
Permanent link to this record
 

 
Author Jose A. Garcia; David Masip; Valerio Sbragaglia; Jacopo Aguzzi edit   pdf
openurl 
  Title Automated Identification and Tracking of Nephrops norvegicus (L.) Using Infrared and Monochromatic Blue Light Type Conference Article
  Year 2016 Publication 19th International Conference of the Catalan Association for Artificial Intelligence Abbreviated Journal  
  Volume Issue Pages  
  Keywords computer vision; video analysis; object recognition; tracking; behaviour; social; decapod; Nephrops norvegicus  
  Abstract Automated video and image analysis can be a very efficient tool to analyze
animal behavior based on sociality, especially in hard access environments
for researchers. The understanding of this social behavior can play a key role in the sustainable design of capture policies of many species. This paper proposes the use of computer vision algorithms to identify and track a specific specie, the Norway lobster, Nephrops norvegicus, a burrowing decapod with relevant commercial value which is captured by trawling. These animals can only be captured when are engaged in seabed excursions, which are strongly related with their social behavior.
This emergent behavior is modulated by the day-night cycle, but their social
interactions remain unknown to the scientific community. The paper introduces an identification scheme made of four distinguishable black and white tags (geometric shapes). The project has recorded 15-day experiments in laboratory pools, under monochromatic blue light (472 nm.) and darkness conditions (recorded using Infra Red light). Using this massive image set, we propose a comparative of state-ofthe-art computer vision algorithms to distinguish and track the different animals’ movements. We evaluate the robustness to the high noise presence in the infrared video signals and free out-of-plane rotations due to animal movement. The experiments show promising accuracies under a cross-validation protocol, being adaptable to the automation and analysis of large scale data. In a second contribution, we created an extensive dataset of shapes (46027 different shapes) from four daily experimental video recordings, which will be available to the community.
 
  Address Barcelona; Spain; October 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CCIA  
  Notes OR;MV; Approved no  
  Call Number Admin @ si @ GMS2016 Serial (down) 2816  
Permanent link to this record
 

 
Author Ishaan Gulrajani; Kundan Kumar; Faruk Ahmed; Adrien Ali Taiga; Francesco Visin; David Vazquez; Aaron Courville edit   pdf
url  openurl
  Title PixelVAE: A Latent Variable Model for Natural Images Type Conference Article
  Year 2017 Publication 5th International Conference on Learning Representations Abbreviated Journal  
  Volume Issue Pages  
  Keywords Deep Learning; Unsupervised Learning  
  Abstract Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and generate samples that preserve global structure but tend to suffer from image blurriness. PixelCNNs model sharp contours and details very well, but lack an explicit latent representation and have difficulty modeling large-scale structure in a computationally efficient way. In this paper, we present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. The resulting architecture achieves state-of-the-art log-likelihood on binarized MNIST. We extend PixelVAE to a hierarchy of multiple latent variables at different scales; this hierarchical model achieves competitive likelihood on 64x64 ImageNet and generates high-quality samples on LSUN bedrooms.  
  Address Toulon; France; April 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICLR  
  Notes ADAS; 600.085; 600.076; 601.281; 600.118 Approved no  
  Call Number ADAS @ adas @ GKA2017 Serial (down) 2815  
Permanent link to this record
 

 
Author Victor Ponce edit  url
openurl 
  Title Evolutionary Bags of Space-Time Features for Human Analysis Type Book Whole
  Year 2016 Publication PhD Thesis Universitat de Barcelona, UOC and CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords Computer algorithms; Digital image processing; Digital video; Analysis of variance; Dynamic programming; Evolutionary computation; Gesture  
  Abstract The representation (or feature) learning has been an emerging concept in the last years, since it collects a set of techniques that are present in any theoretical or practical methodology referring to artificial intelligence. In computer vision, a very common representation has adopted the form of the well-known Bag of Visual Words. This representation appears implicitly in most approaches where images are described, and is also present in a huge number of areas and domains: image content retrieval, pedestrian detection, human-computer interaction, surveillance, e-health, and social computing, amongst others. The early stages of this dissertation provide an approach for learning visual representations inside evolutionary algorithms, which consists of evolving weighting schemes to improve the BoVW representations for the task of recognizing categories of videos and images. Thus, we demonstrate the applicability of the most common weighting schemes, which are often used in text mining but are less frequently found in computer vision tasks. Beyond learning these visual representations, we provide an approach based on fusion strategies for learning spatiotemporal representations, from multimodal data obtained by depth sensors. Besides, we specially aim at the evolutionary and dynamic modelling, where the temporal factor is present in the nature of the data, such as video sequences of gestures and actions. Indeed, we explore the effects of probabilistic modelling for those approaches based on dynamic programming, so as to handle the temporal deformation and variance amongst video sequences of different categories. Finally, we integrate dynamic programming and generative models into an evolutionary computation framework, with the aim of learning Bags of SubGestures (BoSG) representations and hence to improve the generalization capability of standard gesture recognition approaches. The results obtained in the experimentation demonstrate, first, that evolutionary algorithms are useful for improving the representation of BoVW approaches in several datasets for recognizing categories in still images and video sequences. On the other hand, our experimentation reveals that both, the use of dynamic programming and generative models to align video sequences, and the representations obtained from applying fusion strategies in multimodal data, entail an enhancement on the performance when recognizing some gesture categories. Furthermore, the combination of evolutionary algorithms with models based on dynamic programming and generative approaches results, when aiming at the classification of video categories on large video datasets, in a considerable improvement over standard gesture and action recognition approaches. Finally, we demonstrate the applications of these representations in several domains for human analysis: classification of images where humans may be present, action and gesture recognition for general applications, and in particular for conversational settings within the field of restorative justice  
  Address June 2016  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Sergio Escalera;Xavier Baro;Hugo Jair Escalante  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA Approved no  
  Call Number Pon2016 Serial (down) 2814  
Permanent link to this record
 

 
Author Carlos David Martinez Hinarejos; Josep Llados; Alicia Fornes; Francisco Casacuberta; Lluis de Las Heras; Joan Mas; Moises Pastor; Oriol Ramos Terrades; Joan Andreu Sanchez; Enrique Vidal; Fernando Vilariño edit   pdf
openurl 
  Title Context, multimodality, and user collaboration in handwritten text processing: the CoMUN-HaT project Type Conference Article
  Year 2016 Publication 3rd IberSPEECH Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Processing of handwritten documents is a task that is of wide interest for many
purposes, such as those related to preserve cultural heritage. Handwritten text recognition techniques have been successfully applied during the last decade to obtain transcriptions of handwritten documents, and keyword spotting techniques have been applied for searching specific terms in image collections of handwritten documents. However, results on transcription and indexing are far from perfect. In this framework, the use of new data sources arises as a new paradigm that will allow for a better transcription and indexing of handwritten documents. Three main different data sources could be considered: context of the document (style, writer, historical time, topics,. . . ), multimodal data (representations of the document in a different modality, such as the speech signal of the dictation of the text), and user feedback (corrections, amendments,. . . ). The CoMUN-HaT project aims at the integration of these different data sources into the transcription and indexing task for handwritten documents: the use of context derived from the analysis of the documents, how multimodality can aid the recognition process to obtain more accurate transcriptions (including transcription in a modern version of the language), and integration into a userin-the-loop assisted text transcription framework. This will be reflected in the construction of a transcription and indexing platform that can be used by both professional and nonprofessional users, contributing to crowd-sourcing activities to preserve cultural heritage and to obtain an accessible version of the involved corpus.
 
  Address Lisboa; Portugal; November 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IberSPEECH  
  Notes DAG; MV; 600.097;SIAI Approved no  
  Call Number Admin @ si @MLF2016 Serial (down) 2813  
Permanent link to this record
 

 
Author Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan Carlos Moure edit   pdf
url  doi
openurl 
  Title GPU-accelerated real-time stixel computation Type Conference Article
  Year 2017 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 1054-1062  
  Keywords Autonomous Driving; GPU; Stixel  
  Abstract The Stixel World is a medium-level, compact representation of road scenes that abstracts millions of disparity pixels into hundreds or thousands of stixels. The goal of this work is to implement and evaluate a complete multi-stixel estimation pipeline on an embedded, energyefficient, GPU-accelerated device. This work presents a full GPU-accelerated implementation of stixel estimation that produces reliable results at 26 frames per second (real-time) on the Tegra X1 for disparity images of 1024×440 pixels and stixel widths of 5 pixels, and achieves more than 400 frames per second on a high-end Titan X GPU card.  
  Address Santa Rosa; CA; USA; March 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes ADAS; 600.118 Approved no  
  Call Number ADAS @ adas @ HEV2017b Serial (down) 2812  
Permanent link to this record
 

 
Author Angel Sappa; Cristhian A. Aguilera-Carrasco; Juan A. Carvajal Ayala; Miguel Oliveira; Dennis Romero; Boris X. Vintimilla; Ricardo Toledo edit   pdf
doi  openurl
  Title Monocular visual odometry: A cross-spectral image fusion based approach Type Journal Article
  Year 2016 Publication Robotics and Autonomous Systems Abbreviated Journal RAS  
  Volume 85 Issue Pages 26-36  
  Keywords Monocular visual odometry; LWIR-RGB cross-spectral imaging; Image fusion  
  Abstract This manuscript evaluates the usage of fused cross-spectral images in a monocular visual odometry approach. Fused images are obtained through a Discrete Wavelet Transform (DWT) scheme, where the best setup is empirically obtained by means of a mutual information based evaluation metric. The objective is to have a flexible scheme where fusion parameters are adapted according to the characteristics of the given images. Visual odometry is computed from the fused monocular images using an off the shelf approach. Experimental results using data sets obtained with two different platforms are presented. Additionally, comparison with a previous approach as well as with monocular-visible/infrared spectra are also provided showing the advantages of the proposed scheme.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier B.V. Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS;600.086; 600.076 Approved no  
  Call Number Admin @ si @SAC2016 Serial (down) 2811  
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate; David Vazquez; Antonio Lopez; Jaume Amores edit   pdf
doi  openurl
  Title On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts Type Journal Article
  Year 2017 Publication IEEE Transactions on cybernetics Abbreviated Journal Cyber  
  Volume 47 Issue 11 Pages 3980 - 3990  
  Keywords Multicue; multimodal; multiview; object detection  
  Abstract Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities, and a strong multiview (MV) classifier that accounts for different object views and poses. In this paper, we provide an extensive evaluation that gives insight into how each of these aspects (multicue, multimodality, and strong MV classifier) affect accuracy both individually and when integrated together. In the multimodality component, we explore the fusion of RGB and depth maps obtained by high-definition light detection and ranging, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 2168-2267 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.085; 600.082; 600.076; 600.118 Approved no  
  Call Number Admin @ si @ Serial (down) 2810  
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco; F. Aguilera; Angel Sappa; C. Aguilera; Ricardo Toledo edit   pdf
doi  openurl
  Title Learning cross-spectral similarity measures with deep convolutional neural networks Type Conference Article
  Year 2016 Publication 29th IEEE Conference on Computer Vision and Pattern Recognition Worshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The simultaneous use of images from different spectracan be helpful to improve the performance of many computer vision tasks. The core idea behind the usage of crossspectral approaches is to take advantage of the strengths of each spectral band providing a richer representation of a scene, which cannot be obtained with just images from one spectral band. In this work we tackle the cross-spectral image similarity problem by using Convolutional Neural Networks (CNNs). We explore three different CNN architectures to compare the similarity of cross-spectral image patches. Specifically, we train each network with images from the visible and the near-infrared spectrum, and then test the result with two public cross-spectral datasets. Experimental results show that CNN approaches outperform the current state-of-art on both cross-spectral datasets. Additionally, our experiments show that some CNN architectures are capable of generalizing between different crossspectral domains.  
  Address Las vegas; USA; June 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes ADAS; 600.086; 600.076 Approved no  
  Call Number Admin @ si @AAS2016 Serial (down) 2809  
Permanent link to this record
 

 
Author Angel Sappa; P. Carvajal; Cristhian A. Aguilera-Carrasco; Miguel Oliveira; Dennis Romero; Boris X. Vintimilla edit   pdf
doi  openurl
  Title Wavelet based visible and infrared image fusion: a comparative study Type Journal Article
  Year 2016 Publication Sensors Abbreviated Journal SENS  
  Volume 16 Issue 6 Pages 1-15  
  Keywords Image fusion; fusion evaluation metrics; visible and infrared imaging; discrete wavelet transform  
  Abstract This paper evaluates different wavelet-based cross-spectral image fusion strategies adopted to merge visible and infrared images. The objective is to find the best setup independently of the evaluation metric used to measure the performance. Quantitative performance results are obtained with state of the art approaches together with adaptations proposed in the current work. The options evaluated in the current work result from the combination of different setups in the wavelet image decomposition stage together with different fusion strategies for the final merging stage that generates the resulting representation. Most of the approaches evaluate results according to the application for which they are intended for. Sometimes a human observer is selected to judge the quality of the obtained results. In the current work, quantitative values are considered in order to find correlations between setups and performance of obtained results; these correlations can be used to define a criteria for selecting the best fusion strategy for a given pair of cross-spectral images. The whole procedure is evaluated with a large set of correctly registered visible and infrared image pairs, including both Near InfraRed (NIR) and Long Wave InfraRed (LWIR).  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.086; 600.076 Approved no  
  Call Number Admin @ si @SCA2016 Serial (down) 2807  
Permanent link to this record
 

 
Author Miguel Oliveira; Victor Santos; Angel Sappa; P. Dias; A. Moreira edit   pdf
doi  openurl
  Title Incremental Scenario Representations for Autonomous Driving using Geometric Polygonal Primitives Type Journal Article
  Year 2016 Publication Robotics and Autonomous Systems Abbreviated Journal RAS  
  Volume 83 Issue Pages 312-325  
  Keywords Incremental scene reconstruction; Point clouds; Autonomous vehicles; Polygonal primitives  
  Abstract When an autonomous vehicle is traveling through some scenario it receives a continuous stream of sensor data. This sensor data arrives in an asynchronous fashion and often contains overlapping or redundant information. Thus, it is not trivial how a representation of the environment observed by the vehicle can be created and updated over time. This paper presents a novel methodology to compute an incremental 3D representation of a scenario from 3D range measurements. We propose to use macro scale polygonal primitives to model the scenario. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Furthermore, we propose mechanisms designed to update the geometric polygonal primitives over time whenever fresh sensor data is collected. Results show that the approach is capable of producing accurate descriptions of the scene, and that it is computationally very efficient when compared to other reconstruction techniques.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier B.V. Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.086, 600.076 Approved no  
  Call Number Admin @ si @OSS2016a Serial (down) 2806  
Permanent link to this record
 

 
Author Fernando Vilariño edit  openurl
  Title Dissemination, creation and education from archives: Case study of the collection of Digitized Visual Poems from Joan Brossa Foundation Type Conference Article
  Year 2016 Publication International Workshop on Poetry: Archives, Poetries and Receptions Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Barcelona; Spain; October 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference POETRY  
  Notes MV; 600.097;SIAI Approved no  
  Call Number Admin @ si @Vil2016b Serial (down) 2805  
Permanent link to this record
 

 
Author Fernando Vilariño; Dimosthenis Karatzas edit  openurl
  Title A Living Lab approach for Citizen Science in Libraries Type Conference Article
  Year 2016 Publication 1st International ECSA Conference Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Berlin; Germany; May 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECSA  
  Notes MV; DAG; 600.084; 600.097;SIAI Approved no  
  Call Number Admin @ si @ViK2016 Serial (down) 2804  
Permanent link to this record
 

 
Author Fernando Vilariño edit  openurl
  Title Giving Value to digital collections in the Public Library Type Conference Article
  Year 2016 Publication Librarian 2020 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Brussels; Belgium; October 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference LIB  
  Notes MV; 600.097;SIAI Approved no  
  Call Number Admin @ si @Vil2016a Serial (down) 2802  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: