toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author (up) Katerine Diaz; Francesc J. Ferri; Aura Hernandez-Sabate edit   pdf
url  doi
openurl 
  Title An overview of incremental feature extraction methods based on linear subspaces Type Journal Article
  Year 2018 Publication Knowledge-Based Systems Abbreviated Journal KBS  
  Volume 145 Issue Pages 219-235  
  Keywords  
  Abstract With the massive explosion of machine learning in our day-to-day life, incremental and adaptive learning has become a major topic, crucial to keep up-to-date and improve classification models and their corresponding feature extraction processes. This paper presents a categorized overview of incremental feature extraction based on linear subspace methods which aim at incorporating new information to the already acquired knowledge without accessing previous data. Specifically, this paper focuses on those linear dimensionality reduction methods with orthogonal matrix constraints based on global loss function, due to the extensive use of their batch approaches versus other linear alternatives. Thus, we cover the approaches derived from Principal Components Analysis, Linear Discriminative Analysis and Discriminative Common Vector methods. For each basic method, its incremental approaches are differentiated according to the subspace model and matrix decomposition involved in the updating process. Besides this categorization, several updating strategies are distinguished according to the amount of data used to update and to the fact of considering a static or dynamic number of classes. Moreover, the specific role of the size/dimension ratio in each method is considered. Finally, computational complexity, experimental setup and the accuracy rates according to published results are compiled and analyzed, and an empirical evaluation is done to compare the best approach of each kind.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0950-7051 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ DFH2018 Serial 3090  
Permanent link to this record
 

 
Author (up) Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate; Debora Gil edit   pdf
doi  openurl
  Title Continuous head pose estimation using manifold subspace embedding and multivariate regression Type Journal Article
  Year 2018 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume 6 Issue Pages 18325 - 18334  
  Keywords Head Pose estimation; HOG features; Generalized Discriminative Common Vectors; B-splines; Multiple linear regression  
  Abstract In this paper, a continuous head pose estimation system is proposed to estimate yaw and pitch head angles from raw facial images. Our approach is based on manifold learningbased methods, due to their promising generalization properties shown for face modelling from images. The method combines histograms of oriented gradients, generalized discriminative common vectors and continuous local regression to achieve successful performance. Our proposal was tested on multiple standard face datasets, as well as in a realistic scenario. Results show a considerable performance improvement and a higher consistence of our model in comparison with other state-of-art methods, with angular errors varying between 9 and 17 degrees.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 2169-3536 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ DMH2018b Serial 3091  
Permanent link to this record
 

 
Author (up) Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate; Marçal Rusiñol; Francesc J. Ferri edit   pdf
doi  openurl
  Title Fast Kernel Generalized Discriminative Common Vectors for Feature Extraction Type Journal Article
  Year 2018 Publication Journal of Mathematical Imaging and Vision Abbreviated Journal JMIV  
  Volume 60 Issue 4 Pages 512-524  
  Keywords  
  Abstract This paper presents a supervised subspace learning method called Kernel Generalized Discriminative Common Vectors (KGDCV), as a novel extension of the known Discriminative Common Vectors method with Kernels. Our method combines the advantages of kernel methods to model complex data and solve nonlinear
problems with moderate computational complexity, with the better generalization properties of generalized approaches for large dimensional data. These attractive combination makes KGDCV specially suited for feature extraction and classification in computer vision, image processing and pattern recognition applications. Two different approaches to this generalization are proposed, a first one based on the kernel trick (KT) and a second one based on the nonlinear projection trick (NPT) for even higher efficiency. Both methodologies
have been validated on four different image datasets containing faces, objects and handwritten digits, and compared against well known non-linear state-of-art methods. Results show better discriminant properties than other generalized approaches both linear or kernel. In addition, the KGDCV-NPT approach presents a considerable computational gain, without compromising the accuracy of the model.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; ADAS; 600.086; 600.130; 600.121; 600.118; 600.129 Approved no  
  Call Number Admin @ si @ DMH2018a Serial 3062  
Permanent link to this record
 

 
Author (up) Laura Lopez-Fuentes; Alessandro Farasin; Harald Skinnemoen; Paolo Garza edit   pdf
openurl 
  Title Deep Learning models for passability detection of flooded roads Type Conference Article
  Year 2018 Publication MediaEval 2018 Multimedia Benchmark Workshop Abbreviated Journal  
  Volume 2283 Issue Pages  
  Keywords  
  Abstract In this paper we study and compare several approaches to detect floods and evidence for passability of roads by conventional means in Twitter. We focus on tweets containing both visual information (a picture shared by the user) and metadata, a combination of text and related extra information intrinsic to the Twitter API. This work has been done in the context of the MediaEval 2018 Multimedia Satellite Task.  
  Address Sophia Antipolis; France; October 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MediaEval  
  Notes LAMP; 600.084; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ LFS2018 Serial 3224  
Permanent link to this record
 

 
Author (up) Laura Lopez-Fuentes; Joost Van de Weijer; Manuel Gonzalez-Hidalgo; Harald Skinnemoen; Andrew Bagdanov edit   pdf
url  openurl
  Title Review on computer vision techniques in emergency situations Type Journal Article
  Year 2018 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 77 Issue 13 Pages 17069–17107  
  Keywords Emergency management; Computer vision; Decision makers; Situational awareness; Critical situation  
  Abstract In emergency situations, actions that save lives and limit the impact of hazards are crucial. In order to act, situational awareness is needed to decide what to do. Geolocalized photos and video of the situations as they evolve can be crucial in better understanding them and making decisions faster. Cameras are almost everywhere these days, either in terms of smartphones, installed CCTV cameras, UAVs or others. However, this poses challenges in big data and information overflow. Moreover, most of the time there are no disasters at any given location, so humans aiming to detect sudden situations may not be as alert as needed at any point in time. Consequently, computer vision tools can be an excellent decision support. The number of emergencies where computer vision tools has been considered or used is very wide, and there is a great overlap across related emergency research. Researchers tend to focus on state-of-the-art systems that cover the same emergency as they are studying, obviating important research in other fields. In order to unveil this overlap, the survey is divided along four main axes: the types of emergencies that have been studied in computer vision, the objective that the algorithms can address, the type of hardware needed and the algorithms used. Therefore, this review provides a broad overview of the progress of computer vision covering all sorts of emergencies.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.068; 600.120 Approved no  
  Call Number Admin @ si @ LWG2018 Serial 3041  
Permanent link to this record
 

 
Author (up) Lei Kang; Juan Ignacio Toledo; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol edit   pdf
url  openurl
  Title Convolve, Attend and Spell: An Attention-based Sequence-to-Sequence Model for Handwritten Word Recognition Type Conference Article
  Year 2018 Publication 40th German Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 459-472  
  Keywords  
  Abstract This paper proposes Convolve, Attend and Spell, an attention based sequence-to-sequence model for handwritten word recognition. The proposed architecture has three main parts: an encoder, consisting of a CNN and a bi-directional GRU, an attention mechanism devoted to focus on the pertinent features and a decoder formed by a one-directional GRU, able to spell the corresponding word, character by character. Compared with the recent state-of-the-art, our model achieves competitive results on the IAM dataset without needing any pre-processing step, predefined lexicon nor language model. Code and additional results are available in https://github.com/omni-us/research-seq2seq-HTR.  
  Address Stuttgart; Germany; October 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference GCPR  
  Notes DAG; 600.097; 603.057; 302.065; 601.302; 600.084; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ KTR2018 Serial 3167  
Permanent link to this record
 

 
Author (up) Lluis Gomez; Andres Mafla; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
url  openurl
  Title Single Shot Scene Text Retrieval Type Conference Article
  Year 2018 Publication 15th European Conference on Computer Vision Abbreviated Journal  
  Volume 11218 Issue Pages 728-744  
  Keywords Image retrieval; Scene text; Word spotting; Convolutional Neural Networks; Region Proposals Networks; PHOC  
  Abstract Textual information found in scene images provides high level semantic information about the image and its context and it can be leveraged for better scene understanding. In this paper we address the problem of scene text retrieval: given a text query, the system must return all images containing the queried text. The novelty of the proposed model consists in the usage of a single shot CNN architecture that predicts at the same time bounding boxes and a compact text representation of the words in them. In this way, the text based image retrieval task can be casted as a simple nearest neighbor search of the query text representation over the outputs of the CNN over the entire image
database. Our experiments demonstrate that the proposed architecture
outperforms previous state-of-the-art while it offers a significant increase
in processing speed.
 
  Address Munich; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCV  
  Notes DAG; 600.084; 601.338; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GMR2018 Serial 3143  
Permanent link to this record
 

 
Author (up) Lluis Gomez; Marçal Rusiñol; Ali Furkan Biten; Dimosthenis Karatzas edit   pdf
openurl 
  Title Subtitulació automàtica d'imatges. Estat de l'art i limitacions en el context arxivístic Type Conference Article
  Year 2018 Publication Jornades Imatge i Recerca Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference JIR  
  Notes DAG; 600.084; 600.135; 601.338; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GRB2018 Serial 3173  
Permanent link to this record
 

 
Author (up) Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 97-102  
  Keywords Robust Reading; End-to-end Systems; CNN; Utility Meters  
  Abstract In this paper we present a segmentation-free system for reading text in natural scenes. A CNN architecture is trained in an end-to-end manner, and is able to directly output readings without any explicit text localization step. In order to validate our proposal, we focus on the specific case of reading utility meters. We present our results in a large dataset of images acquired by different users and devices, so text appears in any location, with different sizes, fonts and lengths, and the images present several distortions such as
dirt, illumination highlights or blur.
 
  Address Viena; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GRK2018 Serial 3102  
Permanent link to this record
 

 
Author (up) Lu Yu; Lichao Zhang; Joost Van de Weijer; Fahad Shahbaz Khan; Yongmei Cheng; C. Alejandro Parraga edit   pdf
doi  openurl
  Title Beyond Eleven Color Names for Image Understanding Type Journal Article
  Year 2018 Publication Machine Vision and Applications Abbreviated Journal MVAP  
  Volume 29 Issue 2 Pages 361-373  
  Keywords Color name; Discriminative descriptors; Image classification; Re-identification; Tracking  
  Abstract Color description is one of the fundamental problems of image understanding. One of the popular ways to represent colors is by means of color names. Most existing work on color names focuses on only the eleven basic color terms of the English language. This could be limiting the discriminative power of these representations, and representations based on more color names are expected to perform better. However, there exists no clear strategy to choose additional color names. We collect a dataset of 28 additional color names. To ensure that the resulting color representation has high discriminative power we propose a method to order the additional color names according to their complementary nature with the basic color names. This allows us to compute color name representations with high discriminative power of arbitrary length. In the experiments we show that these new color name descriptors outperform the existing color name descriptor on the task of visual tracking, person re-identification and image classification.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; NEUROBIT; 600.068; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ YYW2018 Serial 3087  
Permanent link to this record
 

 
Author (up) Lu Yu; Yongmei Cheng; Joost Van de Weijer edit   pdf
doi  openurl
  Title Weakly Supervised Domain-Specific Color Naming Based on Attention Type Conference Article
  Year 2018 Publication 24th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 3019 - 3024  
  Keywords  
  Abstract The majority of existing color naming methods focuses on the eleven basic color terms of the English language. However, in many applications, different sets of color names are used for the accurate description of objects. Labeling data to learn these domain-specific color names is an expensive and laborious task. Therefore, in this article we aim to learn color names from weakly labeled data. For this purpose, we add an attention branch to the color naming network. The attention branch is used to modulate the pixel-wise color naming predictions of the network. In experiments, we illustrate that the attention branch correctly identifies the relevant regions. Furthermore, we show that our method obtains state-of-the-art results for pixel-wise and image-wise classification on the EBAY dataset and is able to learn color names for various domains.  
  Address Beijing; August 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes LAMP; 600.109; 602.200; 600.120 Approved no  
  Call Number Admin @ si @ YCW2018 Serial 3243  
Permanent link to this record
 

 
Author (up) Luis Herranz; Weiqing Min; Shuqiang Jiang edit  openurl
  Title Food recognition and recipe analysis: integrating visual content, context and external knowledge Type Miscellaneous
  Year 2018 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The central role of food in our individual and social life, combined with recent technological advances, has motivated a growing interest in applications that help to better monitor dietary habits as well as the exploration and retrieval of food-related information. We review how visual content, context and external knowledge can be integrated effectively into food-oriented applications, with special focus on recipe analysis and retrieval, food recommendation and restaurant context as emerging directions.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ HMJ2018 Serial 3250  
Permanent link to this record
 

 
Author (up) Maedeh Aghaei; Mariella Dimiccoli; C. Canton-Ferrer; Petia Radeva edit   pdf
url  doi
openurl 
  Title Towards social pattern characterization from egocentric photo-streams Type Journal Article
  Year 2018 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU  
  Volume 171 Issue Pages 104-117  
  Keywords Social pattern characterization; Social signal extraction; Lifelogging; Convolutional and recurrent neural networks  
  Abstract Following the increasingly popular trend of social interaction analysis in egocentric vision, this article presents a comprehensive pipeline for automatic social pattern characterization of a wearable photo-camera user. The proposed framework relies merely on the visual analysis of egocentric photo-streams and consists of three major steps. The first step is to detect social interactions of the user where the impact of several social signals on the task is explored. The detected social events are inspected in the second step for categorization into different social meetings. These two steps act at event-level where each potential social event is modeled as a multi-dimensional time-series, whose dimensions correspond to a set of relevant features for each task; finally, LSTM is employed to classify the time-series. The last step of the framework is to characterize social patterns of the user. Our goal is to quantify the duration, the diversity and the frequency of the user social relations in various social situations. This goal is achieved by the discovery of recurrences of the same people across the whole set of social events related to the user. Experimental evaluation over EgoSocialStyle – the proposed dataset in this work, and EGO-GROUP demonstrates promising results on the task of social pattern characterization from egocentric photo-streams.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ ADC2018 Serial 3022  
Permanent link to this record
 

 
Author (up) Manuel Carbonell; Mauricio Villegas; Alicia Fornes; Josep Llados edit   pdf
openurl 
  Title Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 399-404  
  Keywords Named entity recognition; Handwritten Text Recognition; neural networks  
  Abstract When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks. This has the disadvantage that errors in the first module affect heavily the
performance of the second module. In this work we propose to do both tasks jointly, using a single neural network with a common architecture used for plain text recognition. Experimentally, the work has been tested on a collection of historical marriage records. Results of experiments are presented to show the effect on the performance for different
configurations: different ways of encoding the information, doing or not transfer learning and processing at text line or multi-line region level. The results are comparable to state of the art reported in the ICDAR 2017 Information Extraction competition, even though the proposed technique does not use any dictionaries, language modeling or post processing.
 
  Address Vienna; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.097; 603.057; 601.311; 600.121 Approved no  
  Call Number Admin @ si @ CVF2018 Serial 3170  
Permanent link to this record
 

 
Author (up) Marçal Rusiñol; J. Chazalon; Katerine Diaz edit   pdf
doi  openurl
  Title Augmented Songbook: an Augmented Reality Educational Application for Raising Music Awareness Type Journal Article
  Year 2018 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 77 Issue 11 Pages 13773-13798  
  Keywords Augmented reality; Document image matching; Educational applications  
  Abstract This paper presents the development of an Augmented Reality mobile application which aims at sensibilizing young children to abstract concepts of music. Such concepts are, for instance, the musical notation or the idea of rhythm. Recent studies in Augmented Reality for education suggest that such technologies have multiple benefits for students, including younger ones. As mobile document image acquisition and processing gains maturity on mobile platforms, we explore how it is possible to build a markerless and real-time application to augment the physical documents with didactic animations and interactive virtual content. Given a standard image processing pipeline, we compare the performance of different local descriptors at two key stages of the process. Results suggest alternatives to the SIFT local descriptors, regarding result quality and computational efficiency, both for document model identification and perspective transform estimation. All experiments are performed on an original and public dataset we introduce here.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; ADAS; 600.084; 600.121; 600.118; 600.129 Approved no  
  Call Number Admin @ si @ RCD2018 Serial 2996  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: