toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links (up)
Author Manuel Carbonell; Mauricio Villegas; Alicia Fornes; Josep Llados edit   pdf
openurl 
  Title Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 399-404  
  Keywords Named entity recognition; Handwritten Text Recognition; neural networks  
  Abstract When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks. This has the disadvantage that errors in the first module affect heavily the
performance of the second module. In this work we propose to do both tasks jointly, using a single neural network with a common architecture used for plain text recognition. Experimentally, the work has been tested on a collection of historical marriage records. Results of experiments are presented to show the effect on the performance for different
configurations: different ways of encoding the information, doing or not transfer learning and processing at text line or multi-line region level. The results are comparable to state of the art reported in the ICDAR 2017 Information Extraction competition, even though the proposed technique does not use any dictionaries, language modeling or post processing.
 
  Address Vienna; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.097; 603.057; 601.311; 600.121 Approved no  
  Call Number Admin @ si @ CVF2018 Serial 3170  
Permanent link to this record
 

 
Author Y. Patel; Lluis Gomez; Raul Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar edit  openurl
  Title TextTopicNet-Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces Type Miscellaneous
  Year 2018 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The immense success of deep learning based methods in computer vision heavily relies on large scale training datasets. These richly annotated datasets help the network learn discriminative visual features. Collecting and annotating such datasets requires a tremendous amount of human effort and annotations are limited to popular set of classes. As an alternative, learning visual features by designing auxiliary tasks which make use of freely available self-supervision has become increasingly popular in the computer vision community.
In this paper, we put forward an idea to take advantage of multi-modal context to provide self-supervision for the training of computer vision algorithms. We show that adequate visual features can be learned efficiently by training a CNN to predict the semantic textual context in which a particular image is more probable to appear as an illustration. More specifically we use popular text embedding techniques to provide the self-supervision for the training of deep CNN.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.084; 601.338; 600.121 Approved no  
  Call Number Admin @ si @ PGG2018 Serial 3177  
Permanent link to this record
 

 
Author Alejandro Cartas; Estefania Talavera; Petia Radeva; Mariella Dimiccoli edit  openurl
  Title On the Role of Event Boundaries in Egocentric Activity Recognition from Photostreams Type Miscellaneous
  Year 2018 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Event boundaries play a crucial role as a pre-processing step for detection, localization, and recognition tasks of human activities in videos. Typically, although their intrinsic subjectiveness, temporal bounds are provided manually as input for training action recognition algorithms. However, their role for activity recognition in the domain of egocentric photostreams has been so far neglected. In this paper, we provide insights of how automatically computed boundaries can impact activity recognition results in the emerging domain of egocentric photostreams. Furthermore, we collected a new annotated dataset acquired by 15 people by a wearable photo-camera and we used it to show the generalization capabilities of several deep learning based architectures to unseen users.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ CTR2018 Serial 3184  
Permanent link to this record
 

 
Author Md.Mostafa Kamal Sarker; Hatem A. Rashwan; Hatem A. Rashwan; Estefania Talavera; Syeda Furruka Banu; Petia Radeva; Domenec Puig edit   pdf
openurl 
  Title MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams Type Conference Article
  Year 2018 Publication European Conference on Computer Vision workshops Abbreviated Journal  
  Volume Issue Pages 423-433  
  Keywords  
  Abstract First-person (wearable) camera continually captures unscripted interactions of the camera user with objects, people, and scenes reflecting his personal and relational tendencies. One of the preferences of people is their interaction with food events. The regulation of food intake and its duration has a great importance to protect against diseases. Consequently, this work aims to develop a smart model that is able to determine the recurrences of a person on food places during a day. This model is based on a deep end-to-end model for automatic food places recognition by analyzing egocentric photo-streams. In this paper, we apply multi-scale Atrous convolution networks to extract the key features related to food places of the input images. The proposed model is evaluated on an in-house private dataset called “EgoFoodPlaces”. Experimental results shows promising results of food places classification recognition in egocentric photo-streams.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LCNS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ SRR2018b Serial 3185  
Permanent link to this record
 

 
Author Xavier Soria; Angel Sappa edit   pdf
openurl 
  Title Improving Edge Detection in RGB Images by Adding NIR Channel Type Conference Article
  Year 2018 Publication 14th IEEE International Conference on Signal Image Technology & Internet Based System Abbreviated Journal  
  Volume Issue Pages  
  Keywords Edge detection; Contour detection; VGG; CNN; RGB-NIR; Near infrared images  
  Abstract The edge detection is yet a critical problem in many computer vision and image processing tasks. The manuscript presents an Holistically-Nested Edge Detection based approach to study the inclusion of Near-Infrared in the Visible spectrum
images. To do so, a Single Sensor based dataset has been acquired in the range of 400nm to 1100nm wavelength spectral band. Prominent results have been obtained even when the ground truth (annotated edge-map) is based in the visible wavelength spectrum.
 
  Address Las Palmas de Gran Canaria; November 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SITIS  
  Notes MSIAU; 600.122 Approved no  
  Call Number Admin @ si @ SoS2018 Serial 3192  
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla edit   pdf
isbn  openurl
  Title Cross-spectral image dehaze through a dense stacked conditional GAN based approach Type Conference Article
  Year 2018 Publication 14th IEEE International Conference on Signal Image Technology & Internet Based System Abbreviated Journal  
  Volume Issue Pages  
  Keywords Infrared imaging; Dense; Stacked CGAN; Crossspectral; Convolutional networks  
  Abstract This paper proposes a novel approach to remove haze from RGB images using a near infrared images based on a dense stacked conditional Generative Adversarial Network (CGAN). The architecture of the deep network implemented
receives, besides the images with haze, its corresponding image in the near infrared spectrum, which serve to accelerate the learning process of the details of the characteristics of the images. The model uses a triplet layer that allows the independence learning of each channel of the visible spectrum image to remove the haze on each color channel separately. A multiple loss function scheme is proposed, which ensures balanced learning between the colors
and the structure of the images. Experimental results have shown that the proposed method effectively removes the haze from the images. Additionally, the proposed approach is compared with a state of the art approach showing better results.
 
  Address Las Palmas de Gran Canaria; November 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-5386-9385-8 Medium  
  Area Expedition Conference SITIS  
  Notes MSIAU; 600.086; 600.130; 600.122 Approved no  
  Call Number Admin @ si @ SSV2018a Serial 3193  
Permanent link to this record
 

 
Author Jorge Charco; Boris X. Vintimilla; Angel Sappa edit   pdf
openurl 
  Title Deep learning based camera pose estimation in multi-view environment Type Conference Article
  Year 2018 Publication 14th IEEE International Conference on Signal Image Technology & Internet Based System Abbreviated Journal  
  Volume Issue Pages  
  Keywords Deep learning; Camera pose estimation; Multiview environment; Siamese architecture  
  Abstract This paper proposes to use a deep learning network architecture for relative camera pose estimation on a multi-view environment. The proposed network is a variant architecture of AlexNet to use as regressor for prediction the relative translation and rotation as output. The proposed approach is trained from
scratch on a large data set that takes as input a pair of imagesfrom the same scene. This new architecture is compared with a previous approach using standard metrics, obtaining better results on the relative camera pose.
 
  Address Las Palmas de Gran Canaria; November 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SITIS  
  Notes MSIAU; 600.086; 600.130; 600.122 Approved no  
  Call Number Admin @ si @ CVS2018 Serial 3194  
Permanent link to this record
 

 
Author Cristina Palmero; Javier Selva; Mohammad Ali Bagueri; Sergio Escalera edit   pdf
openurl 
  Title Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues Type Conference Article
  Year 2018 Publication 29th British Machine Vision Conference Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Gaze behavior is an important non-verbal cue in social signal processing and humancomputer interaction. In this paper, we tackle the problem of person- and head poseindependent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN). We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images. Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame. Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on
EYEDIAP dataset, further improved by 4% when the temporal modality is included.
 
  Address Newcastle; UK; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference BMVC  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ PSB2018 Serial 3208  
Permanent link to this record
 

 
Author Gabriela Ramirez; Esau Villatoro; Bogdan Ionescu; Hugo Jair Escalante; Sergio Escalera; Martha Larson; Henning Muller; Isabelle Guyon edit  openurl
  Title Overview of the Multimedia Information Processing for Personality & Social Networks Analysis Contes Type Conference Article
  Year 2018 Publication Multimedia Information Processing for Personality and Social Networks Analysis (MIPPSNA 2018) Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Beijing; China; August 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRW  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ RVI2018 Serial 3211  
Permanent link to this record
 

 
Author Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya edit  openurl
  Title Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable” Type Journal
  Year 2018 Publication Informaciones Psiquiatricas Abbreviated Journal  
  Volume 232 Issue Pages 47-59  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0210-7279 ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no menciona Approved no  
  Call Number Admin @ si @ FAA2018 Serial 3214  
Permanent link to this record
 

 
Author Suman Ghosh edit  isbn
openurl 
  Title Word Spotting and Recognition in Images from Heterogeneous Sources A Type Book Whole
  Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Text is the most common way of information sharing from ages. With recent development of personal images databases and handwritten historic manuscripts the demand for algorithms to make these databases accessible for browsing and indexing are in rise. Enabling search or understanding large collection of manuscripts or image databases needs fast and robust methods. Researchers have found different ways to represent cropped words for understanding and matching, which works well when words are already segmented. However there is no trivial way to extend these for non-segmented documents. In this thesis we explore different methods for text retrieval and recognition from unsegmented document and scene images. Two different ways of representation exist in literature, one uses a fixed length representation learned from cropped words and another a sequence of features of variable length. Throughout this thesis, we have studied both these representation for their suitability in segmentation free understanding of text. In the first part we are focused on segmentation free word spotting using a fixed length representation. We extended the use of the successful PHOC (Pyramidal Histogram of Character) representation to segmentation free retrieval. In the second part of the thesis, we explore sequence based features and finally, we propose a unified solution where the same framework can generate both kind of representations.  
  Address November 2018  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Ernest Valveny  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-0-4 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Gho2018 Serial 3217  
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Alicia Fornes edit   pdf
openurl 
  Title A Starting Point for Handwritten Music Recognition Type Conference Article
  Year 2018 Publication 1st International Workshop on Reading Music Systems Abbreviated Journal  
  Volume Issue Pages 5-6  
  Keywords Optical Music Recognition; Long Short-Term Memory; Convolutional Neural Networks; MUSCIMA++; CVCMUSCIMA  
  Abstract In the last years, the interest in Optical Music Recognition (OMR) has reawakened, especially since the appearance of deep learning. However, there are very few works addressing handwritten scores. In this work we describe a full OMR pipeline for handwritten music scores by using Convolutional and Recurrent Neural Networks that could serve as a baseline for the research community.  
  Address Paris; France; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WORMS  
  Notes DAG; 600.097; 601.302; 601.330; 600.121 Approved no  
  Call Number Admin @ si @ BRF2018 Serial 3223  
Permanent link to this record
 

 
Author Laura Lopez-Fuentes; Alessandro Farasin; Harald Skinnemoen; Paolo Garza edit   pdf
openurl 
  Title Deep Learning models for passability detection of flooded roads Type Conference Article
  Year 2018 Publication MediaEval 2018 Multimedia Benchmark Workshop Abbreviated Journal  
  Volume 2283 Issue Pages  
  Keywords  
  Abstract In this paper we study and compare several approaches to detect floods and evidence for passability of roads by conventional means in Twitter. We focus on tweets containing both visual information (a picture shared by the user) and metadata, a combination of text and related extra information intrinsic to the Twitter API. This work has been done in the context of the MediaEval 2018 Multimedia Satellite Task.  
  Address Sophia Antipolis; France; October 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MediaEval  
  Notes LAMP; 600.084; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ LFS2018 Serial 3224  
Permanent link to this record
 

 
Author Md.Mostafa Kamal Sarker; Mohammed Jabreel; Hatem A. Rashwan; Syeda Furruka Banu; Antonio Moreno; Petia Radeva; Domenec Puig edit  openurl
  Title CuisineNet: Food Attributes Classification using Multi-scale Convolution Network. Type Miscellaneous
  Year 2018 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Diversity of food and its attributes represents the culinary habits of peoples from different countries. Thus, this paper addresses the problem of identifying food culture of people around the world and its flavor by classifying two main food attributes, cuisine and flavor. A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input images. The aggregation of multi-scale convolution layers with different kernel size is also used for weighting the features results from different scales. In addition, a joint loss function based on Negative Log Likelihood (NLL) is used to fit the model probability to multi labeled classes for multi-modal classification task. Furthermore, this work provides a new dataset for food attributes, so-called Yummly48K, extracted from the popular food website, Yummly. Our model is assessed on the constructed Yummly48K dataset. The experimental results show that our proposed method yields 65% and 62% average F1 score on validation and test set which outperforming the state-of-the-art models.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ KJR2018 Serial 3235  
Permanent link to this record
 

 
Author Marçal Rusiñol; Lluis Gomez edit   pdf
openurl 
  Title Avances en clasificación de imágenes en los últimos diez años. Perspectivas y limitaciones en el ámbito de archivos fotográficos históricos Type Journal
  Year 2018 Publication Revista anual de la Asociación de Archiveros de Castilla y León Abbreviated Journal  
  Volume 21 Issue Pages 161-174  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ RuG2018 Serial 3239  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: