|   | 
Details
   web
Records
Author E. Royer; J. Chazalon; Marçal Rusiñol; F. Bouchara
Title Benchmarking Keypoint Filtering Approaches for Document Image Matching Type Conference Article
Year (down) 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Best Poster Award.
Reducing the amount of keypoints used to index an image is particularly interesting to control processing time and memory usage in real-time document image matching applications, like augmented documents or smartphone applications. This paper benchmarks two keypoint selection methods on a task consisting of reducing keypoint sets extracted from document images, while preserving detection and segmentation accuracy. We first study the different forms of keypoint filtering, and we introduce the use of the CORE selection method on
keypoints extracted from document images. Then, we extend a previously published benchmark by including evaluations of the new method, by adding the SURF-BRISK detection/description scheme, and by reporting processing speeds. Evaluations are conducted on the publicly available dataset of ICDAR2015 SmartDOC challenge 1. Finally, we prove that reducing the original keypoint set is always feasible and can be beneficial
not only to processing speed but also to accuracy.
Address Kyoto; Japan; November 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.084; 600.121 Approved no
Call Number Admin @ si @ RCR2017 Serial 3000
Permanent link to this record
 

 
Author David Aldavert; Marçal Rusiñol; Ricardo Toledo
Title Automatic Static/Variable Content Separation in Administrative Document Images Type Conference Article
Year (down) 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this paper we present an automatic method for separating static and variable content from administrative document images. An alignment approach is able to unsupervisedly build probabilistic templates from a set of examples of the same document kind. Such templates define which is the likelihood of every pixel of being either static or variable content. In the extraction step, the same alignment technique is used to match
an incoming image with the template and to locate the positions where variable fields appear. We validate our approach on the public NIST Structured Tax Forms Dataset.
Address Kyoto; Japan; November 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.084; 600.121 Approved no
Call Number Admin @ si @ ART2017 Serial 3001
Permanent link to this record
 

 
Author Katerine Diaz; Konstantia Georgouli; Anastasios Koidis; Jesus Martinez del Rincon
Title Incremental model learning for spectroscopy-based food analysis Type Journal Article
Year (down) 2017 Publication Chemometrics and Intelligent Laboratory Systems Abbreviated Journal CILS
Volume 167 Issue Pages 123-131
Keywords Incremental model learning; IGDCV technique; Subspace based learning; IdentificationVegetable oils; FT-IR spectroscopy
Abstract In this paper we propose the use of incremental learning for creating and improving multivariate analysis models in the field of chemometrics of spectral data. As main advantages, our proposed incremental subspace-based learning allows creating models faster, progressively improving previously created models and sharing them between laboratories and institutions without requiring transferring or disclosing individual spectra samples. In particular, our approach allows to improve the generalization and adaptability of previously generated models with a few new spectral samples to be applicable to real-world situations. The potential of our approach is demonstrated using vegetable oil type identification based on spectroscopic data as case study. Results show how incremental models maintain the accuracy of batch learning methodologies while reducing their computational cost and handicaps.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ DGK2017 Serial 3002
Permanent link to this record
 

 
Author Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate
Title Decremental generalized discriminative common vectors applied to images classification Type Journal Article
Year (down) 2017 Publication Knowledge-Based Systems Abbreviated Journal KBS
Volume 131 Issue Pages 46-57
Keywords Decremental learning; Generalized Discriminative Common Vectors; Feature extraction; Linear subspace methods; Classification
Abstract In this paper, a novel decremental subspace-based learning method called Decremental Generalized Discriminative Common Vectors method (DGDCV) is presented. The method makes use of the concept of decremental learning, which we introduce in the field of supervised feature extraction and classification. By efficiently removing unnecessary data and/or classes for a knowledge base, our methodology is able to update the model without recalculating the full projection or accessing to the previously processed training data, while retaining the previously acquired knowledge. The proposed method has been validated in 6 standard face recognition datasets, showing a considerable computational gain without compromising the accuracy of the model.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118; 600.121 Approved no
Call Number Admin @ si @ DMH2017a Serial 3003
Permanent link to this record
 

 
Author Leonardo Galteri; Dena Bazazian; Lorenzo Seidenari; Marco Bertini; Andrew Bagdanov; Anguelos Nicolaou; Dimosthenis Karatzas; Alberto del Bimbo
Title Reading Text in the Wild from Compressed Images Type Conference Article
Year (down) 2017 Publication 1st International workshop on Egocentric Perception, Interaction and Computing Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Reading text in the wild is gaining attention in the computer vision community. Images captured in the wild are almost always compressed to varying degrees, depending on application context, and this compression introduces artifacts
that distort image content into the captured images. In this paper we investigate the impact these compression artifacts have on text localization and recognition in the wild. We also propose a deep Convolutional Neural Network (CNN) that can eliminate text-specific compression artifacts and which leads to an improvement in text recognition. Experimental results on the ICDAR-Challenge4 dataset demonstrate that compression artifacts have a significant
impact on text localization and recognition and that our approach yields an improvement in both – especially at high compression rates.
Address Venice; Italy; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV - EPIC
Notes DAG; 600.084; 600.121 Approved no
Call Number Admin @ si @ GBS2017 Serial 3006
Permanent link to this record
 

 
Author Andrei Polzounov; Artsiom Ablavatski; Sergio Escalera; Shijian Lu; Jianfei Cai
Title WordFences: Text Localization and Recognition Type Conference Article
Year (down) 2017 Publication 24th International Conference on Image Processing Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Beijing; China; September 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ PAE2017 Serial 3007
Permanent link to this record
 

 
Author Sergio Escalera; Vassilis Athitsos; Isabelle Guyon
Title Challenges in Multi-modal Gesture Recognition Type Book Chapter
Year (down) 2017 Publication Abbreviated Journal
Volume Issue Pages 1-60
Keywords Gesture recognition; Time series analysis; Multimodal data analysis; Computer vision; Pattern recognition; Wearable sensors; Infrared cameras; Kinect TMTM
Abstract This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011–2015. We began right at the start of the Kinect TMTM revolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ EAG2017 Serial 3008
Permanent link to this record
 

 
Author Jordi Esquirol; Cristina Palmero; Vanessa Bayo; Miquel Angel Cos; Sergio Escalera; David Sanchez; Maider Sanchez; Noelia Serrano; Mireia Relats
Title Automatic RBG-depth-pressure anthropometric analysis and individualised sleep solution prescription Type Journal
Year (down) 2017 Publication Journal of Medical Engineering & Technology Abbreviated Journal JMET
Volume 41 Issue 6 Pages 486-497
Keywords
Abstract INTRODUCTION:
Sleep surfaces must adapt to individual somatotypic features to maintain a comfortable, convenient and healthy sleep, preventing diseases and injuries. Individually determining the most adequate rest surface can often be a complex and subjective question.
OBJECTIVES:
To design and validate an automatic multimodal somatotype determination model to automatically recommend an individually designed mattress-topper-pillow combination.
METHODS:
Design and validation of an automated prescription model for an individualised sleep system is performed through a single-image 2 D-3 D analysis and body pressure distribution, to objectively determine optimal individual sleep surfaces combining five different mattress densities, three different toppers and three cervical pillows.
RESULTS:
A final study (n = 151) and re-analysis (n = 117) defined and validated the model, showing high correlations between calculated and real data (>85% in height and body circumferences, 89.9% in weight, 80.4% in body mass index and more than 70% in morphotype categorisation).
CONCLUSIONS:
Somatotype determination model can accurately prescribe an individualised sleep solution. This can be useful for healthy people and for health centres that need to adapt sleep surfaces to people with special needs. Next steps will increase model's accuracy and analise, if this prescribed individualised sleep solution can improve sleep quantity and quality; additionally, future studies will adapt the model to mattresses with technological improvements, tailor-made production and will define interfaces for people with special needs.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ EPB2017 Serial 3010
Permanent link to this record
 

 
Author Sergio Escalera; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon
Title ChaLearn Looking at People: A Review of Events and Resources Type Conference Article
Year (down) 2017 Publication 30th International Joint Conference on Neural Networks Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challenges and events in this field. This paper reviews associated events, and introduces the ChaLearn LAP platform where public resources (including code, data and preprints of papers) related to the organized events are available. We also provide a discussion on perspectives of ChaLearn LAP activities.
Address Anchorage; Alaska; USA; May 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IJCNN
Notes HuPBA; 602.143 Approved no
Call Number Admin @ si @ EBE2017 Serial 3012
Permanent link to this record
 

 
Author Eirikur Agustsson; Radu Timofte; Sergio Escalera; Xavier Baro; Isabelle Guyon; Rasmus Rothe
Title Apparent and real age estimation in still images with deep residual regressors on APPA-REAL database Type Conference Article
Year (down) 2017 Publication 12th IEEE International Conference on Automatic Face and Gesture Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract After decades of research, the real (biological) age estimation from a single face image reached maturity thanks to the availability of large public face databases and impressive accuracies achieved by recently proposed methods.
The estimation of “apparent age” is a related task concerning the age perceived by human observers. Significant advances have been also made in this new research direction with the recent Looking At People challenges. In this paper we make several contributions to age estimation research. (i) We introduce APPA-REAL, a large face image database with both real and apparent age annotations. (ii) We study the relationship between real and apparent age. (iii) We develop a residual age regression method to further improve the performance. (iv) We show that real age estimation can be successfully tackled as an apparent age estimation followed by an apparent to real age residual regression. (v) We graphically reveal the facial regions on which the CNN focuses in order to perform apparent and real age estimation tasks.
Address Washington;USA; May 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference FG
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ ATE2017 Serial 3013
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera; Huamin Ren; Thomas B. Moeslund; Elham Etemad
Title Locality Regularized Group Sparse Coding for Action Recognition Type Journal Article
Year (down) 2017 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 158 Issue Pages 106-114
Keywords Bag of words; Feature encoding; Locality constrained coding; Group sparse coding; Alternating direction method of multipliers; Action recognition
Abstract Bag of visual words (BoVW) models are widely utilized in image/ video representation and recognition. The cornerstone of these models is the encoding stage, in which local features are decomposed over a codebook in order to obtain a representation of features. In this paper, we propose a new encoding algorithm by jointly encoding the set of local descriptors of each sample and considering the locality structure of descriptors. The proposed method takes advantages of locality coding such as its stability and robustness to noise in descriptors, as well as the strengths of the group coding strategy by taking into account the potential relation among descriptors of a sample. To efficiently implement our proposed method, we consider the Alternating Direction Method of Multipliers (ADMM) framework, which results in quadratic complexity in the problem size. The method is employed for a challenging classification problem: action recognition by depth cameras. Experimental results demonstrate the outperformance of our methodology compared to the state-of-the-art on the considered datasets.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ BGE2017 Serial 3014
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla
Title Colorizing Infrared Images through a Triplet Conditional DCGAN Architecture Type Conference Article
Year (down) 2017 Publication 19th international conference on image analysis and processing Abbreviated Journal
Volume Issue Pages
Keywords CNN in Multispectral Imaging; Image Colorization
Abstract This paper focuses on near infrared (NIR) image colorization by using a Conditional Deep Convolutional Generative Adversarial Network (CDCGAN) architecture model. The proposed architecture is based on the usage of a conditional probabilistic generative model. Firstly, it learns to colorize the given input image, by using a triplet model architecture that tackle every channel in an independent way. In the proposed model, the nal layer of red channel consider the infrared image to enhance the details, resulting in a sharp RGB image. Then, in the second stage, a discriminative model is used to estimate the probability that the generated image came from the training dataset, rather than the image automatically generated. Experimental results with a large set of real images are provided showing the validity of the proposed approach. Additionally, the proposed approach is compared with a state of the art approach showing better results.
Address Catania; Italy; September 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIAP
Notes ADAS; MSIAU; 600.086; 600.122; 600.118 Approved no
Call Number Admin @ si @ SSV2017c Serial 3016
Permanent link to this record
 

 
Author Meysam Madadi
Title Human Segmentation, Pose Estimation and Applications Type Book Whole
Year (down) 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Automatic analyzing humans in photographs or videos has great potential applications in computer vision, including medical diagnosis, sports, entertainment, movie editing and surveillance, just to name a few. Body, face and hand are the most studied components of humans. Body has many variabilities in shape and clothing along with high degrees of freedom in pose. Face has many muscles causing many visible deformity, beside variable shape and hair style. Hand is a small object, moving fast and has high degrees of freedom. Adding human characteristics to all aforementioned variabilities makes human analysis quite a challenging task.
In this thesis, we developed human segmentation in different modalities. In a first scenario, we segmented human body and hand in depth images using example-based shape warping. We developed a shape descriptor based on shape context and class probabilities of shape regions to extract nearest neighbors. We then considered rigid affine alignment vs. nonrigid iterative shape warping. In a second scenario, we segmented face in RGB images using convolutional neural networks (CNN). We modeled conditional random field with recurrent neural networks. In our model pair-wise kernels are not fixed and learned during training. We trained the network end-to-end using adversarial networks which improved hair segmentation by a high margin.
We also worked on 3D hand pose estimation in depth images. In a generative approach, we fitted a finger model separately for each finger based on our example-based rigid hand segmentation. We minimized an energy function based on overlapping area, depth discrepancy and finger collisions. We also applied linear models in joint trajectory space to refine occluded joints based on visible joints error and invisible joints trajectory smoothness. In a CNN-based approach, we developed a tree-structure network to train specific features for each finger and fused them for global pose consistency. We also formulated physical and appearance constraints as loss functions.
Finally, we developed a number of applications consisting of human soft biometrics measurement and garment retexturing. We also generated some datasets in this thesis consisting of human segmentation, synthetic hand pose, garment retexturing and Italian gestures.
Address October 2017
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Sergio Escalera;Jordi Gonzalez
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-3-2 Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ Mad2017 Serial 3017
Permanent link to this record
 

 
Author Onur Ferhat
Title Analysis of Head-Pose Invariant, Natural Light Gaze Estimation Methods Type Book Whole
Year (down) 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Eye tracker devices have traditionally been only used inside laboratories, requiring trained professionals and elaborate setup mechanisms. However, in the recent years the scientific work on easier–to–use eye trackers which require no special hardware—other than the omnipresent front facing cameras in computers, tablets, and mobiles—is aiming at making this technology common–place. These types of trackers have several extra challenges that make the problem harder, such as low resolution images provided by a regular webcam, the changing ambient lighting conditions, personal appearance differences, changes in head pose, and so on. Recent research in the field has focused on all these challenges in order to provide better gaze estimation performances in a real world setup.

In this work, we aim at tackling the gaze tracking problem in a single camera setup. We first analyze all the previous work in the field, identifying the strengths and weaknesses of each tried idea. We start our work on the gaze tracker with an appearance–based gaze estimation method, which is the simplest idea that creates a direct mapping between a rectangular image patch extracted around the eye in a camera image, and the gaze point (or gaze direction). Here, we do an extensive analysis of the factors that affect the performance of this tracker in several experimental setups, in order to address these problems in future works. In the second part of our work, we propose a feature–based gaze estimation method, which encodes the eye region image into a compact representation. We argue that this type of representation is better suited to dealing with head pose and lighting condition changes, as it both reduces the dimensionality of the input (i.e. eye image) and breaks the direct connection between image pixel intensities and the gaze estimation. Lastly, we use a face alignment algorithm to have robust face pose estimation, using a 3D model customized to the subject using the tracker. We combine this with a convolutional neural network trained on a large dataset of images to build a face pose invariant gaze tracker.
Address September 2017
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Fernando Vilariño
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-5-6 Medium
Area Expedition Conference
Notes MV Approved no
Call Number Admin @ si @ Fer2017 Serial 3018
Permanent link to this record
 

 
Author Arash Akbarinia
Title Computational Model of Visual Perception: From Colour to Form Type Book Whole
Year (down) 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The original idea of this project was to study the role of colour in the challenging task of object recognition. We started by extending previous research on colour naming showing that it is feasible to capture colour terms through parsimonious ellipsoids. Although, the results of our model exceeded state-of-the-art in two benchmark datasets, we realised that the two phenomena of metameric lights and colour constancy must be addressed prior to any further colour processing. Our investigation of metameric pairs reached the conclusion that they are infrequent in real world scenarios. Contrary to that, the illumination of a scene often changes dramatically. We addressed this issue by proposing a colour constancy model inspired by the dynamical centre-surround adaptation of neurons in the visual cortex. This was implemented through two overlapping asymmetric Gaussians whose variances and heights are adjusted according to the local contrast of pixels. We complemented this model with a generic contrast-variant pooling mechanism that inversely connect the percentage of pooled signal to the local contrast of a region. The results of our experiments on four benchmark datasets were indeed promising: the proposed model, although simple, outperformed even learning-based approaches in many cases. Encouraged by the success of our contrast-variant surround modulation, we extended this approach to detect boundaries of objects. We proposed an edge detection model based on the first derivative of the Gaussian kernel. We incorporated four types of surround: full, far, iso- and orthogonal-orientation. Furthermore, we accounted for the pooling mechanism at higher cortical areas and the shape feedback sent to lower areas. Our results in three benchmark datasets showed significant improvement over non-learning algorithms.
To summarise, we demonstrated that biologically-inspired models offer promising solutions to computer vision problems, such as, colour naming, colour constancy and edge detection. We believe that the greatest contribution of this Ph.D dissertation is modelling the concept of dynamic surround modulation that shows the significance of contrast-variant surround integration. The models proposed here are grounded on only a portion of what we know about the human visual system. Therefore, it is only natural to complement them accordingly in future works.
Address October 2017
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor C. Alejandro Parraga
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-4-9 Medium
Area Expedition Conference
Notes NEUROBIT Approved no
Call Number Admin @ si @ Akb2017 Serial 3019
Permanent link to this record