|   | 
Details
   web
Records
Author Juan Ignacio Toledo; Jordi Cucurull; Jordi Puiggali; Alicia Fornes; Josep Llados
Title (up) Document Analysis Techniques for Automatic Electoral Document Processing: A Survey Type Conference Article
Year 2015 Publication E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015 Abbreviated Journal
Volume Issue Pages 139-141
Keywords Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally
Abstract In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents.
Address Bern; Switzerland; September 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VoteID
Notes DAG; 600.061; 602.006; 600.077 Approved no
Call Number Admin @ si @ TCP2015 Serial 2641
Permanent link to this record
 

 
Author Jiaolong Xu
Title (up) Domain Adaptation of Deformable Part-based Models Type Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract On-board pedestrian detection is crucial for Advanced Driver Assistance Systems
(ADAS). An accurate classi cation is fundamental for vision-based pedestrian detection.
The underlying assumption for learning classi ers is that the training set and the deployment environment (testing) follow the same probability distribution regarding the features used by the classi ers. However, in practice, there are di erent reasons that can break this constancy assumption. Accordingly, reusing existing classi ers by adapting them from the previous training environment (source domain) to the new testing one (target domain) is an approach with increasing acceptance in the computer vision community. In this thesis we focus on the domain adaptation of deformable part-based models (DPMs) for pedestrian detection. As a prof of concept, we use a computer graphic based synthetic dataset, i.e. a virtual world, as the source domain, and adapt the virtual-world trained DPM detector to various real-world dataset.
We start by exploiting the maximum detection accuracy of the virtual-world
trained DPM. Even though, when operating in various real-world datasets, the virtualworld trained detector still su er from accuracy degradation due to the domain gap of virtual and real worlds. We then focus on domain adaptation of DPM. At the rst step, we consider single source and single target domain adaptation and propose two batch learning methods, namely A-SSVM and SA-SSVM. Later, we further consider leveraging multiple target (sub-)domains for progressive domain adaptation and propose a hierarchical adaptive structured SVM (HA-SSVM) for optimization. Finally, we extend HA-SSVM for the challenging online domain adaptation problem, aiming at making the detector to automatically adapt to the target domain online, without any human intervention. All of the proposed methods in this thesis do not require
revisiting source domain data. The evaluations are done on the Caltech pedestrian detection benchmark. Results show that SA-SSVM slightly outperforms A-SSVM and avoids accuracy drops as high as 15 points when comparing with a non-adapted detector. The hierarchical model learned by HA-SSVM further boosts the domain adaptation performance. Finally, the online domain adaptation method has demonstrated that it can achieve comparable accuracy to the batch learned models while not requiring manually label target domain examples. Domain adaptation for pedestrian detection is of paramount importance and a relatively unexplored area. We humbly hope the work in this thesis could provide foundations for future work in this area.
Address April 2015
Corporate Author Thesis Ph.D. thesis
Publisher Place of Publication Editor Antonio Lopez
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-1-4 Medium
Area Expedition Conference
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @ Xu2015 Serial 2631
Permanent link to this record
 

 
Author Suman Ghosh; Lluis Gomez; Dimosthenis Karatzas; Ernest Valveny
Title (up) Efficient indexing for Query By String text retrieval Type Conference Article
Year 2015 Publication 6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015 Abbreviated Journal
Volume Issue Pages 1236 - 1240
Keywords
Abstract This paper deals with Query By String word spotting in scene images. A hierarchical text segmentation algorithm based on text specific selective search is used to find text regions. These regions are indexed per character n-grams present in the text region. An attribute representation based on Pyramidal Histogram of Characters (PHOC) is used to compare text regions with the query text. For generation of the index a similar attribute space based Pyramidal Histogram of character n-grams is used. These attribute models are learned using linear SVMs over the Fisher Vector [1] representation of the images along with the PHOC labels of the corresponding strings.
Address Nancy; France; August 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CBDAR
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ GGK2015 Serial 2693
Permanent link to this record
 

 
Author Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados
Title (up) Efficient segmentation-free keyword spotting in historical document collections Type Journal Article
Year 2015 Publication Pattern Recognition Abbreviated Journal PR
Volume 48 Issue 2 Pages 545–555
Keywords Historical documents; Keyword spotting; Segmentation-free; Dense SIFT features; Latent semantic analysis; Product quantization
Abstract In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; ADAS; 600.076; 600.077; 600.061; 601.223; 602.006; 600.055 Approved no
Call Number Admin @ si @ RAT2015a Serial 2544
Permanent link to this record
 

 
Author David Sanchez-Mendoza; David Masip; Agata Lapedriza
Title (up) Emotion recognition from mid-level features Type Journal Article
Year 2015 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 67 Issue Part 1 Pages 66–74
Keywords Facial expression; Emotion recognition; Action units; Computer vision
Abstract In this paper we present a study on the use of Action Units as mid-level features for automatically recognizing basic and subtle emotions. We propose a representation model based on mid-level facial muscular movement features. We encode these movements dynamically using the Facial Action Coding System, and propose to use these intermediate features based on Action Units (AUs) to classify emotions. AUs activations are detected fusing a set of spatiotemporal geometric and appearance features. The algorithm is validated in two applications: (i) the recognition of 7 basic emotions using the publicly available Cohn-Kanade database, and (ii) the inference of subtle emotional cues in the Newscast database. In this second scenario, we consider emotions that are perceived cumulatively in longer periods of time. In particular, we Automatically classify whether video shoots from public News TV channels refer to Good or Bad news. To deal with the different video lengths we propose a Histogram of Action Units and compute it using a sliding window strategy on the frame sequences. Our approach achieves accuracies close to human perception.
Address
Corporate Author Thesis
Publisher Elsevier B.V. Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0167-8655 ISBN Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number Admin @ si @ SML2015 Serial 2746
Permanent link to this record
 

 
Author Wenjuan Gong; W.Zhang; Jordi Gonzalez; Y.Ren; Z.Li
Title (up) Enhanced Asymmetric Bilinear Model for Face Recognition Type Journal Article
Year 2015 Publication International Journal of Distributed Sensor Networks Abbreviated Journal IJDSN
Volume Issue Pages Article ID 218514
Keywords
Abstract Bilinear models have been successfully applied to separate two factors, for example, pose variances and different identities in face recognition problems. Asymmetric model is a type of bilinear model which models a system in the most concise way. But seldom there are works exploring the applications of asymmetric bilinear model on face recognition problem with illumination changes. In this work, we propose enhanced asymmetric model for illumination-robust face recognition. Instead of initializing the factor probabilities randomly, we initialize them with nearest neighbor method and optimize them for the test data. Above that, we update the factor model to be identified. We validate the proposed method on a designed data sample and extended Yale B dataset. The experiment results show that the enhanced asymmetric models give promising results and good recognition accuracies.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE; 600.063; 600.078 Approved no
Call Number Admin @ si @ GZG2015 Serial 2592
Permanent link to this record
 

 
Author Youssef El Rhabi; Simon Loic; Brun Luc
Title (up) Estimation de la pose d’une caméra à partir d’un flux vidéo en s’approchant du temps réel Type Conference Article
Year 2015 Publication 15ème édition d'ORASIS, journées francophones des jeunes chercheurs en vision par ordinateur ORASIS2015 Abbreviated Journal
Volume Issue Pages
Keywords Augmented Reality; SFM; SLAM; real time pose computation; 2D/3D registration
Abstract Finding a way to estimate quickly and robustly the pose of an image is essential in augmented reality. Here we will discuss the approach we chose in order to get closer to real time by using SIFT points [4]. We propose a method based on filtering both SIFT points and images on which to focus on. Hence we will focus on relevant data.
Address Amiens; France; June 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ORASIS
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ RLL2015 Serial 2626
Permanent link to this record
 

 
Author Juan Ramon Terven Salinas; Bogdan Raducanu; Maria Elena Meza-de-Luna; Joaquin Salas
Title (up) Evaluating Real-Time Mirroring of Head Gestures using Smart Glasses Type Conference Article
Year 2015 Publication 16th IEEE International Conference on Computer Vision Workshops Abbreviated Journal
Volume Issue Pages 452-460
Keywords
Abstract Mirroring occurs when one person tends to mimic the non-verbal communication of their counterparts. Even though mirroring is a complex phenomenon, in this study, we focus on the detection of head-nodding as a simple non-verbal communication cue due to its significance as a gesture displayed during social interactions. This paper introduces a computer vision-based method to detect mirroring through the analysis of head gestures using wearable cameras (smart glasses). In addition, we study how such a method can be used to explore perceived competence. The proposed method has been evaluated and the experiments demonstrate how static and wearable cameras seem to be equally effective to gather the information required for the analysis.
Address Santiago de Chile; December 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes LAMP; 600.068; 600.072; Approved no
Call Number Admin @ si @ TRM2015 Serial 2722
Permanent link to this record
 

 
Author Fadi Dornaika; Bogdan Raducanu; Alireza Bosaghzadeh
Title (up) Facial expression recognition based on multi observations with application to social robotics Type Book Chapter
Year 2015 Publication Emotional and Facial Expressions: Recognition, Developmental Differences and Social Importance Abbreviated Journal
Volume Issue Pages 153-166
Keywords
Abstract Human-robot interaction is a hot topic nowadays in the social robotics
community. One crucial aspect is represented by the affective communication
which comes encoded through the facial expressions. In this chapter, we propose a novel approach for facial expression recognition, which exploits an efficient and adaptive graph-based label propagation (semi-supervised mode) in a multi-observation framework. The facial features are extracted using an appearance-based 3D face tracker, viewand texture independent. Our method has been extensively tested on the CMU dataset, and has been conveniently compared with other methods for graph construction. With the proposed approach, we developed an application for an AIBO robot, in which it mirrors the recognized facial
expression.
Address
Corporate Author Thesis
Publisher Nova Science publishers Place of Publication Editor Bruce Flores
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; Approved no
Call Number Admin @ si @ DRB2015 Serial 2720
Permanent link to this record
 

 
Author Josep M. Gonfaus; Marco Pedersoli; Jordi Gonzalez; Andrea Vedaldi; Xavier Roca
Title (up) Factorized appearances for object detection Type Journal Article
Year 2015 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 138 Issue Pages 92–101
Keywords Object recognition; Deformable part models; Learning and sharing parts; Discovering discriminative parts
Abstract Deformable object models capture variations in an object’s appearance that can be represented as image deformations. Other effects such as out-of-plane rotations, three-dimensional articulations, and self-occlusions are often captured by considering mixture of deformable models, one per object aspect. A more scalable approach is representing instead the variations at the level of the object parts, applying the concept of a mixture locally. Combining a few part variations can in fact cheaply generate a large number of global appearances.

A limited version of this idea was proposed by Yang and Ramanan [1], for human pose dectection. In this paper we apply it to the task of generic object category detection and extend it in several ways. First, we propose a model for the relationship between part appearances more general than the tree of Yang and Ramanan [1], which is more suitable for generic categories. Second, we treat part locations as well as their appearance as latent variables so that training does not need part annotations but only the object bounding boxes. Third, we modify the weakly-supervised learning of Felzenszwalb et al. and Girshick et al. [2], [3] to handle a significantly more complex latent structure.
Our model is evaluated on standard object detection benchmarks and is found to improve over existing approaches, yielding state-of-the-art results for several object categories.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE; 600.063; 600.078 Approved no
Call Number Admin @ si @ GPG2015 Serial 2705
Permanent link to this record
 

 
Author Marta Nuñez-Garcia; Sonja Simpraga; M.Angeles Jurado; Maite Garolera; Roser Pueyo; Laura Igual
Title (up) FADR: Functional-Anatomical Discriminative Regions for rest fMRI Characterization Type Conference Article
Year 2015 Publication Machine Learning in Medical Imaging, Proceedings of 6th International Workshop, MLMI 2015, Held in Conjunction with MICCAI 2015 Abbreviated Journal
Volume Issue Pages 61-68
Keywords
Abstract
Address Munich; Germany; October 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MLMI
Notes MILAB Approved no
Call Number Admin @ si @ NSJ2015 Serial 2674
Permanent link to this record
 

 
Author Adriana Romero; Nicolas Ballas; Samira Ebrahimi Kahou; Antoine Chassang; Carlo Gatta; Yoshua Bengio
Title (up) FitNets: Hints for Thin Deep Nets Type Conference Article
Year 2015 Publication 3rd International Conference on Learning Representations ICLR2015 Abbreviated Journal
Volume Issue Pages
Keywords Computer Science ; Learning; Computer Science ;Neural and Evolutionary Computing
Abstract While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network.
Address San Diego; CA; May 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICLR
Notes MILAB Approved no
Call Number Admin @ si @ RBK2015 Serial 2593
Permanent link to this record
 

 
Author Hongxing Gao
Title (up) Focused Structural Document Image Retrieval in Digital Mailroom Applications Type Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this work, we develop a generic framework that is able to handle the document retrieval problem in various scenarios such as searching for full page matches or retrieving the counterparts for specific document areas, focusing on their structural similarity or letting their visual resemblance to play a dominant role. Based on the spatial indexing technique, we propose to search for matches of local key-region pairs carrying both structural and visual information from the collection while a scheme allowing to adjust the relative contribution of structural and visual similarity is presented.
Based on the fact that the structure of documents is tightly linked with the distance among their elements, we firstly introduce an efficient detector named Distance Transform based Maximally Stable Extremal Regions (DTMSER). We illustrate that this detector is able to efficiently extract the structure of a document image as a dendrogram (hierarchical tree) of multi-scale key-regions that roughly correspond to letters, words and paragraphs. We demonstrate that, without benefiting from the structure information, the key-regions extracted by the DTMSER algorithm achieve better results comparing with state-of-the-art methods while much less amount of key-regions are employed.
We subsequently propose a pair-wise Bag of Words (BoW) framework to efficiently embed the explicit structure extracted by the DTMSER algorithm. We represent each document as a list of key-region pairs that correspond to the edges in the dendrogram where inclusion relationship is encoded. By employing those structural key-region pairs as the pooling elements for generating the histogram of features, the proposed method is able to encode the explicit inclusion relations into a BoW representation. The experimental results illustrate that the pair-wise BoW, powered by the embedded structural information, achieves remarkable improvement over the conventional BoW and spatial pyramidal BoW methods.
To handle various retrieval scenarios in one framework, we propose to directly query a series of key-region pairs, carrying both structure and visual information, from the collection. We introduce the spatial indexing techniques to the document retrieval community to speed up the structural relationship computation for key-region pairs. We firstly test the proposed framework in a full page retrieval scenario where structurally similar matches are expected. In this case, the pair-wise querying method achieves notable improvement over the BoW and spatial pyramidal BoW frameworks. Furthermore, we illustrate that the proposed method is also able to handle focused retrieval situations where the queries are defined as a specific interesting partial areas of the images. We examine our method on two types of focused queries: structure-focused and exact queries. The experimental results show that, the proposed generic framework obtains nearly perfect precision on both types of focused queries while it is the first framework able to tackle structure-focused queries, setting a new state of the art in the field.
Besides, we introduce a line verification method to check the spatial consistency among the matched key-region pairs. We propose a computationally efficient version of line verification through a two step implementation. We first compute tentative localizations of the query and subsequently employ them to divide the matched key-region pairs into several groups, then line verification is performed within each group while more precise bounding boxes are computed. We demonstrate that, comparing with the standard approach (based on RANSAC), the line verification proposed generally achieves much higher recall with slight loss on precision on specific queries.
Address January 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Dimosthenis Karatzas;Marçal Rusiñol
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-0-7 Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ Gao2015 Serial 2577
Permanent link to this record
 

 
Author Firat Ismailoglu; Ida G. Sprinkhuizen-Kuyper; Evgueni Smirnov; Sergio Escalera; Ralf Peeters
Title (up) Fractional Programming Weighted Decoding for Error-Correcting Output Codes Type Conference Article
Year 2015 Publication Multiple Classifier Systems, Proceedings of 12th International Workshop , MCS 2015 Abbreviated Journal
Volume Issue Pages 38-50
Keywords
Abstract In order to increase the classification performance obtained using Error-Correcting Output Codes designs (ECOC), introducing weights in the decoding phase of the ECOC has attracted a lot of interest. In this work, we present a method for ECOC designs that focuses on increasing hypothesis margin on the data samples given a base classifier. While achieving this, we implicitly reward the base classifiers with high performance, whereas punish those with low performance. The resulting objective function is of the fractional programming type and we deal with this problem through the Dinkelbach’s Algorithm. The conducted tests over well known UCI datasets show that the presented method is superior to the unweighted decoding and that it outperforms the results of the state-of-the-art weighted decoding methods in most of the performed experiments.
Address Gunzburg; Germany; June 2015
Corporate Author Thesis
Publisher Springer International Publishing Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-3-319-20247-1 Medium
Area Expedition Conference MCS
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ ISS2015 Serial 2601
Permanent link to this record
 

 
Author Adria Ruiz; Joost Van de Weijer; Xavier Binefa
Title (up) From emotions to action units with hidden and semi-hidden-task learning Type Conference Article
Year 2015 Publication 16th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 3703-3711
Keywords
Abstract Limited annotated training data is a challenging problem in Action Unit recognition. In this paper, we investigate how the use of large databases labelled according to the 6 universal facial expressions can increase the generalization ability of Action Unit classifiers. For this purpose, we propose a novel learning framework: Hidden-Task Learning. HTL aims to learn a set of Hidden-Tasks (Action Units)for which samples are not available but, in contrast, training data is easier to obtain from a set of related VisibleTasks (Facial Expressions). To that end, HTL is able to exploit prior knowledge about the relation between Hidden and Visible-Tasks. In our case, we base this prior knowledge on empirical psychological studies providing statistical correlations between Action Units and universal facial expressions. Additionally, we extend HTL to Semi-Hidden Task Learning (SHTL) assuming that Action Unit training samples are also provided. Performing exhaustive experiments over four different datasets, we show that HTL and SHTL improve the generalization ability of AU classifiers by training them with additional facial expression data. Additionally, we show that SHTL achieves competitive performance compared with state-of-the-art Transductive Learning approaches which face the problem of limited training data by using unlabelled test samples during training.
Address Santiago de Chile; Chile; December 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP; 600.068; 600.079 Approved no
Call Number Admin @ si @ RWB2015 Serial 2671
Permanent link to this record