|
Records |
Links |
|
Author |
Juan Ignacio Toledo; Jordi Cucurull; Jordi Puiggali; Alicia Fornes; Josep Llados |
|
|
Title |
Document Analysis Techniques for Automatic Electoral Document Processing: A Survey |
Type |
Conference Article |
|
Year |
2015 |
Publication |
E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
139-141 |
|
|
Keywords |
Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally |
|
|
Abstract |
In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents. |
|
|
Address |
Bern; Switzerland; September 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VoteID |
|
|
Notes |
DAG; 600.061; 602.006; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TCP2015 |
Serial |
2641 |
|
Permanent link to this record |
|
|
|
|
Author |
Jiaolong Xu |
|
|
Title |
Domain Adaptation of Deformable Part-based Models |
Type |
Book Whole |
|
Year |
2015 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
On-board pedestrian detection is crucial for Advanced Driver Assistance Systems
(ADAS). An accurate classication is fundamental for vision-based pedestrian detection.
The underlying assumption for learning classiers is that the training set and the deployment environment (testing) follow the same probability distribution regarding the features used by the classiers. However, in practice, there are dierent reasons that can break this constancy assumption. Accordingly, reusing existing classiers by adapting them from the previous training environment (source domain) to the new testing one (target domain) is an approach with increasing acceptance in the computer vision community. In this thesis we focus on the domain adaptation of deformable part-based models (DPMs) for pedestrian detection. As a prof of concept, we use a computer graphic based synthetic dataset, i.e. a virtual world, as the source domain, and adapt the virtual-world trained DPM detector to various real-world dataset.
We start by exploiting the maximum detection accuracy of the virtual-world
trained DPM. Even though, when operating in various real-world datasets, the virtualworld trained detector still suer from accuracy degradation due to the domain gap of virtual and real worlds. We then focus on domain adaptation of DPM. At the rst step, we consider single source and single target domain adaptation and propose two batch learning methods, namely A-SSVM and SA-SSVM. Later, we further consider leveraging multiple target (sub-)domains for progressive domain adaptation and propose a hierarchical adaptive structured SVM (HA-SSVM) for optimization. Finally, we extend HA-SSVM for the challenging online domain adaptation problem, aiming at making the detector to automatically adapt to the target domain online, without any human intervention. All of the proposed methods in this thesis do not require
revisiting source domain data. The evaluations are done on the Caltech pedestrian detection benchmark. Results show that SA-SSVM slightly outperforms A-SSVM and avoids accuracy drops as high as 15 points when comparing with a non-adapted detector. The hierarchical model learned by HA-SSVM further boosts the domain adaptation performance. Finally, the online domain adaptation method has demonstrated that it can achieve comparable accuracy to the batch learned models while not requiring manually label target domain examples. Domain adaptation for pedestrian detection is of paramount importance and a relatively unexplored area. We humbly hope the work in this thesis could provide foundations for future work in this area. |
|
|
Address |
April 2015 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
Antonio Lopez |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-943427-1-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.076 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Xu2015 |
Serial |
2631 |
|
Permanent link to this record |
|
|
|
|
Author |
Suman Ghosh; Lluis Gomez; Dimosthenis Karatzas; Ernest Valveny |
|
|
Title |
Efficient indexing for Query By String text retrieval |
Type |
Conference Article |
|
Year |
2015 |
Publication |
6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1236 - 1240 |
|
|
Keywords |
|
|
|
Abstract |
This paper deals with Query By String word spotting in scene images. A hierarchical text segmentation algorithm based on text specific selective search is used to find text regions. These regions are indexed per character n-grams present in the text region. An attribute representation based on Pyramidal Histogram of Characters (PHOC) is used to compare text regions with the query text. For generation of the index a similar attribute space based Pyramidal Histogram of character n-grams is used. These attribute models are learned using linear SVMs over the Fisher Vector [1] representation of the images along with the PHOC labels of the corresponding strings. |
|
|
Address |
Nancy; France; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CBDAR |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GGK2015 |
Serial |
2693 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados |
|
|
Title |
Efficient segmentation-free keyword spotting in historical document collections |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
48 |
Issue |
2 |
Pages |
545–555 |
|
|
Keywords |
Historical documents; Keyword spotting; Segmentation-free; Dense SIFT features; Latent semantic analysis; Product quantization |
|
|
Abstract |
In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; ADAS; 600.076; 600.077; 600.061; 601.223; 602.006; 600.055 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RAT2015a |
Serial |
2544 |
|
Permanent link to this record |
|
|
|
|
Author |
David Sanchez-Mendoza; David Masip; Agata Lapedriza |
|
|
Title |
Emotion recognition from mid-level features |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
67 |
Issue |
Part 1 |
Pages |
66–74 |
|
|
Keywords |
Facial expression; Emotion recognition; Action units; Computer vision |
|
|
Abstract |
In this paper we present a study on the use of Action Units as mid-level features for automatically recognizing basic and subtle emotions. We propose a representation model based on mid-level facial muscular movement features. We encode these movements dynamically using the Facial Action Coding System, and propose to use these intermediate features based on Action Units (AUs) to classify emotions. AUs activations are detected fusing a set of spatiotemporal geometric and appearance features. The algorithm is validated in two applications: (i) the recognition of 7 basic emotions using the publicly available Cohn-Kanade database, and (ii) the inference of subtle emotional cues in the Newscast database. In this second scenario, we consider emotions that are perceived cumulatively in longer periods of time. In particular, we Automatically classify whether video shoots from public News TV channels refer to Good or Bad news. To deal with the different video lengths we propose a Histogram of Action Units and compute it using a sliding window strategy on the frame sequences. Our approach achieves accuracies close to human perception. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier B.V. |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0167-8655 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ SML2015 |
Serial |
2746 |
|
Permanent link to this record |
|
|
|
|
Author |
Wenjuan Gong; W.Zhang; Jordi Gonzalez; Y.Ren; Z.Li |
|
|
Title |
Enhanced Asymmetric Bilinear Model for Face Recognition |
Type |
Journal Article |
|
Year |
2015 |
Publication |
International Journal of Distributed Sensor Networks |
Abbreviated Journal |
IJDSN |
|
|
Volume |
|
Issue |
|
Pages |
Article ID 218514 |
|
|
Keywords |
|
|
|
Abstract |
Bilinear models have been successfully applied to separate two factors, for example, pose variances and different identities in face recognition problems. Asymmetric model is a type of bilinear model which models a system in the most concise way. But seldom there are works exploring the applications of asymmetric bilinear model on face recognition problem with illumination changes. In this work, we propose enhanced asymmetric model for illumination-robust face recognition. Instead of initializing the factor probabilities randomly, we initialize them with nearest neighbor method and optimize them for the test data. Above that, we update the factor model to be identified. We validate the proposed method on a designed data sample and extended Yale B dataset. The experiment results show that the enhanced asymmetric models give promising results and good recognition accuracies. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 600.063; 600.078 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GZG2015 |
Serial |
2592 |
|
Permanent link to this record |
|
|
|
|
Author |
Youssef El Rhabi; Simon Loic; Brun Luc |
|
|
Title |
Estimation de la pose d’une caméra à partir d’un flux vidéo en s’approchant du temps réel |
Type |
Conference Article |
|
Year |
2015 |
Publication |
15ème édition d'ORASIS, journées francophones des jeunes chercheurs en vision par ordinateur ORASIS2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Augmented Reality; SFM; SLAM; real time pose computation; 2D/3D registration |
|
|
Abstract |
Finding a way to estimate quickly and robustly the pose of an image is essential in augmented reality. Here we will discuss the approach we chose in order to get closer to real time by using SIFT points [4]. We propose a method based on filtering both SIFT points and images on which to focus on. Hence we will focus on relevant data. |
|
|
Address |
Amiens; France; June 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ORASIS |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RLL2015 |
Serial |
2626 |
|
Permanent link to this record |
|
|
|
|
Author |
Juan Ramon Terven Salinas; Bogdan Raducanu; Maria Elena Meza-de-Luna; Joaquin Salas |
|
|
Title |
Evaluating Real-Time Mirroring of Head Gestures using Smart Glasses |
Type |
Conference Article |
|
Year |
2015 |
Publication |
16th IEEE International Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
452-460 |
|
|
Keywords |
|
|
|
Abstract |
Mirroring occurs when one person tends to mimic the non-verbal communication of their counterparts. Even though mirroring is a complex phenomenon, in this study, we focus on the detection of head-nodding as a simple non-verbal communication cue due to its significance as a gesture displayed during social interactions. This paper introduces a computer vision-based method to detect mirroring through the analysis of head gestures using wearable cameras (smart glasses). In addition, we study how such a method can be used to explore perceived competence. The proposed method has been evaluated and the experiments demonstrate how static and wearable cameras seem to be equally effective to gather the information required for the analysis. |
|
|
Address |
Santiago de Chile; December 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCVW |
|
|
Notes |
LAMP; 600.068; 600.072; |
Approved |
no |
|
|
Call Number |
Admin @ si @ TRM2015 |
Serial |
2722 |
|
Permanent link to this record |
|
|
|
|
Author |
Fadi Dornaika; Bogdan Raducanu; Alireza Bosaghzadeh |
|
|
Title |
Facial expression recognition based on multi observations with application to social robotics |
Type |
Book Chapter |
|
Year |
2015 |
Publication |
Emotional and Facial Expressions: Recognition, Developmental Differences and Social Importance |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
153-166 |
|
|
Keywords |
|
|
|
Abstract |
Human-robot interaction is a hot topic nowadays in the social robotics
community. One crucial aspect is represented by the affective communication
which comes encoded through the facial expressions. In this chapter, we propose a novel approach for facial expression recognition, which exploits an efficient and adaptive graph-based label propagation (semi-supervised mode) in a multi-observation framework. The facial features are extracted using an appearance-based 3D face tracker, viewand texture independent. Our method has been extensively tested on the CMU dataset, and has been conveniently compared with other methods for graph construction. With the proposed approach, we developed an application for an AIBO robot, in which it mirrors the recognized facial
expression. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Nova Science publishers |
Place of Publication |
|
Editor |
Bruce Flores |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; |
Approved |
no |
|
|
Call Number |
Admin @ si @ DRB2015 |
Serial |
2720 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep M. Gonfaus; Marco Pedersoli; Jordi Gonzalez; Andrea Vedaldi; Xavier Roca |
|
|
Title |
Factorized appearances for object detection |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Computer Vision and Image Understanding |
Abbreviated Journal |
CVIU |
|
|
Volume |
138 |
Issue |
|
Pages |
92–101 |
|
|
Keywords |
Object recognition; Deformable part models; Learning and sharing parts; Discovering discriminative parts |
|
|
Abstract |
Deformable object models capture variations in an object’s appearance that can be represented as image deformations. Other effects such as out-of-plane rotations, three-dimensional articulations, and self-occlusions are often captured by considering mixture of deformable models, one per object aspect. A more scalable approach is representing instead the variations at the level of the object parts, applying the concept of a mixture locally. Combining a few part variations can in fact cheaply generate a large number of global appearances.
A limited version of this idea was proposed by Yang and Ramanan [1], for human pose dectection. In this paper we apply it to the task of generic object category detection and extend it in several ways. First, we propose a model for the relationship between part appearances more general than the tree of Yang and Ramanan [1], which is more suitable for generic categories. Second, we treat part locations as well as their appearance as latent variables so that training does not need part annotations but only the object bounding boxes. Third, we modify the weakly-supervised learning of Felzenszwalb et al. and Girshick et al. [2], [3] to handle a significantly more complex latent structure.
Our model is evaluated on standard object detection benchmarks and is found to improve over existing approaches, yielding state-of-the-art results for several object categories. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 600.063; 600.078 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GPG2015 |
Serial |
2705 |
|
Permanent link to this record |
|
|
|
|
Author |
Marta Nuñez-Garcia; Sonja Simpraga; M.Angeles Jurado; Maite Garolera; Roser Pueyo; Laura Igual |
|
|
Title |
FADR: Functional-Anatomical Discriminative Regions for rest fMRI Characterization |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Machine Learning in Medical Imaging, Proceedings of 6th International Workshop, MLMI 2015, Held in Conjunction with MICCAI 2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
61-68 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Munich; Germany; October 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MLMI |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ NSJ2015 |
Serial |
2674 |
|
Permanent link to this record |
|
|
|
|
Author |
Adriana Romero; Nicolas Ballas; Samira Ebrahimi Kahou; Antoine Chassang; Carlo Gatta; Yoshua Bengio |
|
|
Title |
FitNets: Hints for Thin Deep Nets |
Type |
Conference Article |
|
Year |
2015 |
Publication |
3rd International Conference on Learning Representations ICLR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Computer Science ; Learning; Computer Science ;Neural and Evolutionary Computing |
|
|
Abstract |
While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network. |
|
|
Address |
San Diego; CA; May 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICLR |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ RBK2015 |
Serial |
2593 |
|
Permanent link to this record |
|
|
|
|
Author |
Hongxing Gao |
|
|
Title |
Focused Structural Document Image Retrieval in Digital Mailroom Applications |
Type |
Book Whole |
|
Year |
2015 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
In this work, we develop a generic framework that is able to handle the document retrieval problem in various scenarios such as searching for full page matches or retrieving the counterparts for specific document areas, focusing on their structural similarity or letting their visual resemblance to play a dominant role. Based on the spatial indexing technique, we propose to search for matches of local key-region pairs carrying both structural and visual information from the collection while a scheme allowing to adjust the relative contribution of structural and visual similarity is presented.
Based on the fact that the structure of documents is tightly linked with the distance among their elements, we firstly introduce an efficient detector named Distance Transform based Maximally Stable Extremal Regions (DTMSER). We illustrate that this detector is able to efficiently extract the structure of a document image as a dendrogram (hierarchical tree) of multi-scale key-regions that roughly correspond to letters, words and paragraphs. We demonstrate that, without benefiting from the structure information, the key-regions extracted by the DTMSER algorithm achieve better results comparing with state-of-the-art methods while much less amount of key-regions are employed.
We subsequently propose a pair-wise Bag of Words (BoW) framework to efficiently embed the explicit structure extracted by the DTMSER algorithm. We represent each document as a list of key-region pairs that correspond to the edges in the dendrogram where inclusion relationship is encoded. By employing those structural key-region pairs as the pooling elements for generating the histogram of features, the proposed method is able to encode the explicit inclusion relations into a BoW representation. The experimental results illustrate that the pair-wise BoW, powered by the embedded structural information, achieves remarkable improvement over the conventional BoW and spatial pyramidal BoW methods.
To handle various retrieval scenarios in one framework, we propose to directly query a series of key-region pairs, carrying both structure and visual information, from the collection. We introduce the spatial indexing techniques to the document retrieval community to speed up the structural relationship computation for key-region pairs. We firstly test the proposed framework in a full page retrieval scenario where structurally similar matches are expected. In this case, the pair-wise querying method achieves notable improvement over the BoW and spatial pyramidal BoW frameworks. Furthermore, we illustrate that the proposed method is also able to handle focused retrieval situations where the queries are defined as a specific interesting partial areas of the images. We examine our method on two types of focused queries: structure-focused and exact queries. The experimental results show that, the proposed generic framework obtains nearly perfect precision on both types of focused queries while it is the first framework able to tackle structure-focused queries, setting a new state of the art in the field.
Besides, we introduce a line verification method to check the spatial consistency among the matched key-region pairs. We propose a computationally efficient version of line verification through a two step implementation. We first compute tentative localizations of the query and subsequently employ them to divide the matched key-region pairs into several groups, then line verification is performed within each group while more precise bounding boxes are computed. We demonstrate that, comparing with the standard approach (based on RANSAC), the line verification proposed generally achieves much higher recall with slight loss on precision on specific queries. |
|
|
Address |
January 2015 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Josep Llados;Dimosthenis Karatzas;Marçal Rusiñol |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-943427-0-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Gao2015 |
Serial |
2577 |
|
Permanent link to this record |
|
|
|
|
Author |
Firat Ismailoglu; Ida G. Sprinkhuizen-Kuyper; Evgueni Smirnov; Sergio Escalera; Ralf Peeters |
|
|
Title |
Fractional Programming Weighted Decoding for Error-Correcting Output Codes |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Multiple Classifier Systems, Proceedings of 12th International Workshop , MCS 2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
38-50 |
|
|
Keywords |
|
|
|
Abstract |
In order to increase the classification performance obtained using Error-Correcting Output Codes designs (ECOC), introducing weights in the decoding phase of the ECOC has attracted a lot of interest. In this work, we present a method for ECOC designs that focuses on increasing hypothesis margin on the data samples given a base classifier. While achieving this, we implicitly reward the base classifiers with high performance, whereas punish those with low performance. The resulting objective function is of the fractional programming type and we deal with this problem through the Dinkelbach’s Algorithm. The conducted tests over well known UCI datasets show that the presented method is superior to the unweighted decoding and that it outperforms the results of the state-of-the-art weighted decoding methods in most of the performed experiments. |
|
|
Address |
Gunzburg; Germany; June 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer International Publishing |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-319-20247-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MCS |
|
|
Notes |
HuPBA;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ ISS2015 |
Serial |
2601 |
|
Permanent link to this record |
|
|
|
|
Author |
Adria Ruiz; Joost Van de Weijer; Xavier Binefa |
|
|
Title |
From emotions to action units with hidden and semi-hidden-task learning |
Type |
Conference Article |
|
Year |
2015 |
Publication |
16th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
3703-3711 |
|
|
Keywords |
|
|
|
Abstract |
Limited annotated training data is a challenging problem in Action Unit recognition. In this paper, we investigate how the use of large databases labelled according to the 6 universal facial expressions can increase the generalization ability of Action Unit classifiers. For this purpose, we propose a novel learning framework: Hidden-Task Learning. HTL aims to learn a set of Hidden-Tasks (Action Units)for which samples are not available but, in contrast, training data is easier to obtain from a set of related VisibleTasks (Facial Expressions). To that end, HTL is able to exploit prior knowledge about the relation between Hidden and Visible-Tasks. In our case, we base this prior knowledge on empirical psychological studies providing statistical correlations between Action Units and universal facial expressions. Additionally, we extend HTL to Semi-Hidden Task Learning (SHTL) assuming that Action Unit training samples are also provided. Performing exhaustive experiments over four different datasets, we show that HTL and SHTL improve the generalization ability of AU classifiers by training them with additional facial expression data. Additionally, we show that SHTL achieves competitive performance compared with state-of-the-art Transductive Learning approaches which face the problem of limited training data by using unlabelled test samples during training. |
|
|
Address |
Santiago de Chile; Chile; December 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCV |
|
|
Notes |
LAMP; 600.068; 600.079 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RWB2015 |
Serial |
2671 |
|
Permanent link to this record |