|   | 
Details
   web
Records
Author David Fernandez; Josep Llados; Alicia Fornes; R.Manmatha
Title On Influence of Line Segmentation in Efficient Word Segmentation in Old Manuscripts Type Conference Article
Year 2012 Publication 13th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal
Volume Issue Pages 763-768
Keywords document image processing;handwritten character recognition;history;image segmentation;Spanish document;historical document;line segmentation;old handwritten document;old manuscript;word segmentation;Bifurcation;Dynamic programming;Handwriting recognition;Image segmentation;Measurement;Noise;Skeleton;Segmentation;document analysis;document and text processing;handwriting analysis;heuristics;path-finding
Abstract he objective of this work is to show the importance of a good line segmentation to obtain better results in the segmentation of words of historical documents. We have used the approach developed by Manmatha and Rothfeder [1] to segment words in old handwritten documents. In their work the lines of the documents are extracted using projections. In this work, we have developed an approach to segment lines more efficiently. The new line segmentation algorithm tackles with skewed, touching and noisy lines, so it is significantly improves word segmentation. Experiments using Spanish documents from the Marriages Database of the Barcelona Cathedral show that this approach reduces the error rate by more than 20%
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4673-2262-1 Medium
Area Expedition Conference ICFHR
Notes DAG Approved no
Call Number (down) Admin @ si @ FLF2012 Serial 2200
Permanent link to this record
 

 
Author Volkmar Frinken; Alicia Fornes; Josep Llados; Jean-Marc Ogier
Title Bidirectional Language Model for Handwriting Recognition Type Conference Article
Year 2012 Publication Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop Abbreviated Journal
Volume 7626 Issue Pages 611-619
Keywords
Abstract In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
Address Japan
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-34165-6 Medium
Area Expedition Conference SSPR&SPR
Notes DAG Approved no
Call Number (down) Admin @ si @ FFL2012 Serial 2057
Permanent link to this record
 

 
Author Onur Ferhat
Title Eye-Tracking with Webcam-Based Setups: Implementation of a Real-Time System and an Analysis of Factors Affecting Performance Type Report
Year 2012 Publication CVC Technical Report Abbreviated Journal
Volume 172 Issue Pages
Keywords Computer vision, eye-tracking, gaussian process, feature selection, optical flow
Abstract In the recent years commercial eye-tracking hardware has become more common, with the introduction of new models from several brands that have better performance and easier setup procedures. A cause and at the same time a result of this phenomenon is the popularity of eye-tracking research directed at marketing, accessibility and usability, among others.
One problem with these hardware components is scalability, because both the price and the necessary expertise to operate them makes it practically impossible in the large scale. In this work, we analyze the feasibility of a software eye-tracking system based on a single, ordinary webcam. Our aim is to discover the limits of such a system and to see whether it provides acceptable performances.
The significance of this setup is that it is the most common setup found in consumer environments, off-the-shelf electronic devices such as laptops, mobile phones and tablet computers. As no special equipment such as infrared lights, mirrors or zoom lenses are used; setting up and calibrating the system is easier compared to other approaches using these components.
Our work is based on the open source application Opengazer, which provides a good starting point for our contributions. We propose several improvements in order to push the system's performance further and make it feasible as a robust, real-time device. Then we carry out an elaborate experiment involving 18 human subjects and 4 different system setups. Finally, we give an analysis of the results and discuss the effects of setup changes, subject differences and modifications in the software.
Address Bellaterra
Corporate Author Computer Vision Center Thesis Master's thesis
Publisher Place of Publication Editor Fernando Vilariño
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV Approved no
Call Number (down) Admin @ si @ Fer2012; IAM @ iam @ Fer2012 Serial 2165
Permanent link to this record
 

 
Author Alicia Fornes; Anjan Dutta; Albert Gordo; Josep Llados
Title CVC-MUSCIMA: A Ground-Truth of Handwritten Music Score Images for Writer Identification and Staff Removal Type Journal Article
Year 2012 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 15 Issue 3 Pages 243-251
Keywords Music scores; Handwritten documents; Writer identification; Staff removal; Performance evaluation; Graphics recognition; Ground truths
Abstract 0,405JCR
The analysis of music scores has been an active research field in the last decades. However, there are no publicly available databases of handwritten music scores for the research community. In this paper we present the CVC-MUSCIMA database and ground-truth of handwritten music score images. The dataset consists of 1,000 music sheets written by 50 different musicians. It has been especially designed for writer identification and staff removal tasks. In addition to the description of the dataset, ground-truth, partitioning and evaluation metrics, we also provide some base-line results for easing the comparison between different approaches.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number (down) Admin @ si @ FDG2012 Serial 2129
Permanent link to this record
 

 
Author Volkmar Frinken; Markus Baumgartner; Andreas Fischer; Horst Bunke
Title Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting Type Conference Article
Year 2012 Publication 13th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal
Volume Issue Pages 49-54
Keywords
Abstract State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.
Address Bari, Italy
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 10.1109/ICFHR.2012.268 ISBN 978-1-4673-2262-1 Medium
Area Expedition Conference ICFHR
Notes DAG Approved no
Call Number (down) Admin @ si @ FBF2012 Serial 2055
Permanent link to this record
 

 
Author Sergio Escalera
Title Human Behavior Analysis From Depth Maps Type Conference Article
Year 2012 Publication 7th Conference on Articulated Motion and Deformable Objects Abbreviated Journal
Volume 7378 Issue Pages 282-292
Keywords
Abstract Pose Recovery (PR) and Human Behavior Analysis (HBA) have been a main focus of interest from the beginnings of Computer Vision and Machine Learning. PR and HBA were originally addressed by the analysis of still images and image sequences. More recent strategies consisted of Motion Capture technology (MOCAP), based on the synchronization of multiple cameras in controlled environments; and the analysis of depth maps from Time-of-Flight (ToF) technology, based on range image recording from distance sensor measurements. Recently, with the appearance of the multi-modal RGBD information provided by the low cost Kinect \textsfTM sensor (from RGB and Depth, respectively), classical methods for PR and HBA have been redefined, and new strategies have been proposed. In this paper, the recent contributions and future trends of multi-modal RGBD data analysis for PR and HBA are reviewed and discussed.
Address Mallorca
Corporate Author Thesis
Publisher Springer Heidelberg Place of Publication Editor F.J. Perales; R.B. Fisher; T.B. Moeslund
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-31566-4 Medium
Area Expedition Conference AMDO
Notes MILAB; HuPBA Approved no
Call Number (down) Admin @ si @ Esc2012 Serial 2040
Permanent link to this record
 

 
Author Sergio Escalera; Josep Moya; Laura Igual; Veronica Violant; Maria Teresa Anguera
Title Análisis Comportamental Automatizado de TDAH: la Influencia de la Variable Motivación Type Conference Article
Year 2012 Publication IPSI – Cosmocaixa, Jornadas "Empremtes del present, efectes en la psicoanàlisi, la cultura i la societat Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Poster
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IPSI
Notes MILAB; HuPBA; OR Approved no
Call Number (down) Admin @ si @ EMI2012b Serial 2065
Permanent link to this record
 

 
Author Sergio Escalera; Josep Moya; Laura Igual; Veronica Violant; Maria Teresa Anguera
Title Automatic Human Behavior Analysis in ADHD Type Conference Article
Year 2012 Publication Eunethydis 2nd International ADHD Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Poster
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference EUNETHYDIS
Notes MILAB;HuPBA Approved no
Call Number (down) Admin @ si @ EMI2012a Serial 2058
Permanent link to this record
 

 
Author Noha Elfiky
Title Compact, Adaptive and Discriminative Spatial Pyramids for Improved Object and Scene Classification Type Book Whole
Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The release of challenging datasets with a vast number of images, requires the development of efficient image representations and algorithms which are able to manipulate these large-scale datasets efficiently. Nowadays the Bag-of-Words (BoW) is the most successful approach in the context of object and scene classification tasks. However, its main drawback is the absence of the important spatial information. Spatial pyramids (SP) have been successfully applied to incorporate spatial information into BoW-based image representation. Observing the remarkable performance of spatial pyramids, their growing number of applications to a broad range of vision problems, and finally its geometry inclusion, a question can be asked what are the limits of spatial pyramids. Within the SP framework, the optimal way for obtaining an image spatial representation, which is able to cope with it’s most foremost shortcomings, concretely, it’s high dimensionality and the rigidity of the resulting image representation, still remains an active research domain. In summary, the main concern of this thesis is to search for the limits of spatial pyramids and try to figure out solutions for them.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number (down) Admin @ si @ Elf2012 Serial 2202
Permanent link to this record
 

 
Author Noha Elfiky; Fahad Shahbaz Khan; Joost Van de Weijer; Jordi Gonzalez
Title Discriminative Compact Pyramids for Object and Scene Recognition Type Journal Article
Year 2012 Publication Pattern Recognition Abbreviated Journal PR
Volume 45 Issue 4 Pages 1627-1636
Keywords
Abstract Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0031-3203 ISBN Medium
Area Expedition Conference
Notes ISE; CAT;CIC Approved no
Call Number (down) Admin @ si @ EKW2012 Serial 1807
Permanent link to this record
 

 
Author Noha Elfiky; Jordi Gonzalez; Xavier Roca
Title Compact and Adaptive Spatial Pyramids for Scene Recognition Type Journal Article
Year 2012 Publication Image and Vision Computing Abbreviated Journal IMAVIS
Volume 30 Issue 8 Pages 492–500
Keywords
Abstract Most successful approaches on scenerecognition tend to efficiently combine global image features with spatial local appearance and shape cues. On the other hand, less attention has been devoted for studying spatial texture features within scenes. Our method is based on the insight that scenes can be seen as a composition of micro-texture patterns. This paper analyzes the role of texture along with its spatial layout for scenerecognition. However, one main drawback of the resulting spatial representation is its huge dimensionality. Hence, we propose a technique that addresses this problem by presenting a compactSpatialPyramid (SP) representation. The basis of our compact representation, namely, CompactAdaptiveSpatialPyramid (CASP) consists of a two-stages compression strategy. This strategy is based on the Agglomerative Information Bottleneck (AIB) theory for (i) compressing the least informative SP features, and, (ii) automatically learning the most appropriate shape for each category. Our method exceeds the state-of-the-art results on several challenging scenerecognition data sets.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number (down) Admin @ si @ EGR2012 Serial 2004
Permanent link to this record
 

 
Author Ivo Everts; Jan van Gemert; Theo Gevers
Title Per-patch Descriptor Selection using Surface and Scene Properties Type Conference Article
Year 2012 Publication 12th European Conference on Computer Vision Abbreviated Journal
Volume 7577 Issue VI Pages 172-186
Keywords
Abstract Local image descriptors are generally designed for describing all possible image patches. Such patches may be subject to complex variations in appearance due to incidental object, scene and recording conditions. Because of this, a single-best descriptor for accurate image representation under all conditions does not exist. Therefore, we propose to automatically select from a pool of descriptors the one that is best suitable based on object surface and scene properties. These properties are measured on the fly from a single image patch through a set of attributes. Attributes are input to a classifier which selects the best descriptor. Our experiments on a large dataset of colored object patches show that the proposed selection method outperforms the best single descriptor and a-priori combinations of the descriptor pool.
Address Florence, Italy
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-33782-6 Medium
Area Expedition Conference ECCV
Notes ALTRES;ISE Approved no
Call Number (down) Admin @ si @ EGG2012 Serial 2023
Permanent link to this record
 

 
Author Sergio Escalera; Xavier Baro; Jordi Vitria; Petia Radeva; Bogdan Raducanu
Title Social Network Extraction and Analysis Based on Multimodal Dyadic Interaction Type Journal Article
Year 2012 Publication Sensors Abbreviated Journal SENS
Volume 12 Issue 2 Pages 1702-1719
Keywords
Abstract IF=1.77 (2010)
Social interactions are a very important component in peopleís lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Timesí Blogging Heads opinion blog.
The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The linksí weights are a measure of the ìinfluenceî a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network.
Address
Corporate Author Thesis
Publisher Molecular Diversity Preservation International Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; OR;HuPBA;MV Approved no
Call Number (down) Admin @ si @ EBV2012 Serial 1885
Permanent link to this record
 

 
Author Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title Noise suppression over bi-level graphical documents using a sparse representation Type Conference Article
Year 2012 Publication Colloque International Francophone sur l'Écrit et le Document Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Bordeaux
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CIFED
Notes DAG Approved no
Call Number (down) Admin @ si @ DTR2012b Serial 2136
Permanent link to this record
 

 
Author Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title Text/graphic separation using a sparse representation with multi-learned dictionaries Type Conference Article
Year 2012 Publication 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords Graphics Recognition; Layout Analysis; Document Understandin
Abstract In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Address Tsukuba
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG Approved no
Call Number (down) Admin @ si @ DTR2012a Serial 2135
Permanent link to this record