Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–14]

Details

Records
Author	David Fernandez; Josep Llados; Alicia Fornes; R.Manmatha
Title	On Influence of Line Segmentation in Efficient Word Segmentation in Old Manuscripts			Type	Conference Article
Year	2012	Publication	13th International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
Volume		Issue		Pages	763-768
Keywords	document image processing;handwritten character recognition;history;image segmentation;Spanish document;historical document;line segmentation;old handwritten document;old manuscript;word segmentation;Bifurcation;Dynamic programming;Handwriting recognition;Image segmentation;Measurement;Noise;Skeleton;Segmentation;document analysis;document and text processing;handwriting analysis;heuristics;path-finding
Abstract	he objective of this work is to show the importance of a good line segmentation to obtain better results in the segmentation of words of historical documents. We have used the approach developed by Manmatha and Rothfeder [1] to segment words in old handwritten documents. In their work the lines of the documents are extracted using projections. In this work, we have developed an approach to segment lines more efficiently. The new line segmentation algorithm tackles with skewed, touching and noisy lines, so it is significantly improves word segmentation. Experiments using Spanish documents from the Marriages Database of the Barcelona Cathedral show that this approach reduces the error rate by more than 20%
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4673-2262-1	Medium
Area		Expedition		Conference	ICFHR
Notes	DAG			Approved	no
Call Number	Admin @ si @ FLF2012			Serial	2200
Permanent link to this record



Author	Volkmar Frinken; Alicia Fornes; Josep Llados; Jean-Marc Ogier
Title	Bidirectional Language Model for Handwriting Recognition			Type	Conference Article
Year	2012	Publication	Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop	Abbreviated Journal
Volume	7626	Issue		Pages	611-619
Keywords
Abstract	In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
Address	Japan
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-34165-6	Medium
Area		Expedition		Conference	SSPR&SPR
Notes	DAG			Approved	no
Call Number	Admin @ si @ FFL2012			Serial	2057
Permanent link to this record



Author	Onur Ferhat
Title	Eye-Tracking with Webcam-Based Setups: Implementation of a Real-Time System and an Analysis of Factors Affecting Performance			Type	Report
Year	2012	Publication	CVC Technical Report	Abbreviated Journal
Volume	172	Issue		Pages
Keywords	Computer vision, eye-tracking, gaussian process, feature selection, optical flow
Abstract	In the recent years commercial eye-tracking hardware has become more common, with the introduction of new models from several brands that have better performance and easier setup procedures. A cause and at the same time a result of this phenomenon is the popularity of eye-tracking research directed at marketing, accessibility and usability, among others. One problem with these hardware components is scalability, because both the price and the necessary expertise to operate them makes it practically impossible in the large scale. In this work, we analyze the feasibility of a software eye-tracking system based on a single, ordinary webcam. Our aim is to discover the limits of such a system and to see whether it provides acceptable performances. The significance of this setup is that it is the most common setup found in consumer environments, off-the-shelf electronic devices such as laptops, mobile phones and tablet computers. As no special equipment such as infrared lights, mirrors or zoom lenses are used; setting up and calibrating the system is easier compared to other approaches using these components. Our work is based on the open source application Opengazer, which provides a good starting point for our contributions. We propose several improvements in order to push the system's performance further and make it feasible as a robust, real-time device. Then we carry out an elaborate experiment involving 18 human subjects and 4 different system setups. Finally, we give an analysis of the results and discuss the effects of setup changes, subject differences and modifications in the software.
Address	Bellaterra
Corporate Author	Computer Vision Center			Thesis	Master's thesis
Publisher		Place of Publication		Editor	Fernando Vilariño
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MV			Approved	no
Call Number	Admin @ si @ Fer2012; IAM @ iam @ Fer2012			Serial	2165
Permanent link to this record



Author	Alicia Fornes; Anjan Dutta; Albert Gordo; Josep Llados
Title	CVC-MUSCIMA: A Ground-Truth of Handwritten Music Score Images for Writer Identification and Staff Removal			Type	Journal Article
Year	2012	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
Volume	15	Issue	3	Pages	243-251
Keywords	Music scores; Handwritten documents; Writer identification; Staff removal; Performance evaluation; Graphics recognition; Ground truths
Abstract	0,405JCR The analysis of music scores has been an active research field in the last decades. However, there are no publicly available databases of handwritten music scores for the research community. In this paper we present the CVC-MUSCIMA database and ground-truth of handwritten music score images. The dataset consists of 1,000 music sheets written by 50 different musicians. It has been especially designed for writer identification and staff removal tasks. In addition to the description of the dataset, ground-truth, partitioning and evaluation metrics, we also provide some base-line results for easing the comparison between different approaches.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1433-2833	ISBN		Medium
Area		Expedition		Conference
Notes	DAG			Approved	no
Call Number	Admin @ si @ FDG2012			Serial	2129
Permanent link to this record



Author	Volkmar Frinken; Markus Baumgartner; Andreas Fischer; Horst Bunke
Title	Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting			Type	Conference Article
Year	2012	Publication	13th International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
Volume		Issue		Pages	49-54
Keywords
Abstract	State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.
Address	Bari, Italy
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	10.1109/ICFHR.2012.268	ISBN	978-1-4673-2262-1	Medium
Area		Expedition		Conference	ICFHR
Notes	DAG			Approved	no
Call Number	Admin @ si @ FBF2012			Serial	2055
Permanent link to this record



Author	Sergio Escalera
Title	Human Behavior Analysis From Depth Maps			Type	Conference Article
Year	2012	Publication	7th Conference on Articulated Motion and Deformable Objects	Abbreviated Journal
Volume	7378	Issue		Pages	282-292
Keywords
Abstract	Pose Recovery (PR) and Human Behavior Analysis (HBA) have been a main focus of interest from the beginnings of Computer Vision and Machine Learning. PR and HBA were originally addressed by the analysis of still images and image sequences. More recent strategies consisted of Motion Capture technology (MOCAP), based on the synchronization of multiple cameras in controlled environments; and the analysis of depth maps from Time-of-Flight (ToF) technology, based on range image recording from distance sensor measurements. Recently, with the appearance of the multi-modal RGBD information provided by the low cost Kinect \textsfTM sensor (from RGB and Depth, respectively), classical methods for PR and HBA have been redefined, and new strategies have been proposed. In this paper, the recent contributions and future trends of multi-modal RGBD data analysis for PR and HBA are reviewed and discussed.
Address	Mallorca
Corporate Author				Thesis
Publisher	Springer Heidelberg	Place of Publication		Editor	F.J. Perales; R.B. Fisher; T.B. Moeslund
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-31566-4	Medium
Area		Expedition		Conference	AMDO
Notes	MILAB; HuPBA			Approved	no
Call Number	Admin @ si @ Esc2012			Serial	2040
Permanent link to this record



Author	Sergio Escalera; Josep Moya; Laura Igual; Veronica Violant; Maria Teresa Anguera
Title	Análisis Comportamental Automatizado de TDAH: la Influencia de la Variable Motivación			Type	Conference Article
Year	2012	Publication	IPSI – Cosmocaixa, Jornadas "Empremtes del present, efectes en la psicoanàlisi, la cultura i la societat	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Poster
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IPSI
Notes	MILAB; HuPBA; OR			Approved	no
Call Number	Admin @ si @ EMI2012b			Serial	2065
Permanent link to this record



Author	Sergio Escalera; Josep Moya; Laura Igual; Veronica Violant; Maria Teresa Anguera
Title	Automatic Human Behavior Analysis in ADHD			Type	Conference Article
Year	2012	Publication	Eunethydis 2nd International ADHD Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Poster
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	EUNETHYDIS
Notes	MILAB;HuPBA			Approved	no
Call Number	Admin @ si @ EMI2012a			Serial	2058
Permanent link to this record



Author	Noha Elfiky
Title	Compact, Adaptive and Discriminative Spatial Pyramids for Improved Object and Scene Classification			Type	Book Whole
Year	2012	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	The release of challenging datasets with a vast number of images, requires the development of efficient image representations and algorithms which are able to manipulate these large-scale datasets efficiently. Nowadays the Bag-of-Words (BoW) is the most successful approach in the context of object and scene classification tasks. However, its main drawback is the absence of the important spatial information. Spatial pyramids (SP) have been successfully applied to incorporate spatial information into BoW-based image representation. Observing the remarkable performance of spatial pyramids, their growing number of applications to a broad range of vision problems, and finally its geometry inclusion, a question can be asked what are the limits of spatial pyramids. Within the SP framework, the optimal way for obtaining an image spatial representation, which is able to cope with it’s most foremost shortcomings, concretely, it’s high dimensionality and the rigidity of the resulting image representation, still remains an active research domain. In summary, the main concern of this thesis is to search for the limits of spatial pyramids and try to figure out solutions for them.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Jordi Gonzalez;Xavier Roca
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ Elf2012			Serial	2202
Permanent link to this record



Author	Noha Elfiky; Fahad Shahbaz Khan; Joost Van de Weijer; Jordi Gonzalez
Title	Discriminative Compact Pyramids for Object and Scene Recognition			Type	Journal Article
Year	2012	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	45	Issue	4	Pages	1627-1636
Keywords
Abstract	Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	ISE; CAT;CIC			Approved	no
Call Number	Admin @ si @ EKW2012			Serial	1807
Permanent link to this record



Author	Noha Elfiky; Jordi Gonzalez; Xavier Roca
Title	Compact and Adaptive Spatial Pyramids for Scene Recognition			Type	Journal Article
Year	2012	Publication	Image and Vision Computing	Abbreviated Journal	IMAVIS
Volume	30	Issue	8	Pages	492–500
Keywords
Abstract	Most successful approaches on scenerecognition tend to efficiently combine global image features with spatial local appearance and shape cues. On the other hand, less attention has been devoted for studying spatial texture features within scenes. Our method is based on the insight that scenes can be seen as a composition of micro-texture patterns. This paper analyzes the role of texture along with its spatial layout for scenerecognition. However, one main drawback of the resulting spatial representation is its huge dimensionality. Hence, we propose a technique that addresses this problem by presenting a compactSpatialPyramid (SP) representation. The basis of our compact representation, namely, CompactAdaptiveSpatialPyramid (CASP) consists of a two-stages compression strategy. This strategy is based on the Agglomerative Information Bottleneck (AIB) theory for (i) compressing the least informative SP features, and, (ii) automatically learning the most appropriate shape for each category. Our method exceeds the state-of-the-art results on several challenging scenerecognition data sets.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ EGR2012			Serial	2004
Permanent link to this record



Author	Ivo Everts; Jan van Gemert; Theo Gevers
Title	Per-patch Descriptor Selection using Surface and Scene Properties			Type	Conference Article
Year	2012	Publication	12th European Conference on Computer Vision	Abbreviated Journal
Volume	7577	Issue	VI	Pages	172-186
Keywords
Abstract	Local image descriptors are generally designed for describing all possible image patches. Such patches may be subject to complex variations in appearance due to incidental object, scene and recording conditions. Because of this, a single-best descriptor for accurate image representation under all conditions does not exist. Therefore, we propose to automatically select from a pool of descriptors the one that is best suitable based on object surface and scene properties. These properties are measured on the fly from a single image patch through a set of attributes. Attributes are input to a classifier which selects the best descriptor. Our experiments on a large dataset of colored object patches show that the proposed selection method outperforms the best single descriptor and a-priori combinations of the descriptor pool.
Address	Florence, Italy
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-33782-6	Medium
Area		Expedition		Conference	ECCV
Notes	ALTRES;ISE			Approved	no
Call Number	Admin @ si @ EGG2012			Serial	2023
Permanent link to this record



Author	Sergio Escalera; Xavier Baro; Jordi Vitria; Petia Radeva; Bogdan Raducanu
Title	Social Network Extraction and Analysis Based on Multimodal Dyadic Interaction			Type	Journal Article
Year	2012	Publication	Sensors	Abbreviated Journal	SENS
Volume	12	Issue	2	Pages	1702-1719
Keywords
Abstract	IF=1.77 (2010) Social interactions are a very important component in peopleís lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Timesí Blogging Heads opinion blog. The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The linksí weights are a measure of the ìinfluenceî a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network.
Address
Corporate Author				Thesis
Publisher	Molecular Diversity Preservation International	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; OR;HuPBA;MV			Approved	no
Call Number	Admin @ si @ EBV2012			Serial	1885
Permanent link to this record



Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title	Noise suppression over bi-level graphical documents using a sparse representation			Type	Conference Article
Year	2012	Publication	Colloque International Francophone sur l'Écrit et le Document	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Bordeaux
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CIFED
Notes	DAG			Approved	no
Call Number	Admin @ si @ DTR2012b			Serial	2136
Permanent link to this record



Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title	Text/graphic separation using a sparse representation with multi-learned dictionaries			Type	Conference Article
Year	2012	Publication	21st International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords	Graphics Recognition; Layout Analysis; Document Understandin
Abstract	In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Address	Tsukuba
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG			Approved	no
Call Number	Admin @ si @ DTR2012a			Serial	2135
Permanent link to this record