Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–14] |
Records | |||||
---|---|---|---|---|---|
Author | David Fernandez; Josep Llados; Alicia Fornes; R.Manmatha | ||||
Title | On Influence of Line Segmentation in Efficient Word Segmentation in Old Manuscripts | Type | Conference Article | ||
Year | 2012 | Publication | 13th International Conference on Frontiers in Handwriting Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 763-768 | ||
Keywords | document image processing;handwritten character recognition;history;image segmentation;Spanish document;historical document;line segmentation;old handwritten document;old manuscript;word segmentation;Bifurcation;Dynamic programming;Handwriting recognition;Image segmentation;Measurement;Noise;Skeleton;Segmentation;document analysis;document and text processing;handwriting analysis;heuristics;path-finding | ||||
Abstract | he objective of this work is to show the importance of a good line segmentation to obtain better results in the segmentation of words of historical documents. We have used the approach developed by Manmatha and Rothfeder [1] to segment words in old handwritten documents. In their work the lines of the documents are extracted using projections. In this work, we have developed an approach to segment lines more efficiently. The new line segmentation algorithm tackles with skewed, touching and noisy lines, so it is significantly improves word segmentation. Experiments using Spanish documents from the Marriages Database of the Barcelona Cathedral show that this approach reduces the error rate by more than 20% | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4673-2262-1 | Medium | ||
Area | Expedition | Conference | ICFHR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FLF2012 | Serial | 2200 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Alicia Fornes; Josep Llados; Jean-Marc Ogier | ||||
Title | Bidirectional Language Model for Handwriting Recognition | Type | Conference Article | ||
Year | 2012 | Publication | Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop | Abbreviated Journal | |
Volume | 7626 | Issue | Pages | 611-619 | |
Keywords | |||||
Abstract | In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity. | ||||
Address | Japan | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-34165-6 | Medium | |
Area | Expedition | Conference | SSPR&SPR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FFL2012 | Serial | 2057 | ||
Permanent link to this record | |||||
Author | Onur Ferhat | ||||
Title | Eye-Tracking with Webcam-Based Setups: Implementation of a Real-Time System and an Analysis of Factors Affecting Performance | Type | Report | ||
Year | 2012 | Publication | CVC Technical Report | Abbreviated Journal | |
Volume | 172 | Issue | Pages | ||
Keywords | Computer vision, eye-tracking, gaussian process, feature selection, optical flow | ||||
Abstract | In the recent years commercial eye-tracking hardware has become more common, with the introduction of new models from several brands that have better performance and easier setup procedures. A cause and at the same time a result of this phenomenon is the popularity of eye-tracking research directed at marketing, accessibility and usability, among others.
One problem with these hardware components is scalability, because both the price and the necessary expertise to operate them makes it practically impossible in the large scale. In this work, we analyze the feasibility of a software eye-tracking system based on a single, ordinary webcam. Our aim is to discover the limits of such a system and to see whether it provides acceptable performances. The significance of this setup is that it is the most common setup found in consumer environments, off-the-shelf electronic devices such as laptops, mobile phones and tablet computers. As no special equipment such as infrared lights, mirrors or zoom lenses are used; setting up and calibrating the system is easier compared to other approaches using these components. Our work is based on the open source application Opengazer, which provides a good starting point for our contributions. We propose several improvements in order to push the system's performance further and make it feasible as a robust, real-time device. Then we carry out an elaborate experiment involving 18 human subjects and 4 different system setups. Finally, we give an analysis of the results and discuss the effects of setup changes, subject differences and modifications in the software. |
||||
Address | Bellaterra | ||||
Corporate Author | Computer Vision Center | Thesis | Master's thesis | ||
Publisher | Place of Publication | Editor | Fernando Vilariño | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MV | Approved | no | ||
Call Number | Admin @ si @ Fer2012; IAM @ iam @ Fer2012 | Serial | 2165 | ||
Permanent link to this record | |||||
Author | Alicia Fornes; Anjan Dutta; Albert Gordo; Josep Llados | ||||
Title | CVC-MUSCIMA: A Ground-Truth of Handwritten Music Score Images for Writer Identification and Staff Removal | Type | Journal Article | ||
Year | 2012 | Publication | International Journal on Document Analysis and Recognition | Abbreviated Journal | IJDAR |
Volume | 15 | Issue | 3 | Pages | 243-251 |
Keywords | Music scores; Handwritten documents; Writer identification; Staff removal; Performance evaluation; Graphics recognition; Ground truths | ||||
Abstract | 0,405JCR
The analysis of music scores has been an active research field in the last decades. However, there are no publicly available databases of handwritten music scores for the research community. In this paper we present the CVC-MUSCIMA database and ground-truth of handwritten music score images. The dataset consists of 1,000 music sheets written by 50 different musicians. It has been especially designed for writer identification and staff removal tasks. In addition to the description of the dataset, ground-truth, partitioning and evaluation metrics, we also provide some base-line results for easing the comparison between different approaches. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-2833 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FDG2012 | Serial | 2129 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Markus Baumgartner; Andreas Fischer; Horst Bunke | ||||
Title | Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting | Type | Conference Article | ||
Year | 2012 | Publication | 13th International Conference on Frontiers in Handwriting Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 49-54 | ||
Keywords | |||||
Abstract | State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches. | ||||
Address | Bari, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 10.1109/ICFHR.2012.268 | ISBN | 978-1-4673-2262-1 | Medium | |
Area | Expedition | Conference | ICFHR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FBF2012 | Serial | 2055 | ||
Permanent link to this record | |||||
Author | Sergio Escalera | ||||
Title | Human Behavior Analysis From Depth Maps | Type | Conference Article | ||
Year | 2012 | Publication | 7th Conference on Articulated Motion and Deformable Objects | Abbreviated Journal | |
Volume | 7378 | Issue | Pages | 282-292 | |
Keywords | |||||
Abstract | Pose Recovery (PR) and Human Behavior Analysis (HBA) have been a main focus of interest from the beginnings of Computer Vision and Machine Learning. PR and HBA were originally addressed by the analysis of still images and image sequences. More recent strategies consisted of Motion Capture technology (MOCAP), based on the synchronization of multiple cameras in controlled environments; and the analysis of depth maps from Time-of-Flight (ToF) technology, based on range image recording from distance sensor measurements. Recently, with the appearance of the multi-modal RGBD information provided by the low cost Kinect \textsfTM sensor (from RGB and Depth, respectively), classical methods for PR and HBA have been redefined, and new strategies have been proposed. In this paper, the recent contributions and future trends of multi-modal RGBD data analysis for PR and HBA are reviewed and discussed. | ||||
Address | Mallorca | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Heidelberg | Place of Publication | Editor | F.J. Perales; R.B. Fisher; T.B. Moeslund | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-31566-4 | Medium | |
Area | Expedition | Conference | AMDO | ||
Notes | MILAB; HuPBA | Approved | no | ||
Call Number | Admin @ si @ Esc2012 | Serial | 2040 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Josep Moya; Laura Igual; Veronica Violant; Maria Teresa Anguera | ||||
Title | Análisis Comportamental Automatizado de TDAH: la Influencia de la Variable Motivación | Type | Conference Article | ||
Year | 2012 | Publication | IPSI – Cosmocaixa, Jornadas "Empremtes del present, efectes en la psicoanàlisi, la cultura i la societat | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Poster | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IPSI | ||
Notes | MILAB; HuPBA; OR | Approved | no | ||
Call Number | Admin @ si @ EMI2012b | Serial | 2065 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Josep Moya; Laura Igual; Veronica Violant; Maria Teresa Anguera | ||||
Title | Automatic Human Behavior Analysis in ADHD | Type | Conference Article | ||
Year | 2012 | Publication | Eunethydis 2nd International ADHD Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Poster | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | EUNETHYDIS | ||
Notes | MILAB;HuPBA | Approved | no | ||
Call Number | Admin @ si @ EMI2012a | Serial | 2058 | ||
Permanent link to this record | |||||
Author | Noha Elfiky | ||||
Title | Compact, Adaptive and Discriminative Spatial Pyramids for Improved Object and Scene Classification | Type | Book Whole | ||
Year | 2012 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The release of challenging datasets with a vast number of images, requires the development of efficient image representations and algorithms which are able to manipulate these large-scale datasets efficiently. Nowadays the Bag-of-Words (BoW) is the most successful approach in the context of object and scene classification tasks. However, its main drawback is the absence of the important spatial information. Spatial pyramids (SP) have been successfully applied to incorporate spatial information into BoW-based image representation. Observing the remarkable performance of spatial pyramids, their growing number of applications to a broad range of vision problems, and finally its geometry inclusion, a question can be asked what are the limits of spatial pyramids. Within the SP framework, the optimal way for obtaining an image spatial representation, which is able to cope with it’s most foremost shortcomings, concretely, it’s high dimensionality and the rigidity of the resulting image representation, still remains an active research domain. In summary, the main concern of this thesis is to search for the limits of spatial pyramids and try to figure out solutions for them. | ||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Jordi Gonzalez;Xavier Roca | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ Elf2012 | Serial | 2202 | ||
Permanent link to this record | |||||
Author | Noha Elfiky; Fahad Shahbaz Khan; Joost Van de Weijer; Jordi Gonzalez | ||||
Title | Discriminative Compact Pyramids for Object and Scene Recognition | Type | Journal Article | ||
Year | 2012 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 45 | Issue | 4 | Pages | 1627-1636 |
Keywords | |||||
Abstract | Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ISE; CAT;CIC | Approved | no | ||
Call Number | Admin @ si @ EKW2012 | Serial | 1807 | ||
Permanent link to this record | |||||
Author | Noha Elfiky; Jordi Gonzalez; Xavier Roca | ||||
Title | Compact and Adaptive Spatial Pyramids for Scene Recognition | Type | Journal Article | ||
Year | 2012 | Publication | Image and Vision Computing | Abbreviated Journal | IMAVIS |
Volume | 30 | Issue | 8 | Pages | 492–500 |
Keywords | |||||
Abstract | Most successful approaches on scenerecognition tend to efficiently combine global image features with spatial local appearance and shape cues. On the other hand, less attention has been devoted for studying spatial texture features within scenes. Our method is based on the insight that scenes can be seen as a composition of micro-texture patterns. This paper analyzes the role of texture along with its spatial layout for scenerecognition. However, one main drawback of the resulting spatial representation is its huge dimensionality. Hence, we propose a technique that addresses this problem by presenting a compactSpatialPyramid (SP) representation. The basis of our compact representation, namely, CompactAdaptiveSpatialPyramid (CASP) consists of a two-stages compression strategy. This strategy is based on the Agglomerative Information Bottleneck (AIB) theory for (i) compressing the least informative SP features, and, (ii) automatically learning the most appropriate shape for each category. Our method exceeds the state-of-the-art results on several challenging scenerecognition data sets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ EGR2012 | Serial | 2004 | ||
Permanent link to this record | |||||
Author | Ivo Everts; Jan van Gemert; Theo Gevers | ||||
Title | Per-patch Descriptor Selection using Surface and Scene Properties | Type | Conference Article | ||
Year | 2012 | Publication | 12th European Conference on Computer Vision | Abbreviated Journal | |
Volume | 7577 | Issue | VI | Pages | 172-186 |
Keywords | |||||
Abstract | Local image descriptors are generally designed for describing all possible image patches. Such patches may be subject to complex variations in appearance due to incidental object, scene and recording conditions. Because of this, a single-best descriptor for accurate image representation under all conditions does not exist. Therefore, we propose to automatically select from a pool of descriptors the one that is best suitable based on object surface and scene properties. These properties are measured on the fly from a single image patch through a set of attributes. Attributes are input to a classifier which selects the best descriptor. Our experiments on a large dataset of colored object patches show that the proposed selection method outperforms the best single descriptor and a-priori combinations of the descriptor pool. | ||||
Address | Florence, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-33782-6 | Medium | |
Area | Expedition | Conference | ECCV | ||
Notes | ALTRES;ISE | Approved | no | ||
Call Number | Admin @ si @ EGG2012 | Serial | 2023 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Xavier Baro; Jordi Vitria; Petia Radeva; Bogdan Raducanu | ||||
Title | Social Network Extraction and Analysis Based on Multimodal Dyadic Interaction | Type | Journal Article | ||
Year | 2012 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 12 | Issue | 2 | Pages | 1702-1719 |
Keywords | |||||
Abstract | IF=1.77 (2010)
Social interactions are a very important component in peopleís lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Timesí Blogging Heads opinion blog. The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The linksí weights are a measure of the ìinfluenceî a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Molecular Diversity Preservation International | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; OR;HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ EBV2012 | Serial | 1885 | ||
Permanent link to this record | |||||
Author | Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades | ||||
Title | Noise suppression over bi-level graphical documents using a sparse representation | Type | Conference Article | ||
Year | 2012 | Publication | Colloque International Francophone sur l'Écrit et le Document | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Bordeaux | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CIFED | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ DTR2012b | Serial | 2136 | ||
Permanent link to this record | |||||
Author | Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades | ||||
Title | Text/graphic separation using a sparse representation with multi-learned dictionaries | Type | Conference Article | ||
Year | 2012 | Publication | 21st International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Graphics Recognition; Layout Analysis; Document Understandin | ||||
Abstract | In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds. | ||||
Address | Tsukuba | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ DTR2012a | Serial | 2135 | ||
Permanent link to this record |