|
Marçal Rusiñol and Josep Llados. 2012. The Role of the Users in Handwritten Word Spotting Applications: Query Fusion and Relevance Feedback. 13th International Conference on Frontiers in Handwriting Recognition.55–60.
Abstract: In this paper we present the importance of including the user in the loop in a handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and a baseline word spotting approach based on a bag-of-visual-words model.
|
|
|
Volkmar Frinken, Markus Baumgartner, Andreas Fischer and Horst Bunke. 2012. Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting. 13th International Conference on Frontiers in Handwriting Recognition.49–54.
Abstract: State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.
|
|
|
Emanuel Indermühle, Volkmar Frinken and Horst Bunke. 2012. Mode Detection in Online Handwritten Documents using BLSTM Neural Networks. 13th International Conference on Frontiers in Handwriting Recognition.302–307.
Abstract: Mode detection in online handwritten documents refers to the process of distinguishing different types of contents, such as text, formulas, diagrams, or tables, one from another. In this paper a new approach to mode detection is proposed that uses bidirectional long-short term memory (BLSTM) neural networks. The BLSTM neural network is a novel type of recursive neural network that has been successfully applied in speech and handwriting recognition. In this paper we show that it has the potential to significantly outperform traditional methods for mode detection, which are usually based on stroke classification. As a further advantage over previous approaches, the proposed system is trainable and does not rely on user-defined heuristics. Moreover, it can be easily adapted to new or additional types of modes by just providing the system with new training data.
|
|
|
Volkmar Frinken, Alicia Fornes, Josep Llados and Jean-Marc Ogier. 2012. Bidirectional Language Model for Handwriting Recognition. Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop. Springer Berlin Heidelberg, 611–619. (LNCS.)
Abstract: In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
|
|
|
Ernest Valveny, Robert Benavente, Agata Lapedriza, Miquel Ferrer, Jaume Garcia and Gemma Sanchez. 2012. Adaptation of a computer programming course to the EXHE requirements: evaluation five years later.
|
|
|
Marçal Rusiñol and 7 others. 2012. CVC-UAB's participation in the Flowchart Recognition Task of CLEF-IP 2012. Conference and Labs of the Evaluation Forum.
|
|
|
Christophe Rigaud, Dimosthenis Karatzas, Joost Van de Weijer, Jean-Christophe Burie and Jean-Marc Ogier. 2013. An active contour model for speech balloon detection in comics. 12th International Conference on Document Analysis and Recognition.1240–1244.
Abstract: Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented.
|
|
|
Alicia Fornes, Xavier Otazu and Josep Llados. 2013. Show through cancellation and image enhancement by multiresolution contrast processing. 12th International Conference on Document Analysis and Recognition.200–204.
Abstract: Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are developed to solve these problems, and in the case of show-through, by scanning and matching both the front and back sides of the document. In contrast, our approach is designed to use only one side of the scanned document. We hypothesize that show-trough are low contrast components, while foreground components are high contrast ones. A Multiresolution Contrast (MC) decomposition is presented in order to estimate the contrast of features at different spatial scales. We cancel the show-through phenomenon by thresholding these low contrast components. This decomposition is also able to enhance the image removing shadowed areas by weighting spatial scales. Results show that the enhanced images improve the readability of the documents, allowing scholars both to recover unreadable words and to solve ambiguities.
|
|
|
David Aldavert, Marçal Rusiñol, Ricardo Toledo and Josep Llados. 2013. Integrating Visual and Textual Cues for Query-by-String Word Spotting. 12th International Conference on Document Analysis and Recognition.511–515.
Abstract: In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character $n$-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances.
|
|
|
Anjan Dutta, Jaume Gibert, Josep Llados, Horst Bunke and Umapada Pal. 2012. Combination of Product Graph and Random Walk Kernel for Symbol Spotting in Graphical Documents. 21st International Conference on Pattern Recognition.1663–1666.
Abstract: This paper explores the utilization of product graph for spotting symbols on graphical documents. Product graph is intended to find the candidate subgraphs or components in the input graph containing the paths similar to the query graph. The acute angle between two edges and their length ratio are considered as the node labels. In a second step, each of the candidate subgraphs in the input graph is assigned with a distance measure computed by a random walk kernel. Actually it is the minimum of the distances of the component to all the components of the model graph. This distance measure is then used to eliminate dissimilar components. The remaining neighboring components are grouped and the grouped zone is considered as a retrieval zone of a symbol similar to the queried one. The entire method works online, i.e., it doesn't need any preprocessing step. The present paper reports the initial results of the method, which are very encouraging.
|
|