|
Volkmar Frinken, Francisco Zamora, Salvador España, Maria Jose Castro, Andreas Fischer, & Horst Bunke. (2012). Long-Short Term Memory Neural Networks Language Modeling for Handwriting Recognition. In 21st International Conference on Pattern Recognition (pp. 701–704).
Abstract: Unconstrained handwritten text recognition systems maximize the combination of two separate probability scores. The first one is the observation probability that indicates how well the returned word sequence matches the input image. The second score is the probability that reflects how likely a word sequence is according to a language model. Current state-of-the-art recognition systems use statistical language models in form of bigram word probabilities. This paper proposes to model the target language by means of a recurrent neural network with long-short term memory cells. Because the network is recurrent, the considered context is not limited to a fixed size especially as the memory cells are designed to deal with long-term dependencies. In a set of experiments conducted on the IAM off-line database we show the superiority of the proposed language model over statistical n-gram models.
|
|
|
Marçal Rusiñol, Dimosthenis Karatzas, Andrew Bagdanov, & Josep Llados. (2012). Multipage Document Retrieval by Textual and Visual Representations. In 21st International Conference on Pattern Recognition (pp. 521–524).
Abstract: In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results are reported on a large dataset of document images sampled from a banking workflow.
|
|
|
Marçal Rusiñol, & Josep Llados. (2012). The Role of the Users in Handwritten Word Spotting Applications: Query Fusion and Relevance Feedback. In 13th International Conference on Frontiers in Handwriting Recognition (pp. 55–60).
Abstract: In this paper we present the importance of including the user in the loop in a handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and a baseline word spotting approach based on a bag-of-visual-words model.
|
|
|
Volkmar Frinken, Markus Baumgartner, Andreas Fischer, & Horst Bunke. (2012). Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting. In 13th International Conference on Frontiers in Handwriting Recognition (pp. 49–54).
Abstract: State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.
|
|
|
Emanuel Indermühle, Volkmar Frinken, & Horst Bunke. (2012). Mode Detection in Online Handwritten Documents using BLSTM Neural Networks. In 13th International Conference on Frontiers in Handwriting Recognition (pp. 302–307).
Abstract: Mode detection in online handwritten documents refers to the process of distinguishing different types of contents, such as text, formulas, diagrams, or tables, one from another. In this paper a new approach to mode detection is proposed that uses bidirectional long-short term memory (BLSTM) neural networks. The BLSTM neural network is a novel type of recursive neural network that has been successfully applied in speech and handwriting recognition. In this paper we show that it has the potential to significantly outperform traditional methods for mode detection, which are usually based on stroke classification. As a further advantage over previous approaches, the proposed system is trainable and does not rely on user-defined heuristics. Moreover, it can be easily adapted to new or additional types of modes by just providing the system with new training data.
|
|
|
Volkmar Frinken, Alicia Fornes, Josep Llados, & Jean-Marc Ogier. (2012). Bidirectional Language Model for Handwriting Recognition. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 611–619). LNCS. Springer Berlin Heidelberg.
Abstract: In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
|
|
|
Laura Igual, Joan Carles Soliva, Roger Gimeno, Sergio Escalera, Oscar Vilarroya, & Petia Radeva. (2012). Automatic Internal Segmentation of Caudate Nucleus for Diagnosis of Attention Deficit Hyperactivity Disorder. In 9th International Conference on Image Analysis and Recognition (Vol. 7325, pp. 222–229). LNCS.
Abstract: Poster
Studies on volumetric brain Magnetic Resonance Imaging (MRI) showed neuroanatomical abnormalities in pediatric Attention-Deficit/Hyperactivity Disorder (ADHD). In particular, the diminished right caudate volume is one of the most replicated findings among ADHD samples in morphometric MRI studies. In this paper, we propose a fully-automatic method for internal caudate nucleus segmentation based on machine learning. Moreover, the ratio between right caudate body volume and the bilateral caudate body volume is applied in a ADHD diagnostic test. We separately validate the automatic internal segmentation of caudate in head and body structures and the diagnostic test using real data from ADHD and control subjects. As a result, we show accurate internal caudate segmentation and similar performance among the proposed automatic diagnostic test and the manual annotation.
|
|
|
Ekaterina Zaytseva, & Jordi Vitria. (2012). A search based approach to non maximum suppression in face detection. In 19th IEEE International Conference on Image Processing.
Abstract: Poster
paper TA.P5.12
Face detectors typically produce a large number of false positives and this leads to the need to have a further non maximum suppression stage to eliminate multiple and spurious responses. This stage is based on considering spatial heuristics: true positive responses are selected by implicitly considering several restrictions on the spatial distribution of detector responses in natural images. In this paper we analyze the limitations of this approach and propose an efficient search method to overcome them. Results show how the application of this new non-maximum suppression approach to a simple face detector boosts its performance to state of the art results.
|
|
|
Angel Sappa, David Geronimo, Fadi Dornaika, Mohammad Rouhani, & Antonio Lopez. (2012). Moving object detection from mobile platforms using stereo data registration. In Marek R. Ogiela, & Lakhmi C. Jain (Eds.), Computational Intelligence paradigms in advanced pattern classification (Vol. 386, pp. 25–37). Springer Berlin Heidelberg.
Abstract: This chapter describes a robust approach for detecting moving objects from on-board stereo vision systems. It relies on a feature point quaternion-based registration, which avoids common problems that appear when computationally expensive iterative-based algorithms are used on dynamic environments. The proposed approach consists of three main stages. Initially, feature points are extracted and tracked through consecutive 2D frames. Then, a RANSAC based approach is used for registering two point sets, with known correspondences in the 3D space. The computed 3D rigid displacement is used to map two consecutive 3D point clouds into the same coordinate system by means of the quaternion method. Finally, moving objects correspond to those areas with large 3D registration errors. Experimental results show the viability of the proposed approach to detect moving objects like vehicles or pedestrians in different urban scenarios.
Keywords: pedestrian detection
|
|
|
Pau Baiget, Carles Fernandez, Xavier Roca, & Jordi Gonzalez. (2012). Trajectory-Based Abnormality Categorization for Learning Route Patterns in Surveillance. In Detection and Identification of Rare Audiovisual Cues, Studies in Computational Intelligence (Vol. 384, pp. 87–95). Springer Berlin Heidelberg.
Abstract: The recognition of abnormal behaviors in video sequences has raised as a hot topic in video understanding research. Particularly, an important challenge resides on automatically detecting abnormality. However, there is no convention about the types of anomalies that training data should derive. In surveillance, these are typically detected when new observations differ substantially from observed, previously learned behavior models, which represent normality. This paper focuses on properly defining anomalies within trajectory analysis: we propose a hierarchical representation conformed by Soft, Intermediate, and Hard Anomaly, which are identified from the extent and nature of deviation from learned models. Towards this end, a novel Gaussian Mixture Model representation of learned route patterns creates a probabilistic map of the image plane, which is applied to detect and classify anomalies in real-time. Our method overcomes limitations of similar existing approaches, and performs correctly even when the tracking is affected by different sources of noise. The reliability of our approach is demonstrated experimentally.
|
|
|
Joost Van de Weijer, Robert Benavente, Maria Vanrell, Cordelia Schmid, Ramon Baldrich, Jacob Verbeek, et al. (2012). Color Naming. In Theo Gevers, Arjan Gijsenij, Joost Van de Weijer, & Jan-Mark Geusebroek (Eds.), Color in Computer Vision: Fundamentals and Applications (pp. 287–317). John Wiley & Sons, Ltd.
|
|
|
Xavier Perez Sala, Laura Igual, Sergio Escalera, & Cecilio Angulo. (2012). Uniform Sampling of Rotations for Discrete and Continuous Learning of 2D Shape Models. In Vision Robotics: Technologies for Machine Learning and Vision Applications (pp. 23–42). IGI-Global.
Abstract: Different methodologies of uniform sampling over the rotation group, SO(3), for building unbiased 2D shape models from 3D objects are introduced and reviewed in this chapter. State-of-the-art non uniform sampling approaches are discussed, and uniform sampling methods using Euler angles and quaternions are introduced. Moreover, since presented work is oriented to model building applications, it is not limited to general discrete methods to obtain uniform 3D rotations, but also from a continuous point of view in the case of Procrustes Analysis.
|
|
|
Sergio Escalera, Josep Moya, Laura Igual, Veronica Violant, & Maria Teresa Anguera. (2012). Análisis Comportamental Automatizado de TDAH: la Influencia de la Variable Motivación. In IPSI – Cosmocaixa, Jornadas "Empremtes del present, efectes en la psicoanàlisi, la cultura i la societat.
|
|
|
Laura Igual, Joan Carles Soliva, Antonio Hernandez, Sergio Escalera, Oscar Vilarroya, & Petia Radeva. (2012). A Supervised Graph-cut Deformable Model for Brain MRI Segmentation. Deformation models: tracking, animation and applications. In Computational Vision and Biomechanics. LNCS. Springer Netherlands.
|
|
|
Angel Sappa, & George A. Triantafyllid. (2012). Computer Graphics and Imaging.
|
|