|
P. Wang, V. Eglin, C. Garcia, C. Largeron, Josep Llados, & Alicia Fornes. (2014). A Novel Learning-free Word Spotting Approach Based on Graph Representation. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 207–211).
Abstract: Effective information retrieval on handwritten document images has always been a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labelled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment result is introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
|
|
|
P. Wang, V. Eglin, C. Garcia, C. Largeron, Josep Llados, & Alicia Fornes. (2014). Représentation par graphe de mots manuscrits dans les images pour la recherche par similarité. In Colloque International Francophone sur l'Écrit et le Document (pp. 233–248).
Abstract: Effective information retrieval on handwritten document images has always been
a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labeled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment results introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
Keywords: word spotting; graph-based representation; shape context description; graph edit distance; DTW; block merging; query by example
|
|
|
Palaiahnakote Shivakumara, Anjan Dutta, Chew Lim Tan, & Umapada Pal. (2014). Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing. MTAP - Multimedia Tools and Applications, 72(1), 515–539.
Abstract: In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well.
|
|
|
Patricia Marquez, Debora Gil, R.Mester, & Aura Hernandez-Sabate. (2014). Local Analysis of Confidence Measures for Optical Flow Quality Evaluation. In 9th International Conference on Computer Vision Theory and Applications (Vol. 3, pp. 450–457).
Abstract: Optical Flow (OF) techniques facing the complexity of real sequences have been developed in the last years. Even using the most appropriate technique for our specific problem, at some points the output flow might fail to achieve the minimum error required for the system. Confidence measures computed from either input data or OF output should discard those points where OF is not accurate enough for its further use. It follows that evaluating the capabilities of a confidence measure for bounding OF error is as important as the definition
itself. In this paper we analyze different confidence measures and point out their advantages and limitations for their use in real world settings. We also explore the agreement with current tools for their evaluation of confidence measures performance.
Keywords: Optical Flow; Confidence Measure; Performance Evaluation.
|
|
|
Patricia Marquez, H. Kause, A. Fuster, Aura Hernandez-Sabate, L. Florack, Debora Gil, et al. (2014). Factors Affecting Optical Flow Performance in Tagging Magnetic Resonance Imaging. In 17th International Conference on Medical Image Computing and Computer Assisted Intervention (Vol. 8896, pp. 231–238). LNCS. Springer International Publishing.
Abstract: Changes in cardiac deformation patterns are correlated with cardiac pathologies. Deformation can be extracted from tagging Magnetic Resonance Imaging (tMRI) using Optical Flow (OF) techniques. For applications of OF in a clinical setting it is important to assess to what extent the performance of a particular OF method is stable across dierent clinical acquisition artifacts. This paper presents a statistical validation framework, based on ANOVA, to assess the motion and appearance factors that have the largest in uence on OF accuracy drop.
In order to validate this framework, we created a database of simulated tMRI data including the most common artifacts of MRI and test three dierent OF methods, including HARP.
Keywords: Optical flow; Performance Evaluation; Synthetic Database; ANOVA; Tagging Magnetic Resonance Imaging
|
|
|
Pau Riba, Jon Almazan, Alicia Fornes, David Fernandez, Ernest Valveny, & Josep Llados. (2014). e-Crowds: a mobile platform for browsing and searching in historical demographyrelated manuscripts. In 14th International Conference on Frontiers in Handwriting Recognition (pp. 228–233).
Abstract: This paper presents a prototype system running on portable devices for browsing and word searching through historical handwritten document collections. The platform adapts the paradigm of eBook reading, where the narrative is not necessarily sequential, but centered on the user actions. The novelty is to replace digitally born books by digitized historical manuscripts of marriage licenses, so document analysis tasks are required in the browser. With an active reading paradigm, the user can cast queries of people names, so he/she can implicitly follow genealogical links. In addition, the system allows combined searches: the user can refine a search by adding more words to search. As a second contribution, the retrieval functionality involves as a core technology a word spotting module with an unified approach, which allows combined query searches, and also two input modalities: query-by-example, and query-by-string.
|
|
|
Pedro Martins, Paulo Carvalho, & Carlo Gatta. (2014). Context-aware features and robust image representations. JVCIR - Journal of Visual Communication and Image Representation, 25(2), 339–348.
Abstract: Local image features are often used to efficiently represent image content. The limited number of types of features that a local feature extractor responds to might be insufficient to provide a robust image representation. To overcome this limitation, we propose a context-aware feature extraction formulated under an information theoretic framework. The algorithm does not respond to a specific type of features; the idea is to retrieve complementary features which are relevant within the image context. We empirically validate the method by investigating the repeatability, the completeness, and the complementarity of context-aware features on standard benchmarks. In a comparison with strictly local features, we show that our context-aware features produce more robust image representations. Furthermore, we study the complementarity between strictly local features and context-aware ones to produce an even more robust representation.
|
|
|
Pierluigi Casale, Oriol Pujol, & Petia Radeva. (2014). Approximate polytope ensemble for one-class classification. PR - Pattern Recognition, 47(2), 854–864.
Abstract: In this work, a new one-class classification ensemble strategy called approximate polytope ensemble is presented. The main contribution of the paper is threefold. First, the geometrical concept of convex hull is used to define the boundary of the target class defining the problem. Expansions and contractions of this geometrical structure are introduced in order to avoid over-fitting. Second, the decision whether a point belongs to the convex hull model in high dimensional spaces is approximated by means of random projections and an ensemble decision process. Finally, a tiling strategy is proposed in order to model non-convex structures. Experimental results show that the proposed strategy is significantly better than state of the art one-class classification methods on over 200 datasets.
Keywords: One-class classification; Convex hull; High-dimensionality; Random projections; Ensemble learning
|
|
|
Q. Xue, Laura Igual, A. Berenguel, M. Guerrieri, & L. Garrido. (2014). Active Contour Segmentation with Affine Coordinate-Based Parametrization. In 9th International Conference on Computer Vision Theory and Applications (Vol. 1, pp. 5–14).
Abstract: In this paper, we present a new framework for image segmentation based on parametrized active contours. The contour and the points of the image space are parametrized using a set of reduced control points that have to form a closed polygon in two dimensional problems and a closed surface in three dimensional problems. By moving the control points, the active contour evolves. We use mean value coordinates as the parametrization tool for the interface, which allows to parametrize any point of the space, inside or outside the closed polygon
or surface. Region-based energies such as the one proposed by Chan and Vese can be easily implemented in both two and three dimensional segmentation problems. We show the usefulness of our approach with several experiments.
Keywords: Active Contours; Affine Coordinates; Mean Value Coordinates
|
|
|
R. Clariso, David Masip, & A. Rius. (2014). Student projects empowering mobile learning in higher education. RUSC - Revista de Universidad y Sociedad del Conocimiento, 192–207.
|
|
|
Ricard Balague. (2014). Exploring the combination of color cues for intrinsic image decomposition (Vol. 178). Master's thesis, , .
Abstract: Intrinsic image decomposition is a challenging problem that consists in separating an image into its physical characteristics: reflectance and shading. This problem can be solved in different ways, but most methods have combined information from several visual cues. In this work we describe an extension of an existing method proposed by Serra et al. which considers two color descriptors and combines them by means of a Markov Random Field. We analyze in depth the weak points of the method and we explore more possibilities to use in both descriptors. The proposed extension depends on the combination of the cues considered to overcome some of the limitations of the original method. Our approach is tested on the MIT dataset and Beigpour et al. dataset, which contain images of real objects acquired under controlled conditions and synthetic images respectively, with their corresponding ground truth.
|
|
|
Salvatore Tabbone, & Oriol Ramos Terrades. (2014). An Overview of Symbol Recognition. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. D, pp. 523–551). Springer London.
Abstract: According to the Cambridge Dictionaries Online, a symbol is a sign, shape, or object that is used to represent something else. Symbol recognition is a subfield of general pattern recognition problems that focuses on identifying, detecting, and recognizing symbols in technical drawings, maps, or miscellaneous documents such as logos and musical scores. This chapter aims at providing the reader an overview of the different existing ways of describing and recognizing symbols and how the field has evolved to attain a certain degree of maturity.
Keywords: Pattern recognition; Shape descriptors; Structural descriptors; Symbolrecognition; Symbol spotting
|
|
|
Santiago Segui, Michal Drozdzal, Ekaterina Zaytseva, Fernando Azpiroz, Petia Radeva, & Jordi Vitria. (2014). Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images. TITB - IEEE Transactions on Information Technology in Biomedicine, 18(6), 1831–1838.
Abstract: Intestinal contractions are one of the most important events to diagnose motility pathologies of the small intestine. When visualized by wireless capsule endoscopy (WCE), the sequence of frames that represents a contraction is characterized by a clear wrinkle structure in the central frames that corresponds to the folding of the intestinal wall. In this paper we present a new method to robustly detect wrinkle frames in full WCE videos by using a new mid-level image descriptor that is based on a centrality measure proposed for graphs. We present an extended validation, carried out in a very large database, that shows that the proposed method achieves state of the art performance for this task.
Keywords: Wireless Capsule Endoscopy; Small Bowel Motility Dysfunction; Contraction Detection; Structured Prediction; Betweenness Centrality
|
|
|
Sebastian Ramos. (2014). Vision-based Detection of Road Hazards for Autonomous Driving. Master's thesis, , .
|
|
|
Sergio Escalera, Xavier Baro, Jordi Gonzalez, Miguel Angel Bautista, Meysam Madadi, Miguel Reyes, et al. (2014). ChaLearn Looking at People Challenge 2014: Dataset and Results. In ECCV Workshop on ChaLearn Looking at People (Vol. 8925, pp. 459–473). LNCS.
Abstract: This paper summarizes the ChaLearn Looking at People 2014 challenge data and the results obtained by the participants. The competition was split into three independent tracks: human pose recovery from RGB data, action and interaction recognition from RGB data sequences, and multi-modal gesture recognition from RGB-Depth sequences. For all the tracks, the goal was to perform user-independent recognition in sequences of continuous images using the overlapping Jaccard index as the evaluation measure. In this edition of the ChaLearn challenge, two large novel data sets were made publicly available and the Microsoft Codalab platform were used to manage the competition. Outstanding results were achieved in the three challenge tracks, with accuracy results of 0.20, 0.50, and 0.85 for pose recovery, action/interaction recognition, and multi-modal gesture recognition, respectively.
Keywords: Human Pose Recovery; Behavior Analysis; Action and in- teractions; Multi-modal gestures; recognition
|
|