|
Alicia Fornes, V.C.Kieu, M. Visani, N.Journet and Anjan Dutta. 2014. The ICDAR/GREC 2013 Music Scores Competition: Staff Removal. In B.Lamiroy and J.-M. Ogier, eds. Graphics Recognition. Current Trends and Challenges. Springer Berlin Heidelberg, 207–220. (LNCS.)
Abstract: The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant’s methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations.
Keywords: Competition; Graphics recognition; Music scores; Writer identification; Staff removal
|
|
|
Pau Riba, Josep Llados, Alicia Fornes and Anjan Dutta. 2015. Large-scale Graph Indexing using Binary Embeddings of Node Contexts. In C.-L.Liu, B.Luo, W.G.Kropatsch and J.Cheng, eds. 10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition. Springer International Publishing, 208–217. (LNCS.)
Abstract: Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.
Keywords: Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding
|
|
|
Klaus Broelemann, Anjan Dutta, Xiaoyi Jiang and Josep Llados. 2014. Hierarchical Plausibility-Graphs for Symbol Spotting in Graphical Documents. In Bart Lamiroy and Jean-Marc Ogier, eds. Graphics Recognition. Current Trends and Challenges. Springer Berlin Heidelberg, 25–37. (LNCS.)
Abstract: Graph representation of graphical documents often suffers from noise such as spurious nodes and edges, and their discontinuity. In general these errors occur during the low-level image processing viz. binarization, skeletonization, vectorization etc. Hierarchical graph representation is a nice and efficient way to solve this kind of problem by hierarchically merging node-node and node-edge depending on the distance. But the creation of hierarchical graph representing the graphical information often uses hard thresholds on the distance to create the hierarchical nodes (next state) of the lower nodes (or states) of a graph. As a result, the representation often loses useful information. This paper introduces plausibilities to the nodes of hierarchical graph as a function of distance and proposes a modified algorithm for matching subgraphs of the hierarchical graphs. The plausibility-annotated nodes help to improve the performance of the matching algorithm on two hierarchical structures. To show the potential of this approach, we conduct an experiment with the SESYD dataset.
|
|
|
Marçal Rusiñol, Dimosthenis Karatzas and Josep Llados. 2014. Spotting Graphical Symbols in Camera-Acquired Documents in Real Time. In Bart Lamiroy and Jean-Marc Ogier, eds. Graphics Recognition. Current Trends and Challenges. Springer Berlin Heidelberg, 3–10. (LNCS.)
Abstract: In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time.
|
|
|
Marçal Rusiñol, V. Poulain d'Andecy, Dimosthenis Karatzas and Josep Llados. 2014. Classification of Administrative Document Images by Logo Identification. In Bart Lamiroy and Jean-Marc Ogier, eds. Graphics Recognition. Current Trends and Challenges. Springer Berlin Heidelberg, 49–58.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier’s graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
Keywords: Administrative Document Classification; Logo Recognition; Logo Spotting
|
|
|
Suman Ghosh and Ernest Valveny. 2015. A Sliding Window Framework for Word Spotting Based on Word Attributes. Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015. Springer International Publishing, 652–661. (LNCS.)
Abstract: In this paper we propose a segmentation-free approach to word spotting. Word images are first encoded into feature vectors using Fisher Vector. Then, these feature vectors are used together with pyramidal histogram of characters labels (PHOC) to learn SVM-based attribute models. Documents are represented by these PHOC based word attributes. To efficiently compute the word attributes over a sliding window, we propose to use an integral image representation of the document using a simplified version of the attribute model. Finally we re-rank the top word candidates using the more discriminative full version of the word attributes. We show state-of-the-art results for segmentation-free query-by-example word spotting in single-writer and multi-writer standard datasets.
Keywords: Word spotting; Sliding window; Word attributes
|
|
|
Miquel Ferrer, Ernest Valveny and F. Serratosa. 2009. Median graph: A new exact algorithm using a distance based on the maximum common subgraph. PRL, 30(5), 579–588.
Abstract: Median graphs have been presented as a useful tool for capturing the essential information of a set of graphs. Nevertheless, computation of optimal solutions is a very hard problem. In this work we present a new and more efficient optimal algorithm for the median graph computation. With the use of a particular cost function that permits the definition of the graph edit distance in terms of the maximum common subgraph, and a prediction function in the backtracking algorithm, we reduce the size of the search space, avoiding the evaluation of a great amount of states and still obtaining the exact median. We present a set of experiments comparing our new algorithm against the previous existing exact algorithm using synthetic data. In addition, we present the first application of the exact median graph computation to real data and we compare the results against an approximate algorithm based on genetic search. These experimental results show that our algorithm outperforms the previous existing exact algorithm and in addition show the potential applicability of the exact solutions to real problems.
|
|
|
Ernest Valveny and Enric Marti. 2003. A model for image generation and symbol recognition through the deformation of lineal shapes. PRL, 24(15), 2857–2867.
Abstract: We describe a general framework for the recognition of distorted images of lineal shapes, which relies on three items: a model to represent lineal shapes and their deformations, a model for the generation of distorted binary images and the combination of both models in a common probabilistic framework, where the generation of deformations is related to an internal energy, and the generation of binary images to an external energy. Then, recognition consists in the minimization of a global energy function, performed by using the EM algorithm. This general framework has been applied to the recognition of hand-drawn lineal symbols in graphic documents.
|
|
|
Albert Gordo, Florent Perronnin, Yunchao Gong and Svetlana Lazebnik. 2014. Asymmetric Distances for Binary Embeddings. TPAMI, 36(1), 33–47.
Abstract: In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH), PCA Embedding (PCAE), PCA Embedding with random rotations (PCAE-RR), and PCA Embedding with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques.
|
|
|
Oriol Ramos Terrades, Ernest Valveny and Salvatore Tabbone. 2009. Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework. TPAMI, 31(9), 1630–1644.
Abstract: The combination of the output of classifiers has been one of the strategies used to improve classification rates in general purpose classification systems. Some of the most common approaches can be explained using the Bayes' formula. In this paper, we tackle the problem of the combination of classifiers using a non-Bayesian probabilistic framework. This approach permits us to derive two linear combination rules that minimize misclassification rates under some constraints on the distribution of classifiers. In order to show the validity of this approach we have compared it with other popular combination rules from a theoretical viewpoint using a synthetic data set, and experimentally using two standard databases: the MNIST handwritten digit database and the GREC symbol database. Results on the synthetic data set show the validity of the theoretical approach. Indeed, results on real data show that the proposed methods outperform other common combination schemes.
|
|