|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2011. Dimensionality Reduction for Graph of Words Embedding. In Xiaoyi Jiang, Miquel Ferrer and Andrea Torsello, eds. 8th IAPR-TC-15 International Workshop. Graph-Based Representations in Pattern Recognition.22–31. (LNCS.)
Abstract: The Graph of Words Embedding consists in mapping every graph of a given dataset to a feature vector by counting unary and binary relations between node attributes of the graph. While it shows good properties in classification problems, it suffers from high dimensionality and sparsity. These two issues are addressed in this article. Two well-known techniques for dimensionality reduction, kernel principal component analysis (kPCA) and independent component analysis (ICA), are applied to the embedded graphs. We discuss their performance compared to the classification of the original vectors on three different public databases of graphs.
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2011. Vocabulary Selection for Graph of Words Embedding. In Vitria, J., J.M.R. Sanches and M. Hernández, eds. 5th Iberian Conference on Pattern Recognition and Image Analysis. Berlin, Springer, 216–223. (LNCS.)
Abstract: The Graph of Words Embedding consists in mapping every graph in a given dataset to a feature vector by counting unary and binary relations between node attributes of the graph. It has been shown to perform well for graphs with discrete label alphabets. In this paper we extend the methodology to graphs with n-dimensional continuous attributes by selecting node representatives. We propose three different discretization procedures for the attribute space and experimentally evaluate the dependence on both the selector and the number of node representatives. In the context of graph classification, the experimental results reveal that on two out of three public databases the proposed extension achieves superior performance over a standard reference system.
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2013. Embedding of Graphs with Discrete Attributes Via Label Frequencies. IJPRAI, 27(3), 1360002–1360029.
Abstract: Graph-based representations of patterns are very flexible and powerful, but they are not easily processed due to the lack of learning algorithms in the domain of graphs. Embedding a graph into a vector space solves this problem since graphs are turned into feature vectors and thus all the statistical learning machinery becomes available for graph input patterns. In this work we present a new way of embedding discrete attributed graphs into vector spaces using node and edge label frequencies. The methodology is experimentally tested on graph classification problems, using patterns of different nature, and it is shown to be competitive to state-of-the-art classification algorithms for graphs, while being computationally much more efficient.
Keywords: Discrete attributed graphs; graph embedding; graph classification
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2012. Graph Embedding in Vector Spaces by Node Attribute Statistics. PR, 45(9), 3072–3083.
Abstract: Graph-based representations are of broad use and applicability in pattern recognition. They exhibit, however, a major drawback with regards to the processing tools that are available in their domain. Graphembedding into vectorspaces is a growing field among the structural pattern recognition community which aims at providing a feature vector representation for every graph, and thus enables classical statistical learning machinery to be used on graph-based input patterns. In this work, we propose a novel embedding methodology for graphs with continuous nodeattributes and unattributed edges. The approach presented in this paper is based on statistics of the node labels and the edges between them, based on their similarity to a set of representatives. We specifically deal with an important issue of this methodology, namely, the selection of a suitable set of representatives. In an experimental evaluation, we empirically show the advantages of this novel approach in the context of different classification problems using several databases of graphs.
Keywords: Structural pattern recognition; Graph embedding; Data clustering; Graph classification
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2012. Feature Selection on Node Statistics Based Embedding of Graphs. PRL, 33(15), 1980–1990.
Abstract: Representing a graph with a feature vector is a common way of making statistical machine learning algorithms applicable to the domain of graphs. Such a transition from graphs to vectors is known as graphembedding. A key issue in graphembedding is to select a proper set of features in order to make the vectorial representation of graphs as strong and discriminative as possible. In this article, we propose features that are constructed out of frequencies of node label representatives. We first build a large set of features and then select the most discriminative ones according to different ranking criteria and feature transformation algorithms. On different classification tasks, we experimentally show that only a small significant subset of these features is needed to achieve the same classification rates as competing to state-of-the-art methods.
Keywords: Structural pattern recognition; Graph embedding; Feature ranking; PCA; Graph classification
|
|
|
Jaume Gibert, Ernest Valveny, Horst Bunke and Alicia Fornes. 2012. On the Correlation of Graph Edit Distance and L1 Distance in the Attribute Statistics Embedding Space. Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop. Springer-Berlag, Berlin, 135–143. (LNCS.)
Abstract: Graph embeddings in vector spaces aim at assigning a pattern vector to every graph so that the problems of graph classification and clustering can be solved by using data processing algorithms originally developed for statistical feature vectors. An important requirement graph features should fulfil is that they reproduce as much as possible the properties among objects in the graph domain. In particular, it is usually desired that distances between pairs of graphs in the graph domain closely resemble those between their corresponding vectorial representations. In this work, we analyse relations between the edit distance in the graph domain and the L1 distance of the attribute statistics based embedding, for which good classification performance has been reported on various datasets. We show that there is actually a high correlation between the two kinds of distances provided that the corresponding parameter values that account for balancing the weight between node and edge based features are properly selected.
|
|
|
Jaume Gibert, Ernest Valveny, Oriol Ramos Terrades and Horst Bunke. 2011. Multiple Classifiers for Graph of Words Embedding. In Carlo Sansone, Josef Kittler and Fabio Roli, eds. 10th International Conference on Multiple Classifier Systems.36–45. (LNCS.)
Abstract: During the last years, there has been an increasing interest in applying the multiple classifier framework to the domain of structural pattern recognition. Constructing base classifiers when the input patterns are graph based representations is not an easy problem. In this work, we make use of the graph embedding methodology in order to construct different feature vector representations for graphs. The graph of words embedding assigns a feature vector to every graph by counting unary and binary relations between node representatives and combining these pieces of information into a single vector. Selecting different node representatives leads to different vectorial representations and therefore to different base classifiers that can be combined. We experimentally show how this methodology significantly improves the classification of graphs with respect to single base classifiers.
|
|
|
Jaume Rodriguez, S. Yacoub, Gemma Sanchez and Josep Llados. 2006. Performance Evaluation, Comparison and Combination of Commercial Handwriting Recognition Engines.
|
|
|
Jean-Christophe Burie and 9 others. 2015. ICDAR2015 Competition on Smartphone Document Capture and OCR (SmartDoc). 13th International Conference on Document Analysis and Recognition ICDAR2015.1161–1165.
Abstract: Smartphones are enabling new ways of capture,
hence arises the need for seamless and reliable acquisition and
digitization of documents, in order to convert them to editable,
searchable and a more human-readable format. Current stateof-the-art
works lack databases and baseline benchmarks for
digitizing mobile captured documents. We have organized a
competition for mobile document capture and OCR in order to
address this issue. The competition is structured into two independent
challenges: smartphone document capture, and smartphone
OCR. This report describes the datasets for both challenges
along with their ground truth, details the performance evaluation
protocols which we used, and presents the final results of the
participating methods. In total, we received 13 submissions: 8
for challenge-I, and 5 for challenge-2.
|
|
|
Jean-Marc Ogier, Wenyin Liu and Josep Llados, eds. 2010. Graphics Recognition: Achievements, Challenges, and Evolution. Springer Link. (LNCS.)
|
|