|
Muhammad Muzzamil Luqman, Jean-Yves Ramel and Josep Llados. 2012. Improving Fuzzy Multilevel Graph Embedding through Feature Selection Technique. Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop. Springer Berlin Heidelberg, 243–253. (LNCS.)
Abstract: Graphs are the most powerful, expressive and convenient data structures but there is a lack of efficient computational tools and algorithms for processing them. The embedding of graphs into numeric vector spaces permits them to access the state-of-the-art computational efficient statistical models and tools. In this paper we take forward our work on explicit graph embedding and present an improvement to our earlier proposed method, named “fuzzy multilevel graph embedding – FMGE”, through feature selection technique. FMGE achieves the embedding of attributed graphs into low dimensional vector spaces by performing a multilevel analysis of graphs and extracting a set of global, structural and elementary level features. Feature selection permits FMGE to select the subset of most discriminating features and to discard the confusing ones for underlying graph dataset. Experimental results for graph classification experimentation on IAM letter, GREC and fingerprint graph databases, show improvement in the performance of FMGE.
|
|
|
Muhammad Muzzamil Luqman, Thierry Brouard, Jean-Yves Ramel and Josep Llados. 2012. Recherche de sous-graphes par encapsulation floue des cliques d'ordre 2: Application à la localisation de contenu dans les images de documents graphiques. Colloque International Francophone sur l'Écrit et le Document.149–162.
|
|
|
Antonio Clavelli and Dimosthenis Karatzas. 2009. Text Segmentation in Colour Posters from the Spanish Civil War Era. 10th International Conference on Document Analysis and Recognition.181–185.
Abstract: The extraction of textual content from colour documents of a graphical nature is a complicated task. The text can be rendered in any colour, size and orientation while the existence of complex background graphics with repetitive patterns can make its localization and segmentation extremely difficult.
Here, we propose a new method for extracting textual content from such colour images that makes no assumption as to the size of the characters, their orientation or colour, while it is tolerant to characters that do not follow a straight baseline. We evaluate this method on a collection of documents with historical
connotations: the Posters from the Spanish Civil War.
|
|
|
Miquel Ferrer, Dimosthenis Karatzas, Ernest Valveny and Horst Bunke. 2009. A Recursive Embedding Approach to Median Graph Computation. 7th IAPR – TC–15 Workshop on Graph–Based Representations in Pattern Recognition. Springer Berlin Heidelberg, 113–123. (LNCS.)
Abstract: The median graph has been shown to be a good choice to infer a representative of a set of graphs. It has been successfully applied to graph-based classification and clustering. Nevertheless, its computation is extremely complex. Several approaches have been presented up to now based on different strategies. In this paper we present a new approximate recursive algorithm for median graph computation based on graph embedding into vector spaces. Preliminary experiments on three databases show that this new approach is able to obtain better medians than the previous existing approaches.
|
|
|
Miquel Ferrer, Ernest Valveny and F. Serratosa. 2009. Median Graph Computation by means of a Genetic Approach Based on Minimum Common Supergraph and Maximum Common Subraph. 4th Iberian Conference on Pattern Recognition and Image Analysis. Springer Berlin Heidelberg, 346–353. (LNCS.)
Abstract: Given a set of graphs, the median graph has been theoretically presented as a useful concept to infer a representative of the set. However, the computation of the median graph is a highly complex task and its practical application has been very limited up to now. In this work we present a new genetic algorithm for the median graph computation. A set of experiments on real data, where none of the existing algorithms for the median graph computation could be applied up to now due to their computational complexity, show that we obtain good approximations of the median graph. Finally, we use the median graph in a real nearest neighbour classification showing that it leaves the box of the only-theoretical concepts and demonstrating, from a practical point of view, that can be a useful tool to represent a set of graphs.
|
|
|
Albert Gordo and Ernest Valveny. 2009. A rotation invariant page layout descriptor for document classification and retrieval. 10th International Conference on Document Analysis and Recognition.481–485.
Abstract: Document classification usually requires of structural features such as the physical layout to obtain good accuracy rates on complex documents. This paper introduces a descriptor of the layout and a distance measure based on the cyclic dynamic time warping which can be computed in O(n2). This descriptor is translation invariant and can be easily modified to be scale and rotation invariant. Experiments with this descriptor and its rotation invariant modification are performed on the Girona archives database and compared against another common layout distance, the minimum weight edge cover. The experiments show that these methods outperform the MWEC both in accuracy and speed, particularly on rotated documents.
|
|
|
Albert Gordo and Ernest Valveny. 2009. The diagonal split: A pre-segmentation step for page layout analysis & classification. 4th Iberian Conference on Pattern Recognition and Image Analysis. Springer Berlin Heidelberg, 290–297. (LNCS.)
Abstract: Document classification is an important task in all the processes related to document storage and retrieval. In the case of complex documents, structural features are needed to achieve a correct classification. Unfortunately, physical layout analysis is error prone. In this paper we present a pre-segmentation step based on a divide & conquer strategy that can be used to improve the page segmentation results, independently of the segmentation algorithm used. This pre-segmentation step is evaluated in classification and retrieval using the selective CRLA algorithm for layout segmentation together with a clustering based on the voronoi area diagram, and tested on two different databases, MARG and Girona Archives.
|
|
|
Marçal Rusiñol and Josep Llados. 2009. Logo Spotting by a Bag-of-words Approach for Document Categorization. 10th International Conference on Document Analysis and Recognition.111–115.
Abstract: In this paper we present a method for document categorization which processes incoming document images such as invoices or receipts. The categorization of these document images is done in terms of the presence of a certain graphical logo detected without segmentation. The graphical logos are described by a set of local features and the categorization of the documents is performed by the use of a bag-of-words model. Spatial coherence rules are added to reinforce the correct category hypothesis, aiming also to spot the logo inside the document image. Experiments which demonstrate the effectiveness of this system on a large set of real data are presented.
|
|
|
Sergio Escalera, Alicia Fornes, Oriol Pujol, Alberto Escudero and Petia Radeva. 2009. Circular Blurred Shape Model for Symbol Spotting in Documents. 16th IEEE International Conference on Image Processing.1985–1988.
Abstract: Symbol spotting problem requires feature extraction strategies able to generalize from training samples and to localize the target object while discarding most part of the image. In the case of document analysis, symbol spotting techniques have to deal with a high variability of symbols' appearance. In this paper, we propose the Circular Blurred Shape Model descriptor. Feature extraction is performed capturing the spatial arrangement of significant object characteristics in a correlogram structure. Shape information from objects is shared among correlogram regions, being tolerant to the irregular deformations. Descriptors are learnt using a cascade of classifiers and Abadoost as the base classifier. Finally, symbol spotting is performed by means of a windowing strategy using the learnt cascade over plan and old musical score documents. Spotting and multi-class categorization results show better performance comparing with the state-of-the-art descriptors.
|
|
|
Sergio Escalera, Alicia Fornes, Oriol Pujol and Petia Radeva. 2009. Multi-class Binary Symbol Classification with Circular Blurred Shape Models. 15th International Conference on Image Analysis and Processing. Springer Berlin Heidelberg, 1005–1014. (LNCS.)
Abstract: Multi-class binary symbol classification requires the use of rich descriptors and robust classifiers. Shape representation is a difficult task because of several symbol distortions, such as occlusions, elastic deformations, gaps or noise. In this paper, we present the Circular Blurred Shape Model descriptor. This descriptor encodes the arrangement information of object parts in a correlogram structure. A prior blurring degree defines the level of distortion allowed to the symbol. Moreover, we learn the new feature space using a set of Adaboost classifiers, which are combined in the Error-Correcting Output Codes framework to deal with the multi-class categorization problem. The presented work has been validated over different multi-class data sets, and compared to the state-of-the-art descriptors, showing significant performance improvements.
|
|