D. Perez, L. Tarazon, N. Serrano, F.M. Castro, Oriol Ramos Terrades, & A. Juan. (2009). The GERMANA Database. In 10th International Conference on Document Analysis and Recognition (pp. 301–305).
Abstract: A new handwritten text database, GERMANA, is presented to facilitate empirical comparison of different approaches to text line extraction and off-line handwriting recognition. GERMANA is the result of digitising and annotating a 764-page Spanish manuscript from 1891, in which most pages only contain nearly calligraphed text written on ruled sheets of well-separated lines. To our knowledge, it is the first publicly available database for handwriting research, mostly written in Spanish and comparable in size to standard databases. Due to its sequential book structure, it is also well-suited for realistic assessment of interactive handwriting recognition systems. To provide baseline results for reference in future studies, empirical results are also reported, using standard techniques and tools for preprocessing, feature extraction, HMM-based image modelling, and language modelling.
|
L.Tarazon, D. Perez, N. Serrano, V. Alabau, Oriol Ramos Terrades, A. Sanchis, et al. (2009). Confidence Measures for Error Correction in Interactive Transcription of Handwritten Text. In 15th International Conference on Image Analysis and Processing (Vol. 5716, pp. 567–574). LNCS. Springer Berlin Heidelberg.
Abstract: An effective approach to transcribe old text documents is to follow an interactive-predictive paradigm in which both, the system is guided by the human supervisor, and the supervisor is assisted by the system to complete the transcription task as efficiently as possible. In this paper, we focus on a particular system prototype called GIDOC, which can be seen as a first attempt to provide user-friendly, integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. More specifically, we focus on the handwriting recognition part of GIDOC, for which we propose the use of confidence measures to guide the human supervisor in locating possible system errors and deciding how to proceed. Empirical results are reported on two datasets showing that a word error rate not larger than a 10% can be achieved by only checking the 32% of words that are recognised with less confidence.
|
H. Chouaib, Oriol Ramos Terrades, Salvatore Tabbone, F. Cloppet, & N. Vincent. (2008). Feature Selection Combining Genetic Algorithm and Adaboost Classifiers. In 19th International Conference on Pattern Recognition (pp. 1–4).
|
T.O. Nguyen, Salvatore Tabbone, & Oriol Ramos Terrades. (2008). Symbol Descriptor Based on Shape Context and Vector Model of Information Retrieval. In Proceedings of the 8th IAPR International Workshop on Document Analysis Systems, (pp. 191–197).
|
H. Chouaib, Salvatore Tabbone, Oriol Ramos Terrades, F. Cloppet, N. Vincent, & A.T. Thierry Paquet. (2008). Sélection de Caractéristiques à partir d'un algorithme génétique et d'une combinaison de classifieurs Adaboost. In Colloque International Francophone sur l'Ecrit et le Document (pp. 181–186).
|
T.O. Nguyen, Salvatore Tabbone, Oriol Ramos Terrades, & A.T. Thierry. (2008). Proposition d'un descripteur de formes et du modèle vectoriel pour la recherche de symboles. In Colloque International Francophone sur l'Ecrit et le Document (pp. 79–84).
|
Salvatore Tabbone, Oriol Ramos Terrades, & S. Barrat. (2008). Histogram of radon transform. A useful descriptor for shape retrieval. In 19th International Conference on Pattern Recognition (pp. 1–4).
|
M. Visani, V.C.Kieu, Alicia Fornes, & N.Journet. (2013). The ICDAR 2013 Music Scores Competition: Staff Removal. In 12th International Conference on Document Analysis and Recognition (pp. 1439–1443).
Abstract: The first competition on music scores that was organized at ICDAR in 2011 awoke the interest of researchers, who participated both at staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario: old music scores. For this purpose, we have generated a new set of images using two kinds of degradations: local noise and 3D distortions. This paper describes the dataset, distortion methods, evaluation metrics, the participant's methods and the obtained results.
|
Sergio Escalera, Ana Puig, Oscar Amoros, & Maria Salamo. (2011). Intelligent GPGPU Classification in Volume Visualization: a framework based on Error-Correcting Output Codes. CGF - Computer Graphics Forum, 30(7), 2107–2115.
Abstract: IF JCR 1.455 2010 25/99
In volume visualization, the definition of the regions of interest is inherently an iterative trial-and-error process finding out the best parameters to classify and render the final image. Generally, the user requires a lot of expertise to analyze and edit these parameters through multi-dimensional transfer functions. In this paper, we present a framework of intelligent methods to label on-demand multiple regions of interest. These methods can be split into a two-level GPU-based labelling algorithm that computes in time of rendering a set of labelled structures using the Machine Learning Error-Correcting Output Codes (ECOC) framework. In a pre-processing step, ECOC trains a set of Adaboost binary classifiers from a reduced pre-labelled data set. Then, at the testing stage, each classifier is independently applied on the features of a set of unlabelled samples and combined to perform multi-class labelling. We also propose an alternative representation of these classifiers that allows to highly parallelize the testing stage. To exploit that parallelism we implemented the testing stage in GPU-OpenCL. The empirical results on different data sets for several volume structures shows high computational performance and classification accuracy.
|
Mario Rojas, David Masip, A. Todorov, & Jordi Vitria. (2011). Automatic Prediction of Facial Trait Judgments: Appearance vs. Structural Models. Plos - PloS one, 6(8), e23323.
Abstract: JCR Impact Factor 2010: 4.411
Evaluating other individuals with respect to personality characteristics plays a crucial role in human relations and it is the focus of attention for research in diverse fields such as psychology and interactive computer systems. In psychology, face perception has been recognized as a key component of this evaluation system. Multiple studies suggest that observers use face information to infer personality characteristics. Interactive computer systems are trying to take advantage of these findings and apply them to increase the natural aspect of interaction and to improve the performance of interactive computer systems. Here, we experimentally test whether the automatic prediction of facial trait judgments (e.g. dominance) can be made by using the full appearance information of the face and whether a reduced representation of its structure is sufficient. We evaluate two separate approaches: a holistic representation model using the facial appearance information and a structural model constructed from the relations among facial salient points. State of the art machine learning methods are applied to a) derive a facial trait judgment model from training data and b) predict a facial trait value for any face. Furthermore, we address the issue of whether there are specific structural relations among facial points that predict perception of facial traits. Experimental results over a set of labeled data (9 different trait evaluations) and classification rules (4 rules) suggest that a) prediction of perception of facial traits is learnable by both holistic and structural approaches; b) the most reliable prediction of facial trait judgments is obtained by certain type of holistic descriptions of the face appearance; and c) for some traits such as attractiveness and extroversion, there are relationships between specific structural features and social perceptions
|
Bogdan Raducanu, & Fadi Dornaika. (2012). A Supervised Non-linear Dimensionality Reduction Approach for Manifold Learning. PR - Pattern Recognition, 45(6), 2432–2444.
Abstract: IF= 2.61
IF=2.61 (2010)
In this paper we introduce a novel supervised manifold learning technique called Supervised Laplacian Eigenmaps (S-LE), which makes use of class label information to guide the procedure of non-linear dimensionality reduction by adopting the large margin concept. The graph Laplacian is split into two components: within-class graph and between-class graph to better characterize the discriminant property of the data. Our approach has two important characteristics: (i) it adaptively estimates the local neighborhood surrounding each sample based on data density and similarity and (ii) the objective function simultaneously maximizes the local margin between heterogeneous samples and pushes the homogeneous samples closer to each other.
Our approach has been tested on several challenging face databases and it has been conveniently compared with other linear and non-linear techniques, demonstrating its superiority. Although we have concentrated in this paper on the face recognition problem, the proposed approach could also be applied to other category of objects characterized by large variations in their appearance (such as hand or body pose, for instance.
|
Sergio Escalera, Xavier Baro, Jordi Vitria, Petia Radeva, & Bogdan Raducanu. (2012). Social Network Extraction and Analysis Based on Multimodal Dyadic Interaction. SENS - Sensors, 12(2), 1702–1719.
Abstract: IF=1.77 (2010)
Social interactions are a very important component in peopleís lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Timesí Blogging Heads opinion blog.
The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The linksí weights are a measure of the ìinfluenceî a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network.
|
Miguel Angel Bautista, Sergio Escalera, Xavier Baro, Oriol Pujol, Jordi Vitria, & Petia Radeva. (2011). On the Design of Low Redundancy Error-Correcting Output Codes. In Ensembles in Machine Learning Applications (Vol. 373, pp. 21–38). Springer Berlin Heidelberg.
Abstract: The classification of large number of object categories is a challenging trend in the Pattern Recognition field. In the literature, this is often addressed using an ensemble of classifiers . In this scope, the Error-Correcting Output Codes framework has demonstrated to be a powerful tool for combining classifiers. However, most of the state-of-the-art ECOC approaches use a linear or exponential number of classifiers, making the discrimination of a large number of classes unfeasible. In this paper, we explore and propose a compact design of ECOC in terms of the number of classifiers. Evolutionary computation is used for tuning the parameters of the classifiers and looking for the best compact ECOC code configuration. The results over several public UCI data sets and different multi-class Computer Vision problems show that the proposed methodology obtains comparable (even better) results than the state-of-the-art ECOC methodologies with far less number of dichotomizers.
|
Miguel Angel Bautista, Oriol Pujol, Xavier Baro, & Sergio Escalera. (2011). Introducing the Separability Matrix for Error Correcting Output Codes Coding. In Carlo Sansone, Josef Kittler, & Fabio Roli (Eds.), 10th International Conference on Multiple Classifier Systems (Vol. 6713, pp. 227–236). LNCS. Springer-Verlag Berlin, Heidelberg.
Abstract: Error Correcting Output Codes (ECOC) have demonstrate to be a powerful tool for treating multi-class problems. Nevertheless, predefined ECOC designs may not benefit from Error-correcting principles for particular multi-class data. In this paper, we introduce the Separability matrix as a tool to study and enhance designs for ECOC coding. In addition, a novel problem-dependent coding design based on the Separability matrix is tested over a wide set of challenging multi-class problems, obtaining very satisfactory results.
|
Fadi Dornaika, Alireza Bosaghzadeh, & Bogdan Raducanu. (2012). LSDA Solution Schemes for Modelless 3D Head Pose Estimation. In IEEE Workshop on the Applications of Computer Vision (pp. 393–398).
|