|
Adriana Romero, & Carlo Gatta. (2013). Do We Really Need All These Neurons? In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 460–467). LNCS. Springer Berlin Heidelberg.
Abstract: Restricted Boltzmann Machines (RBMs) are generative neural networks that have received much attention recently. In particular, choosing the appropriate number of hidden units is important as it might hinder their representative power. According to the literature, RBM require numerous hidden units to approximate any distribution properly. In this paper, we present an experiment to determine whether such amount of hidden units is required in a classification context. We then propose an incremental algorithm that trains RBM reusing the previously trained parameters using a trade-off measure to determine the appropriate number of hidden units. Results on the MNIST and OCR letters databases show that using a number of hidden units, which is one order of magnitude smaller than the literature estimate, suffices to achieve similar performance. Moreover, the proposed algorithm allows to estimate the required number of hidden units without the need of training many RBM from scratch.
Keywords: Retricted Boltzmann Machine; hidden units; unsupervised learning; classification
|
|
|
Fadi Dornaika, Alireza Bosaghzadeh, & Bogdan Raducanu. (2013). Efficient Graph Construction for Label Propagation based Multi-observation Face Recognition. In Human Behavior Understanding 4th International Workshop (Vol. 8212, pp. 124–135). Springer International Publishing.
Abstract: Workshop on Human Behavior Understanding
Human-machine interaction is a hot topic nowadays in the communities of multimedia and computer vision. In this context, face recognition algorithms (used as primary cue for a person’s identity assessment) work well under controlled conditions but degrade significantly when tested in real-world environments. Recently, graph-based label propagation for multi-observation face recognition was proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot adapt optimally to the data. In this paper, we propose a novel approach for efficient and adaptive graph construction that can be used for multi-observation face recognition as well as for other recognition problems. Experimental results performed on Honda video face database, show a distinct advantage of the proposed method over the standard graph construction methods.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2014). Embedding new observations via sparse-coding for non-linear manifold learning. PR - Pattern Recognition, 47(1), 480–492.
Abstract: Non-linear dimensionality reduction techniques are affected by two critical aspects: (i) the design of the adjacency graphs, and (ii) the embedding of new test data-the out-of-sample problem. For the first aspect, the proposed solutions, in general, were heuristically driven. For the second aspect, the difficulty resides in finding an accurate mapping that transfers unseen data samples into an existing manifold. Past works addressing these two aspects were heavily parametric in the sense that the optimal performance is only achieved for a suitable parameter choice that should be known in advance. In this paper, we demonstrate that the sparse representation theory not only serves for automatic graph construction as shown in recent works, but also represents an accurate alternative for out-of-sample embedding. Considering for a case study the Laplacian Eigenmaps, we applied our method to the face recognition problem. To evaluate the effectiveness of the proposed out-of-sample embedding, experiments are conducted using the K-nearest neighbor (KNN) and Kernel Support Vector Machines (KSVM) classifiers on six public face datasets. The experimental results show that the proposed model is able to achieve high categorization effectiveness as well as high consistency with non-linear embeddings/manifolds obtained in batch modes.
|
|
|
Albert Gordo, Florent Perronnin, & Ernest Valveny. (2013). Large-scale document image retrieval and classification with runlength histograms and binary embeddings. PR - Pattern Recognition, 46(7), 1898–1905.
Abstract: We present a new document image descriptor based on multi-scale runlength
histograms. This descriptor does not rely on layout analysis and can be
computed efficiently. We show how this descriptor can achieve state-of-theart
results on two very different public datasets in classification and retrieval
tasks. Moreover, we show how we can compress and binarize these descriptors
to make them suitable for large-scale applications. We can achieve state-ofthe-
art results in classification using binary descriptors of as few as 16 to 64
bits.
Keywords: visual document descriptor; compression; large-scale; retrieval; classification
|
|
|
Albert Gordo, Alicia Fornes, & Ernest Valveny. (2013). Writer identification in handwritten musical scores with bags of notes. PR - Pattern Recognition, 46(5), 1337–1345.
Abstract: Writer Identification is an important task for the automatic processing of documents. However, the identification of the writer in graphical documents is still challenging. In this work, we adapt the Bag of Visual Words framework to the task of writer identification in handwritten musical scores. A vanilla implementation of this method already performs comparably to the state-of-the-art. Furthermore, we analyze the effect of two improvements of the representation: a Bhattacharyya embedding, which improves the results at virtually no extra cost, and a Fisher Vector representation that very significantly improves the results at the cost of a more complex and costly representation. Experimental evaluation shows results more than 20 points above the state-of-the-art in a new, challenging dataset.
|
|
|
David Fernandez, Simone Marinai, Josep Llados, & Alicia Fornes. (2013). Contextual Word Spotting in Historical Manuscripts using Markov Logic Networks. In 2nd International Workshop on Historical Document Imaging and Processing (pp. 36–43).
Abstract: Natural languages can often be modelled by suitable grammars whose knowledge can improve the word spotting results. The implicit contextual information is even more useful when dealing with information that is intrinsically described as one collection of records. In this paper, we present one approach to word spotting which uses the contextual information of records to improve the results. The method relies on Markov Logic Networks to probabilistically model the relational organization of handwritten records. The performance has been evaluated on the Barcelona Marriages Dataset that contains structured handwritten records that summarize marriage information.
|
|
|
A. M. Here, B. C. Lopez, Debora Gil, J. J. Camarero, & Jordi Martinez-Vilalta. (2013). A new software to analyse wood anatomical features in conifer species. In International Symposium on Wood Structure in Plant Biology and Ecology.
Abstract: International Symposium on Wood Structure in Plant Biology and Ecology
|
|
|
Enric Marti, Ferran Poveda, Antoni Gurgui, Jaume Rocarias, & Debora Gil. (2013). Una propuesta de seguimiento, tutorías on line y evaluación en la metodología de Aprendizaje Basado en Proyectos.
|
|
|
Debora Gil, Agnes Borras, Sergio Vera, & Miguel Angel Gonzalez Ballester. (2013). A Validation Benchmark for Assessment of Medial Surface Quality for Medical Applications. In 9th International Conference on Computer Vision Systems (Vol. 7963, pp. 334–343). LNCS. Springer Berlin Heidelberg.
Abstract: Confident use of medial surfaces in medical decision support systems requires evaluating their quality for detecting pathological deformations and describing anatomical volumes. Validation in the medical imaging field is a challenging task mainly due to the difficulties for getting consensual ground truth. In this paper we propose a validation benchmark for assessing medial surfaces in the context of medical applications. Our benchmark includes a home-made database of synthetic medial surfaces and volumes and specific scores for evaluating surface accuracy, its stability against volume deformations and its capabilities for accurate reconstruction of anatomical volumes.
Keywords: Medial Surfaces; Shape Representation; Medical Applications; Performance Evaluation
|
|
|
Sergio Vera, Miguel Angel Gonzalez Ballester, & Debora Gil. (2013). Volumetric Anatomical Parameterization and Meshing for Inter-patient Liver Coordinate System Deffinition. In 16th International Conference on Medical Image Computing and Computer Assisted Intervention.
|
|
|
Carles Sanchez, Jorge Bernal, Debora Gil, & F. Javier Sanchez. (2013). On-line lumen centre detection in gastrointestinal and respiratory endoscopy. In Klaus Miguel Angel and Drechsler Stefan and González Ballester Raj and Wesarg Cristina and Shekhar Marius George and Oyarzun Laura M. and L. Erdt (Ed.), Second International Workshop Clinical Image-Based Procedures (Vol. 8361, pp. 31–38). LNCS. Springer International Publishing.
Abstract: We present in this paper a novel lumen centre detection for gastrointestinal and respiratory endoscopic images. The proposed method is based on the appearance and geometry of the lumen, which we defined as the darkest image region which centre is a hub of image gradients. Experimental results validated on the first public annotated gastro-respiratory database prove the reliability of the method for a wide range of images (with precision over 95 %).
Keywords: Lumen centre detection; Bronchoscopy; Colonoscopy
|
|
|
Volkmar Frinken, Andreas Fischer, Markus Baumgartner, & Horst Bunke. (2014). Keyword spotting for self-training of BLSTM NN based handwriting recognition systems. PR - Pattern Recognition, 47(3), 1073–1082.
Abstract: The automatic transcription of unconstrained continuous handwritten text requires well trained recognition systems. The semi-supervised paradigm introduces the concept of not only using labeled data but also unlabeled data in the learning process. Unlabeled data can be gathered at little or not cost. Hence it has the potential to reduce the need for labeling training data, a tedious and costly process. Given a weak initial recognizer trained on labeled data, self-training can be used to recognize unlabeled data and add words that were recognized with high confidence to the training set for re-training. This process is not trivial and requires great care as far as selecting the elements that are to be added to the training set is concerned. In this paper, we propose to use a bidirectional long short-term memory neural network handwritten recognition system for keyword spotting in order to select new elements. A set of experiments shows the high potential of self-training for bootstrapping handwriting recognition systems, both for modern and historical handwritings, and demonstrate the benefits of using keyword spotting over previously published self-training schemes.
Keywords: Document retrieval; Keyword spotting; Handwriting recognition; Neural networks; Semi-supervised learning
|
|
|
Veronica Romero, Alicia Fornes, Nicolas Serrano, Joan Andreu Sanchez, A.H. Toselli, Volkmar Frinken, et al. (2013). The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition. PR - Pattern Recognition, 46(6), 1658–1669.
Abstract: Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies.
|
|
|
David Roche, Debora Gil, & Jesus Giraldo. (2013). Detecting loss of diversity for an efficient termination of EAs. In 15th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (pp. 561–566).
Abstract: Termination of Evolutionary Algorithms (EA) at its steady state so that useless iterations are not performed is a main point for its efficient application to black-box problems. Many EA algorithms evolve while there is still diversity in their population and, thus, they could be terminated by analyzing the behavior some measures of EA population diversity. This paper presents a numeric approximation to steady states that can be used to detect the moment EA population has lost its diversity for EA termination. Our condition has been applied to 3 EA paradigms based on diversity and a selection of functions
covering the properties most relevant for EA convergence.
Experiments show that our condition works regardless of the search space dimension and function landscape.
Keywords: EA termination; EA population diversity; EA steady state
|
|
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, Josep Llados, Tomokazu Sato, Masakazu Iwamura, et al. (2013). Key-region detection for document images -applications to administrative document retrieval. In 12th International Conference on Document Analysis and Recognition (pp. 230–234).
Abstract: In this paper we argue that a key-region detector designed to take into account the special characteristics of document images can result in the detection of less and more meaningful key-regions. We propose a fast key-region detector able to capture aspects of the structural information of the document, and demonstrate its efficiency by comparing against standard detectors in an administrative document retrieval scenario. We show that using the proposed detector results to a smaller number of detected key-regions and higher performance without any drop in speed compared to standard state of the art detectors.
|
|