Lluis Gomez, & Dimosthenis Karatzas. (2013). Multi-script Text Extraction from Natural Scenes. In 12th International Conference on Document Analysis and Recognition (pp. 467–471).
Abstract: Scene text extraction methodologies are usually based in classification of individual regions or patches, using a priori knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organisation through which text emerges as a perceptually significant group of atomic objects. Therefore humans are able to detect text even in languages and scripts never seen before. In this paper, we argue that the text extraction problem could be posed as the detection of meaningful groups of regions. We present a method built around a perceptual organisation framework that exploits collaboration of proximity and similarity laws to create text-group hypotheses. Experiments demonstrate that our algorithm is competitive with state of the art approaches on a standard dataset covering text in variable orientations and two languages.
|
Adriana Romero, & Carlo Gatta. (2013). Do We Really Need All These Neurons? In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 460–467). LNCS. Springer Berlin Heidelberg.
Abstract: Restricted Boltzmann Machines (RBMs) are generative neural networks that have received much attention recently. In particular, choosing the appropriate number of hidden units is important as it might hinder their representative power. According to the literature, RBM require numerous hidden units to approximate any distribution properly. In this paper, we present an experiment to determine whether such amount of hidden units is required in a classification context. We then propose an incremental algorithm that trains RBM reusing the previously trained parameters using a trade-off measure to determine the appropriate number of hidden units. Results on the MNIST and OCR letters databases show that using a number of hidden units, which is one order of magnitude smaller than the literature estimate, suffices to achieve similar performance. Moreover, the proposed algorithm allows to estimate the required number of hidden units without the need of training many RBM from scratch.
Keywords: Retricted Boltzmann Machine; hidden units; unsupervised learning; classification
|
Fadi Dornaika, Alireza Bosaghzadeh, & Bogdan Raducanu. (2013). Efficient Graph Construction for Label Propagation based Multi-observation Face Recognition. In Human Behavior Understanding 4th International Workshop (Vol. 8212, pp. 124–135). Springer International Publishing.
Abstract: Workshop on Human Behavior Understanding
Human-machine interaction is a hot topic nowadays in the communities of multimedia and computer vision. In this context, face recognition algorithms (used as primary cue for a person’s identity assessment) work well under controlled conditions but degrade significantly when tested in real-world environments. Recently, graph-based label propagation for multi-observation face recognition was proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot adapt optimally to the data. In this paper, we propose a novel approach for efficient and adaptive graph construction that can be used for multi-observation face recognition as well as for other recognition problems. Experimental results performed on Honda video face database, show a distinct advantage of the proposed method over the standard graph construction methods.
|
David Fernandez, Simone Marinai, Josep Llados, & Alicia Fornes. (2013). Contextual Word Spotting in Historical Manuscripts using Markov Logic Networks. In 2nd International Workshop on Historical Document Imaging and Processing (pp. 36–43).
Abstract: Natural languages can often be modelled by suitable grammars whose knowledge can improve the word spotting results. The implicit contextual information is even more useful when dealing with information that is intrinsically described as one collection of records. In this paper, we present one approach to word spotting which uses the contextual information of records to improve the results. The method relies on Markov Logic Networks to probabilistically model the relational organization of handwritten records. The performance has been evaluated on the Barcelona Marriages Dataset that contains structured handwritten records that summarize marriage information.
|
A. M. Here, B. C. Lopez, Debora Gil, J. J. Camarero, & Jordi Martinez-Vilalta. (2013). A new software to analyse wood anatomical features in conifer species. In International Symposium on Wood Structure in Plant Biology and Ecology.
Abstract: International Symposium on Wood Structure in Plant Biology and Ecology
|
Debora Gil, Agnes Borras, Sergio Vera, & Miguel Angel Gonzalez Ballester. (2013). A Validation Benchmark for Assessment of Medial Surface Quality for Medical Applications. In 9th International Conference on Computer Vision Systems (Vol. 7963, pp. 334–343). LNCS. Springer Berlin Heidelberg.
Abstract: Confident use of medial surfaces in medical decision support systems requires evaluating their quality for detecting pathological deformations and describing anatomical volumes. Validation in the medical imaging field is a challenging task mainly due to the difficulties for getting consensual ground truth. In this paper we propose a validation benchmark for assessing medial surfaces in the context of medical applications. Our benchmark includes a home-made database of synthetic medial surfaces and volumes and specific scores for evaluating surface accuracy, its stability against volume deformations and its capabilities for accurate reconstruction of anatomical volumes.
Keywords: Medial Surfaces; Shape Representation; Medical Applications; Performance Evaluation
|
Sergio Vera, Miguel Angel Gonzalez Ballester, & Debora Gil. (2013). Volumetric Anatomical Parameterization and Meshing for Inter-patient Liver Coordinate System Deffinition. In 16th International Conference on Medical Image Computing and Computer Assisted Intervention.
|
Carles Sanchez, Jorge Bernal, Debora Gil, & F. Javier Sanchez. (2013). On-line lumen centre detection in gastrointestinal and respiratory endoscopy. In Klaus Miguel Angel and Drechsler Stefan and González Ballester Raj and Wesarg Cristina and Shekhar Marius George and Oyarzun Laura M. and L. Erdt (Ed.), Second International Workshop Clinical Image-Based Procedures (Vol. 8361, pp. 31–38). LNCS. Springer International Publishing.
Abstract: We present in this paper a novel lumen centre detection for gastrointestinal and respiratory endoscopic images. The proposed method is based on the appearance and geometry of the lumen, which we defined as the darkest image region which centre is a hub of image gradients. Experimental results validated on the first public annotated gastro-respiratory database prove the reliability of the method for a wide range of images (with precision over 95 %).
Keywords: Lumen centre detection; Bronchoscopy; Colonoscopy
|
David Roche, Debora Gil, & Jesus Giraldo. (2013). Detecting loss of diversity for an efficient termination of EAs. In 15th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (pp. 561–566).
Abstract: Termination of Evolutionary Algorithms (EA) at its steady state so that useless iterations are not performed is a main point for its efficient application to black-box problems. Many EA algorithms evolve while there is still diversity in their population and, thus, they could be terminated by analyzing the behavior some measures of EA population diversity. This paper presents a numeric approximation to steady states that can be used to detect the moment EA population has lost its diversity for EA termination. Our condition has been applied to 3 EA paradigms based on diversity and a selection of functions
covering the properties most relevant for EA convergence.
Experiments show that our condition works regardless of the search space dimension and function landscape.
Keywords: EA termination; EA population diversity; EA steady state
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, Josep Llados, Tomokazu Sato, Masakazu Iwamura, et al. (2013). Key-region detection for document images -applications to administrative document retrieval. In 12th International Conference on Document Analysis and Recognition (pp. 230–234).
Abstract: In this paper we argue that a key-region detector designed to take into account the special characteristics of document images can result in the detection of less and more meaningful key-regions. We propose a fast key-region detector able to capture aspects of the structural information of the document, and demonstrate its efficiency by comparing against standard detectors in an administrative document retrieval scenario. We show that using the proposed detector results to a smaller number of detected key-regions and higher performance without any drop in speed compared to standard state of the art detectors.
|
Andreas Fischer, Ching Y. Suen, Volkmar Frinken, Kaspar Riesen, & Horst Bunke. (2013). A Fast Matching Algorithm for Graph-Based Handwriting Recognition. In 9th IAPR – TC15 Workshop on Graph-based Representation in Pattern Recognition (Vol. 7877, pp. 194–203). LNCS. Springer Berlin Heidelberg.
Abstract: The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy.
|
Andreas Fischer, Volkmar Frinken, Horst Bunke, & Ching Y. Suen. (2013). Improving HMM-Based Keyword Spotting with Character Language Models. In 12th International Conference on Document Analysis and Recognition (pp. 506–510).
Abstract: Facing high error rates and slow recognition speed for full text transcription of unconstrained handwriting images, keyword spotting is a promising alternative to locate specific search terms within scanned document images. We have previously proposed a learning-based method for keyword spotting using character hidden Markov models that showed a high performance when compared with traditional template image matching. In the lexicon-free approach pursued, only the text appearance was taken into account for recognition. In this paper, we integrate character n-gram language models into the spotting system in order to provide an additional language context. On the modern IAM database as well as the historical George Washington database, we demonstrate that character language models significantly improve the spotting performance.
|
Volkmar Frinken, Andreas Fischer, & Carlos David Martinez Hinarejos. (2013). Handwriting Recognition in Historical Documents using Very Large Vocabularies. In 2nd International Workshop on Historical Document Imaging and Processing (pp. 67–72).
Abstract: Language models are used in automatic transcription system to resolve ambiguities. This is done by limiting the vocabulary of words that can be recognized as well as estimating the n-gram probability of the words in the given text. In the context of historical documents, a non-unified spelling and the limited amount of written text pose a substantial problem for the selection of the recognizable vocabulary as well as the computation of the word probabilities. In this paper we propose for the transcription of historical Spanish text to keep the corpus for the n-gram limited to a sample of the target text, but expand the vocabulary with words gathered from external resources. We analyze the performance of such a transcription system with different sizes of external vocabularies and demonstrate the applicability and the significant increase in recognition accuracy of using up to 300 thousand external words.
|
Antonio Clavelli, Dimosthenis Karatzas, Josep Llados, Mario Ferraro, & Giuseppe Boccignone. (2013). Towards Modelling an Attention-Based Text Localization Process. In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 296–303). LNCS. Springer Berlin Heidelberg.
Abstract: This note introduces a visual attention model of text localization in real-world scenes. The core of the model built upon the proto-object concept is discussed. It is shown how such dynamic mid-level representation of the scene can be derived in the framework of an action-perception loop engaging salience, text information value computation, and eye guidance mechanisms.
Preliminary results that compare model generated scanpaths with those eye-tracked from human subjects are presented.
Keywords: text localization; visual attention; eye guidance
|
Nuria Cirera, Alicia Fornes, Volkmar Frinken, & Josep Llados. (2013). Hybrid grammar language model for handwritten historical documents recognition. In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 117–124). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we present a hybrid language model for the recognition of handwritten historical documents with a structured syntactical layout. Using a hidden Markov model-based recognition framework, a word-based grammar with a closed dictionary is enhanced by a character sequence recognition method. This allows to recognize out-of-dictionary words in controlled parts of the recognition, while keeping a closed vocabulary restriction for other parts. While the current status is work in progress, we can report an improvement in terms of character error rate.
|