|
Suman Ghosh and Ernest Valveny. 2015. A Sliding Window Framework for Word Spotting Based on Word Attributes. Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015. Springer International Publishing, 652–661. (LNCS.)
Abstract: In this paper we propose a segmentation-free approach to word spotting. Word images are first encoded into feature vectors using Fisher Vector. Then, these feature vectors are used together with pyramidal histogram of characters labels (PHOC) to learn SVM-based attribute models. Documents are represented by these PHOC based word attributes. To efficiently compute the word attributes over a sliding window, we propose to use an integral image representation of the document using a simplified version of the attribute model. Finally we re-rank the top word candidates using the more discriminative full version of the word attributes. We show state-of-the-art results for segmentation-free query-by-example word spotting in single-writer and multi-writer standard datasets.
Keywords: Word spotting; Sliding window; Word attributes
|
|
|
Lluis Gomez and Dimosthenis Karatzas. 2016. A fine-grained approach to scene text script identification. 12th IAPR Workshop on Document Analysis Systems.192–197.
Abstract: This paper focuses on the problem of script identification in unconstrained scenarios. Script identification is an important prerequisite to recognition, and an indispensable condition for automatic text understanding systems designed for multi-language environments. Although widely studied for document images and handwritten documents, it remains an almost unexplored territory for scene text images. We detail a novel method for script identification in natural images that combines convolutional features and the Naive-Bayes Nearest Neighbor classifier. The proposed framework efficiently exploits the discriminative power of small stroke-parts, in a fine-grained classification framework. In addition, we propose a new public benchmark dataset for the evaluation of joint text detection and script identification in natural scenes. Experiments done in this new dataset demonstrate that the proposed method yields state of the art results, while it generalizes well to different datasets and variable number of scripts. The evidence provided shows that multi-lingual scene text recognition in the wild is a viable proposition. Source code of the proposed method is made available online.
|
|
|
Joan Mas, Alicia Fornes and Josep Llados. 2016. An Interactive Transcription System of Census Records using Word-Spotting based Information Transfer. 12th IAPR Workshop on Document Analysis Systems.54–59.
Abstract: This paper presents a system to assist in the transcription of historical handwritten census records in a crowdsourcing platform. Census records have a tabular structured layout. They consist in a sequence of rows with information of homes ordered by street address. For each household snippet in the page, the list of family members is reported. The censuses are recorded in intervals of a few years and the information of individuals in each household is quite stable from a point in time to the next one. This redundancy is used to assist the transcriber, so the redundant information is transferred from the census already transcribed to the next one. Household records are aligned from one year to the next one using the knowledge of the ordering by street address. Given an already transcribed census, a query by string word spotting is applied. Thus, names from the census in time t are used as queries in the corresponding home record in time t+1. Since the search is constrained, the obtained precision-recall values are very high, with an important reduction in the transcription time. The proposed system has been tested in a real citizen-science experience where non expert users transcribe the census data of their home town.
|
|
|
Juan Ignacio Toledo, Alicia Fornes, Jordi Cucurull and Josep Llados. 2016. Election Tally Sheets Processing System. 12th IAPR Workshop on Document Analysis Systems.364–368.
Abstract: In paper based elections, manual tallies at polling station level produce myriads of documents. These documents share a common form-like structure and a reduced vocabulary worldwide. On the other hand, each tally sheet is filled by a different writer and on different countries, different scripts are used. We present a complete document analysis system for electoral tally sheet processing combining state of the art techniques with a new handwriting recognition subprocess based on unsupervised feature discovery with Variational Autoencoders and sequence classification with BLSTM neural networks. The whole system is designed to be script independent and allows a fast and reliable results consolidation process with reduced operational cost.
|
|
|
Anders Hast and Alicia Fornes. 2016. A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching. 12th IAPR Workshop on Document Analysis Systems.150–155.
Abstract: The automatic recognition of historical handwritten documents is still considered challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results.
|
|
|
Dimosthenis Karatzas, V. Poulain d'Andecy and Marçal Rusiñol. 2016. Human-Document Interaction – a new frontier for document image analysis. 12th IAPR Workshop on Document Analysis Systems.369–374.
Abstract: All indications show that paper documents will not cede in favour of their digital counterparts, but will instead be used increasingly in conjunction with digital information. An open challenge is how to seamlessly link the physical with the digital – how to continue taking advantage of the important affordances of paper, without missing out on digital functionality. This paper
presents the authors’ experience with developing systems for Human-Document Interaction based on augmented document interfaces and examines new challenges and opportunities arising for the document image analysis field in this area. The system presented combines state of the art camera-based document
image analysis techniques with a range of complementary tech-nologies to offer fluid Human-Document Interaction. Both fixed and nomadic setups are discussed that have gone through user testing in real-life environments, and use cases are presented that span the spectrum from business to educational application
|
|
|
Q. Bao, Marçal Rusiñol, M.Coustaty, Muhammad Muzzamil Luqman, C.D. Tran and Jean-Marc Ogier. 2016. Delaunay triangulation-based features for Camera-based document image retrieval system. 12th IAPR Workshop on Document Analysis Systems.1–6.
Abstract: In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution)and 700 textual document images.
Keywords: Camera-based Document Image Retrieval; Delaunay Triangulation; Feature descriptors; Indexing
|
|
|
Jaume Gibert, Ernest Valveny and Horst Bunke. 2010. Graph of Words Embedding for Molecular Structure-Activity Relationship Analysis. 15th Iberoamerican Congress on Pattern Recognition.30–37. (LNCS.)
Abstract: Structure-Activity relationship analysis aims at discovering chemical activity of molecular compounds based on their structure. In this article we make use of a particular graph representation of molecules and propose a new graph embedding procedure to solve the problem of structure-activity relationship analysis. The embedding is essentially an arrangement of a molecule in the form of a vector by considering frequencies of appearing atoms and frequencies of covalent bonds between them. Results on two benchmark databases show the effectiveness of the proposed technique in terms of recognition accuracy while avoiding high operational costs in the transformation.
|
|
|
Marçal Rusiñol, V. Poulain d'Andecy, Dimosthenis Karatzas and Josep Llados. 2011. Classification of Administrative Document Images by Logo Identification. In proceedings of 9th IAPR Workshop on Graphic Recognition.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier's graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
|
|
|
Anjan Dutta, Josep Llados and Umapada Pal. 2011. Bag-of-GraphPaths Descriptors for Symbol Recognition and Spotting in Line Drawings. In proceedings of 9th IAPR Workshop on Graphic Recognition. Springer Berlin Heidelberg. (LNCS.)
Abstract: Graphical symbol recognition and spotting recently have become an important research activity. In this work we present a descriptor for symbols, especially for line drawings. The descriptor is based on the graph representation of graphical objects. We construct graphs from the vectorized information of the binarized images, where the critical points detected by the vectorization algorithm are considered as nodes and the lines joining them are considered as edges. Graph paths between two nodes in a graph are the finite sequences of nodes following the order from the starting to the final node. The occurrences of different graph paths in a given graph is an important feature, as they capture the geometrical and structural attributes of a graph. So the graph representing a symbol can efficiently be represent by the occurrences of its different paths. Their occurrences in a symbol can be obtained in terms of a histogram counting the number of some fixed prototype paths, we call the histogram as the Bag-of-GraphPaths (BOGP). These BOGP histograms are used as a descriptor to measure the distance among the symbols in vector space. We use the descriptor for three applications, they are: (1) classification of the graphical symbols, (2) spotting of the architectural symbols on floorplans, (3) classification of the historical handwritten words.
|
|