|
Nuria Cirera, Alicia Fornes and Josep Llados. 2015. Hidden Markov model topology optimization for handwriting recognition. 13th International Conference on Document Analysis and Recognition ICDAR2015.626–630.
Abstract: In this paper we present a method to optimize the topology of linear left-to-right hidden Markov models. These models are very popular for sequential signals modeling on tasks such as handwriting recognition. Many topology definition methods select the number of states for a character model based
on character length. This can be a drawback when characters are shorter than the minimum allowed by the model, since they can not be properly trained nor recognized. The proposed method optimizes the number of states per model by automatically including convenient skip-state transitions and therefore it avoids the aforementioned problem.We discuss and compare our method with other character length-based methods such the Fixed, Bakis and Quantile methods. Our proposal performs well on off-line handwriting recognition task.
|
|
|
J. Chazalon, Marçal Rusiñol and Jean-Marc Ogier. 2015. Improving Document Matching Performance by Local Descriptor Filtering. 6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015.1216–1220.
Abstract: In this paper we propose an effective method aimed at reducing the amount of local descriptors to be indexed in a document matching framework. In an off-line training stage, the matching between the model document and incoming images is computed retaining the local descriptors from the model that steadily produce good matches. We have evaluated this approach by using the ICDAR2015 SmartDOC dataset containing near 25 000 images from documents to be captured by a mobile device. We have tested the performance of this filtering step by using
ORB and SIFT local detectors and descriptors. The results show an important gain both in quality of the final matching as well as in time and space requirements.
|
|
|
Jean-Christophe Burie and 9 others. 2015. ICDAR2015 Competition on Smartphone Document Capture and OCR (SmartDoc). 13th International Conference on Document Analysis and Recognition ICDAR2015.1161–1165.
Abstract: Smartphones are enabling new ways of capture,
hence arises the need for seamless and reliable acquisition and
digitization of documents, in order to convert them to editable,
searchable and a more human-readable format. Current stateof-the-art
works lack databases and baseline benchmarks for
digitizing mobile captured documents. We have organized a
competition for mobile document capture and OCR in order to
address this issue. The competition is structured into two independent
challenges: smartphone document capture, and smartphone
OCR. This report describes the datasets for both challenges
along with their ground truth, details the performance evaluation
protocols which we used, and presents the final results of the
participating methods. In total, we received 13 submissions: 8
for challenge-I, and 5 for challenge-2.
|
|
|
Marçal Rusiñol, David Aldavert, Ricardo Toledo and Josep Llados. 2015. Towards Query-by-Speech Handwritten Keyword Spotting. 13th International Conference on Document Analysis and Recognition ICDAR2015.501–505.
Abstract: In this paper, we present a new querying paradigm for handwritten keyword spotting. We propose to represent handwritten word images both by visual and audio representations, enabling a query-by-speech keyword spotting system. The two representations are merged together and projected to a common sub-space in the training phase. This transform allows to, given a spoken query, retrieve word instances that were only represented by the visual modality. In addition, the same method can be used backwards at no additional cost to produce a handwritten text-tospeech system. We present our first results on this new querying mechanism using synthetic voices over the George Washington
dataset.
|
|
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, Josep Llados, R.Jain and D.Doermann. 2015. Novel Line Verification for Multiple Instance Focused Retrieval in Document Collections. 13th International Conference on Document Analysis and Recognition ICDAR2015.481–485.
|
|
|
Marçal Rusiñol, J. Chazalon, Jean-Marc Ogier and Josep Llados. 2015. A Comparative Study of Local Detectors and Descriptors for Mobile Document Classification. 13th International Conference on Document Analysis and Recognition ICDAR2015.596–600.
Abstract: In this paper we conduct a comparative study of local key-point detectors and local descriptors for the specific task of mobile document classification. A classification architecture based on direct matching of local descriptors is used as baseline for the comparative study. A set of four different key-point
detectors and four different local descriptors are tested in all the possible combinations. The experiments are conducted in a database consisting of 30 model documents acquired on 6 different backgrounds, totaling more than 36.000 test images.
|
|
|
J. Chazalon, Marçal Rusiñol, Jean-Marc Ogier and Josep Llados. 2015. A Semi-Automatic Groundtruthing Tool for Mobile-Captured Document Segmentation. 13th International Conference on Document Analysis and Recognition ICDAR2015.621–625.
Abstract: This paper presents a novel way to generate groundtruth data for the evaluation of mobile document capture systems, focusing on the first stage of the image processing pipeline involved: document object detection and segmentation in lowquality preview frames. We introduce and describe a simple, robust and fast technique based on color markers which enables a semi-automated annotation of page corners. We also detail a technique for marker removal. Methods and tools presented in the paper were successfully used to annotate, in few hours, 24889
frames in 150 video files for the smartDOC competition at ICDAR 2015
|
|
|
A.Nicolaou, Andrew Bagdanov, Marcus Liwicki and Dimosthenis Karatzas. 2015. Sparse Radial Sampling LBP for Writer Identification. 13th International Conference on Document Analysis and Recognition ICDAR2015.716–720.
Abstract: In this paper we present the use of Sparse Radial Sampling Local Binary Patterns, a variant of Local Binary Patterns (LBP) for text-as-texture classification. By adapting and extending the standard LBP operator to the particularities of text we get a generic text-as-texture classification scheme and apply it to writer identification. In experiments on CVL and ICDAR 2013 datasets, the proposed feature-set demonstrates State-Of-the-Art (SOA) performance. Among the SOA, the proposed method is the only one that is based on dense extraction of a single local feature descriptor. This makes it fast and applicable at the earliest stages in a DIA pipeline without the need for segmentation, binarization, or extraction of multiple features.
|
|
|
Suman Ghosh, Lluis Gomez, Dimosthenis Karatzas and Ernest Valveny. 2015. Efficient indexing for Query By String text retrieval. 6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015.1236–1240.
Abstract: This paper deals with Query By String word spotting in scene images. A hierarchical text segmentation algorithm based on text specific selective search is used to find text regions. These regions are indexed per character n-grams present in the text region. An attribute representation based on Pyramidal Histogram of Characters (PHOC) is used to compare text regions with the query text. For generation of the index a similar attribute space based Pyramidal Histogram of character n-grams is used. These attribute models are learned using linear SVMs over the Fisher Vector [1] representation of the images along with the PHOC labels of the corresponding strings.
|
|
|
Suman Ghosh and Ernest Valveny. 2015. Query by String word spotting based on character bi-gram indexing. 13th International Conference on Document Analysis and Recognition ICDAR2015.881–885.
Abstract: In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC). These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets
|
|