|   | 
Details
   web
Records
Author Dimosthenis Karatzas; Lluis Gomez; Marçal Rusiñol; Anguelos Nicolaou
Title The Robust Reading Competition Annotation and Evaluation Platform Type Conference Article
Year 2018 Publication (down) 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal
Volume Issue Pages 61-66
Keywords
Abstract The ICDAR Robust Reading Competition (RRC), initiated in 2003 and reestablished in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous
effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the
Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation of data, and to provide online and offline performance evaluation and analysis services.
Address Viena; Austria; April 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DAS
Notes DAG; 600.084; 600.121 Approved no
Call Number KGR2018 Serial 3103
Permanent link to this record
 

 
Author David Aldavert; Marçal Rusiñol
Title Manuscript text line detection and segmentation using second-order derivatives analysis Type Conference Article
Year 2018 Publication (down) 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal
Volume Issue Pages 293 - 298
Keywords text line detection; text line segmentation; text region detection; second-order derivatives
Abstract In this paper, we explore the use of second-order derivatives to detect text lines on handwritten document images. Taking advantage that the second derivative gives a minimum response when a dark linear element over a
bright background has the same orientation as the filter, we use this operator to create a map with the local orientation and strength of putative text lines in the document. Then, we detect line segments by selecting and merging the filter responses that have a similar orientation and scale. Finally, text lines are found by merging the segments that are within the same text region. The proposed segmentation algorithm, is learning-free while showing a performance similar to the state of the art methods in publicly available datasets.
Address Viena; Austria; April 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DAS
Notes DAG; 600.084; 600.129; 302.065; 600.121 Approved no
Call Number Admin @ si @ AlR2018a Serial 3104
Permanent link to this record
 

 
Author David Aldavert; Marçal Rusiñol
Title Synthetically generated semantic codebook for Bag-of-Visual-Words based word spotting Type Conference Article
Year 2018 Publication (down) 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal
Volume Issue Pages 223 - 228
Keywords Word Spotting; Bag of Visual Words; Synthetic Codebook; Semantic Information
Abstract Word-spotting methods based on the Bag-ofVisual-Words framework have demonstrated a good retrieval performance even when used in a completely unsupervised manner. Although unsupervised approaches are suitable for
large document collections due to the cost of acquiring labeled data, these methods also present some drawbacks. For instance, having to train a suitable “codebook” for a certain dataset has a high computational cost. Therefore, in
this paper we present a database agnostic codebook which is trained from synthetic data. The aim of the proposed approach is to generate a codebook where the only information required is the type of script used in the document. The use of synthetic data also allows to easily incorporate semantic
information in the codebook generation. So, the proposed method is able to determine which set of codewords have a semantic representation of the descriptor feature space. Experimental results show that the resulting codebook attains a state-of-the-art performance while having a more compact representation.
Address Viena; Austria; April 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DAS
Notes DAG; 600.084; 600.129; 600.121 Approved no
Call Number Admin @ si @ AlR2018b Serial 3105
Permanent link to this record
 

 
Author V. Poulain d'Andecy; Emmanuel Hartmann; Marçal Rusiñol
Title Field Extraction by hybrid incremental and a-priori structural templates Type Conference Article
Year 2018 Publication (down) 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal
Volume Issue Pages 251 - 256
Keywords Layout Analysis; information extraction; incremental learning
Abstract In this paper, we present an incremental framework for extracting information fields from administrative documents. First, we demonstrate some limits of the existing state-of-the-art methods such as the delay of the system efficiency. This is a concern in industrial context when we have only few samples of each document class. Based on this analysis, we propose a hybrid system combining incremental learning by means of itf-df statistics and a-priori generic
models. We report in the experimental section our results obtained with a dataset of real invoices.
Address Viena; Austria; April 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DAS
Notes DAG; 600.084; 600.129; 600.121 Approved no
Call Number Admin @ si @ PHR2018 Serial 3106
Permanent link to this record
 

 
Author Manuel Carbonell; Mauricio Villegas; Alicia Fornes; Josep Llados
Title Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model Type Conference Article
Year 2018 Publication (down) 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal
Volume Issue Pages 399-404
Keywords Named entity recognition; Handwritten Text Recognition; neural networks
Abstract When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks. This has the disadvantage that errors in the first module affect heavily the
performance of the second module. In this work we propose to do both tasks jointly, using a single neural network with a common architecture used for plain text recognition. Experimentally, the work has been tested on a collection of historical marriage records. Results of experiments are presented to show the effect on the performance for different
configurations: different ways of encoding the information, doing or not transfer learning and processing at text line or multi-line region level. The results are comparable to state of the art reported in the ICDAR 2017 Information Extraction competition, even though the proposed technique does not use any dictionaries, language modeling or post processing.
Address Vienna; Austria; April 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DAS
Notes DAG; 600.097; 603.057; 601.311; 600.121 Approved no
Call Number Admin @ si @ CVF2018 Serial 3170
Permanent link to this record