|   | 
Details
   web
Record
Author (up) Veronica Romero; Alicia Fornes; Enrique Vidal; Joan Andreu Sanchez
Title Information Extraction in Handwritten Marriage Licenses Books Using the MGGI Methodology Type Conference Article
Year 2017 Publication 8th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal
Volume 10255 Issue Pages 287-294
Keywords Handwritten Text Recognition; Information extraction; Language modeling; MGGI; Categories-based language model
Abstract Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach.
Address Faro; Portugal; June 2017
Corporate Author Thesis
Publisher Place of Publication Editor L.A. Alexandre; J.Salvador Sanchez; Joao M. F. Rodriguez
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-319-58837-7 Medium
Area Expedition Conference IbPRIA
Notes DAG; 602.006; 600.097; 600.121 Approved no
Call Number Admin @ si @ RFV2017 Serial 2952
Permanent link to this record