TY - CONF AU - Veronica Romero AU - Alicia Fornes AU - Enrique Vidal AU - Joan Andreu Sanchez A2 - ICFHR PY - 2016// TI - Using the MGGI Methodology for Category-based Language Modeling in Handwritten Marriage Licenses Books BT - 15th international conference on Frontiers in Handwriting Recognition N2 - Handwritten marriage licenses books have been used for centuries by ecclesiastical and secular institutions to register marriages. The information contained in these historical documents is useful for demography studies andgenealogical research, among others. Despite the generally simple structure of the text in these documents, automatic transcription and semantic information extraction is difficult due to the distinct and evolutionary vocabulary, which is composed mainly of proper names that change along the time. In previousworks we studied the use of category-based language models to both improve the automatic transcription accuracy and make easier the extraction of semantic information. Here we analyze the main causes of the semantic errors observed in previous results and apply a Grammatical Inference technique known as MGGI to improve the semantic accuracy of the language model obtained. Using this language model, full handwritten text recognition experiments have been carried out, with results supporting the interest of the proposed approach. L1 - http://refbase.cvc.uab.es/files/RFV2016.pdf N1 - DAG; 600.097; 602.006 ID - Veronica Romero2016 ER -