TY - CONF AU - Andreas Fischer AU - Volkmar Frinken AU - Alicia Fornes AU - Horst Bunke A2 - HIP PY - 2011// TI - Transcription Alignment of Latin Manuscripts Using Hidden Markov Models BT - Proceedings of the 2011 Workshop on Historical Document Imaging and Processing SP - 29 EP - 36 PB - ACM N2 - Transcriptions of historical documents are a valuable source for extracting labeled handwriting images that can be used for training recognition systems. In this paper, we introduce the Saint Gall database that includes images as well as the transcription of a Latin manuscript from the 9th century written in Carolingian script. Although the available transcription is of high quality for a human reader, the spelling of the words is not accurate when compared with the handwriting image. Hence, the transcription poses several challenges for alignment regarding, e.g., line breaks, abbreviations, and capitalization. We propose an alignment system based on character Hidden Markov Models that can cope with these challenges and efficiently aligns complete document pages. On the Saint Gall database, we demonstrate that a considerable alignment accuracy can be achieved, even with weakly trained character models. UR - http://dx.doi.org/10.1145/2037342.2037348 N1 - DAG ID - Andreas Fischer2011 ER -