PT Journal AU Josep Llados Marçal Rusiñol Alicia Fornes David Fernandez Anjan Dutta TI On the Influence of Word Representations for Handwritten Word Spotting in Historical Documents SO International Journal of Pattern Recognition and Artificial Intelligence JI IJPRAI PY 2012 BP 1263002-126027 VL 26 IS 5 DI 10.1142/S0218001412630025 DE Handwriting recognition; word spotting; historical documents; feature representation; shape descriptors Read More: http://www.worldscientific.com/doi/abs/10.1142/S0218001412630025 AB 0,624 JCRWord spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images. ER