TY  - JOUR
AU  - Jose Antonio Rodriguez
AU  - Florent Perronnin
PY  - 2009//
TI  - Handwritten word-spotting using hidden Markov models and universal vocabularies
T2  - PR
JO  - Pattern Recognition
SP  - 2103
EP  - 2116
VL  - 42
IS  - 9
PB  - Elsevier
KW  - Word-spotting
KW  - Hidden Markov model
KW  - Score normalization
KW  - Universal vocabulary
KW  - Handwriting recognition
N2  - Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians.
SN  - 0031-3203
UR  - http://dx.doi.org/10.1016/j.patcog.2009.02.005
N1  - exported from refbase (http://refbase.cvc.uab.es/show.php?record=1053), last updated on Thu, 19 Dec 2013 12:26:24 +0100
ID  - Jose Antonio Rodriguez2009
ER  -