PT Journal AU Marçal Rusiñol Volkmar Frinken Dimosthenis Karatzas Andrew Bagdanov Josep Llados TI Multimodal page classification in administrative document image streams SO International Journal on Document Analysis and Recognition JI IJDAR PY 2014 BP 331 EP 341 VL 17 IS 4 DI 10.1007/s10032-014-0225-8 DE Digital mail room; Multimodal page classification; Visual and textual document description AB In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages. ER