%0 Journal Article %T Large-scale document image retrieval and classification with runlength histograms and binary embeddings %A Albert Gordo %A Florent Perronnin %A Ernest Valveny %J Pattern Recognition %D 2013 %V 46 %N 7 %I Elsevier %@ 0031-3203 %F Albert Gordo2013 %O DAG; 600.042; 600.045; 605.203 %O exported from refbase (http://refbase.cvc.uab.es/show.php?record=2306), last updated on Thu, 21 Jan 2016 10:37:56 +0100 %X We present a new document image descriptor based on multi-scale runlengthhistograms. This descriptor does not rely on layout analysis and can becomputed efficiently. We show how this descriptor can achieve state-of-theartresults on two very different public datasets in classification and retrievaltasks. Moreover, we show how we can compress and binarize these descriptorsto make them suitable for large-scale applications. We can achieve state-ofthe-art results in classification using binary descriptors of as few as 16 to 64bits. %K visual document descriptor %K compression %K large-scale %K retrieval %K classification %U http://dx.doi.org/10.1016/j.patcog.2012.12.004 %U http://refbase.cvc.uab.es/files/GPV2013.pdf %P 1898-1905