TY - JOUR AU - Albert Gordo AU - Florent Perronnin AU - Ernest Valveny PY - 2013// TI - Large-scale document image retrieval and classification with runlength histograms and binary embeddings T2 - PR JO - Pattern Recognition SP - 1898 EP - 1905 VL - 46 IS - 7 PB - Elsevier KW - visual document descriptor KW - compression KW - large-scale KW - retrieval KW - classification N2 - We present a new document image descriptor based on multi-scale runlengthhistograms. This descriptor does not rely on layout analysis and can becomputed efficiently. We show how this descriptor can achieve state-of-theartresults on two very different public datasets in classification and retrievaltasks. Moreover, we show how we can compress and binarize these descriptorsto make them suitable for large-scale applications. We can achieve state-ofthe-art results in classification using binary descriptors of as few as 16 to 64bits. SN - 0031-3203 UR - http://dx.doi.org/10.1016/j.patcog.2012.12.004 L1 - http://refbase.cvc.uab.es/files/GPV2013.pdf N1 - DAG; 600.042; 600.045; 605.203 ID - Albert Gordo2013 ER -