TY - CONF AU - Albert Gordo AU - Ernest Valveny A2 - IbPRIA PY - 2009// TI - The diagonal split: A pre-segmentation step for page layout analysis & classification T2 - LNCS BT - 4th Iberian Conference on Pattern Recognition and Image Analysis SP - 290–297 VL - 5524 PB - Springer Berlin Heidelberg N2 - Document classification is an important task in all the processes related to document storage and retrieval. In the case of complex documents, structural features are needed to achieve a correct classification. Unfortunately, physical layout analysis is error prone. In this paper we present a pre-segmentation step based on a divide & conquer strategy that can be used to improve the page segmentation results, independently of the segmentation algorithm used. This pre-segmentation step is evaluated in classification and retrieval using the selective CRLA algorithm for layout segmentation together with a clustering based on the voronoi area diagram, and tested on two different databases, MARG and Girona Archives. SN - 0302-9743 SN - 978-3-642-02171-8 UR - http://dx.doi.org/10.1007/978-3-642-02172-5_38 N1 - DAG ID - Albert Gordo2009 ER -