PT Journal AU Partha Pratim Roy Umapada Pal Josep Llados TI Text line extraction in graphical documents using background and foreground SO International Journal on Document Analysis and Recognition JI IJDAR PY 2012 BP 227 EP 241 VL 15 IS 3 DI 10.1007/s10032-011-0167-3 AB 0,405 JCRIn graphical documents (e.g., maps, engineering drawings), artistic documents etc., the text lines are annotated in multiple orientations or curvilinear way to illustrate different locations or symbols. For the optical character recognition of such documents, individual text lines from the documents need to be extracted. In this paper, we propose a novel method to segment such text lines and the method is based on the foreground and background information of the text components. To effectively utilize the background information, a water reservoir concept is used here. In the proposed scheme, at first, individual components are detected and grouped into character clusters in a hierarchical way using size and positional information. Next, the clusters are extended in two extreme sides to determine potential candidate regions. Finally, with the help of these candidate regions,individual lines are extracted. The experimental results are presented on different datasets of graphical documents, camera-based warped documents, noisy images containing seals, etc. The results demonstrate that our approach is robust and invariant to size and orientation of the text lines present inthe document. ER