|
Ernest Valveny, Salvatore Tabbone and Oriol Ramos Terrades. 2008. Performance Characterization of Shape Descriptors for Symbol Representation. In W. Liu, J.L., J.M. Ogier, ed. Graphics Recognition: Recent Advances and New Opportunities.278–287. (LNCS.)
|
|
|
Stepan Simsa and 10 others. 2023. Overview of DocILE 2023: Document Information Localization and Extraction. International Conference of the Cross-Language Evaluation Forum for European Languages.276–293. (LNCS.)
Abstract: This paper provides an overview of the DocILE 2023 Competition, its tasks, participant submissions, the competition results and possible future research directions. This first edition of the competition focused on two Information Extraction tasks, Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR). Both of these tasks require detection of pre-defined categories of information in business documents. The second task additionally requires correctly grouping the information into tuples, capturing the structure laid out in the document. The competition used the recently published DocILE dataset and benchmark that stays open to new submissions. The diversity of the participant solutions indicates the potential of the dataset as the submissions included pure Computer Vision, pure Natural Language Processing, as well as multi-modal solutions and utilized all of the parts of the dataset, including the annotated, synthetic and unlabeled subsets.
Keywords: Information Extraction; Computer Vision; Natural Language Processing; Optical Character Recognition; Document Understanding
|
|
|
Sanket Biswas, Pau Riba, Josep Llados and Umapada Pal. 2021. Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts. IJDAR, 24, 269–281.
Abstract: Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art.
|
|
|
Christophe Rigaud, Dimosthenis Karatzas, Jean-Christophe Burie and Jean-Marc Ogier. 2014. Color descriptor for content-based drawing retrieval. 11th IAPR International Workshop on Document Analysis and Systems.267–271.
Abstract: Human detection in computer vision field is an active field of research. Extending this to human-like drawings such as the main characters in comic book stories is not trivial. Comics analysis is a very recent field of research at the intersection of graphics, texts, objects and people recognition. The detection of the main comic characters is an essential step towards a fully automatic comic book understanding. This paper presents a color-based approach for comics character retrieval using content-based drawing retrieval and color palette.
|
|
|
Thanh Ha Do, Salvatore Tabbone and Oriol Ramos Terrades. 2013. New Approach for Symbol Recognition Combining Shape Context of Interest Points with Sparse Representation. 12th International Conference on Document Analysis and Recognition.265–269.
Abstract: In this paper, we propose a new approach for symbol description. Our method is built based on the combination of shape context of interest points descriptor and sparse representation. More specifically, we first learn a dictionary describing shape context of interest point descriptors. Then, based on information retrieval techniques, we build a vector model for each symbol based on its sparse representation in a visual vocabulary whose visual words are columns in the learneddictionary. The retrieval task is performed by ranking symbols based on similarity between vector models. Evaluation of our method, using benchmark datasets, demonstrates the validity of our approach and shows that it outperforms related state-of-theart methods.
|
|
|
Mathieu Nicolas Delalandre, Jean-Yves Ramel, Ernest Valveny and Muhammad Muzzamil Luqman. 2010. A Performance Characterization Algorithm for Symbol Localization. Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers. Springer Berlin Heidelberg, 260–271. (LNCS.)
Abstract: In this paper we present an algorithm for performance characterization of symbol localization systems. This algorithm is aimed to be a more “reliable” and “open” solution to characterize the performance. To achieve that, it exploits only single points as the result of localization and offers the possibility to reconsider the localization results provided by a system. We use the information about context in groundtruth, and overall localization results, to detect the ambiguous localization results. A probability score is computed for each matching between a localization point and a groundtruth region, depending on the spatial distribution of the other regions in the groundtruth. Final characterization is given with detection rate/probability score plots, describing the sets of possible interpretations of the localization results, according to a given confidence rate. We present experimentation details along with the results for the symbol localization system of [1], exploiting a synthetic dataset of architectural floorplans and electrical diagrams (composed of 200 images and 3861 symbols).
|
|
|
Mohammed Al Rawi, Ernest Valveny and Dimosthenis Karatzas. 2019. Can One Deep Learning Model Learn Script-Independent Multilingual Word-Spotting? 15th International Conference on Document Analysis and Recognition.260–267.
Abstract: Word spotting has gained increased attention lately as it can be used to extract textual information from handwritten documents and scene-text images. Current word spotting approaches are designed to work on a single language and/or script. Building intelligent models that learn script-independent multilingual word-spotting is challenging due to the large variability of multilingual alphabets and symbols. We used ResNet-152 and the Pyramidal Histogram of Characters (PHOC) embedding to build a one-model script-independent multilingual word-spotting and we tested it on Latin, Arabic, and Bangla (Indian) languages. The one-model we propose performs on par with the multi-model language-specific word-spotting system, and thus, reduces the number of models needed for each script and/or language.
|
|
|
David Fernandez, Jon Almazan, Nuria Cirera, Alicia Fornes and Josep Llados. 2014. BH2M: the Barcelona Historical Handwritten Marriages database. 22nd International Conference on Pattern Recognition.256–261.
Abstract: This paper presents an image database of historical handwritten marriages records stored in the archives of Barcelona cathedral, and the corresponding meta-data addressed to evaluate the performance of document analysis algorithms. The contribution of this paper is twofold. First, it presents a complete ground truth which covers the whole pipeline of handwriting
recognition research, from layout analysis to recognition and understanding. Second, it is the first dataset in the emerging area of genealogical document analysis, where documents are manuscripts pseudo-structured with specific lexicons and the interest is beyond pure transcriptions but context dependent.
|
|
|
V. Poulain d'Andecy, Emmanuel Hartmann and Marçal Rusiñol. 2018. Field Extraction by hybrid incremental and a-priori structural templates. 13th IAPR International Workshop on Document Analysis Systems.251–256.
Abstract: In this paper, we present an incremental framework for extracting information fields from administrative documents. First, we demonstrate some limits of the existing state-of-the-art methods such as the delay of the system efficiency. This is a concern in industrial context when we have only few samples of each document class. Based on this analysis, we propose a hybrid system combining incremental learning by means of itf-df statistics and a-priori generic
models. We report in the experimental section our results obtained with a dataset of real invoices.
Keywords: Layout Analysis; information extraction; incremental learning
|
|
|
Salvatore Tabbone and Josep Llados. 2007. A Propos de la Reconnaissance de Documents Graphiques: Synthese et Perspectives. Traitement et Analyse de l’Information: Methodes et Applications.247–258.
|
|