|
N. Nayef and 14 others. 2017. ICDAR2017 Robust Reading Challenge on Multi-Lingual Scene Text Detection and Script Identification – RRC-MLT. 14th International Conference on Document Analysis and Recognition.1454–1459.
Abstract: Text detection and recognition in a natural environment are key components of many applications, ranging from business card digitization to shop indexation in a street. This competition aims at assessing the ability of state-of-the-art methods to detect Multi-Lingual Text (MLT) in scene images, such as in contents gathered from the Internet media and in modern cities where multiple cultures live and communicate together. This competition is an extension of the Robust Reading Competition (RRC) which has been held since 2003 both in ICDAR and in an online context. The proposed competition is presented as a new challenge of the RRC. The dataset built for this challenge largely extends the previous RRC editions in many aspects: the multi-lingual text, the size of the dataset, the multi-oriented text, the wide variety of scenes. The dataset is comprised of 18,000 images which contain text belonging to 9 languages. The challenge is comprised of three tasks related to text detection and script classification. We have received a total of 16 participations from the research and industrial communities. This paper presents the dataset, the tasks and the findings of this RRC-MLT challenge.
|
|
|
Albert Berenguel, Oriol Ramos Terrades, Josep Llados and Cristina Cañero. 2017. e-Counterfeit: a mobile-server platform for document counterfeit detection. 14th IAPR International Conference on Document Analysis and Recognition.
Abstract: This paper presents a novel application to detect counterfeit identity documents forged by a scan-printing operation. Texture analysis approaches are proposed to extract validation features from security background that is usually printed in documents as IDs or banknotes. The main contribution of this work is the end-to-end mobile-server architecture, which provides a service for non-expert users and therefore can be used in several scenarios. The system also provides a crowdsourcing mode so labeled images can be gathered, generating databases for incremental training of the algorithms.
|
|
|
Alicia Fornes and 6 others. 2017. ICDAR2017 Competition on Information Extraction in Historical Handwritten Records. 14th International Conference on Document Analysis and Recognition.1389–1394.
Abstract: The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results.
|
|
|
Dimosthenis Karatzas, Lluis Gomez and Marçal Rusiñol. 2017. The Robust Reading Competition Annotation and Evaluation Platform. 1st International Workshop on Open Services and Tools for Document Analysis.
Abstract: The ICDAR Robust Reading Competition (RRC), initiated in 2003 and re-established in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation
of data, and to provide online and offline performance evaluation and analysis services
|
|
|
Raul Gomez and 7 others. 2017. ICDAR2017 Robust Reading Challenge on COCO-Text. 14th International Conference on Document Analysis and Recognition.
|
|
|
Jean-Marc Ogier, Wenyin Liu and Josep Llados, eds. 2010. Graphics Recognition: Achievements, Challenges, and Evolution. Springer Link. (LNCS.)
|
|
|
Alicia Fornes, Josep Llados, Gemma Sanchez and Horst Bunke. 2009. Symbol-independent writer identification in old handwritten music scores. In proceedings of 8th IAPR International Workshop on Graphics Recognition. Springer Berlin Heidelberg, 186–197.
|
|
|
Salim Jouili, Salvatore Tabbone and Ernest Valveny. 2009. Evaluation of graph matching measures for documents retrieval. In proceedings of 8th IAPR International Workshop on Graphics Recognition.13–21.
Abstract: In this paper we evaluate four graph distance measures. The analysis is performed for document retrieval tasks. For this aim, different kind of documents are used which include line drawings (symbols), ancient documents (ornamental letters), shapes and trademark-logos. The experimental results show that the performance of each grahp distance measure depends on the kind of data and the graph representation technique.
Keywords: Graph Matching; Graph retrieval; structural representation; Performance Evaluation
|
|
|
Adria Molina, Lluis Gomez, Oriol Ramos Terrades and Josep Llados. 2022. A Generic Image Retrieval Method for Date Estimation of Historical Document Collections. Document Analysis Systems.15th IAPR International Workshop, (DAS2022).583–597.
Abstract: Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others. This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections. We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem. One of the main usages of the presented approach is as a tool for historical contextual retrieval. It means that scholars could perform comparative analysis of historical images from big datasets in terms of the period where they were produced. We provide experimental evaluation on different types of documents from real datasets of manuscript and newspaper images.
Keywords: Date estimation; Document retrieval; Image retrieval; Ranking loss; Smooth-nDCG
|
|
|
Josep Brugues Pujolras, Lluis Gomez and Dimosthenis Karatzas. 2022. A Multilingual Approach to Scene Text Visual Question Answering. Document Analysis Systems.15th IAPR International Workshop, (DAS2022).65–79.
Abstract: Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines.
Keywords: Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning
|
|