|
Chee-Kheng Chng and 13 others. 2019. ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text – RRC-ArT. 15th International Conference on Document Analysis and Recognition.1571–1576.
Abstract: This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text – RRC-ArT that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting. A total of 78 submissions from 46 unique teams/individuals were received for this competition. The top performing score of each challenge is as follows: i) T1 – 82.65%, ii) T2.1 – 74.3%, iii) T2.2 – 85.32%, iv) T3.1 – 53.86%, and v) T3.2 – 54.91%. Apart from the results, this paper also details the ArT dataset, tasks description, evaluation metrics and participants' methods. The dataset, the evaluation kit as well as the results are publicly available at the challenge website.
|
|
|
Manuel Carbonell, Joan Mas, Mauricio Villegas, Alicia Fornes and Josep Llados. 2019. End-to-End Handwritten Text Detection and Transcription in Full Pages. 2nd International Workshop on Machine Learning.29–34.
Abstract: When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect
the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume
segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.
Keywords: Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning
|
|
|
Mickael Coustaty and Alicia Fornes. 2023. Document Analysis and Recognition – ICDAR 2023 Workshops. (LNCS.)
|
|
|
S. Chanda, Umapada Pal and Oriol Ramos Terrades. 2009. Word-Wise Thai and Roman Script Identification.
Abstract: In some Thai documents, a single text line of a printed document page may contain words of both Thai and Roman scripts. For the Optical Character Recognition (OCR) of such a document page it is better to identify, at first, Thai and Roman script portions and then to use individual OCR systems of the respective scripts on these identified portions. In this article, an SVM-based method is proposed for identification of word-wise printed Roman and Thai scripts from a single line of a document page. Here, at first, the document is segmented into lines and then lines are segmented into character groups (words). In the proposed scheme, we identify the script of a character group combining different character features obtained from structural shape, profile behavior, component overlapping information, topological properties, and water reservoir concept, etc. Based on the experiment on 10,000 data (words) we obtained 99.62% script identification accuracy from the proposed scheme.
|
|
|
T.Chauhan, E.Perales, Kaida Xiao, E.Hird, Dimosthenis Karatzas and Sophie Wuerger. 2014. The achromatic locus: Effect of navigation direction in color space. VSS, 14 (1)(25), 1–11.
Abstract: 5Y Impact Factor: 2.99 / 1st (Ophthalmology)
An achromatic stimulus is defined as a patch of light that is devoid of any hue. This is usually achieved by asking observers to adjust the stimulus such that it looks neither red nor green and at the same time neither yellow nor blue. Despite the theoretical and practical importance of the achromatic locus, little is known about the variability in these settings. The main purpose of the current study was to evaluate whether achromatic settings were dependent on the task of the observers, namely the navigation direction in color space. Observers could either adjust the test patch along the two chromatic axes in the CIE u*v* diagram or, alternatively, navigate along the unique-hue lines. Our main result is that the navigation method affects the reliability of these achromatic settings. Observers are able to make more reliable achromatic settings when adjusting the test patch along the directions defined by the four unique hues as opposed to navigating along the main axes in the commonly used CIE u*v* chromaticity plane. This result holds across different ambient viewing conditions (Dark, Daylight, Cool White Fluorescent) and different test luminance levels (5, 20, and 50 cd/m2). The reduced variability in the achromatic settings is consistent with the idea that internal color representations are more aligned with the unique-hue lines than the u* and v* axes.
Keywords: achromatic; unique hues; color constancy; luminance; color space
|
|
|
Jialuo Chen, Pau Riba, Alicia Fornes, Juan Mas, Josep Llados and Joana Maria Pujadas-Mora. 2018. Word-Hunter: A Gamesourcing Experience to Validate the Transcription of Historical Manuscripts. 16th International Conference on Frontiers in Handwriting Recognition.528–533.
Abstract: Nowadays, there are still many handwritten historical documents in archives waiting to be transcribed and indexed. Since manual transcription is tedious and time consuming, the automatic transcription seems the path to follow. However, the performance of current handwriting recognition techniques is not perfect, so a manual validation is mandatory. Crowdsourcing is a good strategy for manual validation, however it is a tedious task. In this paper we analyze experiences based in gamification
in order to propose and design a gamesourcing framework that increases the interest of users. Then, we describe and analyze our experience when validating the automatic transcription using the gamesourcing application. Moreover, thanks to the combination of clustering and handwriting recognition techniques, we can speed up the validation while maintaining the performance.
Keywords: Crowdsourcing; Gamification; Handwritten documents; Performance evaluation
|
|
|
J. Chazalon, Marçal Rusiñol and Jean-Marc Ogier. 2015. Improving Document Matching Performance by Local Descriptor Filtering. 6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015.1216–1220.
Abstract: In this paper we propose an effective method aimed at reducing the amount of local descriptors to be indexed in a document matching framework. In an off-line training stage, the matching between the model document and incoming images is computed retaining the local descriptors from the model that steadily produce good matches. We have evaluated this approach by using the ICDAR2015 SmartDOC dataset containing near 25 000 images from documents to be captured by a mobile device. We have tested the performance of this filtering step by using
ORB and SIFT local detectors and descriptors. The results show an important gain both in quality of the final matching as well as in time and space requirements.
|
|
|
J. Chazalon, Marçal Rusiñol, Jean-Marc Ogier and Josep Llados. 2015. A Semi-Automatic Groundtruthing Tool for Mobile-Captured Document Segmentation. 13th International Conference on Document Analysis and Recognition ICDAR2015.621–625.
Abstract: This paper presents a novel way to generate groundtruth data for the evaluation of mobile document capture systems, focusing on the first stage of the image processing pipeline involved: document object detection and segmentation in lowquality preview frames. We introduce and describe a simple, robust and fast technique based on color markers which enables a semi-automated annotation of page corners. We also detail a technique for marker removal. Methods and tools presented in the paper were successfully used to annotate, in few hours, 24889
frames in 150 video files for the smartDOC competition at ICDAR 2015
|
|
|
Francisco Cruz and Oriol Ramos Terrades. 2012. Document segmentation using relative location features. 21st International Conference on Pattern Recognition.1562–1565.
Abstract: In this paper we evaluate the use of Relative Location Features (RLF) on a historical document segmentation task, and compare the quality of the results obtained on structured and unstructured documents using RLF and not using them. We prove that using these features improve the final segmentation on documents with a strong structure, while their application on unstructured documents does not show significant improvement. Although this paper is not focused on segmenting unstructured documents, results obtained on a benchmark dataset are equal or even overcome previous results of similar works.
|
|
|
Francisco Cruz and Oriol Ramos Terrades. 2014. EM-Based Layout Analysis Method for Structured Documents. 22nd International Conference on Pattern Recognition.315–320.
Abstract: In this paper we present a method to perform layout analysis in structured documents. We proposed an EM-based algorithm to fit a set of Gaussian mixtures to the different regions according to the logical distribution along the page. After the convergence, we estimate the final shape of the regions according
to the parameters computed for each component of the mixture. We evaluated our method in the task of record detection in a collection of historical structured documents and performed a comparison with other previous works in this task.
|
|