|
Anders Hast and Alicia Fornes. 2016. A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching. 12th IAPR Workshop on Document Analysis Systems.150–155.
Abstract: The automatic recognition of historical handwritten documents is still considered challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results.
|
|
|
Alloy Das, Sanket Biswas, Umapada Pal and Josep Llados. 2024. Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes. IEEE International Conference on Robotics and Automation in PACIFICO.
Abstract: When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter which achieves comparable or superior performance over existing text spotting architectures for both regular and arbitrary-shaped scene text spotting benchmarks in terms of both accuracy and model efficiency. The dataset, code and pre-trained models will be released upon acceptance.
|
|
|
Alloy Das, Sanket Biswas, Ayan Banerjee, Josep Llados, Umapada Pal and Saumik Bhattacharya. 2024. Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance. Winter Conference on Applications of Computer Vision.718–728.
Abstract: The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions. However, existing state-of-the-art (SOTA) approaches usually incorporate scene text detection and recognition simply by pretraining on natural scene text datasets, which do not directly exploit the intermediate feature representations between multiple domains. Here, we investigate the problem of domain-adaptive scene text spotting, i.e., training a model on multi-domain source data such that it can directly adapt to target domains rather than being specialized for a specific domain or scenario. Further, we investigate a transformer baseline called Swin-TESTR to focus on solving scene-text spotting for both regular and arbitrary-shaped scene text along with an exhaustive evaluation. The results clearly demonstrate the potential of intermediate representations to achieve significant performance on text spotting benchmarks across multiple domains (e.g. language, synth-to-real, and documents). both in terms of accuracy and efficiency.
|
|
|
Alicia Fornes, Xavier Otazu and Josep Llados. 2013. Show through cancellation and image enhancement by multiresolution contrast processing. 12th International Conference on Document Analysis and Recognition.200–204.
Abstract: Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are developed to solve these problems, and in the case of show-through, by scanning and matching both the front and back sides of the document. In contrast, our approach is designed to use only one side of the scanned document. We hypothesize that show-trough are low contrast components, while foreground components are high contrast ones. A Multiresolution Contrast (MC) decomposition is presented in order to estimate the contrast of features at different spatial scales. We cancel the show-through phenomenon by thresholding these low contrast components. This decomposition is also able to enhance the image removing shadowed areas by weighting spatial scales. Results show that the enhanced images improve the readability of the documents, allowing scholars both to recover unreadable words and to solve ambiguities.
|
|
|
Alicia Fornes, Volkmar Frinken, Andreas Fischer, Jon Almazan, G. Jackson and Horst Bunke. 2011. A Keyword Spotting Approach Using Blurred Shape Model-Based Descriptors. Proceedings of the 2011 Workshop on Historical Document Imaging and Processing. ACM, 83–90.
Abstract: The automatic processing of handwritten historical documents is considered a hard problem in pattern recognition. In addition to the challenges given by modern handwritten data, a lack of training data as well as effects caused by the degradation of documents can be observed. In this scenario, keyword spotting arises to be a viable solution to make documents amenable for searching and browsing. For this task we propose the adaptation of shape descriptors used in symbol recognition. By treating each word image as a shape, it can be represented using the Blurred Shape Model and the De-formable Blurred Shape Model. Experiments on the George Washington database demonstrate that this approach is able to outperform the commonly used Dynamic Time Warping approach.
|
|
|
Alicia Fornes and 6 others. 2017. ICDAR2017 Competition on Information Extraction in Historical Handwritten Records. 14th International Conference on Document Analysis and Recognition.1389–1394.
Abstract: The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results.
|
|
|
Alicia Fornes, V.C.Kieu, M. Visani, N.Journet and Anjan Dutta. 2014. The ICDAR/GREC 2013 Music Scores Competition: Staff Removal. In B.Lamiroy and J.-M. Ogier, eds. Graphics Recognition. Current Trends and Challenges. Springer Berlin Heidelberg, 207–220. (LNCS.)
Abstract: The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant’s methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations.
Keywords: Competition; Graphics recognition; Music scores; Writer identification; Staff removal
|
|
|
Alicia Fornes, Sergio Escalera, Josep Llados, Gemma Sanchez, Petia Radeva and Oriol Pujol. 2007. Handwritten Symbol Recognition by a Boosted Blurred Shape Model with Error Correction. 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4477:13–21.
|
|
|
Alicia Fornes, Sergio Escalera, Josep Llados, Gemma Sanchez and Joan Mas. 2008. Hand Drawn Symbol Recognition by Blurred Shape Model Descriptor and a Multiclass Classifier. In W. Liu, J.L., J.M. Ogier, ed. Graphics Recognition: Recent Advances and New Opportunities.30–40. (LNCS.)
|
|
|
Alicia Fornes, Sergio Escalera, Josep Llados and Gemma Sanchez. 2007. Symbol Recognition by Multi-class Blurred Shape Models. Seventh IAPR International Workshop on Graphics Recognition.11–13.
|
|