|
Partha Pratim Roy, Umapada Pal and Josep Llados. 2009. Seal detection and recognition: An approach for document indexing. 10th International Conference on Document Analysis and Recognition.101–105.
Abstract: Reliable indexing of documents having seal instances can be achieved by recognizing seal information. This paper presents a novel approach for detecting and classifying such multi-oriented seals in these documents. First, Hough Transform based methods are applied to extract the seal regions in documents. Next, isolated text characters within these regions are detected. Rotation and size invariant features and a support vector machine based classifier have been used to recognize these detected text characters. Next, for each pair of character, we encode their relative spatial organization using their distance and angular position with respect to the centre of the seal, and enter this code into a hash table. Given an input seal, we recognize the individual text characters and compute the code for pair-wise character based on the relative spatial organization. The code obtained from the input seal helps to retrieve model hypothesis from the hash table. The seal model to which we get maximum hypothesis is selected for the recognition of the input seal. The methodology is tested to index seal in rotation and size invariant environment and we obtained encouraging results.
|
|
|
David Fernandez, R.Manmatha, Josep Llados and Alicia Fornes. 2014. Sequential Word Spotting in Historical Handwritten Documents. 11th IAPR International Workshop on Document Analysis and Systems.101–105.
Abstract: In this work we present a handwritten word spotting approach that takes advantage of the a priori known order of appearance of the query words. Given an ordered sequence of query word instances, the proposed approach performs a
sequence alignment with the words in the target collection. Although the alignment is quite sparse, i.e. the number of words in the database is higher than the query set, the improvement in the overall performance is sensitively higher than isolated word spotting. As application dataset, we use a collection of handwritten marriage licenses taking advantage of the ordered
index pages of family names.
|
|
|
Sophie Wuerger, Kaida Xiao, Dimitris Mylonas, Q. Huang, Dimosthenis Karatzas and Galina Paramei. 2012. Blue green color categorization in mandarin english speakers. JOSA A, 29(2), A102–A1207.
Abstract: Observers are faster to detect a target among a set of distracters if the targets and distracters come from different color categories. This cross-boundary advantage seems to be limited to the right visual field, which is consistent with the dominance of the left hemisphere for language processing [Gilbert et al., Proc. Natl. Acad. Sci. USA 103, 489 (2006)]. Here we study whether a similar visual field advantage is found in the color identification task in speakers of Mandarin, a language that uses a logographic system. Forty late Mandarin-English bilinguals performed a blue-green color categorization task, in a blocked design, in their first language (L1: Mandarin) or second language (L2: English). Eleven color singletons ranging from blue to green were presented for 160 ms, randomly in the left visual field (LVF) or right visual field (RVF). Color boundary and reaction times (RTs) at the color boundary were estimated in L1 and L2, for both visual fields. We found that the color boundary did not differ between the languages; RTs at the color boundary, however, were on average more than 100 ms shorter in the English compared to the Mandarin sessions, but only when the stimuli were presented in the RVF. The finding may be explained by the script nature of the two languages: Mandarin logographic characters are analyzed visuospatially in the right hemisphere, which conceivably facilitates identification of color presented to the LVF.
|
|
|
Alicia Fornes, Josep Llados, Joan Mas, Joana Maria Pujadas-Mora and Anna Cabre. 2014. A Bimodal Crowdsourcing Platform for Demographic Historical Manuscripts. Digital Access to Textual Cultural Heritage Conference.103–108.
Abstract: In this paper we present a crowdsourcing web-based application for extracting information from demographic handwritten document images. The proposed application integrates two points of view: the semantic information for demographic research, and the ground-truthing for document analysis research. Concretely, the application has the contents view, where the information is recorded into forms, and the labeling view, with the word labels for evaluating document analysis techniques. The crowdsourcing architecture allows to accelerate the information extraction (many users can work simultaneously), validate the information, and easily provide feedback to the users. We finally show how the proposed application can be extended to other kind of demographic historical manuscripts.
|
|
|
Pau Riba, Alicia Fornes and Josep Llados. 2017. Towards the Alignment of Handwritten Music Scores. In Bart Lamiroy and R Dueire Lins, eds. International Workshop on Graphics Recognition. GREC 2015.Graphic Recognition. Current Trends and Challenges.103–116. (LNCS.)
Abstract: It is very common to nd dierent versions of the same music work in archives of Opera Theaters. These dierences correspond to modications and annotations from the musicians. From the musicologist point of view, these variations are very interesting and deserve study.
This paper explores the alignment of music scores as a tool for automatically detecting the passages that contain such dierences. Given the diculties in the recognition of handwritten music scores, our goal is to align the music scores and at the same time, avoid the recognition of music elements as much as possible. After removing the sta lines, braces and ties, the bar lines are detected. Then, the bar units are described as a whole using the Blurred Shape Model. The bar units alignment is performed by using Dynamic Time Warping. The analysis of the alignment path is used to detect the variations in the music scores. The method has been evaluated on a subset of the CVC-MUSCIMA dataset, showing encouraging results.
Keywords: Optical Music Recognition; Handwritten Music Scores; Dynamic Time Warping alignment
|
|
|
Marçal Rusiñol and Josep Llados. 2008. A Region-Based Hashing Approach for Symbol Spotting in Technical Documents. In W. Lius, J.L., J.M. Ogier, ed. Graphics Recognition: Recent Advances and New Opportunities.104–113. (LNCS.)
|
|
|
Josep Llados, Ernest Valveny, Gemma Sanchez and Enric Marti. 2002. Symbol recognition: current advances and perspectives. In Dorothea Blostein and Young- Bin Kwon, ed. Graphics Recognition Algorithms And Applications. Springer-Verlag, 104–128. (LNCS.)
Abstract: The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content.
|
|
|
R. Bertrand, P. Gomez-Krämer, Oriol Ramos Terrades, P. Franco and Jean-Marc Ogier. 2013. A System Based On Intrinsic Features for Fraudulent Document Detection. 12th International Conference on Document Analysis and Recognition.106–110.
Abstract: Paper documents still represent a large amount of information supports used nowadays and may contain critical data. Even though official documents are secured with techniques such as printed patterns or artwork, paper documents suffer froma lack of security.
However, the high availability of cheap scanning and printing hardware allows non-experts to easily create fake documents. As the use of a watermarking system added during the document production step is hardly possible, solutions have to be proposed to distinguish a genuine document from a forged one.
In this paper, we present an automatic forgery detection method based on document’s intrinsic features at character level. This method is based on the one hand on outlier character detection in a discriminant feature space and on the other hand on the detection of strictly similar characters. Therefore, a feature set iscomputed for all characters. Then, based on a distance between characters of the same class.
Keywords: paper document; document analysis; fraudulent document; forgery; fake
|
|
|
Sergi Garcia Bordils, Dimosthenis Karatzas and Marçal Rusiñol. 2023. Accelerating Transformer-Based Scene Text Detection and Recognition via Token Pruning. 17th International Conference on Document Analysis and Recognition.106–121. (LNCS.)
Abstract: Scene text detection and recognition is a crucial task in computer vision with numerous real-world applications. Transformer-based approaches are behind all current state-of-the-art models and have achieved excellent performance. However, the computational requirements of the transformer architecture makes training these methods slow and resource heavy. In this paper, we introduce a new token pruning strategy that significantly decreases training and inference times without sacrificing performance, striking a balance between accuracy and speed. We have applied this pruning technique to our own end-to-end transformer-based scene text understanding architecture. Our method uses a separate detection branch to guide the pruning of uninformative image features, which significantly reduces the number of tokens at the input of the transformer. Experimental results show how our network is able to obtain competitive results on multiple public benchmarks while running at significantly higher speeds.
Keywords: Scene Text Detection; Scene Text Recognition; Transformer Acceleration
|
|
|
Pau Riba, Josep Llados and Alicia Fornes. 2017. Error-tolerant coarse-to-fine matching model for hierarchical graphs. In Pasquale Foggia, Cheng-Lin Liu and Mario Vento, eds. 11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition. Springer International Publishing, 107–117.
Abstract: Graph-based representations are effective tools to capture structural information from visual elements. However, retrieving a query graph from a large database of graphs implies a high computational complexity. Moreover, these representations are very sensitive to noise or small changes. In this work, a novel hierarchical graph representation is designed. Using graph clustering techniques adapted from graph-based social media analysis, we propose to generate a hierarchy able to deal with different levels of abstraction while keeping information about the topology. For the proposed representations, a coarse-to-fine matching method is defined. These approaches are validated using real scenarios such as classification of colour images and handwritten word spotting.
Keywords: Graph matching; Hierarchical graph; Graph-based representation; Coarse-to-fine matching
|
|