|
Records |
Links |
|
Author  |
Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas |


|
|
Title |
Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
129 |
Issue |
|
Pages |
108766 |
|
|
Keywords |
|
|
|
Abstract |
The advent of recurrent neural networks for handwriting recognition marked an important milestone reaching impressive recognition accuracies despite the great variability that we observe across different writing styles. Sequential architectures are a perfect fit to model text lines, not only because of the inherent temporal aspect of text, but also to learn probability distributions over sequences of characters and words. However, using such recurrent paradigms comes at a cost at training stage, since their sequential pipelines prevent parallelization. In this work, we introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We propose a novel method that bypasses any recurrence. By using multi-head self-attention layers both at the visual and textual stages, we are able to tackle character recognition as well as to learn language-related dependencies of the character sequences to be decoded. Our model is unconstrained to any predefined vocabulary, being able to recognize out-of-vocabulary words, i.e. words that do not appear in the training vocabulary. We significantly advance over prior art and demonstrate that satisfactory recognition accuracies are yielded even in few-shot learning scenarios. |
|
|
Address |
Sept. 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121; 600.162 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KRR2022 |
Serial |
3556 |
|
Permanent link to this record |
|
|
|
|
Author  |
Lei Kang; Marçal Rusiñol; Alicia Fornes; Pau Riba; Mauricio Villegas |


|
|
Title |
Unsupervised Adaptation for Synthetic-to-Real Handwritten Word Recognition |
Type |
Conference Article |
|
Year |
2020 |
Publication |
IEEE Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Handwritten Text Recognition (HTR) is still a challenging problem because it must deal with two important difficulties: the variability among writing styles, and the scarcity of labelled data. To alleviate such problems, synthetic data generation and data augmentation are typically used to train HTR systems. However, training with such data produces encouraging but still inaccurate transcriptions in real words. In this paper, we propose an unsupervised writer adaptation approach that is able to automatically adjust a generic handwritten word recognizer, fully trained with synthetic fonts, towards a new incoming writer. We have experimentally validated our proposal using five different datasets, covering several challenges (i) the document source: modern and historic samples, which may involve paper degradation problems; (ii) different handwriting styles: single and multiple writer collections; and (iii) language, which involves different character combinations. Across these challenging collections, we show that our system is able to maintain its performance, thus, it provides a practical and generic approach to deal with new document collections without requiring any expensive and tedious manual annotation step. |
|
|
Address |
Aspen; Colorado; USA; March 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
DAG; 600.129; 600.140; 601.302; 601.312; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KRF2020 |
Serial |
3446 |
|
Permanent link to this record |
|
|
|
|
Author  |
Lei Kang; Juan Ignacio Toledo; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol |


|
|
Title |
Convolve, Attend and Spell: An Attention-based Sequence-to-Sequence Model for Handwritten Word Recognition |
Type |
Conference Article |
|
Year |
2018 |
Publication |
40th German Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
459-472 |
|
|
Keywords |
|
|
|
Abstract |
This paper proposes Convolve, Attend and Spell, an attention based sequence-to-sequence model for handwritten word recognition. The proposed architecture has three main parts: an encoder, consisting of a CNN and a bi-directional GRU, an attention mechanism devoted to focus on the pertinent features and a decoder formed by a one-directional GRU, able to spell the corresponding word, character by character. Compared with the recent state-of-the-art, our model achieves competitive results on the IAM dataset without needing any pre-processing step, predefined lexicon nor language model. Code and additional results are available in https://github.com/omni-us/research-seq2seq-HTR. |
|
|
Address |
Stuttgart; Germany; October 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GCPR |
|
|
Notes |
DAG; 600.097; 603.057; 302.065; 601.302; 600.084; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KTR2018 |
Serial |
3167 |
|
Permanent link to this record |
|
|
|
|
Author  |
Lei Kang |

|
|
Title |
Robust Handwritten Text Recognition in Scarce Labeling Scenarios: Disentanglement, Adaptation and Generation |
Type |
Book Whole |
|
Year |
2020 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Handwritten documents are not only preserved in historical archives but also widely used in administrative documents such as cheques and claims. With the rise of the deep learning era, many state-of-the-art approaches have achieved good performance on specific datasets for Handwritten Text Recognition (HTR). However, it is still challenging to solve real use cases because of the varied handwriting styles across different writers and the limited labeled data. Thus, both explorin a more robust handwriting recognition architectures and proposing methods to diminish the gap between the source and target data in an unsupervised way are
demanded.
In this thesis, firstly, we explore novel architectures for HTR, from Sequence-to-Sequence (Seq2Seq) method with attention mechanism to non-recurrent Transformer-based method. Secondly, we focus on diminishing the performance gap between source and target data in an unsupervised way. Finally, we propose a group of generative methods for handwritten text images, which could be utilized to increase the training set to obtain a more robust recognizer. In addition, by simply modifying the generative method and joining it with a recognizer, we end up with an effective disentanglement method to distill textual content from handwriting styles so as to achieve a generalized recognition performance.
We outperform state-of-the-art HTR performances in the experimental results among different scientific and industrial datasets, which prove the effectiveness of the proposed methods. To the best of our knowledge, the non-recurrent recognizer and the disentanglement method are the first contributions in the handwriting recognition field. Furthermore, we have outlined the potential research lines, which would be interesting to explore in the future. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Alicia Fornes;Marçal Rusiñol;Mauricio Villegas |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-122714-0-9 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Kan20 |
Serial |
3482 |
|
Permanent link to this record |
|
|
|
|
Author  |
Lasse Martensson; Ekta Vats; Anders Hast; Alicia Fornes |

|
|
Title |
In Search of the Scribe: Letter Spotting as a Tool for Identifying Scribes in Large Handwritten Text Corpora |
Type |
Journal |
|
Year |
2019 |
Publication |
Journal for Information Technology Studies as a Human Science |
Abbreviated Journal |
HUMAN IT |
|
|
Volume |
14 |
Issue |
2 |
Pages |
95-120 |
|
|
Keywords |
Scribal attribution/ writer identification; digital palaeography; word spotting; mediaeval charters; mediaeval manuscripts |
|
|
Abstract |
In this article, a form of the so-called word spotting-method is used on a large set of handwritten documents in order to identify those that contain script of similar execution. The point of departure for the investigation is the mediaeval Swedish manuscript Cod. Holm. D 3. The main scribe of this manuscript has yet not been identified in other documents. The current attempt aims at localising other documents that display a large degree of similarity in the characteristics of the script, these being possible candidates for being executed by the same hand. For this purpose, the method of word spotting has been employed, focusing on individual letters, and therefore the process is referred to as letter spotting in the article. In this process, a set of ‘g’:s, ‘h’:s and ‘k’:s have been selected as templates, and then a search has been made for close matches among the mediaeval Swedish charters. The search resulted in a number of charters that displayed great similarities with the manuscript D 3. The used letter spotting method thus proofed to be a very efficient sorting tool localising similar script samples. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MVH2019 |
Serial |
3234 |
|
Permanent link to this record |
|
|
|
|
Author  |
Lasse Martensson; Anders Hast; Alicia Fornes |


|
|
Title |
Word Spotting as a Tool for Scribal Attribution |
Type |
Conference Article |
|
Year |
2017 |
Publication |
2nd Conference of the association of Digital Humanities in the Nordic Countries |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
87-89 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Gothenburg; Suecia; March 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-91-88348-83-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DHN |
|
|
Notes |
DAG; 600.097; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MHF2017 |
Serial |
2954 |
|
Permanent link to this record |
|
|
|
|
Author  |
L.Tarazon; D. Perez; N. Serrano; V. Alabau; Oriol Ramos Terrades; A. Sanchis; A. Juan |


|
|
Title |
Confidence Measures for Error Correction in Interactive Transcription of Handwritten Text |
Type |
Conference Article |
|
Year |
2009 |
Publication |
15th International Conference on Image Analysis and Processing |
Abbreviated Journal |
|
|
|
Volume |
5716 |
Issue |
|
Pages |
567-574 |
|
|
Keywords |
|
|
|
Abstract |
An effective approach to transcribe old text documents is to follow an interactive-predictive paradigm in which both, the system is guided by the human supervisor, and the supervisor is assisted by the system to complete the transcription task as efficiently as possible. In this paper, we focus on a particular system prototype called GIDOC, which can be seen as a first attempt to provide user-friendly, integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. More specifically, we focus on the handwriting recognition part of GIDOC, for which we propose the use of confidence measures to guide the human supervisor in locating possible system errors and deciding how to proceed. Empirical results are reported on two datasets showing that a word error rate not larger than a 10% can be achieved by only checking the 32% of words that are recognised with less confidence. |
|
|
Address |
Vietri sul Mare, Italy |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-04145-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICIAP |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ TPS2009 |
Serial |
1871 |
|
Permanent link to this record |
|
|
|
|
Author  |
L. Rothacker; Marçal Rusiñol; Josep Llados; G.A. Fink |

|
|
Title |
A Two-stage Approach to Segmentation-Free Query-by-example Word Spotting |
Type |
Journal |
|
Year |
2014 |
Publication |
Manuscript Cultures |
Abbreviated Journal |
|
|
|
Volume |
7 |
Issue |
|
Pages |
47-58 |
|
|
Keywords |
|
|
|
Abstract |
With the ongoing progress in digitization, huge document collections and archives have become available to a broad audience. Scanned document images can be transmitted electronically and studied simultaneously throughout the world. While this is very beneficial, it is often impossible to perform automated searches on these document collections. Optical character recognition usually fails when it comes to handwritten or historic documents. In order to address the need for exploring document collections rapidly, researchers are working on word spotting. In query-by-example word spotting scenarios, the user selects an exemplary occurrence of the query word in a document image. The word spotting system then retrieves all regions in the collection that are visually similar to the given example of the query word. The best matching regions are presented to the user and no actual transcription is required.
An important property of a word spotting system is the computational speed with which queries can be executed. In our previous work, we presented a relatively slow but high-precision method. In the present work, we will extend this baseline system to an integrated two-stage approach. In a coarse-grained first stage, we will filter document images efficiently in order to identify regions that are likely to contain the query word. In the fine-grained second stage, these regions will be analyzed with our previously presented high-precision method. Finally, we will report recognition results and query times for the well-known George Washington
benchmark in our evaluation. We achieve state-of-the-art recognition results while the query times can be reduced to 50% in comparison with our baseline. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.061; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3190 |
|
Permanent link to this record |
|
|
|
|
Author  |
L. Rothacker; Marçal Rusiñol; G.A. Fink |


|
|
Title |
Bag-of-Features HMMs for segmentation-free word spotting in handwritten documents |
Type |
Conference Article |
|
Year |
2013 |
Publication |
12th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1305 - 1309 |
|
|
Keywords |
|
|
|
Abstract |
Recent HMM-based approaches to handwritten word spotting require large amounts of learning samples and mostly rely on a prior segmentation of the document. We propose to use Bag-of-Features HMMs in a patch-based segmentation-free framework that are estimated by a single sample. Bag-of-Features HMMs use statistics of local image feature representatives. Therefore they can be considered as a variant of discrete HMMs allowing to model the observation of a number of features at a point in time. The discrete nature enables us to estimate a query model with only a single example of the query provided by the user. This makes our method very flexible with respect to the availability of training data. Furthermore, we are able to outperform state-of-the-art results on the George Washington dataset. |
|
|
Address |
Washington; USA; August 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1520-5363 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ RRF2013 |
Serial |
2344 |
|
Permanent link to this record |
|
|
|
|
Author  |
Kunal Biswas; Palaiahnakote Shivakumara; Umapada Pal; Tong Lu; Michel Blumenstein; Josep Llados |

|
|
Title |
Classification of aesthetic natural scene images using statistical and semantic features |
Type |
Journal Article |
|
Year |
2023 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
|
|
Volume |
82 |
Issue |
9 |
Pages |
13507-13532 |
|
|
Keywords |
|
|
|
Abstract |
Aesthetic image analysis is essential for improving the performance of multimedia image retrieval systems, especially from a repository of social media and multimedia content stored on mobile devices. This paper presents a novel method for classifying aesthetic natural scene images by studying the naturalness of image content using statistical features, and reading text in the images using semantic features. Unlike existing methods that focus only on image quality with human information, the proposed approach focuses on image features as well as text-based semantic features without human intervention to reduce the gap between subjectivity and objectivity in the classification. The aesthetic classes considered in this work are (i) Very Pleasant, (ii) Pleasant, (iii) Normal and (iv) Unpleasant. The naturalness is represented by features of focus, defocus, perceived brightness, perceived contrast, blurriness and noisiness, while semantics are represented by text recognition, description of the images and labels of images, profile pictures, and banner images. Furthermore, a deep learning model is proposed in a novel way to fuse statistical and semantic features for the classification of aesthetic natural scene images. Experiments on our own dataset and the standard datasets demonstrate that the proposed approach achieves 92.74%, 88.67% and 83.22% average classification rates on our own dataset, AVA dataset and CUHKPQ dataset, respectively. Furthermore, a comparative study of the proposed model with the existing methods shows that the proposed method is effective for the classification of aesthetic social media images. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ BSP2023 |
Serial |
3873 |
|
Permanent link to this record |