Records |
Links |
Author |
S.K. Jemni; Mohamed Ali Souibgui; Yousri Kessentini; Alicia Fornes |
![goto web page url](http://refbase.cvc.uab.es/img/www.gif)
Title |
Enhance to Read Better: A Multi-Task Adversarial Network for Handwritten Document Image Enhancement |
Type |
Journal Article |
Year |
2022 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
Volume |
123 |
Issue |
Pages |
108370 |
Keywords |
Abstract |
Handwritten document images can be highly affected by degradation for different reasons: Paper ageing, daily-life scenarios (wrinkles, dust, etc.), bad scanning process and so on. These artifacts raise many readability issues for current Handwritten Text Recognition (HTR) algorithms and severely devalue their efficiency. In this paper, we propose an end to end architecture based on Generative Adversarial Networks (GANs) to recover the degraded documents into a and form. Unlike the most well-known document binarization methods, which try to improve the visual quality of the degraded document, the proposed architecture integrates a handwritten text recognizer that promotes the generated document image to be more readable. To the best of our knowledge, this is the first work to use the text information while binarizing handwritten documents. Extensive experiments conducted on degraded Arabic and Latin handwritten documents demonstrate the usefulness of integrating the recognizer within the GAN architecture, which improves both the visual quality and the readability of the degraded document images. Moreover, we outperform the state of the art in H-DIBCO challenges, after fine tuning our pre-trained model with synthetically degraded Latin handwritten images, on this task. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.124; 600.121; 602.230 |
Approved |
no |
Call Number ![sorted by Call Number field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
Admin @ si @ JSK2022 |
Serial |
3613 |
Permanent link to this record |
Author |
Lei Kang; Pau Riba; Marcal Rusinol; Alicia Fornes; Mauricio Villegas |
![goto web page url](http://refbase.cvc.uab.es/img/www.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
Title |
Content and Style Aware Generation of Text-line Images for Handwriting Recognition |
Type |
Journal Article |
Year |
2021 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords |
Abstract |
Handwritten Text Recognition has achieved an impressive performance in public benchmarks. However, due to the high inter- and intra-class variability between handwriting styles, such recognizers need to be trained using huge volumes of manually labeled training data. To alleviate this labor-consuming problem, synthetic data produced with TrueType fonts has been often used in the training loop to gain volume and augment the handwriting style variability. However, there is a significant style bias between synthetic and real data which hinders the improvement of recognition performance. To deal with such limitations, we propose a generative method for handwritten text-line images, which is conditioned on both visual appearance and textual content. Our method is able to produce long text-line samples with diverse handwriting styles. Once properly trained, our method can also be adapted to new target data by only accessing unlabeled text-line images to mimic handwritten styles and produce images with any textual content. Extensive experiments have been done on making use of the generated samples to boost Handwritten Text Recognition performance. Both qualitative and quantitative results demonstrate that the proposed approach outperforms the current state of the art. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.140; 600.121 |
Approved |
no |
Call Number ![sorted by Call Number field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
Admin @ si @ KRR2021 |
Serial |
3612 |
Permanent link to this record |
Author |
Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas |
![download file file](http://refbase.cvc.uab.es/img/file.gif)
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
Title |
Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition |
Type |
Journal Article |
Year |
2022 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
Volume |
129 |
Issue |
Pages |
108766 |
Keywords |
Abstract |
The advent of recurrent neural networks for handwriting recognition marked an important milestone reaching impressive recognition accuracies despite the great variability that we observe across different writing styles. Sequential architectures are a perfect fit to model text lines, not only because of the inherent temporal aspect of text, but also to learn probability distributions over sequences of characters and words. However, using such recurrent paradigms comes at a cost at training stage, since their sequential pipelines prevent parallelization. In this work, we introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We propose a novel method that bypasses any recurrence. By using multi-head self-attention layers both at the visual and textual stages, we are able to tackle character recognition as well as to learn language-related dependencies of the character sequences to be decoded. Our model is unconstrained to any predefined vocabulary, being able to recognize out-of-vocabulary words, i.e. words that do not appear in the training vocabulary. We significantly advance over prior art and demonstrate that satisfactory recognition accuracies are yielded even in few-shot learning scenarios. |
Address |
Sept. 2022 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.121; 600.162 |
Approved |
no |
Call Number ![sorted by Call Number field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
Admin @ si @ KRR2022 |
Serial |
3556 |
Permanent link to this record |
Author |
Lei Kang; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
Title |
Candidate Fusion: Integrating Language Modelling into a Sequence-to-Sequence Handwritten Word Recognition Architecture |
Type |
Journal Article |
Year |
2021 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
Volume |
112 |
Issue |
Pages |
107790 |
Keywords |
Abstract |
Sequence-to-sequence models have recently become very popular for tackling
handwritten word recognition problems. However, how to effectively integrate an external language model into such recognizer is still a challenging
problem. The main challenge faced when training a language model is to
deal with the language model corpus which is usually different to the one
used for training the handwritten word recognition system. Thus, the bias
between both word corpora leads to incorrectness on the transcriptions, providing similar or even worse performances on the recognition task. In this
work, we introduce Candidate Fusion, a novel way to integrate an external
language model to a sequence-to-sequence architecture. Moreover, it provides suggestions from an external language knowledge, as a new input to
the sequence-to-sequence recognizer. Hence, Candidate Fusion provides two
improvements. On the one hand, the sequence-to-sequence recognizer has
the flexibility not only to combine the information from itself and the language model, but also to choose the importance of the information provided
by the language model. On the other hand, the external language model
has the ability to adapt itself to the training corpus and even learn the
most commonly errors produced from the recognizer. Finally, by conducting
comprehensive experiments, the Candidate Fusion proves to outperform the
state-of-the-art language models for handwritten word recognition tasks. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.140; 601.302; 601.312; 600.121 |
Approved |
no |
Call Number ![sorted by Call Number field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
Admin @ si @ KRV2021 |
Serial |
3343 |
Permanent link to this record |
Author |
Thanh Nam Le; Muhammad Muzzamil Luqman; Anjan Dutta; Pierre Heroux; Christophe Rigaud; Clement Guerin; Pasquale Foggia; Jean Christophe Burie; Jean Marc Ogier; Josep Llados; Sebastien Adam |
![goto web page url](http://refbase.cvc.uab.es/img/www.gif)
Title |
Subgraph spotting in graph representations of comic book images |
Type |
Journal Article |
Year |
2018 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
Volume |
112 |
Issue |
Pages |
118-124 |
Keywords |
Attributed graph; Region adjacency graph; Graph matching; Graph isomorphism; Subgraph isomorphism; Subgraph spotting; Graph indexing; Graph retrieval; Query by example; Dataset and comic book images |
Abstract |
Graph-based representations are the most powerful data structures for extracting, representing and preserving the structural information of underlying data. Subgraph spotting is an interesting research problem, especially for studying and investigating the structural information based content-based image retrieval (CBIR) and query by example (QBE) in image databases. In this paper we address the problem of lack of freely available ground-truthed datasets for subgraph spotting and present a new dataset for subgraph spotting in graph representations of comic book images (SSGCI) with its ground-truth and evaluation protocol. Experimental results of two state-of-the-art methods of subgraph spotting are presented on the new SSGCI dataset. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.097; 600.121 |
Approved |
no |
Call Number ![sorted by Call Number field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
Admin @ si @ LLD2018 |
Serial |
3150 |
Permanent link to this record |