|
Records |
Links |
|
Author |
David Aldavert; Marçal Rusiñol |
|
|
Title |
Manuscript text line detection and segmentation using second-order derivatives analysis |
Type |
Conference Article |
|
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
293 - 298 |
|
|
Keywords |
text line detection; text line segmentation; text region detection; second-order derivatives |
|
|
Abstract |
In this paper, we explore the use of second-order derivatives to detect text lines on handwritten document images. Taking advantage that the second derivative gives a minimum response when a dark linear element over a
bright background has the same orientation as the filter, we use this operator to create a map with the local orientation and strength of putative text lines in the document. Then, we detect line segments by selecting and merging the filter responses that have a similar orientation and scale. Finally, text lines are found by merging the segments that are within the same text region. The proposed segmentation algorithm, is learning-free while showing a performance similar to the state of the art methods in publicly available datasets. |
|
|
Address |
Viena; Austria; April 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.084; 600.129; 302.065; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ AlR2018a |
Serial |
3104 |
|
Permanent link to this record |
|
|
|
|
Author |
David Aldavert; Marçal Rusiñol |
|
|
Title |
Synthetically generated semantic codebook for Bag-of-Visual-Words based word spotting |
Type |
Conference Article |
|
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
223 - 228 |
|
|
Keywords |
Word Spotting; Bag of Visual Words; Synthetic Codebook; Semantic Information |
|
|
Abstract |
Word-spotting methods based on the Bag-ofVisual-Words framework have demonstrated a good retrieval performance even when used in a completely unsupervised manner. Although unsupervised approaches are suitable for
large document collections due to the cost of acquiring labeled data, these methods also present some drawbacks. For instance, having to train a suitable “codebook” for a certain dataset has a high computational cost. Therefore, in
this paper we present a database agnostic codebook which is trained from synthetic data. The aim of the proposed approach is to generate a codebook where the only information required is the type of script used in the document. The use of synthetic data also allows to easily incorporate semantic
information in the codebook generation. So, the proposed method is able to determine which set of codewords have a semantic representation of the descriptor feature space. Experimental results show that the resulting codebook attains a state-of-the-art performance while having a more compact representation. |
|
|
Address |
Viena; Austria; April 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.084; 600.129; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ AlR2018b |
Serial |
3105 |
|
Permanent link to this record |
|
|
|
|
Author |
V. Poulain d'Andecy; Emmanuel Hartmann; Marçal Rusiñol |
|
|
Title |
Field Extraction by hybrid incremental and a-priori structural templates |
Type |
Conference Article |
|
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
251 - 256 |
|
|
Keywords |
Layout Analysis; information extraction; incremental learning |
|
|
Abstract |
In this paper, we present an incremental framework for extracting information fields from administrative documents. First, we demonstrate some limits of the existing state-of-the-art methods such as the delay of the system efficiency. This is a concern in industrial context when we have only few samples of each document class. Based on this analysis, we propose a hybrid system combining incremental learning by means of itf-df statistics and a-priori generic
models. We report in the experimental section our results obtained with a dataset of real invoices. |
|
|
Address |
Viena; Austria; April 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.084; 600.129; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ PHR2018 |
Serial |
3106 |
|
Permanent link to this record |
|
|
|
|
Author |
Manuel Carbonell; Mauricio Villegas; Alicia Fornes; Josep Llados |
|
|
Title |
Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model |
Type |
Conference Article |
|
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
399-404 |
|
|
Keywords |
Named entity recognition; Handwritten Text Recognition; neural networks |
|
|
Abstract |
When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks. This has the disadvantage that errors in the first module affect heavily the
performance of the second module. In this work we propose to do both tasks jointly, using a single neural network with a common architecture used for plain text recognition. Experimentally, the work has been tested on a collection of historical marriage records. Results of experiments are presented to show the effect on the performance for different
configurations: different ways of encoding the information, doing or not transfer learning and processing at text line or multi-line region level. The results are comparable to state of the art reported in the ICDAR 2017 Information Extraction competition, even though the proposed technique does not use any dictionaries, language modeling or post processing. |
|
|
Address |
Vienna; Austria; April 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.097; 603.057; 601.311; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CVF2018 |
Serial |
3170 |
|
Permanent link to this record |
|
|
|
|
Author |
Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados |
|
|
Title |
A Generic Image Retrieval Method for Date Estimation of Historical Document Collections |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Document Analysis Systems.15th IAPR International Workshop, (DAS2022) |
Abbreviated Journal |
|
|
|
Volume |
13237 |
Issue |
|
Pages |
583–597 |
|
|
Keywords |
Date estimation; Document retrieval; Image retrieval; Ranking loss; Smooth-nDCG |
|
|
Abstract |
Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others. This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections. We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem. One of the main usages of the presented approach is as a tool for historical contextual retrieval. It means that scholars could perform comparative analysis of historical images from big datasets in terms of the period where they were produced. We provide experimental evaluation on different types of documents from real datasets of manuscript and newspaper images. |
|
|
Address |
La Rochelle, France; May 22–25, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MGR2022 |
Serial |
3694 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Brugues Pujolras; Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
A Multilingual Approach to Scene Text Visual Question Answering |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Document Analysis Systems.15th IAPR International Workshop, (DAS2022) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
65-79 |
|
|
Keywords |
Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning |
|
|
Abstract |
Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines. |
|
|
Address |
La Rochelle, France; May 22–25, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 611.004; 600.155; 601.002 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BGK2022b |
Serial |
3695 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergi Garcia Bordils; George Tom; Sangeeth Reddy; Minesh Mathew; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas |
|
|
Title |
Read While You Drive-Multilingual Text Tracking on the Road |
Type |
Conference Article |
|
Year |
2022 |
Publication |
15th IAPR International workshop on document analysis systems |
Abbreviated Journal |
|
|
|
Volume |
13237 |
Issue |
|
Pages |
756–770 |
|
|
Keywords |
|
|
|
Abstract |
Visual data obtained during driving scenarios usually contain large amounts of text that conveys semantic information necessary to analyse the urban environment and is integral to the traffic control plan. Yet, research on autonomous driving or driver assistance systems typically ignores this information. To advance research in this direction, we present RoadText-3K, a large driving video dataset with fully annotated text. RoadText-3K is three times bigger than its predecessor and contains data from varied geographical locations, unconstrained driving conditions and multiple languages and scripts. We offer a comprehensive analysis of tracking by detection and detection by tracking methods exploring the limits of state-of-the-art text detection. Finally, we propose a new end-to-end trainable tracking model that yields state-of-the-art results on this challenging dataset. Our experiments demonstrate the complexity and variability of RoadText-3K and establish a new, realistic benchmark for scene text tracking in the wild. |
|
|
Address |
La Rochelle; France; May 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-031-06554-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.155; 611.022; 611.004 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GTR2022 |
Serial |
3783 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Josep Llados; Joan Mas; Joana Maria Pujadas-Mora; Anna Cabre |
|
|
Title |
A Bimodal Crowdsourcing Platform for Demographic Historical Manuscripts |
Type |
Conference Article |
|
Year |
2014 |
Publication |
Digital Access to Textual Cultural Heritage Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
103-108 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we present a crowdsourcing web-based application for extracting information from demographic handwritten document images. The proposed application integrates two points of view: the semantic information for demographic research, and the ground-truthing for document analysis research. Concretely, the application has the contents view, where the information is recorded into forms, and the labeling view, with the word labels for evaluating document analysis techniques. The crowdsourcing architecture allows to accelerate the information extraction (many users can work simultaneously), validate the information, and easily provide feedback to the users. We finally show how the proposed application can be extended to other kind of demographic historical manuscripts. |
|
|
Address |
Madrid; May 2014 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4503-2588-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DATeCH |
|
|
Notes |
DAG; 600.061; 602.006; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FLM2014 |
Serial |
2516 |
|
Permanent link to this record |
|
|
|
|
Author |
Arnau Baro; Jialuo Chen; Alicia Fornes; Beata Megyesi |
|
|
Title |
Towards a generic unsupervised method for transcription of encoded manuscripts |
Type |
Conference Article |
|
Year |
2019 |
Publication |
3rd International Conference on Digital Access to Textual Cultural Heritage |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
73-78 |
|
|
Keywords |
A. Baró, J. Chen, A. Fornés, B. Megyesi. |
|
|
Abstract |
Historical ciphers, a special type of manuscripts, contain encrypted information, important for the interpretation of our history. The first step towards decipherment is to transcribe the images, either manually or by automatic image processing techniques. Despite the improvements in handwritten text recognition (HTR) thanks to deep learning methodologies, the need of labelled data to train is an important limitation. Given that ciphers often use symbol sets across various alphabets and unique symbols without any transcription scheme available, these supervised HTR techniques are not suitable to transcribe ciphers. In this paper we propose an un-supervised method for transcribing encrypted manuscripts based on clustering and label propagation, which has been successfully applied to community detection in networks. We analyze the performance on ciphers with various symbol sets, and discuss the advantages and drawbacks compared to supervised HTR methods. |
|
|
Address |
Brussels; May 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DATeCH |
|
|
Notes |
DAG; 600.097; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BCF2019 |
Serial |
3276 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Beata Megyesi; Joan Mas |
|
|
Title |
Transcription of Encoded Manuscripts with Image Processing Techniques |
Type |
Conference Article |
|
Year |
2017 |
Publication |
Digital Humanities Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
441-443 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DH |
|
|
Notes |
DAG; 600.097; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FMM2017 |
Serial |
3061 |
|
Permanent link to this record |