|
Records |
Links |
|
Author |
Arnau Baro; Pau Riba; Alicia Fornes |
|
|
Title |
Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) |
Abbreviated Journal |
|
|
|
Volume |
13639 |
Issue |
|
Pages |
171-184 |
|
|
Keywords |
Object detection; Optical music recognition; Graph neural network |
|
|
Abstract |
During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results. |
|
|
Address |
December 04 – 07, 2022; Hyderabad, India |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.162; 600.140; 602.230 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRF2022b |
Serial |
3740 |
|
Permanent link to this record |
|
|
|
|
Author |
Giuseppe De Gregorio; Sanket Biswas; Mohamed Ali Souibgui; Asma Bensalah; Josep Llados; Alicia Fornes; Angelo Marcelli |
|
|
Title |
A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) |
Abbreviated Journal |
|
|
|
Volume |
13639 |
Issue |
|
Pages |
3-12 |
|
|
Keywords |
N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections |
|
|
Abstract |
Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction. |
|
|
Address |
December 04 – 07, 2022; Hyderabad, India |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBS2022 |
Serial |
3733 |
|
Permanent link to this record |
|
|
|
|
Author |
Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds) |
|
|
Title |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022 |
Type |
Book Whole |
|
Year |
2022 |
Publication |
Frontiers in Handwriting Recognition. |
Abbreviated Journal |
|
|
|
Volume |
13639 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
ICFHR 2022, Hyderabad, India, December 4–7, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
|
Editor |
Utkarsh Porwal; Alicia Fornes; Faisal Shafait |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-031-21648-0 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ PFS2022 |
Serial |
3809 |
|
Permanent link to this record |
|
|
|
|
Author |
Asma Bensalah; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados |
|
|
Title |
Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 |
Abbreviated Journal |
|
|
|
Volume |
13424 |
Issue |
|
Pages |
336-348 |
|
|
Keywords |
Neurorehabilitation; Upper-lim; Movement classification; Movement smoothness; Deep learning; Jerk |
|
|
Abstract |
Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients. In fact, it depends basically on the patient’s functional independence and its progress along the rehabilitation sessions. To tackle this challenge and make neurorehabilitation more agile, we propose an automatic assessment pipeline that starts by recognising patients’ movements by means of a shallow deep learning architecture, then measuring the movement quality using jerk measure and related measures. A particularity of this work is that the dataset used is clinically relevant, since it represents movements inspired from Fugl-Meyer a well common upper-limb clinical stroke assessment scale for stroke patients. We show that it is possible to detect the contrast between healthy and patients movements in terms of smoothness, besides achieving conclusions about the patients’ progress during the rehabilitation sessions that correspond to the clinicians’ findings about each case. |
|
|
Address |
June 7-9, 2022, Las Palmas de Gran Canaria, Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IGS |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BFC2022 |
Serial |
3738 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Brugues Pujolras; Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
A Multilingual Approach to Scene Text Visual Question Answering |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Document Analysis Systems.15th IAPR International Workshop, (DAS2022) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
65-79 |
|
|
Keywords |
Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning |
|
|
Abstract |
Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines. |
|
|
Address |
La Rochelle, France; May 22–25, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 611.004; 600.155; 601.002 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BGK2022b |
Serial |
3695 |
|
Permanent link to this record |
|
|
|
|
Author |
Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados |
|
|
Title |
A Generic Image Retrieval Method for Date Estimation of Historical Document Collections |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Document Analysis Systems.15th IAPR International Workshop, (DAS2022) |
Abbreviated Journal |
|
|
|
Volume |
13237 |
Issue |
|
Pages |
583–597 |
|
|
Keywords |
Date estimation; Document retrieval; Image retrieval; Ranking loss; Smooth-nDCG |
|
|
Abstract |
Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others. This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections. We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem. One of the main usages of the presented approach is as a tool for historical contextual retrieval. It means that scholars could perform comparative analysis of historical images from big datasets in terms of the period where they were produced. We provide experimental evaluation on different types of documents from real datasets of manuscript and newspaper images. |
|
|
Address |
La Rochelle, France; May 22–25, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MGR2022 |
Serial |
3694 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) |
|
|
Title |
16th International Conference, 2021, Proceedings, Part I |
Type |
Book Whole |
|
Year |
2021 |
Publication |
Document Analysis and Recognition – ICDAR 2021 |
Abbreviated Journal |
|
|
|
Volume |
12821 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition. |
|
|
Address |
Lausanne, Switzerland, September 5-10, 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Cham |
Place of Publication |
|
Editor |
Josep Llados; Daniel Lopresti; Seiichi Uchida |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-030-86548-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3725 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) |
|
|
Title |
16th International Conference, 2021, Proceedings, Part IV |
Type |
Book Whole |
|
Year |
2021 |
Publication |
Document Analysis and Recognition – ICDAR 2021 |
Abbreviated Journal |
|
|
|
Volume |
12824 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding. |
|
|
Address |
Lausanne, Switzerland, September 5-10, 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Cham |
Place of Publication |
|
Editor |
Josep Llados; Daniel Lopresti; Seiichi Uchida |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-030-86336-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3728 |
|
Permanent link to this record |
|
|
|
|
Author |
Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal |
|
|
Title |
DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis |
Type |
Conference Article |
|
Year |
2021 |
Publication |
16th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
12823 |
Issue |
|
Pages |
555–568 |
|
|
Keywords |
|
|
|
Abstract |
Despite significant progress on current state-of-the-art image generation models, synthesis of document images containing multiple and complex object layouts is a challenging task. This paper presents a novel approach, called DocSynth, to automatically synthesize document images based on a given layout. In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed DocSynth model learns to generate a set of realistic document images consistent with the defined layout. Also, this framework has been adapted to this work as a superior baseline model for creating synthetic document image datasets for augmenting real data during training for document layout analysis tasks. Different sets of learning objectives have been also used to improve the model performance. Quantitatively, we also compare the generated results of our model with real data using standard evaluation metrics. The results highlight that our model can successfully generate realistic and diverse document images with multiple objects. We also present a comprehensive qualitative analysis summary of the different scopes of synthetic image generation tasks. Lastly, to our knowledge this is the first work of its kind. |
|
|
Address |
Lausanne; Suissa; September 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121; 600.140; 110.312 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRL2021a |
Serial |
3573 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) |
|
|
Title |
16th International Conference, 2021, Proceedings, Part III |
Type |
Book Whole |
|
Year |
2021 |
Publication |
Document Analysis and Recognition – ICDAR 2021 |
Abbreviated Journal |
|
|
|
Volume |
12823 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding. |
|
|
Address |
Lausanne, Switzerland, September 5-10, 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Cham |
Place of Publication |
|
Editor |
Josep Llados; Daniel Lopresti; Seiichi Uchida |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-030-86333-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3727 |
|
Permanent link to this record |