|   | 
Details
   web
Records
Author Arnau Baro; Pau Riba; Alicia Fornes
Title Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network Type Conference Article
Year 2022 Publication Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) Abbreviated Journal
Volume (up) 13639 Issue Pages 171-184
Keywords Object detection; Optical music recognition; Graph neural network
Abstract During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.
Address December 04 – 07, 2022; Hyderabad, India
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICFHR
Notes DAG; 600.162; 600.140; 602.230 Approved no
Call Number Admin @ si @ BRF2022b Serial 3740
Permanent link to this record
 

 
Author Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds)
Title Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022 Type Book Whole
Year 2022 Publication Frontiers in Handwriting Recognition. Abbreviated Journal
Volume (up) 13639 Issue Pages
Keywords
Abstract
Address ICFHR 2022, Hyderabad, India, December 4–7, 2022
Corporate Author Thesis
Publisher Springer Place of Publication Editor Utkarsh Porwal; Alicia Fornes; Faisal Shafait
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-031-21648-0 Medium
Area Expedition Conference ICFHR
Notes DAG Approved no
Call Number Admin @ si @ PFS2022 Serial 3809
Permanent link to this record
 

 
Author Emanuele Vivoli; Ali Furkan Biten; Andres Mafla; Dimosthenis Karatzas; Lluis Gomez
Title MUST-VQA: MUltilingual Scene-text VQA Type Conference Article
Year 2022 Publication Proceedings European Conference on Computer Vision Workshops Abbreviated Journal
Volume (up) 13804 Issue Pages 345–358
Keywords Visual question answering; Scene text; Translation robustness; Multilingual models; Zero-shot transfer; Power of language models
Abstract In this paper, we present a framework for Multilingual Scene Text Visual Question Answering that deals with new languages in a zero-shot fashion. Specifically, we consider the task of Scene Text Visual Question Answering (STVQA) in which the question can be asked in different languages and it is not necessarily aligned to the scene text language. Thus, we first introduce a natural step towards a more generalized version of STVQA: MUST-VQA. Accounting for this, we discuss two evaluation scenarios in the constrained setting, namely IID and zero-shot and we demonstrate that the models can perform on a par on a zero-shot setting. We further provide extensive experimentation and show the effectiveness of adapting multilingual language models into STVQA tasks.
Address Tel-Aviv; Israel; October 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes DAG; 302.105; 600.155; 611.002 Approved no
Call Number Admin @ si @ VBM2022 Serial 3770
Permanent link to this record
 

 
Author Sergi Garcia Bordils; Andres Mafla; Ali Furkan Biten; Oren Nuriel; Aviad Aberdam; Shai Mazor; Ron Litman; Dimosthenis Karatzas
Title Out-of-Vocabulary Challenge Report Type Conference Article
Year 2022 Publication Proceedings European Conference on Computer Vision Workshops Abbreviated Journal
Volume (up) 13804 Issue Pages 359–375
Keywords
Abstract This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.
Address Tel-Aviv; Israel; October 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes DAG; 600.155; 302.105; 611.002 Approved no
Call Number Admin @ si @ GMB2022 Serial 3771
Permanent link to this record
 

 
Author Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai
Title Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks Type Conference Article
Year 2022 Publication 17th European Conference on Computer Vision Workshops Abbreviated Journal
Volume (up) 13804 Issue Pages 329–344
Keywords
Abstract Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-031-25068-2 Medium
Area Expedition Conference ECCV-TiE
Notes DAG; 600.162; 600.140; 110.312 Approved no
Call Number Admin @ si @ GBC2022 Serial 3795
Permanent link to this record