toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Arnau Baro; Pau Riba; Alicia Fornes edit  doi
openurl 
  Title Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network Type Conference Article
  Year 2022 Publication Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) Abbreviated Journal  
  Volume (up) 13639 Issue Pages 171-184  
  Keywords Object detection; Optical music recognition; Graph neural network  
  Abstract During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.  
  Address December 04 – 07, 2022; Hyderabad, India  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICFHR  
  Notes DAG; 600.162; 600.140; 602.230 Approved no  
  Call Number Admin @ si @ BRF2022b Serial 3740  
Permanent link to this record
 

 
Author Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds) edit  doi
isbn  openurl
  Title Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022 Type Book Whole
  Year 2022 Publication Frontiers in Handwriting Recognition. Abbreviated Journal  
  Volume (up) 13639 Issue Pages  
  Keywords  
  Abstract  
  Address ICFHR 2022, Hyderabad, India, December 4–7, 2022  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor Utkarsh Porwal; Alicia Fornes; Faisal Shafait  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-21648-0 Medium  
  Area Expedition Conference ICFHR  
  Notes DAG Approved no  
  Call Number Admin @ si @ PFS2022 Serial 3809  
Permanent link to this record
 

 
Author Emanuele Vivoli; Ali Furkan Biten; Andres Mafla; Dimosthenis Karatzas; Lluis Gomez edit   pdf
url  doi
openurl 
  Title MUST-VQA: MUltilingual Scene-text VQA Type Conference Article
  Year 2022 Publication Proceedings European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume (up) 13804 Issue Pages 345–358  
  Keywords Visual question answering; Scene text; Translation robustness; Multilingual models; Zero-shot transfer; Power of language models  
  Abstract In this paper, we present a framework for Multilingual Scene Text Visual Question Answering that deals with new languages in a zero-shot fashion. Specifically, we consider the task of Scene Text Visual Question Answering (STVQA) in which the question can be asked in different languages and it is not necessarily aligned to the scene text language. Thus, we first introduce a natural step towards a more generalized version of STVQA: MUST-VQA. Accounting for this, we discuss two evaluation scenarios in the constrained setting, namely IID and zero-shot and we demonstrate that the models can perform on a par on a zero-shot setting. We further provide extensive experimentation and show the effectiveness of adapting multilingual language models into STVQA tasks.  
  Address Tel-Aviv; Israel; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes DAG; 302.105; 600.155; 611.002 Approved no  
  Call Number Admin @ si @ VBM2022 Serial 3770  
Permanent link to this record
 

 
Author Sergi Garcia Bordils; Andres Mafla; Ali Furkan Biten; Oren Nuriel; Aviad Aberdam; Shai Mazor; Ron Litman; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Out-of-Vocabulary Challenge Report Type Conference Article
  Year 2022 Publication Proceedings European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume (up) 13804 Issue Pages 359–375  
  Keywords  
  Abstract This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.  
  Address Tel-Aviv; Israel; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes DAG; 600.155; 302.105; 611.002 Approved no  
  Call Number Admin @ si @ GMB2022 Serial 3771  
Permanent link to this record
 

 
Author Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai edit   pdf
url  doi
isbn  openurl
  Title Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks Type Conference Article
  Year 2022 Publication 17th European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume (up) 13804 Issue Pages 329–344  
  Keywords  
  Abstract Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-25068-2 Medium  
  Area Expedition Conference ECCV-TiE  
  Notes DAG; 600.162; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ GBC2022 Serial 3795  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: