Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >>

Details

Records
Author	Arnau Baro; Pau Riba; Alicia Fornes
Title	Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network			Type	Conference Article
Year	2022	Publication	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022)	Abbreviated Journal
Volume	13639	Issue		Pages	171-184
Keywords	Object detection; Optical music recognition; Graph neural network
Abstract	During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.
Address	December 04 – 07, 2022; Hyderabad, India
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICFHR
Notes	DAG; 600.162; 600.140; 602.230			Approved	no
Call Number	Admin @ si @ BRF2022b			Serial	3740
Permanent link to this record



Author	Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds)
Title	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022			Type	Book Whole
Year	2022	Publication	Frontiers in Handwriting Recognition.	Abbreviated Journal
Volume	13639	Issue		Pages
Keywords
Abstract
Address	ICFHR 2022, Hyderabad, India, December 4–7, 2022
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	Utkarsh Porwal; Alicia Fornes; Faisal Shafait
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-21648-0	Medium
Area		Expedition		Conference	ICFHR
Notes	DAG			Approved	no
Call Number	Admin @ si @ PFS2022			Serial	3809
Permanent link to this record



Author	Emanuele Vivoli; Ali Furkan Biten; Andres Mafla; Dimosthenis Karatzas; Lluis Gomez
Title	MUST-VQA: MUltilingual Scene-text VQA			Type	Conference Article
Year	2022	Publication	Proceedings European Conference on Computer Vision Workshops	Abbreviated Journal
Volume	13804	Issue		Pages	345–358
Keywords	Visual question answering; Scene text; Translation robustness; Multilingual models; Zero-shot transfer; Power of language models
Abstract	In this paper, we present a framework for Multilingual Scene Text Visual Question Answering that deals with new languages in a zero-shot fashion. Specifically, we consider the task of Scene Text Visual Question Answering (STVQA) in which the question can be asked in different languages and it is not necessarily aligned to the scene text language. Thus, we first introduce a natural step towards a more generalized version of STVQA: MUST-VQA. Accounting for this, we discuss two evaluation scenarios in the constrained setting, namely IID and zero-shot and we demonstrate that the models can perform on a par on a zero-shot setting. We further provide extensive experimentation and show the effectiveness of adapting multilingual language models into STVQA tasks.
Address	Tel-Aviv; Israel; October 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	DAG; 302.105; 600.155; 611.002			Approved	no
Call Number	Admin @ si @ VBM2022			Serial	3770
Permanent link to this record



Author	Sergi Garcia Bordils; Andres Mafla; Ali Furkan Biten; Oren Nuriel; Aviad Aberdam; Shai Mazor; Ron Litman; Dimosthenis Karatzas
Title	Out-of-Vocabulary Challenge Report			Type	Conference Article
Year	2022	Publication	Proceedings European Conference on Computer Vision Workshops	Abbreviated Journal
Volume	13804	Issue		Pages	359–375
Keywords
Abstract	This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.
Address	Tel-Aviv; Israel; October 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	DAG; 600.155; 302.105; 611.002			Approved	no
Call Number	Admin @ si @ GMB2022			Serial	3771
Permanent link to this record



Author	Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai
Title	Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks			Type	Conference Article
Year	2022	Publication	17th European Conference on Computer Vision Workshops	Abbreviated Journal
Volume	13804	Issue		Pages	329–344
Keywords
Abstract	Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-25068-2	Medium
Area		Expedition		Conference	ECCV-TiE
Notes	DAG; 600.162; 600.140; 110.312			Approved	no
Call Number	Admin @ si @ GBC2022			Serial	3795
Permanent link to this record