toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author (up) Ariel Amato; Angel Sappa; Alicia Fornes; Felipe Lumbreras; Josep Llados edit   pdf
doi  isbn
openurl 
  Title Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform Type Conference Article
  Year 2013 Publication 2nd International ACM Workshop on Crowdsourcing for Multimedia Abbreviated Journal  
  Volume Issue Pages 21-22  
  Keywords  
  Abstract In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.  
  Address Barcelona; October 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4503-2396-3 Medium  
  Area Expedition Conference CrowdMM  
  Notes ADAS; ISE; DAG; 600.054; 600.055; 600.045; 600.061; 602.006 Approved no  
  Call Number Admin @ si @ SLA2013 Serial 2335  
Permanent link to this record
 

 
Author (up) Arka Ujjal Dey; Suman Ghosh; Ernest Valveny edit   pdf
openurl 
  Title Don't only Feel Read: Using Scene text to understand advertisements Type Conference Article
  Year 2018 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract We propose a framework for automated classification of Advertisement Images, using not just Visual features but also Textual cues extracted from embedded text. Our approach takes inspiration from the assumption that Ad images contain meaningful textual content, that can provide discriminative semantic interpretetion, and can thus aid in classifcation tasks. To this end, we develop a framework using off-the-shelf components, and demonstrate the effectiveness of Textual cues in semantic Classfication tasks.  
  Address Salt Lake City; Utah; USA; June 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ DGV2018 Serial 3551  
Permanent link to this record
 

 
Author (up) Arka Ujjal Dey; Suman Ghosh; Ernest Valveny; Gaurav Harit edit   pdf
url  doi
openurl 
  Title Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding Type Journal Article
  Year 2021 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 149 Issue Pages 164-171  
  Keywords  
  Abstract Images with visual and scene text content are ubiquitous in everyday life. However, current image interpretation systems are mostly limited to using only the visual features, neglecting to leverage the scene text content. In this paper, we propose to jointly use scene text and visual channels for robust semantic interpretation of images. We do not only extract and encode visual and scene text cues, but also model their interplay to generate a contextual joint embedding with richer semantics. The contextual embedding thus generated is applied to retrieval and classification tasks on multimedia images, with scene text content, to demonstrate its effectiveness. In the retrieval framework, we augment our learned text-visual semantic representation with scene text cues, to mitigate vocabulary misses that may have occurred during the semantic embedding. To deal with irrelevant or erroneous recognition of scene text, we also apply query-based attention to our text channel. We show how the multi-channel approach, involving visual semantics and scene text, improves upon state of the art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ DGV2021 Serial 3364  
Permanent link to this record
 

 
Author (up) Arnau Baro edit  isbn
openurl 
  Title Reading Music Systems: From Deep Optical Music Recognition to Contextual Methods Type Book Whole
  Year 2022 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The transcription of sheet music into some machine-readable format can be carried out manually. However, the complexity of music notation inevitably leads to burdensome software for music score editing, which makes the whole process
very time-consuming and prone to errors. Consequently, automatic transcription
systems for musical documents represent interesting tools.
Document analysis is the subject that deals with the extraction and processing
of documents through image and pattern recognition. It is a branch of computer
vision. Taking music scores as source, the field devoted to address this task is
known as Optical Music Recognition (OMR). Typically, an OMR system takes an
image of a music score and automatically extracts its content into some symbolic
structure such as MEI or MusicXML.
In this dissertation, we have investigated different methods for recognizing a
single staff section (e.g. scores for violin, flute, etc.), much in the same way as most text recognition research focuses on recognizing words appearing in a given line image. These methods are based in two different methodologies. On the one hand, we present two methods based on Recurrent Neural Networks, in particular, the
Long Short-Term Memory Neural Network. On the other hand, a method based on Sequence to Sequence models is detailed.
Music context is needed to improve the OMR results, just like language models
and dictionaries help in handwriting recognition. For example, syntactical rules
and grammars could be easily defined to cope with the ambiguities in the rhythm.
In music theory, for example, the time signature defines the amount of beats per
bar unit. Thus, in the second part of this dissertation, different methodologies
have been investigated to improve the OMR recognition. We have explored three
different methods: (a) a graphic tree-structure representation, Dendrograms, that
joins, at each level, its primitives following a set of rules, (b) the incorporation of Language Models to model the probability of a sequence of tokens, and (c) graph neural networks to analyze the music scores to avoid meaningless relationships between music primitives.
Finally, to train all these methodologies, and given the method-specificity of
the datasets in the literature, we have created four different music datasets. Two of them are synthetic with a modern or old handwritten appearance, whereas the
other two are real handwritten scores, being one of them modern and the other
old.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Alicia Fornes  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-124793-8-6 Medium  
  Area Expedition Conference  
  Notes DAG; Approved no  
  Call Number Admin @ si @ Bar2022 Serial 3754  
Permanent link to this record
 

 
Author (up) Arnau Baro; Alicia Fornes; Carles Badal edit   pdf
openurl 
  Title Handwritten Historical Music Recognition by Sequence-to-Sequence with Attention Mechanism Type Conference Article
  Year 2020 Publication 17th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks.  
  Address Virtual ICFHR; September 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICFHR  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ BFB2020 Serial 3448  
Permanent link to this record
 

 
Author (up) Arnau Baro; Carles Badal; Pau Torras; Alicia Fornes edit   pdf
url  openurl
  Title Handwritten Historical Music Recognition through Sequence-to-Sequence with Attention Mechanism Type Conference Article
  Year 2022 Publication 3rd International Workshop on Reading Music Systems (WoRMS2021) Abbreviated Journal  
  Volume Issue Pages 55-59  
  Keywords Optical Music Recognition; Digits; Image Classification  
  Abstract Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks.  
  Address July 23, 2021, Alicante (Spain)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WoRMS  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ BBT2022 Serial 3734  
Permanent link to this record
 

 
Author (up) Arnau Baro; Jialuo Chen; Alicia Fornes; Beata Megyesi edit   pdf
doi  openurl
  Title Towards a generic unsupervised method for transcription of encoded manuscripts Type Conference Article
  Year 2019 Publication 3rd International Conference on Digital Access to Textual Cultural Heritage Abbreviated Journal  
  Volume Issue Pages 73-78  
  Keywords A. Baró, J. Chen, A. Fornés, B. Megyesi.  
  Abstract Historical ciphers, a special type of manuscripts, contain encrypted information, important for the interpretation of our history. The first step towards decipherment is to transcribe the images, either manually or by automatic image processing techniques. Despite the improvements in handwritten text recognition (HTR) thanks to deep learning methodologies, the need of labelled data to train is an important limitation. Given that ciphers often use symbol sets across various alphabets and unique symbols without any transcription scheme available, these supervised HTR techniques are not suitable to transcribe ciphers. In this paper we propose an un-supervised method for transcribing encrypted manuscripts based on clustering and label propagation, which has been successfully applied to community detection in networks. We analyze the performance on ciphers with various symbol sets, and discuss the advantages and drawbacks compared to supervised HTR methods.  
  Address Brussels; May 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DATeCH  
  Notes DAG; 600.097; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ BCF2019 Serial 3276  
Permanent link to this record
 

 
Author (up) Arnau Baro; Pau Riba; Alicia Fornes edit   pdf
doi  openurl
  Title Towards the recognition of compound music notes in handwritten music scores Type Conference Article
  Year 2016 Publication 15th international conference on Frontiers in Handwriting Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The recognition of handwritten music scores still remains an open problem. The existing approaches can only deal with very simple handwritten scores mainly because of the variability in the handwriting style and the variability in the composition of groups of music notes (i.e. compound music notes). In this work we focus on this second problem and propose a method based on perceptual grouping for the recognition of compound music notes. Our method has been tested using several handwritten music scores of the CVC-MUSCIMA database and compared with a commercial Optical Music Recognition (OMR) software. Given that our method is learning-free, the obtained results are promising.  
  Address Shenzhen; China; October 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 2167-6445 ISBN Medium  
  Area Expedition Conference ICFHR  
  Notes DAG; 600.097 Approved no  
  Call Number Admin @ si @ BRF2016 Serial 2903  
Permanent link to this record
 

 
Author (up) Arnau Baro; Pau Riba; Alicia Fornes edit   pdf
openurl 
  Title A Starting Point for Handwritten Music Recognition Type Conference Article
  Year 2018 Publication 1st International Workshop on Reading Music Systems Abbreviated Journal  
  Volume Issue Pages 5-6  
  Keywords Optical Music Recognition; Long Short-Term Memory; Convolutional Neural Networks; MUSCIMA++; CVCMUSCIMA  
  Abstract In the last years, the interest in Optical Music Recognition (OMR) has reawakened, especially since the appearance of deep learning. However, there are very few works addressing handwritten scores. In this work we describe a full OMR pipeline for handwritten music scores by using Convolutional and Recurrent Neural Networks that could serve as a baseline for the research community.  
  Address Paris; France; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WORMS  
  Notes DAG; 600.097; 601.302; 601.330; 600.121 Approved no  
  Call Number Admin @ si @ BRF2018 Serial 3223  
Permanent link to this record
 

 
Author (up) Arnau Baro; Pau Riba; Alicia Fornes edit  doi
openurl 
  Title Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network Type Conference Article
  Year 2022 Publication Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) Abbreviated Journal  
  Volume 13639 Issue Pages 171-184  
  Keywords Object detection; Optical music recognition; Graph neural network  
  Abstract During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.  
  Address December 04 – 07, 2022; Hyderabad, India  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICFHR  
  Notes DAG; 600.162; 600.140; 602.230 Approved no  
  Call Number Admin @ si @ BRF2022b Serial 3740  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: