toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Pau Torras; Arnau Baro; Lei Kang; Alicia Fornes edit  openurl
  Title On the Integration of Language Models into Sequence to Sequence Architectures for Handwritten Music Recognition Type Conference Article
  Year 2021 Publication International Society for Music Information Retrieval Conference Abbreviated Journal  
  Volume Issue Pages 690-696  
  Keywords  
  Abstract Despite the latest advances in Deep Learning, the recognition of handwritten music scores is still a challenging endeavour. Even though the recent Sequence to Sequence(Seq2Seq) architectures have demonstrated its capacity to reliably recognise handwritten text, their performance is still far from satisfactory when applied to historical handwritten scores. Indeed, the ambiguous nature of handwriting, the non-standard musical notation employed by composers of the time and the decaying state of old paper make these scores remarkably difficult to read, sometimes even by trained humans. Thus, in this work we explore the incorporation of language models into a Seq2Seq-based architecture to try to improve transcriptions where the aforementioned unclear writing produces statistically unsound mistakes, which as far as we know, has never been attempted for this field of research on this architecture. After studying various Language Model integration techniques, the experimental evaluation on historical handwritten music scores shows a significant improvement over the state of the art, showing that this is a promising research direction for dealing with such difficult manuscripts.  
  Address (down) Virtual; November 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ISMIR  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ TBK2021 Serial 3616  
Permanent link to this record
 

 
Author Ruben Tito; Minesh Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas edit   pdf
url  openurl
  Title ICDAR 2021 Competition on Document Visual Question Answering Type Conference Article
  Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 635-649  
  Keywords  
  Abstract In this report we present results of the ICDAR 2021 edition of the Document Visual Question Challenges. This edition complements the previous tasks on Single Document VQA and Document Collection VQA with a newly introduced on Infographics VQA. Infographics VQA is based on a new dataset of more than 5, 000 infographics images and 30, 000 question-answer pairs. The winner methods have scored 0.6120 ANLS in Infographics VQA task, 0.7743 ANLSL in Document Collection VQA task and 0.8705 ANLS in Single Document VQA. We present a summary of the datasets used for each task, description of each of the submitted methods and the results and analysis of their performance. A summary of the progress made on Single Document VQA since the first edition of the DocVQA 2020 challenge is also presented.  
  Address (down) VIRTUAL; Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ TMJ2021 Serial 3624  
Permanent link to this record
 

 
Author Jialuo Chen; M.A.Souibgui; Alicia Fornes; Beata Megyesi edit   pdf
openurl 
  Title A Web-based Interactive Transcription Tool for Encrypted Manuscripts Type Conference Article
  Year 2020 Publication 3rd International Conference on Historical Cryptology Abbreviated Journal  
  Volume Issue Pages 52-59  
  Keywords  
  Abstract Manual transcription of handwritten text is a time consuming task. In the case of encrypted manuscripts, the recognition is even more complex due to the huge variety of alphabets and symbol sets. To speed up and ease this process, we present a web-based tool aimed to (semi)-automatically transcribe the encrypted sources. The user uploads one or several images of the desired encrypted document(s) as input, and the system returns the transcription(s). This process is carried out in an interactive fashion with
the user to obtain more accurate results. For discovering and testing, the developed web tool is freely available.
 
  Address (down) Virtual; June 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference HistoCrypt  
  Notes DAG; 600.140; 602.230; 600.121 Approved no  
  Call Number Admin @ si @ CSF2020 Serial 3447  
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Ali Furkan Biten; Sounak Dey; Alicia Fornes; Yousri Kessentini; Lluis Gomez; Dimosthenis Karatzas; Josep Llados edit   pdf
url  doi
openurl 
  Title One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition Type Conference Article
  Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages  
  Keywords Document Analysis  
  Abstract Low resource Handwritten Text Recognition (HTR) is a hard problem due to the scarce annotated data and the very limited linguistic information (dictionaries and language models). This appears, for example, in the case of historical ciphered manuscripts, which are usually written with invented alphabets to hide the content. Thus, in this paper we address this problem through a data generation technique based on Bayesian Program Learning (BPL). Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet. After generating symbols, we create synthetic lines to train state-of-the-art HTR architectures in a segmentation free fashion. Quantitative and qualitative analyses were carried out and confirm the effectiveness of the proposed method, achieving competitive results compared to the usage of real annotated data.  
  Address (down) Virtual; January 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ SBD2022 Serial 3615  
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Alicia Fornes; Y.Kessentini; C.Tudor edit   pdf
doi  openurl
  Title A Few-shot Learning Approach for Historical Encoded Manuscript Recognition Type Conference Article
  Year 2021 Publication 25th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 5413-5420  
  Keywords  
  Abstract Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning, our method surpasses existing unsupervised and supervised HTR methods for ciphers recognition.  
  Address (down) Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.121; 600.140 Approved no  
  Call Number Admin @ si @ SFK2021 Serial 3449  
Permanent link to this record
 

 
Author Andres Mafla; Sounak Dey; Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas edit   pdf
doi  openurl
  Title Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 4022-4032  
  Keywords  
  Abstract  
  Address (down) Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MDB2021 Serial 3491  
Permanent link to this record
 

 
Author Andres Mafla; Rafael S. Rezende; Lluis Gomez; Diana Larlus; Dimosthenis Karatzas edit   pdf
doi  openurl
  Title StacMR: Scene-Text Aware Cross-Modal Retrieval Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 2219-2229  
  Keywords  
  Abstract  
  Address (down) Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MRG2021a Serial 3492  
Permanent link to this record
 

 
Author Minesh Mathew; Dimosthenis Karatzas; C.V. Jawahar edit   pdf
openurl 
  Title DocVQA: A Dataset for VQA on Document Images Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 2200-2209  
  Keywords  
  Abstract We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets for VQA and reading comprehension is presented. We report several baseline results by adopting existing VQA and reading comprehension models. Although the existing models perform reasonably well on certain types of questions, there is large performance gap compared to human performance (94.36% accuracy). The models need to improve specifically on questions where understanding structure of the document is crucial. The dataset, code and leaderboard are available at docvqa. org  
  Address (down) Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MKJ2021 Serial 3498  
Permanent link to this record
 

 
Author Asma Bensalah; Jialuo Chen; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados; Miguel A. Ferrer edit   pdf
url  openurl
  Title Towards Stroke Patients' Upper-limb Automatic Motor Assessment Using Smartwatches. Type Conference Article
  Year 2020 Publication International Workshop on Artificial Intelligence for Healthcare Applications Abbreviated Journal  
  Volume 12661 Issue Pages 476-489  
  Keywords  
  Abstract Assessing the physical condition in rehabilitation scenarios is a challenging problem, since it involves Human Activity Recognition (HAR) and kinematic analysis methods. In addition, the difficulties increase in unconstrained rehabilitation scenarios, which are much closer to the real use cases. In particular, our aim is to design an upper-limb assessment pipeline for stroke patients using smartwatches. We focus on the HAR task, as it is the first part of the assessing pipeline. Our main target is to automatically detect and recognize four key movements inspired by the Fugl-Meyer assessment scale, which are performed in both constrained and unconstrained scenarios. In addition to the application protocol and dataset, we propose two detection and classification baseline methods. We believe that the proposed framework, dataset and baseline results will serve to foster this research field.  
  Address (down) Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRW  
  Notes DAG; 600.121; 600.140; Approved no  
  Call Number Admin @ si @ BCF2020 Serial 3508  
Permanent link to this record
 

 
Author Manuel Carbonell; Pau Riba; Mauricio Villegas; Alicia Fornes; Josep Llados edit   pdf
openurl 
  Title Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents Type Conference Article
  Year 2020 Publication 25th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The use of administrative documents to communicate and leave record of business information requires of methods
able to automatically extract and understand the content from
such documents in a robust and efficient way. In addition,
the semi-structured nature of these reports is specially suited
for the use of graph-based representations which are flexible
enough to adapt to the deformations from the different document
templates. Moreover, Graph Neural Networks provide the proper
methodology to learn relations among the data elements in
these documents. In this work we study the use of Graph
Neural Network architectures to tackle the problem of entity
recognition and relation extraction in semi-structured documents.
Our approach achieves state of the art results in the three
tasks involved in the process. Additionally, the experimentation
with two datasets of different nature demonstrates the good
generalization ability of our approach.
 
  Address (down) Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ CRV2020 Serial 3509  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: