toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Albert Suso; Pau Riba; Oriol Ramos Terrades; Josep Llados edit  url
openurl 
  Title A Self-supervised Inverse Graphics Approach for Sketch Parametrization Type Conference Article
  Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume (up) 12916 Issue Pages 28-42  
  Keywords  
  Abstract The study of neural generative models of handwritten text and human sketches is a hot topic in the computer vision field. The landmark SketchRNN provided a breakthrough by sequentially generating sketches as a sequence of waypoints, and more recent articles have managed to generate fully vector sketches by coding the strokes as Bézier curves. However, the previous attempts with this approach need them all a ground truth consisting in the sequence of points that make up each stroke, which seriously limits the datasets the model is able to train in. In this work, we present a self-supervised end-to-end inverse graphics approach that learns to embed each image to its best fit of Bézier curves. The self-supervised nature of the training process allows us to train the model in a wider range of datasets, but also to perform better after-training predictions by applying an overfitting process on the input binary image. We report qualitative an quantitative evaluations on the MNIST and the Quick, Draw! datasets.  
  Address Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ SRR2021 Serial 3675  
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal edit   pdf
url  doi
openurl 
  Title Graph-Based Deep Generative Modelling for Document Layout Generation Type Conference Article
  Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume (up) 12917 Issue Pages 525-537  
  Keywords  
  Abstract One of the major prerequisites for any deep learning approach is the availability of large-scale training data. When dealing with scanned document images in real world scenarios, the principal information of its content is stored in the layout itself. In this work, we have proposed an automated deep generative model using Graph Neural Networks (GNNs) to generate synthetic data with highly variable and plausible document layouts that can be used to train document interpretation systems, in this case, specially in digital mailroom applications. It is also the first graph-based approach for document layout generation task experimented on administrative document images, in this case, invoices.  
  Address Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ BRL2021 Serial 3676  
Permanent link to this record
 

 
Author Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados edit   pdf
doi  openurl
  Title A Generic Image Retrieval Method for Date Estimation of Historical Document Collections Type Conference Article
  Year 2022 Publication Document Analysis Systems.15th IAPR International Workshop, (DAS2022) Abbreviated Journal  
  Volume (up) 13237 Issue Pages 583–597  
  Keywords Date estimation; Document retrieval; Image retrieval; Ranking loss; Smooth-nDCG  
  Abstract Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others. This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections. We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem. One of the main usages of the presented approach is as a tool for historical contextual retrieval. It means that scholars could perform comparative analysis of historical images from big datasets in terms of the period where they were produced. We provide experimental evaluation on different types of documents from real datasets of manuscript and newspaper images.  
  Address La Rochelle, France; May 22–25, 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ MGR2022 Serial 3694  
Permanent link to this record
 

 
Author Sergi Garcia Bordils; George Tom; Sangeeth Reddy; Minesh Mathew; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas edit   pdf
url  doi
isbn  openurl
  Title Read While You Drive-Multilingual Text Tracking on the Road Type Conference Article
  Year 2022 Publication 15th IAPR International workshop on document analysis systems Abbreviated Journal  
  Volume (up) 13237 Issue Pages 756–770  
  Keywords  
  Abstract Visual data obtained during driving scenarios usually contain large amounts of text that conveys semantic information necessary to analyse the urban environment and is integral to the traffic control plan. Yet, research on autonomous driving or driver assistance systems typically ignores this information. To advance research in this direction, we present RoadText-3K, a large driving video dataset with fully annotated text. RoadText-3K is three times bigger than its predecessor and contains data from varied geographical locations, unconstrained driving conditions and multiple languages and scripts. We offer a comprehensive analysis of tracking by detection and detection by tracking methods exploring the limits of state-of-the-art text detection. Finally, we propose a new end-to-end trainable tracking model that yields state-of-the-art results on this challenging dataset. Our experiments demonstrate the complexity and variability of RoadText-3K and establish a new, realistic benchmark for scene text tracking in the wild.  
  Address La Rochelle; France; May 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-06554-5 Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.155; 611.022; 611.004 Approved no  
  Call Number Admin @ si @ GTR2022 Serial 3783  
Permanent link to this record
 

 
Author Asma Bensalah; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados edit   pdf
doi  openurl
  Title Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis Type Conference Article
  Year 2022 Publication Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 Abbreviated Journal  
  Volume (up) 13424 Issue Pages 336-348  
  Keywords Neurorehabilitation; Upper-lim; Movement classification; Movement smoothness; Deep learning; Jerk  
  Abstract Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients. In fact, it depends basically on the patient’s functional independence and its progress along the rehabilitation sessions. To tackle this challenge and make neurorehabilitation more agile, we propose an automatic assessment pipeline that starts by recognising patients’ movements by means of a shallow deep learning architecture, then measuring the movement quality using jerk measure and related measures. A particularity of this work is that the dataset used is clinically relevant, since it represents movements inspired from Fugl-Meyer a well common upper-limb clinical stroke assessment scale for stroke patients. We show that it is possible to detect the contrast between healthy and patients movements in terms of smoothness, besides achieving conclusions about the patients’ progress during the rehabilitation sessions that correspond to the clinicians’ findings about each case.  
  Address June 7-9, 2022, Las Palmas de Gran Canaria, Spain  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IGS  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ BFC2022 Serial 3738  
Permanent link to this record
 

 
Author Alicia Fornes; Asma Bensalah; Cristina Carmona_Duarte; Jialuo Chen; Miguel A. Ferrer; Andreas Fischer; Josep Llados; Cristina Martin; Eloy Opisso; Rejean Plamondon; Anna Scius-Bertrand; Josep Maria Tormos edit   pdf
url  doi
openurl 
  Title The RPM3D Project: 3D Kinematics for Remote Patient Monitoring Type Conference Article
  Year 2022 Publication Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 Abbreviated Journal  
  Volume (up) 13424 Issue Pages 217-226  
  Keywords Healthcare applications; Kinematic; Theory of Rapid Human Movements; Human activity recognition; Stroke rehabilitation; 3D kinematics  
  Abstract This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute (https://www.guttmann.com/en/) (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.  
  Address June 7-9, 2022, Las Palmas de Gran Canaria, Spain  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IGS  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ FBC2022 Serial 3739  
Permanent link to this record
 

 
Author Giuseppe De Gregorio; Sanket Biswas; Mohamed Ali Souibgui; Asma Bensalah; Josep Llados; Alicia Fornes; Angelo Marcelli edit   pdf
doi  openurl
  Title A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts Type Conference Article
  Year 2022 Publication Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) Abbreviated Journal  
  Volume (up) 13639 Issue Pages 3-12  
  Keywords N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections  
  Abstract Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction.  
  Address December 04 – 07, 2022; Hyderabad, India  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICFHR  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ GBS2022 Serial 3733  
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Alicia Fornes edit  doi
openurl 
  Title Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network Type Conference Article
  Year 2022 Publication Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) Abbreviated Journal  
  Volume (up) 13639 Issue Pages 171-184  
  Keywords Object detection; Optical music recognition; Graph neural network  
  Abstract During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results.  
  Address December 04 – 07, 2022; Hyderabad, India  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICFHR  
  Notes DAG; 600.162; 600.140; 602.230 Approved no  
  Call Number Admin @ si @ BRF2022b Serial 3740  
Permanent link to this record
 

 
Author Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds) edit  doi
isbn  openurl
  Title Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022 Type Book Whole
  Year 2022 Publication Frontiers in Handwriting Recognition. Abbreviated Journal  
  Volume (up) 13639 Issue Pages  
  Keywords  
  Abstract  
  Address ICFHR 2022, Hyderabad, India, December 4–7, 2022  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor Utkarsh Porwal; Alicia Fornes; Faisal Shafait  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-21648-0 Medium  
  Area Expedition Conference ICFHR  
  Notes DAG Approved no  
  Call Number Admin @ si @ PFS2022 Serial 3809  
Permanent link to this record
 

 
Author Emanuele Vivoli; Ali Furkan Biten; Andres Mafla; Dimosthenis Karatzas; Lluis Gomez edit   pdf
url  doi
openurl 
  Title MUST-VQA: MUltilingual Scene-text VQA Type Conference Article
  Year 2022 Publication Proceedings European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume (up) 13804 Issue Pages 345–358  
  Keywords Visual question answering; Scene text; Translation robustness; Multilingual models; Zero-shot transfer; Power of language models  
  Abstract In this paper, we present a framework for Multilingual Scene Text Visual Question Answering that deals with new languages in a zero-shot fashion. Specifically, we consider the task of Scene Text Visual Question Answering (STVQA) in which the question can be asked in different languages and it is not necessarily aligned to the scene text language. Thus, we first introduce a natural step towards a more generalized version of STVQA: MUST-VQA. Accounting for this, we discuss two evaluation scenarios in the constrained setting, namely IID and zero-shot and we demonstrate that the models can perform on a par on a zero-shot setting. We further provide extensive experimentation and show the effectiveness of adapting multilingual language models into STVQA tasks.  
  Address Tel-Aviv; Israel; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes DAG; 302.105; 600.155; 611.002 Approved no  
  Call Number Admin @ si @ VBM2022 Serial 3770  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: