toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas edit   file
url  doi
openurl 
  Title (down) Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition Type Journal Article
  Year 2022 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 129 Issue Pages 108766  
  Keywords  
  Abstract The advent of recurrent neural networks for handwriting recognition marked an important milestone reaching impressive recognition accuracies despite the great variability that we observe across different writing styles. Sequential architectures are a perfect fit to model text lines, not only because of the inherent temporal aspect of text, but also to learn probability distributions over sequences of characters and words. However, using such recurrent paradigms comes at a cost at training stage, since their sequential pipelines prevent parallelization. In this work, we introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We propose a novel method that bypasses any recurrence. By using multi-head self-attention layers both at the visual and textual stages, we are able to tackle character recognition as well as to learn language-related dependencies of the character sequences to be decoded. Our model is unconstrained to any predefined vocabulary, being able to recognize out-of-vocabulary words, i.e. words that do not appear in the training vocabulary. We significantly advance over prior art and demonstrate that satisfactory recognition accuracies are yielded even in few-shot learning scenarios.  
  Address Sept. 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.162 Approved no  
  Call Number Admin @ si @ KRR2022 Serial 3556  
Permanent link to this record
 

 
Author Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Bemedi edit   pdf
doi  isbn
openurl 
  Title (down) Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars Type Conference Article
  Year 2013 Publication 6th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal  
  Volume 7887 Issue Pages 133-140  
  Keywords  
  Abstract In this paper we define a bidimensional extension of Stochastic Context-Free Grammars for page segmentation of structured documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the page segmentation is obtained as the most likely hypothesis according to a grammar. This approach is compared to Conditional Random Fields and results show significant improvements in several cases. Furthermore, grammars provide a detailed segmentation that allowed a semantic evaluation which also validates this model.  
  Address Madeira; Portugal; June 2013  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-38627-5 Medium  
  Area Expedition Conference IbPRIA  
  Notes DAG; 605.203 Approved no  
  Call Number Admin @ si @ ACS2013 Serial 2328  
Permanent link to this record
 

 
Author Stepan Simsa; Michal Uricar; Milan Sulc; Yash Patel; Ahmed Hamdi; Matej Kocian; Matyas Skalicky; Jiri Matas; Antoine Doucet; Mickael Coustaty; Dimosthenis Karatzas edit  url
doi  openurl
  Title (down) Overview of DocILE 2023: Document Information Localization and Extraction Type Conference Article
  Year 2023 Publication International Conference of the Cross-Language Evaluation Forum for European Languages Abbreviated Journal  
  Volume 14163 Issue Pages 276–293  
  Keywords Information Extraction; Computer Vision; Natural Language Processing; Optical Character Recognition; Document Understanding  
  Abstract This paper provides an overview of the DocILE 2023 Competition, its tasks, participant submissions, the competition results and possible future research directions. This first edition of the competition focused on two Information Extraction tasks, Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR). Both of these tasks require detection of pre-defined categories of information in business documents. The second task additionally requires correctly grouping the information into tuples, capturing the structure laid out in the document. The competition used the recently published DocILE dataset and benchmark that stays open to new submissions. The diversity of the participant solutions indicates the potential of the dataset as the submissions included pure Computer Vision, pure Natural Language Processing, as well as multi-modal solutions and utilized all of the parts of the dataset, including the annotated, synthetic and unlabeled subsets.  
  Address Thessaloniki; Greece; September 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CLEF  
  Notes DAG Approved no  
  Call Number Admin @ si @ SUS2023a Serial 3924  
Permanent link to this record
 

 
Author Sergi Garcia Bordils; Andres Mafla; Ali Furkan Biten; Oren Nuriel; Aviad Aberdam; Shai Mazor; Ron Litman; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title (down) Out-of-Vocabulary Challenge Report Type Conference Article
  Year 2022 Publication Proceedings European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume 13804 Issue Pages 359–375  
  Keywords  
  Abstract This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.  
  Address Tel-Aviv; Israel; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes DAG; 600.155; 302.105; 611.002 Approved no  
  Call Number Admin @ si @ GMB2022 Serial 3771  
Permanent link to this record
 

 
Author Oriol Ramos Terrades; Salvatore Tabbone; Ernest Valveny edit  openurl
  Title (down) Optimal Linear Combination for Two-class Classifiers Type Conference Article
  Year 2007 Publication Proceedings of the International Conference on Advances in Pattern Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Kolkata (India)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICAPR  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RTV2007a Serial 894  
Permanent link to this record
 

 
Author Oriol Ramos Terrades; Ernest Valveny; Salvatore Tabbone edit  doi
openurl 
  Title (down) Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework Type Journal Article
  Year 2009 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 31 Issue 9 Pages 1630–1644  
  Keywords  
  Abstract The combination of the output of classifiers has been one of the strategies used to improve classification rates in general purpose classification systems. Some of the most common approaches can be explained using the Bayes' formula. In this paper, we tackle the problem of the combination of classifiers using a non-Bayesian probabilistic framework. This approach permits us to derive two linear combination rules that minimize misclassification rates under some constraints on the distribution of classifiers. In order to show the validity of this approach we have compared it with other popular combination rules from a theoretical viewpoint using a synthetic data set, and experimentally using two standard databases: the MNIST handwritten digit database and the GREC symbol database. Results on the synthetic data set show the validity of the theoretical approach. Indeed, results on real data show that the proposed methods outperform other common combination schemes.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0162-8828 ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RVT2009 Serial 1220  
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes edit   pdf
doi  openurl
  Title (down) Optical Music Recognition by Recurrent Neural Networks Type Conference Article
  Year 2017 Publication 14th IAPR International Workshop on Graphics Recognition Abbreviated Journal  
  Volume Issue Pages 25-26  
  Keywords Optical Music Recognition; Recurrent Neural Network; Long Short-Term Memory  
  Abstract Optical Music Recognition is the task of transcribing a music score into a machine readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.097; 601.302; 600.121 Approved no  
  Call Number Admin @ si @ BRC2017 Serial 3056  
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes edit   pdf
doi  isbn
openurl 
  Title (down) Optical Music Recognition by Long Short-Term Memory Networks Type Book Chapter
  Year 2018 Publication Graphics Recognition. Current Trends and Evolutions Abbreviated Journal  
  Volume 11009 Issue Pages 81-95  
  Keywords Optical Music Recognition; Recurrent Neural Network; Long ShortTerm Memory  
  Abstract Optical Music Recognition refers to the task of transcribing the image of a music score into a machine-readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level. The experimental results are promising, showing the benefits of our approach.  
  Address  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor A. Fornes, B. Lamiroy  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-02283-9 Medium  
  Area Expedition Conference GREC  
  Notes DAG; 600.097; 601.302; 601.330; 600.121 Approved no  
  Call Number Admin @ si @ BRC2018 Serial 3227  
Permanent link to this record
 

 
Author Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados edit  url
openurl 
  Title (down) Ontology-Based Understanding of Architectural Drawings Type Book Chapter
  Year 2017 Publication International Workshop on Graphics Recognition. GREC 2015.Graphic Recognition. Current Trends and Challenges Abbreviated Journal  
  Volume 9657 Issue Pages 75-85  
  Keywords Graphics recognition; Floor plan analysi; Domain ontology  
  Abstract In this paper we present a knowledge base of architectural documents aiming at improving existing methods of floor plan classification and understanding. It consists of an ontological definition of the domain and the inclusion of real instances coming from both, automatically interpreted and manually labeled documents. The knowledge base has proven to be an effective tool to structure our knowledge and to easily maintain and upgrade it. Moreover, it is an appropriate means to automatically check the consistency of relational data and a convenient complement of hard-coded knowledge interpretation systems.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ HRL2017 Serial 3086  
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Ali Furkan Biten; Sounak Dey; Alicia Fornes; Yousri Kessentini; Lluis Gomez; Dimosthenis Karatzas; Josep Llados edit   pdf
url  doi
openurl 
  Title (down) One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition Type Conference Article
  Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages  
  Keywords Document Analysis  
  Abstract Low resource Handwritten Text Recognition (HTR) is a hard problem due to the scarce annotated data and the very limited linguistic information (dictionaries and language models). This appears, for example, in the case of historical ciphered manuscripts, which are usually written with invented alphabets to hide the content. Thus, in this paper we address this problem through a data generation technique based on Bayesian Program Learning (BPL). Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet. After generating symbols, we create synthetic lines to train state-of-the-art HTR architectures in a segmentation free fashion. Quantitative and qualitative analyses were carried out and confirm the effectiveness of the proposed method, achieving competitive results compared to the usage of real annotated data.  
  Address Virtual; January 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ SBD2022 Serial 3615  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: