toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Pau Riba; Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados edit   pdf
doi  openurl
  Title Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting Type Conference Article
  Year (down) 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume 12822 Issue Pages 381–395  
  Keywords  
  Abstract In this paper, we explore and evaluate the use of ranking-based objective functions for learning simultaneously a word string and a word image encoder. We consider retrieval frameworks in which the user expects a retrieval list ranked according to a defined relevance score. In the context of a word spotting problem, the relevance score has been set according to the string edit distance from the query string. We experimentally demonstrate the competitive performance of the proposed model on query-by-string word spotting for both, handwritten and real scene word images. We also provide the results for query-by-example word spotting, although it is not the main focus of this work.  
  Address Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ RMG2021 Serial 3572  
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal edit   pdf
doi  openurl
  Title DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis Type Conference Article
  Year (down) 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume 12823 Issue Pages 555–568  
  Keywords  
  Abstract Despite significant progress on current state-of-the-art image generation models, synthesis of document images containing multiple and complex object layouts is a challenging task. This paper presents a novel approach, called DocSynth, to automatically synthesize document images based on a given layout. In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed DocSynth model learns to generate a set of realistic document images consistent with the defined layout. Also, this framework has been adapted to this work as a superior baseline model for creating synthetic document image datasets for augmenting real data during training for document layout analysis tasks. Different sets of learning objectives have been also used to improve the model performance. Quantitatively, we also compare the generated results of our model with real data using standard evaluation metrics. The results highlight that our model can successfully generate realistic and diverse document images with multiple objects. We also present a comprehensive qualitative analysis summary of the different scopes of synthetic image generation tasks. Lastly, to our knowledge this is the first work of its kind.  
  Address Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ BRL2021a Serial 3573  
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal edit   pdf
url  doi
openurl 
  Title Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts Type Journal Article
  Year (down) 2021 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume 24 Issue Pages 269–281  
  Keywords  
  Abstract Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ BRL2021b Serial 3574  
Permanent link to this record
 

 
Author Debora Gil; Oriol Ramos Terrades; Raquel Perez edit  doi
openurl 
  Title Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution Type Book Chapter
  Year (down) 2021 Publication Extended Abstracts GEOMVAP 2019, Trends in Mathematics 15 Abbreviated Journal  
  Volume 15 Issue Pages 89–93  
  Keywords  
  Abstract Abnormalities in radiomic measures correlate to genomic alterations prone to alter the outcome of personalized anti-cancer treatments. TOPiomics is a new method for the early detection of variations in tumor imaging phenotype from a topological structure in multi-view radiomic spaces.  
  Address  
  Corporate Author Thesis  
  Publisher Springer Nature Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; DAG; 600.120; 600.145; 600.139 Approved no  
  Call Number Admin @ si @ GRP2021 Serial 3594  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part I Type Book Whole
  Year (down) 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12821 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86548-1 Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3725  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part II Type Book Whole
  Year (down) 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12822 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86330-2 Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3726  
Permanent link to this record
 

 
Author Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes edit   pdf
url  openurl
  Title Learning graph edit distance by graph neural networks Type Journal Article
  Year (down) 2021 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 120 Issue Pages 108132  
  Keywords  
  Abstract The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies. In this paper, we propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance. Hence, we propose an efficient graph distance based on the novel field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation. The performance of the proposed graph distance is validated on two different scenarios. On the one hand, in a graph retrieval of handwritten words i.e. keyword spotting, showing its superior performance when compared with (approximate) graph edit distance benchmarks. On the other hand, demonstrating competitive results for graph similarity learning when compared with the current state-of-the-art on a recent benchmark dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ RFL2021 Serial 3611  
Permanent link to this record
 

 
Author Lei Kang; Pau Riba; Marcal Rusinol; Alicia Fornes; Mauricio Villegas edit  url
doi  openurl
  Title Content and Style Aware Generation of Text-line Images for Handwriting Recognition Type Journal Article
  Year (down) 2021 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume Issue Pages  
  Keywords  
  Abstract Handwritten Text Recognition has achieved an impressive performance in public benchmarks. However, due to the high inter- and intra-class variability between handwriting styles, such recognizers need to be trained using huge volumes of manually labeled training data. To alleviate this labor-consuming problem, synthetic data produced with TrueType fonts has been often used in the training loop to gain volume and augment the handwriting style variability. However, there is a significant style bias between synthetic and real data which hinders the improvement of recognition performance. To deal with such limitations, we propose a generative method for handwritten text-line images, which is conditioned on both visual appearance and textual content. Our method is able to produce long text-line samples with diverse handwriting styles. Once properly trained, our method can also be adapted to new target data by only accessing unlabeled text-line images to mimic handwritten styles and produce images with any textual content. Extensive experiments have been done on making use of the generated samples to boost Handwritten Text Recognition performance. Both qualitative and quantitative results demonstrate that the proposed approach outperforms the current state of the art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ KRR2021 Serial 3612  
Permanent link to this record
 

 
Author Pau Torras; Arnau Baro; Lei Kang; Alicia Fornes edit  openurl
  Title On the Integration of Language Models into Sequence to Sequence Architectures for Handwritten Music Recognition Type Conference Article
  Year (down) 2021 Publication International Society for Music Information Retrieval Conference Abbreviated Journal  
  Volume Issue Pages 690-696  
  Keywords  
  Abstract Despite the latest advances in Deep Learning, the recognition of handwritten music scores is still a challenging endeavour. Even though the recent Sequence to Sequence(Seq2Seq) architectures have demonstrated its capacity to reliably recognise handwritten text, their performance is still far from satisfactory when applied to historical handwritten scores. Indeed, the ambiguous nature of handwriting, the non-standard musical notation employed by composers of the time and the decaying state of old paper make these scores remarkably difficult to read, sometimes even by trained humans. Thus, in this work we explore the incorporation of language models into a Seq2Seq-based architecture to try to improve transcriptions where the aforementioned unclear writing produces statistically unsound mistakes, which as far as we know, has never been attempted for this field of research on this architecture. After studying various Language Model integration techniques, the experimental evaluation on historical handwritten music scores shows a significant improvement over the state of the art, showing that this is a promising research direction for dealing with such difficult manuscripts.  
  Address Virtual; November 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ISMIR  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ TBK2021 Serial 3616  
Permanent link to this record
 

 
Author Jialuo Chen; Mohamed Ali Souibgui; Alicia Fornes; Beata Megyesi edit   pdf
openurl 
  Title Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images Type Conference Article
  Year (down) 2021 Publication 4th International Conference on Historical Cryptology Abbreviated Journal  
  Volume Issue Pages 34-37  
  Keywords  
  Abstract Historical ciphers contain a wide range ofsymbols from various symbol sets. Iden-tifying the cipher alphabet is a prerequi-site before decryption can take place andis a time-consuming process. In this workwe explore the use of image processing foridentifying the underlying alphabet in ci-pher images, and to compare alphabets be-tween ciphers. The experiments show thatciphers with similar alphabets can be suc-cessfully discovered through clustering.  
  Address Virtual; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference HistoCrypt  
  Notes DAG; 602.230; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ CSF2021 Serial 3617  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: