|   | 
Details
   web
Records
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds)
Title 16th International Conference, 2021, Proceedings, Part IV Type Book Whole
Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal
Volume 12824 Issue Pages
Keywords
Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
Address Lausanne, Switzerland, September 5-10, 2021
Corporate Author Thesis
Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN (up) ISBN 978-3-030-86336-4 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ Serial 3728
Permanent link to this record
 

 
Author Eduardo Aguilar; Bhalaji Nagarajan; Rupali Khatun; Marc Bolaños; Petia Radeva
Title Uncertainty Modeling and Deep Learning Applied to Food Image Analysis Type Conference Article
Year 2020 Publication 13th International Joint Conference on Biomedical Engineering Systems and Technologies Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Recently, computer vision approaches specially assisted by deep learning techniques have shown unexpected advancements that practically solve problems that never have been imagined to be automatized like face recognition or automated driving. However, food image recognition has received a little effort in the Computer Vision community. In this project, we review the field of food image analysis and focus on how to combine with two challenging research lines: deep learning and uncertainty modeling. After discussing our methodology to advance in this direction, we comment potential research, social and economic impact of the research on food image analysis.
Address Villetta; Malta; February 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference BIODEVICES
Notes MILAB Approved no
Call Number Admin @ si @ ANK2020 Serial 3526
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Alicia Fornes; Y.Kessentini; C.Tudor
Title A Few-shot Learning Approach for Historical Encoded Manuscript Recognition Type Conference Article
Year 2021 Publication 25th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 5413-5420
Keywords
Abstract Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning, our method surpasses existing unsupervised and supervised HTR methods for ciphers recognition.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 600.121; 600.140 Approved no
Call Number Admin @ si @ SFK2021 Serial 3449
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes
Title A conditional GAN based approach for distorted camera captured documents recovery Type Conference Article
Year 2020 Publication 4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference MedPRAI
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ SKF2020 Serial 3450
Permanent link to this record
 

 
Author Manuel Carbonell; Alicia Fornes; Mauricio Villegas; Josep Llados
Title A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue Pages 219-227
Keywords
Abstract In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 601.311; 600.121 Approved no
Call Number Admin @ si @ CFV2020 Serial 3451
Permanent link to this record
 

 
Author Angel Morera; Angel Sanchez; A. Belen Moreno; Angel Sappa; Jose F. Velez
Title SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities Type Journal Article
Year 2020 Publication Sensors Abbreviated Journal SENS
Volume 20 Issue 16 Pages 4587
Keywords
Abstract This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) deep neural networks for the outdoor advertisement panel detection problem by handling multiple and combined variabilities in the scenes. Publicity panel detection in images offers important advantages both in the real world as well as in the virtual one. For example, applications like Google Street View can be used for Internet publicity and when detecting these ads panels in images, it could be possible to replace the publicity appearing inside the panels by another from a funding company. In our experiments, both SSD and YOLO detectors have produced acceptable results under variable sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex background and multiple panels in scenes. Due to the difficulty of finding annotated images for the considered problem, we created our own dataset for conducting the experiments. The major strength of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable when the publicity contained inside the panel is analyzed after detecting them. On the other side, YOLO produced better panel localization results detecting a higher number of True Positive (TP) panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models with different types of semantic segmentation networks and using the same evaluation metrics is also included.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference
Notes MSIAU; 600.130; 601.349; 600.122 Approved no
Call Number Admin @ si @ MSM2020 Serial 3452
Permanent link to this record
 

 
Author B. Gautam; Oriol Ramos Terrades; Joana Maria Pujadas-Mora; Miquel Valls-Figols
Title Knowledge graph based methods for record linkage Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue Pages 127-133
Keywords
Abstract Nowadays, it is common in Historical Demography the use of individual-level data as a consequence of a predominant life-course approach for the understanding of the demographic behaviour, family transition, mobility, etc. Advanced record linkage is key since it allows increasing the data complexity and its volume to be analyzed. However, current methods are constrained to link data from the same kind of sources. Knowledge graph are flexible semantic representations, which allow to encode data variability and semantic relations in a structured manner.

In this paper we propose the use of knowledge graph methods to tackle record linkage tasks. The proposed method, named WERL, takes advantage of the main knowledge graph properties and learns embedding vectors to encode census information. These embeddings are properly weighted to maximize the record linkage performance. We have evaluated this method on benchmark data sets and we have compared it to related methods with stimulating and satisfactory results.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ GRP2020 Serial 3453
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Y.Kessentini
Title DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement Type Journal Article
Year 2022 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 44 Issue 3 Pages 1180-1191
Keywords
Abstract Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end framework named Document Enhancement Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice has not been studied within the context of generative adversarial deep networks. We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality. In addition, our approach provides consistent improvements compared to state-of-the-art methods over the widely used DIBCO 2013, DIBCO 2017 and H-DIBCO 2018 datasets, proving its ability to restore a degraded document image to its ideal condition. The obtained results on a wide variety of degradation reveal the flexibility of the proposed model to be exploited in other document enhancement problems.
Address 1 March 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference
Notes DAG; 602.230; 600.121; 600.140 Approved no
Call Number Admin @ si @ SoK2022 Serial 3454
Permanent link to this record
 

 
Author Sounak Dey; Anguelos Nicolaou; Josep Llados; Umapada Pal
Title Evaluation of the Effect of Improper Segmentation on Word Spotting Type Journal Article
Year 2019 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 22 Issue Pages 361-374
Keywords
Abstract Word spotting is an important recognition task in large-scale retrieval of document collections. In most of the cases, methods are developed and evaluated assuming perfect word segmentation. In this paper, we propose an experimental framework to quantify the goodness that word segmentation has on the performance achieved by word spotting methods in identical unbiased conditions. The framework consists of generating systematic distortions on segmentation and retrieving the original queries from the distorted dataset. We have tested our framework on several established and state-of-the-art methods using George Washington and Barcelona Marriage Datasets. The experiments done allow for an estimate of the end-to-end performance of word spotting methods.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 600.084; 600.121; 600.140; 600.129 Approved no
Call Number Admin @ si @ DNL2019 Serial 3455
Permanent link to this record
 

 
Author Albert Berenguel; Oriol Ramos Terrades; Josep Llados; Cristina Cañero
Title Recurrent Comparator with attention models to detect counterfeit documents Type Conference Article
Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This paper is focused on the detection of counterfeit documents via the recurrent comparison of the security textured background regions of two images. The main contributions are twofold: first we apply and adapt a recurrent comparator architecture with attention mechanism to the counterfeit detection task, which constructs a representation of the background regions by recurrently condition the next observation, learning the difference between genuine and counterfeit images through iterative glimpses. Second we propose a new counterfeit document dataset to ensure the generalization of the learned model towards the detection of the lack of resolution during the counterfeit manufacturing. The presented network, outperforms state-of-the-art classification approaches for counterfeit detection as demonstrated in the evaluation.
Address Sidney; Australia; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.140; 600.121; 601.269 Approved no
Call Number Admin @ si @ BRL2019 Serial 3456
Permanent link to this record
 

 
Author Fernando Vilariño
Title Library Living Lab, Numérisation 3D des chapiteaux du cloître de Saint-Cugat : des citoyens co- créant le nouveau patrimoine culturel numérique Type Conference Article
Year 2019 Publication Intersectorialité et approche Living Labs. Entretiens Jacques-Cartier Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Montreal; Canada; December 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference
Notes MV; DAG; 600.140; 600.121;SIAI Approved no
Call Number Admin @ si @ Vil2019a Serial 3457
Permanent link to this record
 

 
Author Fernando Vilariño
Title Public Libraries Exploring how technology transforms the cultural experience of people Type Conference Article
Year 2019 Publication Workshop on Social Impact of AI. Open Living Lab Days Conference. Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Thessaloniki; Grecia; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference
Notes MV; DAG; 600.140; 600.121;SIAI Approved no
Call Number Admin @ si @ Vil2019b Serial 3458
Permanent link to this record
 

 
Author Fernando Vilariño
Title Unveiling the Social Impact of AI Type Conference Article
Year 2020 Publication Workshop at Digital Living Lab Days Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference
Notes MV; DAG; 600.121; 600.140;SIAI Approved no
Call Number Admin @ si @ Vil2020 Serial 3459
Permanent link to this record
 

 
Author Hassan Ahmed Sial; Ramon Baldrich; Maria Vanrell; Dimitris Samaras
Title Light Direction and Color Estimation from Single Image with Deep Regression Type Conference Article
Year 2020 Publication London Imaging Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We present a method to estimate the direction and color of the scene light source from a single image. Our method is based on two main ideas: (a) we use a new synthetic dataset with strong shadow effects with similar constraints to the SID dataset; (b) we define a deep architecture trained on the mentioned dataset to estimate the direction and color of the scene light source. Apart from showing good performance on synthetic images, we additionally propose a preliminary procedure to obtain light positions of the Multi-Illumination dataset, and, in this way, we also prove that our trained model achieves good performance when it is applied to real scenes.
Address Virtual; September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference LIM
Notes CIC; 600.118; 600.140; Approved no
Call Number Admin @ si @ SBV2020 Serial 3460
Permanent link to this record
 

 
Author Sagnik Das; Hassan Ahmed Sial; Ke Ma; Ramon Baldrich; Maria Vanrell; Dimitris Samaras
Title Intrinsic Decomposition of Document Images In-the-Wild Type Conference Article
Year 2020 Publication 31st British Machine Vision Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Automatic document content processing is affected by artifacts caused by the shape
of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised
methods on real data are impossible due to the large amount of data needed. Hence, the
current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models. In this paper we tackle these problems with our two main contributions. First, a physically constrained learning-based method that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. Second, a new dataset that clearly improves previous synthetic ones, by adding a large range of realistic shading and diverse multi-illuminant conditions, uniquely customized to deal with documents in-the-wild. The proposed architecture works in two steps. First, a white balancing module neutralizes the color of the illumination on the input image. Based on the proposed multi-illuminant dataset we achieve a good white-balancing in really difficult conditions. Second, the shading separation module accurately disentangles the shading and paper material in a self-supervised manner where only the synthetic texture is used as a weak training signal (obviating the need for very costly ground truth with disentangled versions of shading and reflectance). The proposed approach leads to significant generalization of document reflectance estimation in real scenes with challenging illumination. We extensively evaluate on the real benchmark datasets available for intrinsic image decomposition and document shadow removal tasks. Our reflectance estimation scheme, when used as a pre-processing step of an OCR pipeline, shows a 21% improvement of character error rate (CER), thus, proving the practical applicability. The data and code will be available at: https://github.com/cvlab-stonybrook/DocIIW.
Address Virtual; September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN (up) ISBN Medium
Area Expedition Conference BMVC
Notes CIC; 600.087; 600.140; 600.118 Approved no
Call Number Admin @ si @ DSM2020 Serial 3461
Permanent link to this record