|   | 
Details
   web
Records
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds)
Title 16th International Conference, 2021, Proceedings, Part IV Type Book Whole
Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal
Volume 12824 Issue (up) Pages
Keywords
Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
Address Lausanne, Switzerland, September 5-10, 2021
Corporate Author Thesis
Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-030-86336-4 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ Serial 3728
Permanent link to this record
 

 
Author Eduardo Aguilar; Bhalaji Nagarajan; Rupali Khatun; Marc Bolaños; Petia Radeva
Title Uncertainty Modeling and Deep Learning Applied to Food Image Analysis Type Conference Article
Year 2020 Publication 13th International Joint Conference on Biomedical Engineering Systems and Technologies Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract Recently, computer vision approaches specially assisted by deep learning techniques have shown unexpected advancements that practically solve problems that never have been imagined to be automatized like face recognition or automated driving. However, food image recognition has received a little effort in the Computer Vision community. In this project, we review the field of food image analysis and focus on how to combine with two challenging research lines: deep learning and uncertainty modeling. After discussing our methodology to advance in this direction, we comment potential research, social and economic impact of the research on food image analysis.
Address Villetta; Malta; February 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference BIODEVICES
Notes MILAB Approved no
Call Number Admin @ si @ ANK2020 Serial 3526
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Alicia Fornes; Y.Kessentini; C.Tudor
Title A Few-shot Learning Approach for Historical Encoded Manuscript Recognition Type Conference Article
Year 2021 Publication 25th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue (up) Pages 5413-5420
Keywords
Abstract Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning, our method surpasses existing unsupervised and supervised HTR methods for ciphers recognition.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 600.121; 600.140 Approved no
Call Number Admin @ si @ SFK2021 Serial 3449
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes
Title A conditional GAN based approach for distorted camera captured documents recovery Type Conference Article
Year 2020 Publication 4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract
Address Virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MedPRAI
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ SKF2020 Serial 3450
Permanent link to this record
 

 
Author Manuel Carbonell; Alicia Fornes; Mauricio Villegas; Josep Llados
Title A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue (up) Pages 219-227
Keywords
Abstract In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 601.311; 600.121 Approved no
Call Number Admin @ si @ CFV2020 Serial 3451
Permanent link to this record
 

 
Author B. Gautam; Oriol Ramos Terrades; Joana Maria Pujadas-Mora; Miquel Valls-Figols
Title Knowledge graph based methods for record linkage Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue (up) Pages 127-133
Keywords
Abstract Nowadays, it is common in Historical Demography the use of individual-level data as a consequence of a predominant life-course approach for the understanding of the demographic behaviour, family transition, mobility, etc. Advanced record linkage is key since it allows increasing the data complexity and its volume to be analyzed. However, current methods are constrained to link data from the same kind of sources. Knowledge graph are flexible semantic representations, which allow to encode data variability and semantic relations in a structured manner.

In this paper we propose the use of knowledge graph methods to tackle record linkage tasks. The proposed method, named WERL, takes advantage of the main knowledge graph properties and learns embedding vectors to encode census information. These embeddings are properly weighted to maximize the record linkage performance. We have evaluated this method on benchmark data sets and we have compared it to related methods with stimulating and satisfactory results.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ GRP2020 Serial 3453
Permanent link to this record
 

 
Author Sounak Dey; Anguelos Nicolaou; Josep Llados; Umapada Pal
Title Evaluation of the Effect of Improper Segmentation on Word Spotting Type Journal Article
Year 2019 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 22 Issue (up) Pages 361-374
Keywords
Abstract Word spotting is an important recognition task in large-scale retrieval of document collections. In most of the cases, methods are developed and evaluated assuming perfect word segmentation. In this paper, we propose an experimental framework to quantify the goodness that word segmentation has on the performance achieved by word spotting methods in identical unbiased conditions. The framework consists of generating systematic distortions on segmentation and retrieving the original queries from the distorted dataset. We have tested our framework on several established and state-of-the-art methods using George Washington and Barcelona Marriage Datasets. The experiments done allow for an estimate of the end-to-end performance of word spotting methods.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 600.084; 600.121; 600.140; 600.129 Approved no
Call Number Admin @ si @ DNL2019 Serial 3455
Permanent link to this record
 

 
Author Albert Berenguel; Oriol Ramos Terrades; Josep Llados; Cristina Cañero
Title Recurrent Comparator with attention models to detect counterfeit documents Type Conference Article
Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract This paper is focused on the detection of counterfeit documents via the recurrent comparison of the security textured background regions of two images. The main contributions are twofold: first we apply and adapt a recurrent comparator architecture with attention mechanism to the counterfeit detection task, which constructs a representation of the background regions by recurrently condition the next observation, learning the difference between genuine and counterfeit images through iterative glimpses. Second we propose a new counterfeit document dataset to ensure the generalization of the learned model towards the detection of the lack of resolution during the counterfeit manufacturing. The presented network, outperforms state-of-the-art classification approaches for counterfeit detection as demonstrated in the evaluation.
Address Sidney; Australia; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.140; 600.121; 601.269 Approved no
Call Number Admin @ si @ BRL2019 Serial 3456
Permanent link to this record
 

 
Author Fernando Vilariño
Title Library Living Lab, Numérisation 3D des chapiteaux du cloître de Saint-Cugat : des citoyens co- créant le nouveau patrimoine culturel numérique Type Conference Article
Year 2019 Publication Intersectorialité et approche Living Labs. Entretiens Jacques-Cartier Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract
Address Montreal; Canada; December 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV; DAG; 600.140; 600.121;SIAI Approved no
Call Number Admin @ si @ Vil2019a Serial 3457
Permanent link to this record
 

 
Author Fernando Vilariño
Title Public Libraries Exploring how technology transforms the cultural experience of people Type Conference Article
Year 2019 Publication Workshop on Social Impact of AI. Open Living Lab Days Conference. Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract
Address Thessaloniki; Grecia; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV; DAG; 600.140; 600.121;SIAI Approved no
Call Number Admin @ si @ Vil2019b Serial 3458
Permanent link to this record
 

 
Author Fernando Vilariño
Title Unveiling the Social Impact of AI Type Conference Article
Year 2020 Publication Workshop at Digital Living Lab Days Conference Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract
Address September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV; DAG; 600.121; 600.140;SIAI Approved no
Call Number Admin @ si @ Vil2020 Serial 3459
Permanent link to this record
 

 
Author Hassan Ahmed Sial; Ramon Baldrich; Maria Vanrell; Dimitris Samaras
Title Light Direction and Color Estimation from Single Image with Deep Regression Type Conference Article
Year 2020 Publication London Imaging Conference Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract We present a method to estimate the direction and color of the scene light source from a single image. Our method is based on two main ideas: (a) we use a new synthetic dataset with strong shadow effects with similar constraints to the SID dataset; (b) we define a deep architecture trained on the mentioned dataset to estimate the direction and color of the scene light source. Apart from showing good performance on synthetic images, we additionally propose a preliminary procedure to obtain light positions of the Multi-Illumination dataset, and, in this way, we also prove that our trained model achieves good performance when it is applied to real scenes.
Address Virtual; September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference LIM
Notes CIC; 600.118; 600.140; Approved no
Call Number Admin @ si @ SBV2020 Serial 3460
Permanent link to this record
 

 
Author Sagnik Das; Hassan Ahmed Sial; Ke Ma; Ramon Baldrich; Maria Vanrell; Dimitris Samaras
Title Intrinsic Decomposition of Document Images In-the-Wild Type Conference Article
Year 2020 Publication 31st British Machine Vision Conference Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract Automatic document content processing is affected by artifacts caused by the shape
of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised
methods on real data are impossible due to the large amount of data needed. Hence, the
current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models. In this paper we tackle these problems with our two main contributions. First, a physically constrained learning-based method that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. Second, a new dataset that clearly improves previous synthetic ones, by adding a large range of realistic shading and diverse multi-illuminant conditions, uniquely customized to deal with documents in-the-wild. The proposed architecture works in two steps. First, a white balancing module neutralizes the color of the illumination on the input image. Based on the proposed multi-illuminant dataset we achieve a good white-balancing in really difficult conditions. Second, the shading separation module accurately disentangles the shading and paper material in a self-supervised manner where only the synthetic texture is used as a weak training signal (obviating the need for very costly ground truth with disentangled versions of shading and reflectance). The proposed approach leads to significant generalization of document reflectance estimation in real scenes with challenging illumination. We extensively evaluate on the real benchmark datasets available for intrinsic image decomposition and document shadow removal tasks. Our reflectance estimation scheme, when used as a pre-processing step of an OCR pipeline, shows a 21% improvement of character error rate (CER), thus, proving the practical applicability. The data and code will be available at: https://github.com/cvlab-stonybrook/DocIIW.
Address Virtual; September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference BMVC
Notes CIC; 600.087; 600.140; 600.118 Approved no
Call Number Admin @ si @ DSM2020 Serial 3461
Permanent link to this record
 

 
Author Sounak Dey; Pau Riba; Anjan Dutta; Josep Llados; Yi-Zhe Song
Title Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval Type Conference Article
Year 2019 Publication IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue (up) Pages 2179-2188
Keywords
Abstract In this paper, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000 photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset. The new dataset, plus all training and testing code of our model, will be publicly released to facilitate future research.
Address Long beach; CA; USA; June 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes DAG; 600.140; 600.121; 600.097 Approved no
Call Number Admin @ si @ DRD2019 Serial 3462
Permanent link to this record
 

 
Author Fernando Vilariño
Title 3D Scanning of Capitals at Library Living Lab Type Book Whole
Year 2019 Publication “Living Lab Projects 2019”. ENoLL. Abbreviated Journal
Volume Issue (up) Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV; DAG; 600.140; 600.121;SIAI Approved no
Call Number Admin @ si @ Vil2019c Serial 3463
Permanent link to this record