|   | 
Details
   web
Records
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds)
Title 16th International Conference, 2021, Proceedings, Part III Type Book Whole
Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal
Volume 12823 Issue Pages
Keywords
Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
Address Lausanne, Switzerland, September 5-10, 2021
Corporate Author Thesis
Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-030-86333-3 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ Serial 3727
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds)
Title 16th International Conference, 2021, Proceedings, Part IV Type Book Whole
Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal
Volume 12824 Issue Pages
Keywords
Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
Address Lausanne, Switzerland, September 5-10, 2021
Corporate Author Thesis
Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-030-86336-4 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ Serial 3728
Permanent link to this record
 

 
Author Tomas Sixta; Julio C. S. Jacques Junior; Pau Buch Cardona; Eduard Vazquez; Sergio Escalera
Title FairFace Challenge at ECCV 2020: Analyzing Bias in Face Recognition Type Conference Article
Year 2020 Publication ECCV Workshops Abbreviated Journal
Volume 12540 Issue Pages 463-481
Keywords
Abstract This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were evaluated using an in-the-wild dataset based on reannotated IJB-C, further enriched 12.5K new images and additional labels. The dataset is not balanced, which simulates a real world scenario where AI-based models supposed to present fair outcomes are trained and evaluated on imbalanced data. The challenge attracted 151 participants, who made more 1.8K submissions in total. The final phase of the challenge attracted 36 active teams out of which 10 exceeded 0.999 AUC-ROC while achieving very low scores in the proposed bias metrics. Common strategies by the participants were face pre-processing, homogenization of data distributions, the use of bias aware loss functions and ensemble models. The analysis of top-10 teams shows higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too.
Address Virtual; August 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes HUPBA Approved no
Call Number Admin @ si @ SJB2020 Serial 3499
Permanent link to this record
 

 
Author Martin Menchon; Estefania Talavera; Jose M. Massa; Petia Radeva
Title Behavioural Pattern Discovery from Collections of Egocentric Photo-Streams Type Conference Article
Year 2020 Publication ECCV Workshops Abbreviated Journal
Volume 12538 Issue Pages 469-484
Keywords
Abstract The automatic discovery of behaviour is of high importance when aiming to assess and improve the quality of life of people. Egocentric images offer a rich and objective description of the daily life of the camera wearer. This work proposes a new method to identify a person’s patterns of behaviour from collected egocentric photo-streams. Our model characterizes time-frames based on the context (place, activities and environment objects) that define the images composition. Based on the similarity among the time-frames that describe the collected days for a user, we propose a new unsupervised greedy method to discover the behavioural pattern set based on a novel semantic clustering approach. Moreover, we present a new score metric to evaluate the performance of the proposed algorithm. We validate our method on 104 days and more than 100k images extracted from 7 users. Results show that behavioural patterns can be discovered to characterize the routine of individuals and consequently their lifestyle.
Address Virtual; August 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes MILAB; no proj Approved no
Call Number Admin @ si @ MTM2020 Serial 3528
Permanent link to this record
 

 
Author Adria Molina; Pau Riba; Lluis Gomez; Oriol Ramos Terrades; Josep Llados
Title Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach Type Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12822 Issue Pages 306-320
Keywords
Abstract This paper presents a novel method for date estimation of historical photographs from archival sources. The main contribution is to formulate the date estimation as a retrieval task, where given a query, the retrieved images are ranked in terms of the estimated date similarity. The closer are their embedded representations the closer are their dates. Contrary to the traditional models that design a neural network that learns a classifier or a regressor, we propose a learning objective based on the nDCG ranking metric. We have experimentally evaluated the performance of the method in two different tasks: date estimation and date-sensitive image retrieval, using the DEW public database, overcoming the baseline methods.
Address Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121; 600.140; 110.312 Approved no
Call Number Admin @ si @ MRG2021b Serial 3571
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal
Title DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis Type Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12823 Issue Pages 555–568
Keywords
Abstract Despite significant progress on current state-of-the-art image generation models, synthesis of document images containing multiple and complex object layouts is a challenging task. This paper presents a novel approach, called DocSynth, to automatically synthesize document images based on a given layout. In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed DocSynth model learns to generate a set of realistic document images consistent with the defined layout. Also, this framework has been adapted to this work as a superior baseline model for creating synthetic document image datasets for augmenting real data during training for document layout analysis tasks. Different sets of learning objectives have been also used to improve the model performance. Quantitatively, we also compare the generated results of our model with real data using standard evaluation metrics. The results highlight that our model can successfully generate realistic and diverse document images with multiple objects. We also present a comprehensive qualitative analysis summary of the different scopes of synthetic image generation tasks. Lastly, to our knowledge this is the first work of its kind.
Address Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121; 600.140; 110.312 Approved no
Call Number Admin @ si @ BRL2021a Serial 3573
Permanent link to this record
 

 
Author Bartlomiej Twardowski; Pawel Zawistowski; Szymon Zaborowski
Title Metric Learning for Session-Based Recommendations Type Conference Article
Year 2021 Publication 43rd edition of the annual BCS-IRSG European Conference on Information Retrieval Abbreviated Journal
Volume 12656 Issue Pages 650-665
Keywords Session-based recommendations; Deep metric learning; Learning to rank
Abstract Session-based recommenders, used for making predictions out of users’ uninterrupted sequences of actions, are attractive for many applications. Here, for this task we propose using metric learning, where a common embedding space for sessions and items is created, and distance measures dissimilarity between the provided sequence of users’ events and the next action. We discuss and compare metric learning approaches to commonly used learning-to-rank methods, where some synergies exist. We propose a simple architecture for problem analysis and demonstrate that neither extensively big nor deep architectures are necessary in order to outperform existing methods. The experimental results against strong baselines on four datasets are provided with an ablation study.
Address Virtual; March 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECIR
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ TZZ2021 Serial 3586
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds)
Title 16th International Conference, 2021, Proceedings, Part I Type Book Whole
Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal
Volume 12821 Issue Pages
Keywords
Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition.
Address Lausanne, Switzerland, September 5-10, 2021
Corporate Author Thesis
Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-030-86548-1 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ Serial 3725
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds)
Title 16th International Conference, 2021, Proceedings, Part II Type Book Whole
Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal
Volume 12822 Issue Pages
Keywords
Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
Address Lausanne, Switzerland, September 5-10, 2021
Corporate Author Thesis
Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-030-86330-2 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ Serial 3726
Permanent link to this record
 

 
Author Pau Torras; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes
Title A Transcription Is All You Need: Learning to Align through Attention Type Conference Article
Year 2021 Publication 14th IAPR International Workshop on Graphics Recognition Abbreviated Journal
Volume 12916 Issue Pages 141–146
Keywords
Abstract Historical ciphered manuscripts are a type of document where graphical symbols are used to encrypt their content instead of regular text. Nowadays, expert transcriptions can be found in libraries alongside the corresponding manuscript images. However, those transcriptions are not aligned, so these are barely usable for training deep learning-based recognition methods. To solve this issue, we propose a method to align each symbol in the transcript of an image with its visual representation by using an attention-based Sequence to Sequence (Seq2Seq) model. The core idea is that, by learning to recognise symbols sequence within a cipher line image, the model also identifies their position implicitly through an attention mechanism. Thus, the resulting symbol segmentation can be later used for training algorithms. The experimental evaluation shows that this method is promising, especially taking into account the small size of the cipher dataset.
Address Virtual; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference GREC
Notes DAG; 602.230; 600.140; 600.121 Approved no
Call Number Admin @ si @ TSC2021 Serial 3619
Permanent link to this record
 

 
Author Ruben Tito; Dimosthenis Karatzas; Ernest Valveny
Title Document Collection Visual Question Answering Type Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12822 Issue Pages 778-792
Keywords Document collection; Visual Question Answering
Abstract Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ TKV2021 Serial 3622
Permanent link to this record
 

 
Author Joan Codina-Filba; Sergio Escalera; Joan Escudero; Coen Antens; Pau Buch-Cardona; Mireia Farrus
Title Mobile eHealth Platform for Home Monitoring of Bipolar Disorder Type Conference Article
Year 2021 Publication 27th ACM International Conference on Multimedia Modeling Abbreviated Journal
Volume 12573 Issue Pages 330-341
Keywords
Abstract People suffering Bipolar Disorder (BD) experiment changes in mood status having depressive or manic episodes with normal periods in the middle. BD is a chronic disease with a high level of non-adherence to medication that needs a continuous monitoring of patients to detect when they relapse in an episode, so that physicians can take care of them. Here we present MoodRecord, an easy-to-use, non-intrusive, multilingual, robust and scalable platform suitable for home monitoring patients with BD, that allows physicians and relatives to track the patient state and get alarms when abnormalities occur.

MoodRecord takes advantage of the capabilities of smartphones as a communication and recording device to do a continuous monitoring of patients. It automatically records user activity, and asks the user to answer some questions or to record himself in video, according to a predefined plan designed by physicians. The video is analysed, recognising the mood status from images and bipolar assessment scores are extracted from speech parameters. The data obtained from the different sources are merged periodically to observe if a relapse may start and if so, raise the corresponding alarm. The application got a positive evaluation in a pilot with users from three different countries. During the pilot, the predictions of the voice and image modules showed a coherent correlation with the diagnosis performed by clinicians.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MMM
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ CEE2021 Serial 3659
Permanent link to this record
 

 
Author Henry Velesaca; Patricia Suarez; Dario Carpio; Angel Sappa
Title Synthesized Image Datasets: Towards an Annotation-Free Instance Segmentation Strategy Type Conference Article
Year 2021 Publication 16th International Symposium on Visual Computing Abbreviated Journal
Volume 13017 Issue Pages 131–143
Keywords
Abstract This paper presents a complete pipeline to perform deep learning-based instance segmentation of different types of grains (e.g., corn, sunflower, soybeans, lentils, chickpeas, mote, and beans). The proposed approach consists of using synthesized image datasets for the training process, which are easily generated according to the category of the instance to be segmented. The synthesized imaging process allows generating a large set of well-annotated grain samples with high variability—as large and high as the user requires. Instance segmentation is performed through a popular deep learning based approach, the Mask R-CNN architecture, but any learning-based instance segmentation approach can be considered. Results obtained by the proposed pipeline show that the strategy of using synthesized image datasets for training instance segmentation helps to avoid the time-consuming image annotation stage, as well as to achieve higher intersection over union and average precision performances. Results obtained with different varieties of grains are shown, as well as comparisons with manually annotated images, showing both the simplicity of the process and the improvements in the performance.
Address Virtual; October 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ISVC
Notes MSIAU Approved no
Call Number Admin @ si @ VSC2021 Serial 3667
Permanent link to this record
 

 
Author Patricia Suarez; Dario Carpio; Angel Sappa
Title Non-homogeneous Haze Removal Through a Multiple Attention Module Architecture Type Conference Article
Year 2021 Publication 16th International Symposium on Visual Computing Abbreviated Journal
Volume 13018 Issue Pages 178–190
Keywords
Abstract This paper presents a novel attention based architecture to remove non-homogeneous haze. The proposed model is focused on obtaining the most representative characteristics of the image, at each learning cycle, by means of adaptive attention modules coupled with a residual learning convolutional network. The latter is based on the Res2Net model. The proposed architecture is trained with just a few set of images. Its performance is evaluated on a public benchmark—images from the non-homogeneous haze NTIRE 2021 challenge—and compared with state of the art approaches reaching the best result.
Address Virtual; October 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ISVC
Notes MSIAU Approved no
Call Number Admin @ si @ SCS2021 Serial 3668
Permanent link to this record
 

 
Author Albert Suso; Pau Riba; Oriol Ramos Terrades; Josep Llados
Title A Self-supervised Inverse Graphics Approach for Sketch Parametrization Type Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12916 Issue Pages 28-42
Keywords
Abstract The study of neural generative models of handwritten text and human sketches is a hot topic in the computer vision field. The landmark SketchRNN provided a breakthrough by sequentially generating sketches as a sequence of waypoints, and more recent articles have managed to generate fully vector sketches by coding the strokes as Bézier curves. However, the previous attempts with this approach need them all a ground truth consisting in the sequence of points that make up each stroke, which seriously limits the datasets the model is able to train in. In this work, we present a self-supervised end-to-end inverse graphics approach that learns to embed each image to its best fit of Bézier curves. The self-supervised nature of the training process allows us to train the model in a wider range of datasets, but also to perform better after-training predictions by applying an overfitting process on the input binary image. We report qualitative an quantitative evaluations on the MNIST and the Quick, Draw! datasets.
Address Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up) LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ SRR2021 Serial 3675
Permanent link to this record