|   | 
Details
   web
Records
Author Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z. Li
Title Multi-modal Face Presentation Attach Detection Type Book Whole
Year 2020 Publication Synthesis Lectures on Computer Vision Abbreviated Journal
Volume 13 Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA Approved no
Call Number Admin @ si @ WGE2020 Serial 3440
Permanent link to this record
 

 
Author Thomas B. Moeslund; Sergio Escalera; Gholamreza Anbarjafari; Kamal Nasrollahi; Jun Wan
Title Statistical Machine Learning for Human Behaviour Analysis Type Journal Article
Year 2020 Publication Entropy Abbreviated Journal ENTROPY
Volume 25 Issue 5 Pages 530
Keywords action recognition; emotion recognition; privacy-aware
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ MEA2020 Serial 3441
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title Video-based Isolated Hand Sign Language Recognition Using a Deep Cascaded Model Type Journal Article
Year 2020 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 79 Issue Pages 22965–22987
Keywords
Abstract In this paper, we propose an efficient cascaded model for sign language recognition taking benefit from spatio-temporal hand-based information using deep learning approaches, especially Single Shot Detector (SSD), Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM), from videos. Our simple yet efficient and accurate model includes two main parts: hand detection and sign recognition. Three types of spatial features, including hand features, Extra Spatial Hand Relation (ESHR) features, and Hand Pose (HP) features, have been fused in the model to feed to LSTM for temporal features extraction. We train SSD model for hand detection using some videos collected from five online sign dictionaries. Our model is evaluated on our proposed dataset (Rastgoo et al., Expert Syst Appl 150: 113336, 2020), including 10’000 sign videos for 100 Persian sign using 10 contributors in 10 different backgrounds, and isoGD dataset. Using the 5-fold cross-validation method, our model outperforms state-of-the-art alternatives in sign language recognition
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ RKE2020b Serial 3442
Permanent link to this record
 

 
Author Raquel Justo; Leila Ben Letaifa; Cristina Palmero; Eduardo Gonzalez-Fraile; Anna Torp Johansen; Alain Vazquez; Gennaro Cordasco; Stephan Schlogl; Begoña Fernandez-Ruanova; Micaela Silva; Sergio Escalera; Mikel de Velasco; Joffre Tenorio-Laranga; Anna Esposito; Maria Korsnes; M. Ines Torres
Title Analysis of the Interaction between Elderly People and a Simulated Virtual Coach, Journal of Ambient Intelligence and Humanized Computing Type Journal Article
Year 2020 Publication Journal of Ambient Intelligence and Humanized Computing Abbreviated Journal AIHC
Volume 11 Issue 12 Pages 6125-6140
Keywords
Abstract The EMPATHIC project develops and validates new interaction paradigms for personalized virtual coaches (VC) to promote healthy and independent aging. To this end, the work presented in this paper is aimed to analyze the interaction between the EMPATHIC-VC and the users. One of the goals of the project is to ensure an end-user driven design, involving senior users from the beginning and during each phase of the project. Thus, the paper focuses on some sessions where the seniors carried out interactions with a Wizard of Oz driven, simulated system. A coaching strategy based on the GROW model was used throughout these sessions so as to guide interactions and engage the elderly with the goals of the project. In this interaction framework, both the human and the system behavior were analyzed. The way the wizard implements the GROW coaching strategy is a key aspect of the system behavior during the interaction. The language used by the virtual agent as well as his or her physical aspect are also important cues that were analyzed. Regarding the user behavior, the vocal communication provides information about the speaker’s emotional status, that is closely related to human behavior and which can be extracted from the speech and language analysis. In the same way, the analysis of the facial expression, gazes and gestures can provide information on the non verbal human communication even when the user is not talking. In addition, in order to engage senior users, their preferences and likes had to be considered. To this end, the effect of the VC on the users was gathered by means of direct questionnaires. These analyses have shown a positive and calm behavior of users when interacting with the simulated virtual coach as well as some difficulties of the system to develop the proposed coaching strategy.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ JLP2020 Serial 3443
Permanent link to this record
 

 
Author David Berga; Xavier Otazu
Title Modeling Bottom-Up and Top-Down Attention with a Neurodynamic Model of V1 Type Journal Article
Year 2020 Publication Neurocomputing Abbreviated Journal NEUCOM
Volume 417 Issue Pages 270-289
Keywords
Abstract Previous studies suggested that lateral interactions of V1 cells are responsible, among other visual effects, of bottom-up visual attention (alternatively named visual salience or saliency). Our objective is to mimic these connections with a neurodynamic network of firing-rate neurons in order to predict visual attention. Early visual subcortical processes (i.e. retinal and thalamic) are functionally simulated. An implementation of the cortical magnification function is included to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return, oculomotor and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search tasks. Results show that our model outpeforms other biologically inspired models of saliency prediction while predicting visual saccade sequences with the same model. We also show how temporal and spatial characteristics of saccade amplitude and inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) can predict attention at distinct image contexts.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes NEUROBIT Approved no
Call Number Admin @ si @ BeO2020c Serial 3444
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds)
Title 16th International Conference, 2021, Proceedings, Part III Type Book Whole
Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal
Volume 12823 Issue Pages
Keywords
Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
Address Lausanne, Switzerland, September 5-10, 2021
Corporate Author Thesis
Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue (up) Edition
ISSN ISBN 978-3-030-86333-3 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ Serial 3727
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds)
Title 16th International Conference, 2021, Proceedings, Part IV Type Book Whole
Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal
Volume 12824 Issue Pages
Keywords
Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
Address Lausanne, Switzerland, September 5-10, 2021
Corporate Author Thesis
Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue (up) Edition
ISSN ISBN 978-3-030-86336-4 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ Serial 3728
Permanent link to this record
 

 
Author Eduardo Aguilar; Bhalaji Nagarajan; Rupali Khatun; Marc Bolaños; Petia Radeva
Title Uncertainty Modeling and Deep Learning Applied to Food Image Analysis Type Conference Article
Year 2020 Publication 13th International Joint Conference on Biomedical Engineering Systems and Technologies Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Recently, computer vision approaches specially assisted by deep learning techniques have shown unexpected advancements that practically solve problems that never have been imagined to be automatized like face recognition or automated driving. However, food image recognition has received a little effort in the Computer Vision community. In this project, we review the field of food image analysis and focus on how to combine with two challenging research lines: deep learning and uncertainty modeling. After discussing our methodology to advance in this direction, we comment potential research, social and economic impact of the research on food image analysis.
Address Villetta; Malta; February 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference BIODEVICES
Notes MILAB Approved no
Call Number Admin @ si @ ANK2020 Serial 3526
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Alicia Fornes; Y.Kessentini; C.Tudor
Title A Few-shot Learning Approach for Historical Encoded Manuscript Recognition Type Conference Article
Year 2021 Publication 25th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 5413-5420
Keywords
Abstract Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning, our method surpasses existing unsupervised and supervised HTR methods for ciphers recognition.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG; 600.121; 600.140 Approved no
Call Number Admin @ si @ SFK2021 Serial 3449
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes
Title A conditional GAN based approach for distorted camera captured documents recovery Type Conference Article
Year 2020 Publication 4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference MedPRAI
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ SKF2020 Serial 3450
Permanent link to this record
 

 
Author Manuel Carbonell; Alicia Fornes; Mauricio Villegas; Josep Llados
Title A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue Pages 219-227
Keywords
Abstract In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 601.311; 600.121 Approved no
Call Number Admin @ si @ CFV2020 Serial 3451
Permanent link to this record
 

 
Author Angel Morera; Angel Sanchez; A. Belen Moreno; Angel Sappa; Jose F. Velez
Title SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities Type Journal Article
Year 2020 Publication Sensors Abbreviated Journal SENS
Volume 20 Issue 16 Pages 4587
Keywords
Abstract This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) deep neural networks for the outdoor advertisement panel detection problem by handling multiple and combined variabilities in the scenes. Publicity panel detection in images offers important advantages both in the real world as well as in the virtual one. For example, applications like Google Street View can be used for Internet publicity and when detecting these ads panels in images, it could be possible to replace the publicity appearing inside the panels by another from a funding company. In our experiments, both SSD and YOLO detectors have produced acceptable results under variable sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex background and multiple panels in scenes. Due to the difficulty of finding annotated images for the considered problem, we created our own dataset for conducting the experiments. The major strength of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable when the publicity contained inside the panel is analyzed after detecting them. On the other side, YOLO produced better panel localization results detecting a higher number of True Positive (TP) panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models with different types of semantic segmentation networks and using the same evaluation metrics is also included.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU; 600.130; 601.349; 600.122 Approved no
Call Number Admin @ si @ MSM2020 Serial 3452
Permanent link to this record
 

 
Author B. Gautam; Oriol Ramos Terrades; Joana Maria Pujadas-Mora; Miquel Valls-Figols
Title Knowledge graph based methods for record linkage Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 136 Issue Pages 127-133
Keywords
Abstract Nowadays, it is common in Historical Demography the use of individual-level data as a consequence of a predominant life-course approach for the understanding of the demographic behaviour, family transition, mobility, etc. Advanced record linkage is key since it allows increasing the data complexity and its volume to be analyzed. However, current methods are constrained to link data from the same kind of sources. Knowledge graph are flexible semantic representations, which allow to encode data variability and semantic relations in a structured manner.

In this paper we propose the use of knowledge graph methods to tackle record linkage tasks. The proposed method, named WERL, takes advantage of the main knowledge graph properties and learns embedding vectors to encode census information. These embeddings are properly weighted to maximize the record linkage performance. We have evaluated this method on benchmark data sets and we have compared it to related methods with stimulating and satisfactory results.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ GRP2020 Serial 3453
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Y.Kessentini
Title DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement Type Journal Article
Year 2022 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 44 Issue 3 Pages 1180-1191
Keywords
Abstract Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end framework named Document Enhancement Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice has not been studied within the context of generative adversarial deep networks. We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality. In addition, our approach provides consistent improvements compared to state-of-the-art methods over the widely used DIBCO 2013, DIBCO 2017 and H-DIBCO 2018 datasets, proving its ability to restore a degraded document image to its ideal condition. The obtained results on a wide variety of degradation reveal the flexibility of the proposed model to be exploited in other document enhancement problems.
Address 1 March 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 602.230; 600.121; 600.140 Approved no
Call Number Admin @ si @ SoK2022 Serial 3454
Permanent link to this record
 

 
Author Sounak Dey; Anguelos Nicolaou; Josep Llados; Umapada Pal
Title Evaluation of the Effect of Improper Segmentation on Word Spotting Type Journal Article
Year 2019 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 22 Issue Pages 361-374
Keywords
Abstract Word spotting is an important recognition task in large-scale retrieval of document collections. In most of the cases, methods are developed and evaluated assuming perfect word segmentation. In this paper, we propose an experimental framework to quantify the goodness that word segmentation has on the performance achieved by word spotting methods in identical unbiased conditions. The framework consists of generating systematic distortions on segmentation and retrieving the original queries from the distorted dataset. We have tested our framework on several established and state-of-the-art methods using George Washington and Barcelona Marriage Datasets. The experiments done allow for an estimate of the end-to-end performance of word spotting methods.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 600.084; 600.121; 600.140; 600.129 Approved no
Call Number Admin @ si @ DNL2019 Serial 3455
Permanent link to this record