|   | 
Details
   web
Records
Author Ahmed M. A. Salih; Ilaria Boscolo Galazzo; Zahra Zahra Raisi-Estabragh; Steffen E. Petersen; Polyxeni Gkontra; Karim Lekadir; Gloria Menegaz; Petia Radeva
Title A new scheme for the assessment of the robustness of Explainable Methods Applied to Brain Age estimation Type Conference Article
Year 2021 Publication 34th International Symposium on Computer-Based Medical Systems Abbreviated Journal
Volume Issue Pages (up) 492-497
Keywords
Abstract Deep learning methods show great promise in a range of settings including the biomedical field. Explainability of these models is important in these fields for building end-user trust and to facilitate their confident deployment. Although several Machine Learning Interpretability tools have been proposed so far, there is currently no recognized evaluation standard to transfer the explainability results into a quantitative score. Several measures have been proposed as proxies for quantitative assessment of explainability methods. However, the robustness of the list of significant features provided by the explainability methods has not been addressed. In this work, we propose a new proxy for assessing the robustness of the list of significant features provided by two explainability methods. Our validation is defined at functionality-grounded level based on the ranked correlation statistical index and demonstrates its successful application in the framework of brain aging estimation. We assessed our proxy to estimate brain age using neuroscience data. Our results indicate small variability and high robustness in the considered explainability methods using this new proxy.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CBMS
Notes MILAB; no proj Approved no
Call Number Admin @ si @ SBZ2021 Serial 3629
Permanent link to this record
 

 
Author Arturo Fuentes; F. Javier Sanchez; Thomas Voncina; Jorge Bernal
Title LAMV: Learning to Predict Where Spectators Look in Live Music Performances Type Conference Article
Year 2021 Publication 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal
Volume 5 Issue Pages (up) 500-507
Keywords
Abstract The advent of artificial intelligence has supposed an evolution on how different daily work tasks are performed. The analysis of cultural content has seen a huge boost by the development of computer-assisted methods that allows easy and transparent data access. In our case, we deal with the automation of the production of live shows, like music concerts, aiming to develop a system that can indicate the producer which camera to show based on what each of them is showing. In this context, we consider that is essential to understand where spectators look and what they are interested in so the computational method can learn from this information. The work that we present here shows the results of a first preliminary study in which we compare areas of interest defined by human beings and those indicated by an automatic system. Our system is based on the extraction of motion textures from dynamic Spatio-Temporal Volumes (STV) and then analyzing the patterns by means of texture analysis techniques. We validate our approach over several video sequences that have been labeled by 16 different experts. Our method is able to match those relevant areas identified by the experts, achieving recall scores higher than 80% when a distance of 80 pixels between method and ground truth is considered. Current performance shows promise when detecting abnormal peaks and movement trends.
Address Virtual; February 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISIGRAPP
Notes MV; ISE; 600.119; Approved no
Call Number Admin @ si @ FSV2021 Serial 3570
Permanent link to this record
 

 
Author Fatemeh Noroozi; Ciprian Corneanu; Dorota Kamińska; Tomasz Sapiński; Sergio Escalera; Gholamreza Anbarjafari
Title Survey on Emotional Body Gesture Recognition Type Journal Article
Year 2021 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC
Volume 12 Issue 2 Pages (up) 505 - 523
Keywords
Abstract Automatic emotion recognition has become a trending research topic in the past decade. While works based on facial expressions or speech abound, recognizing affect from body gestures remains a less explored topic. We present a new comprehensive survey hoping to boost research in the field. We first introduce emotional body gestures as a component of what is commonly known as “body language” and comment general aspects as gender differences and culture dependence. We then define a complete framework for automatic emotional body gesture recognition. We introduce person detection and comment static and dynamic body pose estimation methods both in RGB and 3D. We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures. We also discuss multi-modal approaches that combine speech or face with body gestures for improved emotion recognition. While pre-processing methodologies (e.g. human detection and pose estimation) are nowadays mature technologies fully developed for robust large scale analysis, we show that for emotion recognition the quantity of labelled data is scarce, there is no agreement on clearly defined output spaces and the representations are shallow and largely based on naive geometrical representations.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ NCK2021 Serial 3657
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal
Title Graph-Based Deep Generative Modelling for Document Layout Generation Type Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12917 Issue Pages (up) 525-537
Keywords
Abstract One of the major prerequisites for any deep learning approach is the availability of large-scale training data. When dealing with scanned document images in real world scenarios, the principal information of its content is stored in the layout itself. In this work, we have proposed an automated deep generative model using Graph Neural Networks (GNNs) to generate synthetic data with highly variable and plausible document layouts that can be used to train document interpretation systems, in this case, specially in digital mailroom applications. It is also the first graph-based approach for document layout generation task experimented on administrative document images, in this case, invoices.
Address Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121; 600.140; 110.312 Approved no
Call Number Admin @ si @ BRL2021 Serial 3676
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal
Title DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis Type Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12823 Issue Pages (up) 555–568
Keywords
Abstract Despite significant progress on current state-of-the-art image generation models, synthesis of document images containing multiple and complex object layouts is a challenging task. This paper presents a novel approach, called DocSynth, to automatically synthesize document images based on a given layout. In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed DocSynth model learns to generate a set of realistic document images consistent with the defined layout. Also, this framework has been adapted to this work as a superior baseline model for creating synthetic document image datasets for augmenting real data during training for document layout analysis tasks. Different sets of learning objectives have been also used to improve the model performance. Quantitatively, we also compare the generated results of our model with real data using standard evaluation metrics. The results highlight that our model can successfully generate realistic and diverse document images with multiple objects. We also present a comprehensive qualitative analysis summary of the different scopes of synthetic image generation tasks. Lastly, to our knowledge this is the first work of its kind.
Address Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121; 600.140; 110.312 Approved no
Call Number Admin @ si @ BRL2021a Serial 3573
Permanent link to this record
 

 
Author Ricardo Dario Perez Principi; Cristina Palmero; Julio C. S. Jacques Junior; Sergio Escalera
Title On the Effect of Observed Subject Biases in Apparent Personality Analysis from Audio-visual Signals Type Journal Article
Year 2021 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC
Volume 12 Issue 3 Pages (up) 607-621
Keywords
Abstract Personality perception is implicitly biased due to many subjective factors, such as cultural, social, contextual, gender and appearance. Approaches developed for automatic personality perception are not expected to predict the real personality of the target, but the personality external observers attributed to it. Hence, they have to deal with human bias, inherently transferred to the training data. However, bias analysis in personality computing is an almost unexplored area. In this work, we study different possible sources of bias affecting personality perception, including emotions from facial expressions, attractiveness, age, gender, and ethnicity, as well as their influence on prediction ability for apparent personality estimation. To this end, we propose a multi-modal deep neural network that combines raw audio and visual information alongside predictions of attribute-specific models to regress apparent personality. We also analyse spatio-temporal aggregation schemes and the effect of different time intervals on first impressions. We base our study on the ChaLearn First Impressions dataset, consisting of one-person conversational videos. Our model shows state-of-the-art results regressing apparent personality based on the Big-Five model. Furthermore, given the interpretability nature of our network design, we provide an incremental analysis on the impact of each possible source of bias on final network predictions.
Address 1 July-Sept. 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ PPJ2019 Serial 3312
Permanent link to this record
 

 
Author Ruben Tito; Minesh Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas
Title ICDAR 2021 Competition on Document Visual Question Answering Type Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages (up) 635-649
Keywords
Abstract In this report we present results of the ICDAR 2021 edition of the Document Visual Question Challenges. This edition complements the previous tasks on Single Document VQA and Document Collection VQA with a newly introduced on Infographics VQA. Infographics VQA is based on a new dataset of more than 5, 000 infographics images and 30, 000 question-answer pairs. The winner methods have scored 0.6120 ANLS in Infographics VQA task, 0.7743 ANLSL in Document Collection VQA task and 0.8705 ANLS in Single Document VQA. We present a summary of the datasets used for each task, description of each of the submitted methods and the results and analysis of their performance. A summary of the progress made on Single Document VQA since the first edition of the DocVQA 2020 challenge is also presented.
Address VIRTUAL; Lausanne; Suissa; September 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ TMJ2021 Serial 3624
Permanent link to this record
 

 
Author Bartlomiej Twardowski; Pawel Zawistowski; Szymon Zaborowski
Title Metric Learning for Session-Based Recommendations Type Conference Article
Year 2021 Publication 43rd edition of the annual BCS-IRSG European Conference on Information Retrieval Abbreviated Journal
Volume 12656 Issue Pages (up) 650-665
Keywords Session-based recommendations; Deep metric learning; Learning to rank
Abstract Session-based recommenders, used for making predictions out of users’ uninterrupted sequences of actions, are attractive for many applications. Here, for this task we propose using metric learning, where a common embedding space for sessions and items is created, and distance measures dissimilarity between the provided sequence of users’ events and the next action. We discuss and compare metric learning approaches to commonly used learning-to-rank methods, where some synergies exist. We propose a simple architecture for problem analysis and demonstrate that neither extensively big nor deep architectures are necessary in order to outperform existing methods. The experimental results against strong baselines on four datasets are provided with an ablation study.
Address Virtual; March 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECIR
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ TZZ2021 Serial 3586
Permanent link to this record
 

 
Author Pau Torras; Arnau Baro; Lei Kang; Alicia Fornes
Title On the Integration of Language Models into Sequence to Sequence Architectures for Handwritten Music Recognition Type Conference Article
Year 2021 Publication International Society for Music Information Retrieval Conference Abbreviated Journal
Volume Issue Pages (up) 690-696
Keywords
Abstract Despite the latest advances in Deep Learning, the recognition of handwritten music scores is still a challenging endeavour. Even though the recent Sequence to Sequence(Seq2Seq) architectures have demonstrated its capacity to reliably recognise handwritten text, their performance is still far from satisfactory when applied to historical handwritten scores. Indeed, the ambiguous nature of handwriting, the non-standard musical notation employed by composers of the time and the decaying state of old paper make these scores remarkably difficult to read, sometimes even by trained humans. Thus, in this work we explore the incorporation of language models into a Seq2Seq-based architecture to try to improve transcriptions where the aforementioned unclear writing produces statistically unsound mistakes, which as far as we know, has never been attempted for this field of research on this architecture. After studying various Language Model integration techniques, the experimental evaluation on historical handwritten music scores shows a significant improvement over the state of the art, showing that this is a promising research direction for dealing with such difficult manuscripts.
Address Virtual; November 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ISMIR
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ TBK2021 Serial 3616
Permanent link to this record
 

 
Author Ruben Tito; Dimosthenis Karatzas; Ernest Valveny
Title Document Collection Visual Question Answering Type Conference Article
Year 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 12822 Issue Pages (up) 778-792
Keywords Document collection; Visual Question Answering
Abstract Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ TKV2021 Serial 3622
Permanent link to this record
 

 
Author AN Ruchai; VI Kober; KA Dorofeev; VN Karnaukhov; Mikhail Mozerov
Title Classification of breast abnormalities using a deep convolutional neural network and transfer learning Type Journal Article
Year 2021 Publication Journal of Communications Technology and Electronics Abbreviated Journal
Volume 66 Issue 6 Pages (up) 778–783
Keywords
Abstract A new algorithm for classification of breast pathologies in digital mammography using a convolutional neural network and transfer learning is proposed. The following pretrained neural networks were chosen: MobileNetV2, InceptionResNetV2, Xception, and ResNetV2. All mammographic images were pre-processed to improve classification reliability. Transfer training was carried out using additional data augmentation and fine-tuning. The performance of the proposed algorithm for classification of breast pathologies in terms of accuracy on real data is discussed and compared with that of state-of-the-art algorithms on the available MIAS database.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; Approved no
Call Number Admin @ si @ RKD2022 Serial 3680
Permanent link to this record
 

 
Author Ajian Liu; Chenxu Zhao; Zitong Yu; Anyang Su; Xing Liu; Zijian Kong; Jun Wan; Sergio Escalera; Hugo Jair Escalante; Zhen Lei; Guodong Guo
Title 3D High-Fidelity Mask Face Presentation Attack Detection Challenge Type Conference Article
Year 2021 Publication IEEE/CVF International Conference on Computer Vision Workshops Abbreviated Journal
Volume Issue Pages (up) 814-823
Keywords
Abstract The threat of 3D mask to face recognition systems is increasing serious, and has been widely concerned by researchers. To facilitate the study of the algorithms, a large-scale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask) has been collected. Specifically, it consists of total amount of 54,600 videos which are recorded from 75 subjects with 225 realistic masks under 7 new kinds of sensors. Based on this dataset and Protocol 3 which evaluates both the discrimination and generalization ability of the algorithm under the open set scenarios, we organized a 3D High-Fidelity Mask Face Presentation Attack Detection Challenge to boost the research of 3D mask based attack detection. It attracted more than 200 teams for the development phase with a total of 18 teams qualifying for the final round. All the results were verified and re-ran by the organizing team, and the results were used for the final ranking. This paper presents an overview of the challenge, including the introduction of the dataset used, the definition of the protocol, the calculation of the evaluation criteria, and the summary and publication of the competition results. Finally, we focus on introducing and analyzing the top ranked algorithms, the conclusion summary, and the research ideas for mask attack detection provided by this competition.
Address Virtual; October 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ LZY2021 Serial 3646
Permanent link to this record
 

 
Author Javier M. Olaso; Alain Vazquez; Leila Ben Letaifa; Mikel de Velasco; Aymen Mtibaa; Mohamed Amine Hmani; Dijana Petrovska-Delacretaz; Gerard Chollet; Cesar Montenegro; Asier Lopez-Zorrilla; Raquel Justo; Roberto Santana; Jofre Tenorio-Laranga; Eduardo Gonzalez-Fraile; Begoña Fernandez-Ruanova; Gennaro Cordasco; Anna Esposito; Kristin Beck Gjellesvik; Anna Torp Johansen; Maria Stylianou Kornes; Colin Pickard; Cornelius Glackin; Gary Cahalane; Pau Buch; Cristina Palmero; Sergio Escalera; Olga Gordeeva; Olivier Deroo; Anaïs Fernandez; Daria Kyslitska; Jose Antonio Lozano; Maria Ines Torres; Stephan Schlogl
Title The EMPATHIC Virtual Coach: a demo Type Conference Article
Year 2021 Publication 23rd ACM International Conference on Multimodal Interaction Abbreviated Journal
Volume Issue Pages (up) 848-851
Keywords
Abstract The main objective of the EMPATHIC project has been the design and development of a virtual coach to engage the healthy-senior user and to enhance well-being through awareness of personal status. The EMPATHIC approach addresses this objective through multimodal interactions supported by the GROW coaching model. The paper summarizes the main components of the EMPATHIC Virtual Coach (EMPATHIC-VC) and introduces a demonstration of the coaching sessions in selected scenarios.
Address Virtual; October 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICMI
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ OVB2021 Serial 3644
Permanent link to this record
 

 
Author Shiqi Yang; Kai Wang; Luis Herranz; Joost Van de Weijer
Title On Implicit Attribute Localization for Generalized Zero-Shot Learning Type Journal Article
Year 2021 Publication IEEE Signal Processing Letters Abbreviated Journal
Volume 28 Issue Pages (up) 872 - 876
Keywords
Abstract Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their attribute-based descriptions. Since attributes are often related to specific parts of objects, many recent works focus on discovering discriminative regions. However, these methods usually require additional complex part detection modules or attention mechanisms. In this paper, 1) we show that common ZSL backbones (without explicit attention nor part detection) can implicitly localize attributes, yet this property is not exploited. 2) Exploiting it, we then propose SELAR, a simple method that further encourages attribute localization, surprisingly achieving very competitive generalized ZSL (GZSL) performance when compared with more complex state-of-the-art methods. Our findings provide useful insight for designing future GZSL methods, and SELAR provides an easy to implement yet strong baseline.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number YWH2021 Serial 3563
Permanent link to this record
 

 
Author O.F.Ahmad; Y.Mori; M.Misawa; S.Kudo; J.T.Anderson; Jorge Bernal
Title Establishing key research questions for the implementation of artificial intelligence in colonoscopy: a modified Delphi method Type Journal Article
Year 2021 Publication Endoscopy Abbreviated Journal END
Volume 53 Issue 9 Pages (up) 893-901
Keywords
Abstract BACKGROUND : Artificial intelligence (AI) research in colonoscopy is progressing rapidly but widespread clinical implementation is not yet a reality. We aimed to identify the top implementation research priorities. METHODS : An established modified Delphi approach for research priority setting was used. Fifteen international experts, including endoscopists and translational computer scientists/engineers, from nine countries participated in an online survey over 9 months. Questions related to AI implementation in colonoscopy were generated as a long-list in the first round, and then scored in two subsequent rounds to identify the top 10 research questions. RESULTS : The top 10 ranked questions were categorized into five themes. Theme 1: clinical trial design/end points (4 questions), related to optimum trial designs for polyp detection and characterization, determining the optimal end points for evaluation of AI, and demonstrating impact on interval cancer rates. Theme 2: technological developments (3 questions), including improving detection of more challenging and advanced lesions, reduction of false-positive rates, and minimizing latency. Theme 3: clinical adoption/integration (1 question), concerning the effective combination of detection and characterization into one workflow. Theme 4: data access/annotation (1 question), concerning more efficient or automated data annotation methods to reduce the burden on human experts. Theme 5: regulatory approval (1 question), related to making regulatory approval processes more efficient. CONCLUSIONS : This is the first reported international research priority setting exercise for AI in colonoscopy. The study findings should be used as a framework to guide future research with key stakeholders to accelerate the clinical implementation of AI in endoscopy.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number Admin @ si @ AMM2021 Serial 3670
Permanent link to this record