Home | << 1 2 3 4 5 6 7 8 9 >> |
Records | |||||
---|---|---|---|---|---|
Author | Anna Esposito; Italia Cirillo; Antonietta Esposito; Leopoldina Fortunati; Gian Luca Foresti; Sergio Escalera; Nikolaos Bourbakis | ||||
Title | Impairments in decoding facial and vocal emotional expressions in high functioning autistic adults and adolescents | Type | Conference Article | ||
Year | 2020 | Publication | Faces and Gestures in E-health and welfare workshop | Abbreviated Journal | |
Volume | Issue | Pages | 667-674 | ||
Keywords | |||||
Abstract | |||||
Address | Virtual; November 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FGW | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ ECE2020 | Serial | 3516 | ||
Permanent link to this record | |||||
Author | Josep Famadas; Meysam Madadi; Cristina Palmero; Sergio Escalera | ||||
Title | Generative Video Face Reenactment by AUs and Gaze Regularization | Type | Conference Article | ||
Year | 2020 | Publication | 15th IEEE International Conference on Automatic Face and Gesture Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 444-451 | ||
Keywords | |||||
Abstract | In this work, we propose an encoder-decoder-like architecture to perform face reenactment in image sequences. Our goal is to transfer the training subject identity to a given test subject. We regularize face reenactment by facial action unit intensity and 3D gaze vector regression. This way, we enforce the network to transfer subtle facial expressions and eye dynamics, providing a more lifelike result. The proposed encoder-decoder receives as input the previous sequence frame stacked to the current frame image of facial landmarks. Thus, the generated frames benefit from appearance and geometry, while keeping temporal coherence for the generated sequence. At test stage, a new target subject with the facial performance of the source subject and the appearance of the training subject is reenacted. Principal component analysis is applied to project the test subject geometry to the closest training subject geometry before reenactment. Evaluation of our proposal shows faster convergence, and more accurate and realistic results in comparison to other architectures without action units and gaze regularization. | ||||
Address | Virtual; November 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FG | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ FMP2020 | Serial | 3517 | ||
Permanent link to this record | |||||
Author | Carlos Martin-Isla; Maryam Asadi-Aghbolaghi; Polyxeni Gkontra; Victor M. Campello; Sergio Escalera; Karim Lekadir | ||||
Title | Stacked BCDU-net with semantic CMR synthesis: application to Myocardial Pathology Segmentation challenge | Type | Conference Article | ||
Year | 2020 | Publication | MYOPS challenge and workshop | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Virtual; October 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | MICCAIW | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ MAG2020 | Serial | 3518 | ||
Permanent link to this record | |||||
Author | Hugo Bertiche; Meysam Madadi; Sergio Escalera | ||||
Title | CLOTH3D: Clothed 3D Humans | Type | Conference Article | ||
Year | 2020 | Publication | 16th European Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This work presents CLOTH3D, the first big scale synthetic dataset of 3D clothed human sequences. CLOTH3D contains a large variability on garment type, topology, shape, size, tightness and fabric. Clothes are simulated on top of thousands of different pose sequences and body shapes, generating realistic cloth dynamics. We provide the dataset with a generative model for cloth generation. We propose a Conditional Variational Auto-Encoder (CVAE) based on graph convolutions (GCVAE) to learn garment latent spaces. This allows for realistic generation of 3D garments on top of SMPL model for any pose and shape. | ||||
Address | Virtual; August 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCV | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ BME2020 | Serial | 3519 | ||
Permanent link to this record | |||||
Author | Reza Azad; Maryam Asadi-Aghbolaghi; Mahmood Fathy; Sergio Escalera | ||||
Title | Attention Deeplabv3+: Multi-level Context Attention Mechanism for Skin Lesion Segmentation | Type | Conference Article | ||
Year | 2020 | Publication | Bioimage computation workshop | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Virtual; August 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCVW | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ AAF2020 | Serial | 3520 | ||
Permanent link to this record | |||||
Author | Petia Radeva | ||||
Title | Uncertainty Modeling within an End-to-end Framework for Food Image Analysis | Type | Conference Article | ||
Year | 2020 | Publication | 1st DELTA | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DELTA | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ Rad2020 | Serial | 3527 | ||
Permanent link to this record | |||||
Author | Martin Menchon; Estefania Talavera; Jose M. Massa; Petia Radeva | ||||
Title | Behavioural Pattern Discovery from Collections of Egocentric Photo-Streams | Type | Conference Article | ||
Year | 2020 | Publication | ECCV Workshops | Abbreviated Journal | |
Volume | 12538 | Issue | Pages | 469-484 | |
Keywords | |||||
Abstract | The automatic discovery of behaviour is of high importance when aiming to assess and improve the quality of life of people. Egocentric images offer a rich and objective description of the daily life of the camera wearer. This work proposes a new method to identify a person’s patterns of behaviour from collected egocentric photo-streams. Our model characterizes time-frames based on the context (place, activities and environment objects) that define the images composition. Based on the similarity among the time-frames that describe the collected days for a user, we propose a new unsupervised greedy method to discover the behavioural pattern set based on a novel semantic clustering approach. Moreover, we present a new score metric to evaluate the performance of the proposed algorithm. We validate our method on 104 days and more than 100k images extracted from 7 users. Results show that behavioural patterns can be discovered to characterize the routine of individuals and consequently their lifestyle. | ||||
Address | Virtual; August 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCVW | ||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ MTM2020 | Serial | 3528 | ||
Permanent link to this record | |||||
Author | Mariona Caros; Maite Garolera; Petia Radeva; Xavier Giro | ||||
Title | Automatic Reminiscence Therapy for Dementia | Type | Conference Article | ||
Year | 2020 | Publication | 10th ACM International Conference on Multimedia Retrieval | Abbreviated Journal | |
Volume | Issue | Pages | 383-387 | ||
Keywords | |||||
Abstract | With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily. It affects more than 46 million people worldwide, and it is estimated that in 2050 more than 100 million will be affected. While there are not effective treatments for these terminal diseases, therapies such as reminiscence, that stimulate memories from the past are recommended. Currently, reminiscence therapy takes place in care homes and is guided by a therapist or a carer. In this work, we present an AI-based solution to automatize the reminiscence therapy, which consists in a dialogue system that uses photos as input to generate questions. We run a usability case study with patients diagnosed of mild cognitive impairment that shows they found the system very entertaining and challenging. Overall, this paper presents how reminiscence therapy can be automatized by using machine learning, and deployed to smartphones and laptops, making the therapy more accessible to every person affected by dementia. | ||||
Address | Virtual; October 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICRM | ||
Notes | Approved | no | |||
Call Number | Admin @ si @ CGR2020 | Serial | 3529 | ||
Permanent link to this record | |||||
Author | Idoia Ruiz; Joan Serrat | ||||
Title | Rank-based ordinal classification | Type | Conference Article | ||
Year | 2020 | Publication | 25th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 8069-8076 | ||
Keywords | |||||
Abstract | Differently from the regular classification task, in ordinal classification there is an order in the classes. As a consequence not all classification errors matter the same: a predicted class close to the groundtruth one is better than predicting a farther away class. To account for this, most previous works employ loss functions based on the absolute difference between the predicted and groundtruth class labels. We argue that there are many cases in ordinal classification where label values are arbitrary (for instance 1. . . C, being C the number of classes) and thus such loss functions may not be the best choice. We instead propose a network architecture that produces not a single class prediction but an ordered vector, or ranking, of all the possible classes from most to least likely. This is thanks to a loss function that compares groundtruth and predicted rankings of these class labels, not the labels themselves. Another advantage of this new formulation is that we can enforce consistency in the predictions, namely, predicted rankings come from some unimodal vector of scores with mode at the groundtruth class. We compare with the state of the art ordinal classification methods, showing
that ours attains equal or better performance, as measured by common ordinal classification metrics, on three benchmark datasets. Furthermore, it is also suitable for a new task on image aesthetics assessment, i.e. most voted score prediction. Finally, we also apply it to building damage assessment from satellite images, providing an analysis of its performance depending on the degree of imbalance of the dataset. |
||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | ADAS; 600.118; 600.124 | Approved | no | ||
Call Number | Admin @ si @ RuS2020 | Serial | 3549 | ||
Permanent link to this record | |||||
Author | Klara Janousckova; Jiri Matas; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Text Recognition – Real World Data and Where to Find Them | Type | Conference Article | ||
Year | 2020 | Publication | 25th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 4489-4496 | ||
Keywords | |||||
Abstract | We present a method for exploiting weakly annotated images to improve text extraction pipelines. The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions. The method includes matching of imprecise transcriptions to weak annotations and an edit distance guided neighbourhood search. It produces nearly error-free, localised instances of scene text, which we treat as “pseudo ground truth” (PGT). The method is applied to two weakly-annotated datasets. Training with the extracted PGT consistently improves the accuracy of a state of the art recognition model, by 3.7% on average, across different benchmark datasets (image domains) and 24.5% on one of the weakly annotated datasets 1 1 Acknowledgements. The authors were supported by Czech Technical University student grant SGS20/171/0HK3/3TJ13, the MEYS VVV project CZ.02.1.01/0.010.0J16 019/0000765 Research Center for Informatics, the Spanish Research project TIN2017-89779-P and the CERCA Programme / Generalitat de Catalunya. | ||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ JMG2020 | Serial | 3557 | ||
Permanent link to this record | |||||
Author | Minesh Mathew; Ruben Tito; Dimosthenis Karatzas; R.Manmatha; C.V. Jawahar | ||||
Title | Document Visual Question Answering Challenge 2020 | Type | Conference Article | ||
Year | 2020 | Publication | 33rd IEEE Conference on Computer Vision and Pattern Recognition – Short paper | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This paper presents results of Document Visual Question Answering Challenge organized as part of “Text and Documents in the Deep Learning Era” workshop, in CVPR 2020. The challenge introduces a new problem – Visual Question Answering on document images. The challenge comprised two tasks. The first task concerns with asking questions on a single document image. On the other hand, the second task is set as a retrieval task where the question is posed over a collection of images. For the task 1 a new dataset is introduced comprising 50,000 questions-answer(s) pairs defined over 12,767 document images. For task 2 another dataset has been created comprising 20 questions over 14,362 document images which share the same document template. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPR | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MTK2020 | Serial | 3558 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Ralf Herbrich | ||||
Title | The NeurIPS’18 Competition: From Machine Learning to Intelligent Conversations | Type | Book Whole | ||
Year | 2020 | Publication | The Springer Series on Challenges in Machine Learning | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This volume presents the results of the Neural Information Processing Systems Competition track at the 2018 NeurIPS conference. The competition follows the same format as the 2017 competition track for NIPS. Out of 21 submitted proposals, eight competition proposals were selected, spanning the area of Robotics, Health, Computer Vision, Natural Language Processing, Systems and Physics. Competitions have become an integral part of advancing state-of-the-art in artificial intelligence (AI). They exhibit one important difference to benchmarks: Competitions test a system end-to-end rather than evaluating only a single component; they assess the practicability of an algorithmic solution in addition to assessing feasibility. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | Sergio Escalera; Ralf Hebrick | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2520-1328 | ISBN | 978-3-030-29134-1 | Medium | |
Area | Expedition | Conference | |||
Notes | HuPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ HeE2020 | Serial | 3328 | ||
Permanent link to this record | |||||
Author | Yaxing Wang | ||||
Title | Transferring and Learning Representations for Image Generation and Translation | Type | Book Whole | ||
Year | 2020 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Image generation is arguably one of the most attractive, compelling, and challenging tasks in computer vision. Among the methods which perform image generation, generative adversarial networks (GANs) play a key role. The most common image generation models based on GANs can be divided into two main approaches. The first one, called simply image generation takes random noise as an input and synthesizes an image which follows the same distribution as the images in the training set. The second class, which is called image-to-image translation, aims to map an image from a source domain to one that is indistinguishable from those in the target domain. Image-to-image translation methods can further be divided into paired and unpaired image-to-image translation based on whether they require paired data or not. In this thesis, we aim to address some challenges of both image generation and image-to-image generation.GANs highly rely upon having access to vast quantities of data, and fail to generate realistic images from random noise when applied to domains with few images. To address this problem, we aim to transfer knowledge from a model trained on a large dataset (source domain) to the one learned on limited data (target domain). We find that both GANs andconditional GANs can benefit from models trained on large datasets. Our experiments show that transferring the discriminator is more important than the generator. Using both the generator and discriminator results in the best performance. We found, however, that this method suffers from overfitting, since we update all parameters to adapt to the target data. We propose a novel architecture, which is tailored to address knowledge transfer to very small target domains. Our approach effectively exploreswhich part of the latent space is more related to the target domain. Additionally, the proposed method is able to transfer knowledge from multiple pretrained GANs. Although image-to-image translation has achieved outstanding performance, it still facesseveral problems. First, for translation between complex domains (such as translations between different modalities) image-to-image translation methods require paired data. We show that when only some of the pairwise translations have been seen (i.e. during training), we can infer the remaining unseen translations (where training pairs are not available). We propose a new approach where we align multiple encoders and decoders in such a way that the desired translation can be obtained by simply cascadingthe source encoder and the target decoder, even when they have not interacted during the training stage (i.e. unseen). Second, we address the issue of bias in image-to-image translation. Biased datasets unavoidably contain undesired changes, which are dueto the fact that the target dataset has a particular underlying visual distribution. We use carefully designed semantic constraints to reduce the effects of the bias. The semantic constraint aims to enforce the preservation of desired image properties. Finally, current approaches fail to generate diverse outputs or perform scalable image transfer in a single model. To alleviate this problem, we propose a scalable and diverse image-to-image translation. We employ random noise to control the diversity. The scalabitlity is determined by conditioning the domain label.computer vision, deep learning, imitation learning, adversarial generative networks, image generation, image-to-image translation. | ||||
Address | January 2020 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Joost Van de Weijer;Abel Gonzalez;Luis Herranz | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-121011-5-7 | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; 600.141; 600.120 | Approved | no | ||
Call Number | Admin @ si @ Wan2020 | Serial | 3397 | ||
Permanent link to this record | |||||
Author | Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z. Li | ||||
Title | Multi-modal Face Presentation Attach Detection | Type | Book Whole | ||
Year | 2020 | Publication | Synthesis Lectures on Computer Vision | Abbreviated Journal | |
Volume | 13 | Issue | Pages | ||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA | Approved | no | ||
Call Number | Admin @ si @ WGE2020 | Serial | 3440 | ||
Permanent link to this record | |||||
Author | Pau Riba | ||||
Title | Distilling Structure from Imagery: Graph-based Models for the Interpretation of Document Images | Type | Book Whole | ||
Year | 2020 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | From its early stages, the community of Pattern Recognition and Computer Vision has considered the importance of leveraging the structural information when understanding images. Usually, graphs have been proposed as a suitable model to represent this kind of information due to their flexibility and representational power able to codify both, the components, objects, or entities and their pairwise relationship. Even though graphs have been successfully applied to a huge variety of tasks, as a result of their symbolic and relational nature, graphs have always suffered from some limitations compared to statistical approaches. Indeed, some trivial mathematical operations do not have an equivalence in the graph domain. For instance, in the core of many pattern recognition applications, there is a need to compare two objects. This operation, which is trivial when considering feature vectors defined in \(\mathbb{R}^n\), is not properly defined for graphs.
In this thesis, we have investigated the importance of the structural information from two perspectives, the traditional graph-based methods and the new advances on Geometric Deep Learning. On the one hand, we explore the problem of defining a graph representation and how to deal with it on a large scale and noisy scenario. On the other hand, Graph Neural Networks are proposed to first redefine a Graph Edit Distance methodologies as a metric learning problem, and second, to apply them in a real use case scenario for the detection of repetitive patterns which define tables in invoice documents. As experimental framework, we have validated the different methodological contributions in the domain of Document Image Analysis and Recognition. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Josep Llados;Alicia Fornes | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-121011-6-4 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ Rib20 | Serial | 3478 | ||
Permanent link to this record |