toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author (up) Marc Masana; Bartlomiej Twardowski; Joost Van de Weijer edit   pdf
openurl 
  Title On Class Orderings for Incremental Learning Type Conference Article
  Year 2020 Publication ICML Workshop on Continual Learning Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The influence of class orderings in the evaluation of incremental learning has received very little attention. In this paper, we investigate the impact of class orderings for incrementally learned classifiers. We propose a method to compute various orderings for a dataset. The orderings are derived by simulated annealing optimization from the confusion matrix and reflect different incremental learning scenarios, including maximally and minimally confusing tasks. We evaluate a wide range of state-of-the-art incremental learning methods on the proposed orderings. Results show that orderings can have a significant impact on performance and the ranking of the methods.  
  Address Virtual; July 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICMLW  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ MTW2020 Serial 3505  
Permanent link to this record
 

 
Author (up) Margarita Torre; Beatriz Remeseiro; Petia Radeva; Fernando Martinez edit  url
doi  openurl
  Title DeepNEM: Deep Network Energy-Minimization for Agricultural Field Segmentation Type Journal Article
  Year 2020 Publication IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Abbreviated Journal JSTAEOR  
  Volume 13 Issue Pages 726-737  
  Keywords  
  Abstract One of the main characteristics of agricultural fields is that the appearance of different crops and their growth status, in an aerial image, is varied, and has a wide range of radiometric values and high level of variability. The extraction of these fields and their monitoring are activities that require a high level of human intervention. In this article, we propose a novel automatic algorithm, named deep network energy-minimization (DeepNEM), to extract agricultural fields in aerial images. The model-guided process selects the most relevant image clues extracted by a deep network, completes them and finally generates regions that represent the agricultural fields under a minimization scheme. DeepNEM has been tested over a broad range of fields in terms of size, shape, and content. Different measures were used to compare the DeepNEM with other methods, and to prove that it represents an improved approach to achieve a high-quality segmentation of agricultural fields. Furthermore, this article also presents a new public dataset composed of 1200 images with their parcels boundaries annotations.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB Approved no  
  Call Number Admin @ si @ TRR2020 Serial 3410  
Permanent link to this record
 

 
Author (up) Mariona Caros; Maite Garolera; Petia Radeva; Xavier Giro edit   pdf
url  openurl
  Title Automatic Reminiscence Therapy for Dementia Type Conference Article
  Year 2020 Publication 10th ACM International Conference on Multimedia Retrieval Abbreviated Journal  
  Volume Issue Pages 383-387  
  Keywords  
  Abstract With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily. It affects more than 46 million people worldwide, and it is estimated that in 2050 more than 100 million will be affected. While there are not effective treatments for these terminal diseases, therapies such as reminiscence, that stimulate memories from the past are recommended. Currently, reminiscence therapy takes place in care homes and is guided by a therapist or a carer. In this work, we present an AI-based solution to automatize the reminiscence therapy, which consists in a dialogue system that uses photos as input to generate questions. We run a usability case study with patients diagnosed of mild cognitive impairment that shows they found the system very entertaining and challenging. Overall, this paper presents how reminiscence therapy can be automatized by using machine learning, and deployed to smartphones and laptops, making the therapy more accessible to every person affected by dementia.  
  Address Virtual; October 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICRM  
  Notes Approved no  
  Call Number Admin @ si @ CGR2020 Serial 3529  
Permanent link to this record
 

 
Author (up) Martin Menchon; Estefania Talavera; Jose M. Massa; Petia Radeva edit   pdf
url  openurl
  Title Behavioural Pattern Discovery from Collections of Egocentric Photo-Streams Type Conference Article
  Year 2020 Publication ECCV Workshops Abbreviated Journal  
  Volume 12538 Issue Pages 469-484  
  Keywords  
  Abstract The automatic discovery of behaviour is of high importance when aiming to assess and improve the quality of life of people. Egocentric images offer a rich and objective description of the daily life of the camera wearer. This work proposes a new method to identify a person’s patterns of behaviour from collected egocentric photo-streams. Our model characterizes time-frames based on the context (place, activities and environment objects) that define the images composition. Based on the similarity among the time-frames that describe the collected days for a user, we propose a new unsupervised greedy method to discover the behavioural pattern set based on a novel semantic clustering approach. Moreover, we present a new score metric to evaluate the performance of the proposed algorithm. We validate our method on 104 days and more than 100k images extracted from 7 users. Results show that behavioural patterns can be discovered to characterize the routine of individuals and consequently their lifestyle.  
  Address Virtual; August 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ MTM2020 Serial 3528  
Permanent link to this record
 

 
Author (up) Meysam Madadi; Hugo Bertiche; Sergio Escalera edit   pdf
url  openurl
  Title SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery Type Journal Article
  Year 2020 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 106 Issue Pages 107472  
  Keywords Deep learning; 3D Human pose; Body shape; SMPL; Denoising autoencoder; Volumetric stack hourglass  
  Abstract In this paper we propose to embed SMPL within a deep-based model to accurately estimate 3D pose and shape from a still RGB image. We use CNN-based 3D joint predictions as an intermediate representation to regress SMPL pose and shape parameters. Later, 3D joints are reconstructed again in the SMPL output. This module can be seen as an autoencoder where the encoder is a deep neural network and the decoder is SMPL model. We refer to this as SMPL reverse (SMPLR). By implementing SMPLR as an encoder-decoder we avoid the need of complex constraints on pose and shape. Furthermore, given that in-the-wild datasets usually lack accurate 3D annotations, it is desirable to lift 2D joints to 3D without pairing 3D annotations with RGB images. Therefore, we also propose a denoising autoencoder (DAE) module between CNN and SMPLR, able to lift 2D joints to 3D and partially recover from structured error. We evaluate our method on SURREAL and Human3.6M datasets, showing improvement over SMPL-based state-of-the-art alternatives by about 4 and 12 mm, respectively.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ MBE2020 Serial 3439  
Permanent link to this record
 

 
Author (up) Mikel Menta; Adriana Romero; Joost Van de Weijer edit   pdf
openurl 
  Title Learning to adapt class-specific features across domains for semantic segmentation Type Miscellaneous
  Year 2020 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract arXiv:2001.08311
Recent advances in unsupervised domain adaptation have shown the effectiveness of adversarial training to adapt features across domains, endowing neural networks with the capability of being tested on a target domain without requiring any training annotations in this domain. The great majority of existing domain adaptation models rely on image translation networks, which often contain a huge amount of domain-specific parameters. Additionally, the feature adaptation step often happens globally, at a coarse level, hindering its applicability to tasks such as semantic segmentation, where details are of crucial importance to provide sharp results. In this thesis, we present a novel architecture, which learns to adapt features across domains by taking into account per class information. To that aim, we design a conditional pixel-wise discriminator network, whose output is conditioned on the segmentation masks. Moreover, following recent advances in image translation, we adopt the recently introduced StarGAN architecture as image translation backbone, since it is able to perform translations across multiple domains by means of a single generator network. Preliminary results on a segmentation task designed to assess the effectiveness of the proposed approach highlight the potential of the model, improving upon strong baselines and alternative designs.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ MRW2020 Serial 3545  
Permanent link to this record
 

 
Author (up) Minesh Mathew; Ruben Tito; Dimosthenis Karatzas; R.Manmatha; C.V. Jawahar edit   pdf
url  openurl
  Title Document Visual Question Answering Challenge 2020 Type Conference Article
  Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition – Short paper Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper presents results of Document Visual Question Answering Challenge organized as part of “Text and Documents in the Deep Learning Era” workshop, in CVPR 2020. The challenge introduces a new problem – Visual Question Answering on document images. The challenge comprised two tasks. The first task concerns with asking questions on a single document image. On the other hand, the second task is set as a retrieval task where the question is posed over a collection of images. For the task 1 a new dataset is introduced comprising 50,000 questions-answer(s) pairs defined over 12,767 document images. For task 2 another dataset has been created comprising 20 questions over 14,362 document images which share the same document template.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MTK2020 Serial 3558  
Permanent link to this record
 

 
Author (up) Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes edit   pdf
openurl 
  Title A conditional GAN based approach for distorted camera captured documents recovery Type Conference Article
  Year 2020 Publication 4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Virtual; December 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MedPRAI  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ SKF2020 Serial 3450  
Permanent link to this record
 

 
Author (up) Oriol Ramos Terrades; Albert Berenguel; Debora Gil edit   pdf
url  openurl
  Title A flexible outlier detector based on a topology given by graph communities Type Miscellaneous
  Year 2020 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Outlier, or anomaly, detection is essential for optimal performance of machine learning methods and statistical predictive models. It is not just a technical step in a data cleaning process but a key topic in many fields such as fraudulent document detection, in medical applications and assisted diagnosis systems or detecting security threats. In contrast to population-based methods, neighborhood based local approaches are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. However, a main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters. This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world data sets show that our approach overall outperforms, both, local and global strategies in multi and single view settings.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; DAG; 600.139; 600.145; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ RBG2020 Serial 3475  
Permanent link to this record
 

 
Author (up) Pau Riba edit  isbn
openurl 
  Title Distilling Structure from Imagery: Graph-based Models for the Interpretation of Document Images Type Book Whole
  Year 2020 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract From its early stages, the community of Pattern Recognition and Computer Vision has considered the importance of leveraging the structural information when understanding images. Usually, graphs have been proposed as a suitable model to represent this kind of information due to their flexibility and representational power able to codify both, the components, objects, or entities and their pairwise relationship. Even though graphs have been successfully applied to a huge variety of tasks, as a result of their symbolic and relational nature, graphs have always suffered from some limitations compared to statistical approaches. Indeed, some trivial mathematical operations do not have an equivalence in the graph domain. For instance, in the core of many pattern recognition applications, there is a need to compare two objects. This operation, which is trivial when considering feature vectors defined in \(\mathbb{R}^n\), is not properly defined for graphs.


In this thesis, we have investigated the importance of the structural information from two perspectives, the traditional graph-based methods and the new advances on Geometric Deep Learning. On the one hand, we explore the problem of defining a graph representation and how to deal with it on a large scale and noisy scenario. On the other hand, Graph Neural Networks are proposed to first redefine a Graph Edit Distance methodologies as a metric learning problem, and second, to apply them in a real use case scenario for the detection of repetitive patterns which define tables in invoice documents. As experimental framework, we have validated the different methodological contributions in the domain of Document Image Analysis and Recognition.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Alicia Fornes  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-121011-6-4 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Rib20 Serial 3478  
Permanent link to this record
 

 
Author (up) Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes edit   pdf
url  openurl
  Title Learning Graph Edit Distance by Graph NeuralNetworks Type Miscellaneous
  Year 2020 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies. In this paper, we propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance. Hence, we propose an efficient graph distance based on the novel field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation. The performance of the proposed graph distance is validated on two different scenarios. On the one hand, in a graph retrieval of handwritten words~\ie~keyword spotting, showing its superior performance when compared with (approximate) graph edit distance benchmarks. On the other hand, demonstrating competitive results for graph similarity learning when compared with the current state-of-the-art on a recent benchmark dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.140; 601.302 Approved no  
  Call Number Admin @ si @ RFL2020 Serial 3555  
Permanent link to this record
 

 
Author (up) Pau Riba; Josep Llados; Alicia Fornes edit  url
openurl 
  Title Hierarchical graphs for coarse-to-fine error tolerant matching Type Journal Article
  Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 134 Issue Pages 116-124  
  Keywords Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval  
  Abstract During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting).  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.097; 601.302; 603.057; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ RLF2020 Serial 3349  
Permanent link to this record
 

 
Author (up) Pau Rodriguez; Diego Velazquez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez edit   pdf
doi  openurl
  Title Pay attention to the activations: a modular attention mechanism for fine-grained image recognition Type Journal Article
  Year 2020 Publication IEEE Transactions on Multimedia Abbreviated Journal TMM  
  Volume 22 Issue 2 Pages 502-514  
  Keywords  
  Abstract Fine-grained image recognition is central to many multimedia tasks such as search, retrieval, and captioning. Unfortunately, these tasks are still challenging since the appearance of samples of the same class can be more different than those from different classes. This issue is mainly due to changes in deformation, pose, and the presence of clutter. In the literature, attention has been one of the most successful strategies to handle the aforementioned problems. Attention has been typically implemented in neural networks by selecting the most informative regions of the image that improve classification. In contrast, in this paper, attention is not applied at the image level but to the convolutional feature activations. In essence, with our approach, the neural model learns to attend to lower-level feature activations without requiring part annotations and uses those activations to update and rectify the output likelihood distribution. The proposed mechanism is modular, architecture-independent, and efficient in terms of both parameters and computation required. Experiments demonstrate that well-known networks such as wide residual networks and ResNeXt, when augmented with our approach, systematically improve their classification accuracy and become more robust to changes in deformation and pose and to the presence of clutter. As a result, our proposal reaches state-of-the-art classification accuracies in CIFAR-10, the Adience gender recognition task, Stanford Dogs, and UEC-Food100 while obtaining competitive performance in ImageNet, CIFAR-100, CUB200 Birds, and Stanford Cars. In addition, we analyze the different components of our model, showing that the proposed attention modules succeed in finding the most discriminative regions of the image. Finally, as a proof of concept, we demonstrate that with only local predictions, an augmented neural network can successfully classify an image before reaching any fully connected layer, thus reducing the computational amount up to 10%.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.119; 600.098 Approved no  
  Call Number Admin @ si @ RVC2020a Serial 3417  
Permanent link to this record
 

 
Author (up) Pau Rodriguez; Diego Velazquez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez edit  url
doi  openurl
  Title Personality Trait Analysis in Social Networks Based on Weakly Supervised Learning of Shared Images Type Journal Article
  Year 2020 Publication Applied Sciences Abbreviated Journal APPLSCI  
  Volume 10 Issue 22 Pages 8170  
  Keywords sentiment analysis, personality trait analysis; weakly-supervised learning; visual classification; OCEAN model; social networks  
  Abstract Social networks have attracted the attention of psychologists, as the behavior of users can be used to assess personality traits, and to detect sentiments and critical mental situations such as depression or suicidal tendencies. Recently, the increasing amount of image uploads to social networks has shifted the focus from text to image-based personality assessment. However, obtaining the ground-truth requires giving personality questionnaires to the users, making the process very costly and slow, and hindering research on large populations. In this paper, we demonstrate that it is possible to predict which images are most associated with each personality trait of the OCEAN personality model, without requiring ground-truth personality labels. Namely, we present a weakly supervised framework which shows that the personality scores obtained using specific images textually associated with particular personality traits are highly correlated with scores obtained using standard text-based personality questionnaires. We trained an OCEAN trait model based on Convolutional Neural Networks (CNNs), learned from 120K pictures posted with specific textual hashtags, to infer whether the personality scores from the images uploaded by users are consistent with those scores obtained from text. In order to validate our claims, we performed a personality test on a heterogeneous group of 280 human subjects, showing that our model successfully predicts which kind of image will match a person with a given level of a trait. Looking at the results, we obtained evidence that personality is not only correlated with text, but with image content too. Interestingly, different visual patterns emerged from those images most liked by persons with a particular personality trait: for instance, pictures most associated with high conscientiousness usually contained healthy food, while low conscientiousness pictures contained injuries, guns, and alcohol. These findings could pave the way to complement text-based personality questionnaires with image-based questions.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.119 Approved no  
  Call Number Admin @ si @ RVC2020b Serial 3553  
Permanent link to this record
 

 
Author (up) Petia Radeva edit  openurl
  Title Uncertainty Modeling within an End-to-end Framework for Food Image Analysis Type Conference Article
  Year 2020 Publication 1st DELTA Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DELTA  
  Notes MILAB Approved no  
  Call Number Admin @ si @ Rad2020 Serial 3527  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: