|   | 
Details
   web
Records
Author Andreas Møgelmose; Chris Bahnsen; Thomas B. Moeslund; Albert Clapes; Sergio Escalera
Title Tri-modal Person Re-identification with RGB, Depth and Thermal Features Type Conference Article
Year 2013 Publication 9th IEEE Workshop on Perception beyond the visible Spectrum, Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 301-307
Keywords
Abstract Person re-identification is about recognizing people who have passed by a sensor earlier. Previous work is mainly based on RGB data, but in this work we for the first time present a system where we combine RGB, depth, and thermal data for re-identification purposes. First, from each of the three modalities, we obtain some particular features: from RGB data, we model color information from different regions of the body, from depth data, we compute different soft body biometrics, and from thermal data, we extract local structural information. Then, the three information types are combined in a joined classifier. The tri-modal system is evaluated on a new RGB-D-T dataset, showing successful results in re-identification scenarios.
Address Portland; oregon; June 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-0-7695-4990-3 Medium
Area Expedition Conference CVPRW
Notes HUPBA;MILAB Approved no
Call Number (down) Admin @ si @ MBM2013 Serial 2253
Permanent link to this record
 

 
Author Meysam Madadi; Hugo Bertiche; Sergio Escalera
Title Deep unsupervised 3D human body reconstruction from a sparse set of landmarks Type Journal Article
Year 2021 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 129 Issue Pages 2499–2512
Keywords
Abstract In this paper we propose the first deep unsupervised approach in human body reconstruction to estimate body surface from a sparse set of landmarks, so called DeepMurf. We apply a denoising autoencoder to estimate missing landmarks. Then we apply an attention model to estimate body joints from landmarks. Finally, a cascading network is applied to regress parameters of a statistical generative model that reconstructs body. Our set of proposed loss functions allows us to train the network in an unsupervised way. Results on four public datasets show that our approach accurately reconstructs the human body from real world mocap data.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number (down) Admin @ si @ MBE2021 Serial 3654
Permanent link to this record
 

 
Author Meysam Madadi; Hugo Bertiche; Sergio Escalera
Title SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery Type Journal Article
Year 2020 Publication Pattern Recognition Abbreviated Journal PR
Volume 106 Issue Pages 107472
Keywords Deep learning; 3D Human pose; Body shape; SMPL; Denoising autoencoder; Volumetric stack hourglass
Abstract In this paper we propose to embed SMPL within a deep-based model to accurately estimate 3D pose and shape from a still RGB image. We use CNN-based 3D joint predictions as an intermediate representation to regress SMPL pose and shape parameters. Later, 3D joints are reconstructed again in the SMPL output. This module can be seen as an autoencoder where the encoder is a deep neural network and the decoder is SMPL model. We refer to this as SMPL reverse (SMPLR). By implementing SMPLR as an encoder-decoder we avoid the need of complex constraints on pose and shape. Furthermore, given that in-the-wild datasets usually lack accurate 3D annotations, it is desirable to lift 2D joints to 3D without pairing 3D annotations with RGB images. Therefore, we also propose a denoising autoencoder (DAE) module between CNN and SMPLR, able to lift 2D joints to 3D and partially recover from structured error. We evaluate our method on SURREAL and Human3.6M datasets, showing improvement over SMPL-based state-of-the-art alternatives by about 4 and 12 mm, respectively.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number (down) Admin @ si @ MBE2020 Serial 3439
Permanent link to this record
 

 
Author Armin Mehri; Parichehr Behjati; Dario Carpio; Angel Sappa
Title SRFormer: Efficient Yet Powerful Transformer Network for Single Image Super Resolution Type Journal Article
Year 2023 Publication IEEE Access Abbreviated Journal ACCESS
Volume 11 Issue Pages
Keywords
Abstract Recent breakthroughs in single image super resolution have investigated the potential of deep Convolutional Neural Networks (CNNs) to improve performance. However, CNNs based models suffer from their limited fields and their inability to adapt to the input content. Recently, Transformer based models were presented, which demonstrated major performance gains in Natural Language Processing and Vision tasks while mitigating the drawbacks of CNNs. Nevertheless, Transformer computational complexity can increase quadratically for high-resolution images, and the fact that it ignores the original structures of the image by converting them to the 1D structure can make it problematic to capture the local context information and adapt it for real-time applications. In this paper, we present, SRFormer, an efficient yet powerful Transformer-based architecture, by making several key designs in the building of Transformer blocks and Transformer layers that allow us to consider the original structure of the image (i.e., 2D structure) while capturing both local and global dependencies without raising computational demands or memory consumption. We also present a Gated Multi-Layer Perceptron (MLP) Feature Fusion module to aggregate the features of different stages of Transformer blocks by focusing on inter-spatial relationships while adding minor computational costs to the network. We have conducted extensive experiments on several super-resolution benchmark datasets to evaluate our approach. SRFormer demonstrates superior performance compared to state-of-the-art methods from both Transformer and Convolutional networks, with an improvement margin of 0.1∼0.53dB . Furthermore, while SRFormer has almost the same model size, it outperforms SwinIR by 0.47% and inference time by half the time of SwinIR. The code will be available on GitHub.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU Approved no
Call Number (down) Admin @ si @ MBC2023 Serial 3887
Permanent link to this record
 

 
Author Meysam Madadi; Hugo Bertiche; Wafa Bouzouita; Isabelle Guyon; Sergio Escalera
Title Learning Cloth Dynamics: 3D+Texture Garment Reconstruction Benchmark Type Conference Article
Year 2021 Publication Proceedings of Machine Learning Research Abbreviated Journal
Volume 133 Issue Pages 57-76
Keywords
Abstract Human avatars are important targets in many computer applications. Accurately tracking, capturing, reconstructing and animating the human body, face and garments in 3D are critical for human-computer interaction, gaming, special effects and virtual reality. In the past, this has required extensive manual animation. Regardless of the advances in human body and face reconstruction, still modeling, learning and analyzing human dynamics need further attention. In this paper we plan to push the research in this direction, e.g. understanding human dynamics in 2D and 3D, with special attention to garments. We provide a large-scale dataset (more than 2M frames) of animated garments with variable topology and type, calledCLOTH3D++. The dataset contains RGBA video sequences paired with its corresponding 3D data. We pay special care to garment dynamics and realistic rendering of RGB data, including lighting, fabric type and texture. With this dataset, we hold a competition at NeurIPS2020. We design three tracks so participants can compete to develop the best method to perform 3D garment reconstruction in a sequence from (1) 3D-to-3D garments, (2) RGB-to-3D garments, and (3) RGB-to-3D garments plus texture. We also provide a baseline method, based on graph convolutional networks, for each track. Baseline results show that there is a lot of room for improvements. However, due to the challenging nature of the problem, no participant could outperform the baselines.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number (down) Admin @ si @ MBB2021 Serial 3655
Permanent link to this record
 

 
Author Jose Marone; Simone Balocco; Marc Bolaños; Jose Massa; Petia Radeva
Title Learning the Lumen Border using a Convolutional Neural Networks classifier Type Conference Article
Year 2016 Publication 19th International Conference on Medical Image Computing and Computer Assisted Intervention Workshop Abbreviated Journal
Volume Issue Pages
Keywords
Abstract IntraVascular UltraSound (IVUS) is a technique allowing the diagnosis of coronary plaque. An accurate (semi-)automatic assessment of the luminal contours could speed up the diagnosis. In most of the approaches, the information on the vessel shape is obtained combining a supervised learning step with a local refinement algorithm. In this paper, we explore for the first time, the use of a Convolutional Neural Networks (CNN) architecture that on one hand is able to extract the optimal image features and at the same time can serve as a supervised classifier to detect the lumen border in IVUS images. The main limitation of CNN, relies on the fact that this technique requires a large amount of training data due to the huge amount of parameters that it has. To
solve this issue, we introduce a patch classification approach to generate an extended training-set from a few annotated images. An accuracy of 93% and F-score of 71% was obtained with this technique, even when it was applied to challenging frames containig calcified plaques, stents and catheter shadows.
Address Athens; Greece; October 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MICCAIW
Notes MILAB; Approved no
Call Number (down) Admin @ si @ MBB2016 Serial 2822
Permanent link to this record
 

 
Author Judit Martinez; F. Thomas
Title Efficient Computation of Local Geometric Moments Type Journal Article
Year 2002 Publication IEEE Transactions on Image Porcessing, (IF: 2.553) Abbreviated Journal
Volume 11 Issue 9 Pages 1102-1111
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number (down) Admin @ si @ MaT2002 Serial 271
Permanent link to this record
 

 
Author Armin Mehri; Parichehr Behjati Ardakani; Angel Sappa
Title MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution Type Conference Article
Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 2703-2712
Keywords
Abstract Lightweight super resolution networks have extremely importance for real-world applications. In recent years several SR deep learning approaches with outstanding achievement have been introduced by sacrificing memory and computational cost. To overcome this problem, a novel lightweight super resolution network is proposed, which improves the SOTA performance in lightweight SR and performs roughly similar to computationally expensive networks. Multi-Path Residual Network designs with a set of Residual concatenation Blocks stacked with Adaptive Residual Blocks: ($i$) to adaptively extract informative features and learn more expressive spatial context information; ($ii$) to better leverage multi-level representations before up-sampling stage; and ($iii$) to allow an efficient information and gradient flow within the network. The proposed architecture also contains a new attention mechanism, Two-Fold Attention Module, to maximize the representation ability of the model. Extensive experiments show the superiority of our model against other SOTA SR approaches.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes MSIAU; 600.130; 600.122 Approved no
Call Number (down) Admin @ si @ MAS2021b Serial 3582
Permanent link to this record
 

 
Author Armin Mehri; Parichehr Behjati Ardakani; Angel Sappa
Title LiNet: A Lightweight Network for Image Super Resolution Type Conference Article
Year 2021 Publication 25th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 7196-7202
Keywords
Abstract This paper proposes a new lightweight network, LiNet, that enhancing technical efficiency in lightweight super resolution and operating approximately like very large and costly networks in terms of number of network parameters and operations. The proposed architecture allows the network to learn more abstract properties by avoiding low-level information via multiple links. LiNet introduces a Compact Dense Module, which contains set of inner and outer blocks, to efficiently extract meaningful information, to better leverage multi-level representations before upsampling stage, and to allow an efficient information and gradient flow within the network. Experiments on benchmark datasets show that the proposed LiNet achieves favorable performance against lightweight state-of-the-art methods.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU; 600.130; 600.122 Approved no
Call Number (down) Admin @ si @ MAS2021a Serial 3583
Permanent link to this record
 

 
Author Martha Mackay; Fernando Alonso; Pere Salamero; Xavier Baro; Jordi Gonzalez; Sergio Escalera
Title Care and caring: future proofing the new demographics Type Conference Article
Year 2015 Publication 6th International Carers Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract With an ageing population, the issue of care provision is becoming increasingly important. The simple aspiration of the majority of older people is to live safely and well at home. Housing will be part of health & care integration in the following years and decades. A higher proportion of people will have to rely on informal care through family, friends, neighbors and others who
provide care to an older person in need of assistance (around 80% of care across the EU). They do not usually have a formal status and are usually unpaid. We need to ensure that all disabled or chronically ill people can get the help they need without overburdening their families.
The physical and emotional stress of carers is one of the dangers that this dependency can bring. To prevent carers burnout it is necessary to provide new solutions that are affordable and user friendly for the families and caregivers.
Address Gothenburg; Sweden; September 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CARERS
Notes HuPBA; ISE; 600.078;MV Approved no
Call Number (down) Admin @ si @ MAS2015b Serial 2678
Permanent link to this record
 

 
Author T. Mouats; N. Aouf; Angel Sappa; Cristhian A. Aguilera-Carrasco; Ricardo Toledo
Title Multi-Spectral Stereo Odometry Type Journal Article
Year 2015 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS
Volume 16 Issue 3 Pages 1210-1224
Keywords Egomotion estimation; feature matching; multispectral odometry (MO); optical flow; stereo odometry; thermal imagery
Abstract In this paper, we investigate the problem of visual odometry for ground vehicles based on the simultaneous utilization of multispectral cameras. It encompasses a stereo rig composed of an optical (visible) and thermal sensors. The novelty resides in the localization of the cameras as a stereo setup rather
than two monocular cameras of different spectrums. To the best of our knowledge, this is the first time such task is attempted. Log-Gabor wavelets at different orientations and scales are used to extract interest points from both images. These are then described using a combination of frequency and spatial information within the local neighborhood. Matches between the pairs of multimodal images are computed using the cosine similarity function based
on the descriptors. Pyramidal Lucas–Kanade tracker is also introduced to tackle temporal feature matching within challenging sequences of the data sets. The vehicle egomotion is computed from the triangulated 3-D points corresponding to the matched features. A windowed version of bundle adjustment incorporating
Gauss–Newton optimization is utilized for motion estimation. An outlier removal scheme is also included within the framework to deal with outliers. Multispectral data sets were generated and used as test bed. They correspond to real outdoor scenarios captured using our multimodal setup. Finally, detailed results validating the proposed strategy are illustrated.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1524-9050 ISBN Medium
Area Expedition Conference
Notes ADAS; 600.055; 600.076 Approved no
Call Number (down) Admin @ si @ MAS2015a Serial 2533
Permanent link to this record
 

 
Author David Masip
Title Face Classification Using Discriminative Features and Classifier Combination Type Book Whole
Year 2005 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address CVC (UAB)
Corporate Author Thesis Ph.D. thesis
Publisher Place of Publication Editor Jordi Vitria
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 84-933652-3-8 Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number (down) Admin @ si @ Mas2005b Serial 602
Permanent link to this record
 

 
Author David Masip
Title Dimensionality reduction techniques applied to nearest neighbor classification Type Report
Year 2003 Publication CVC Technical Report #72 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address CVC (UAB)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number (down) Admin @ si @ Mas2003 Serial 519
Permanent link to this record
 

 
Author Marc Masana
Title Lifelong Learning of Neural Networks: Detecting Novelty and Adapting to New Domains without Forgetting Type Book Whole
Year 2020 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Computer vision has gone through considerable changes in the last decade as neural networks have come into common use. As available computational capabilities have grown, neural networks have achieved breakthroughs in many computer vision tasks, and have even surpassed human performance in others. With accuracy being so high, focus has shifted to other issues and challenges. One research direction that saw a notable increase in interest is on lifelong learning systems. Such systems should be capable of efficiently performing tasks, identifying and learning new ones, and should moreover be able to deploy smaller versions of themselves which are experts on specific tasks. In this thesis, we contribute to research on lifelong learning and address the compression and adaptation of networks to small target domains, the incremental learning of networks faced with a variety of tasks, and finally the detection of out-of-distribution samples at inference time.

We explore how knowledge can be transferred from large pretrained models to more task-specific networks capable of running on smaller devices by extracting the most relevant information. Using a pretrained model provides more robust representations and a more stable initialization when learning a smaller task, which leads to higher performance and is known as domain adaptation. However, those models are too large for certain applications that need to be deployed on devices with limited memory and computational capacity. In this thesis we show that, after performing domain adaptation, some learned activations barely contribute to the predictions of the model. Therefore, we propose to apply network compression based on low-rank matrix decomposition using the activation statistics. This results in a significant reduction of the model size and the computational cost.

Like human intelligence, machine intelligence aims to have the ability to learn and remember knowledge. However, when a trained neural network is presented with learning a new task, it ends up forgetting previous ones. This is known as catastrophic forgetting and its avoidance is studied in continual learning. The work presented in this thesis extensively surveys continual learning techniques and presents an approach to avoid catastrophic forgetting in sequential task learning scenarios. Our technique is based on using ternary masks in order to update a network to new tasks, reusing the knowledge of previous ones while not forgetting anything about them. In contrast to earlier work, our masks are applied to the activations of each layer instead of the weights. This considerably reduces the number of parameters to be added for each new task. Furthermore, the analysis on a wide range of work on incremental learning without access to the task-ID, provides insight on current state-of-the-art approaches that focus on avoiding catastrophic forgetting by using regularization, rehearsal of previous tasks from a small memory, or compensating the task-recency bias.

Neural networks trained with a cross-entropy loss force the outputs of the model to tend toward a one-hot encoded vector. This leads to models being too overly confident when presented with images or classes that were not present in the training distribution. The capacity of a system to be aware of the boundaries of the learned tasks and identify anomalies or classes which have not been learned yet is key to lifelong learning and autonomous systems. In this thesis, we present a metric learning approach to out-of-distribution detection that learns the task at hand on an embedding space.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Joost Van de Weijer;Andrew Bagdanov
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-121011-9-5 Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number (down) Admin @ si @ Mas20 Serial 3481
Permanent link to this record
 

 
Author Patricia Marquez
Title A Confidence Framework for the Assessment of Optical Flow Performance Type Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Optical Flow (OF) is the input of a wide range of decision support systems such as car driver assistance, UAV guiding or medical diagnose. In these real situations, the absence of ground truth forces to assess OF quality using quantities computed from either sequences or the computed optical flow itself. These quantities are generally known as Confidence Measures, CM. Even if we have a proper confidence measure we still need a way to evaluate its ability to discard pixels with an OF prone to have a large error. Current approaches only provide a descriptive evaluation of the CM performance but such approaches are not capable to fairly compare different confidence measures and optical flow algorithms. Thus, it is of prime importance to define a framework and a general road map for the evaluation of optical flow performance.

This thesis provides a framework able to decide which pairs “ optical flow – confidence measure” (OF-CM) are best suited for optical flow error bounding given a confidence level determined by a decision support system. To design this framework we cover the following points:

Descriptive scores. As a first step, we summarize and analyze the sources of inaccuracies in the output of optical flow algorithms. Second, we present several descriptive plots that visually assess CM capabilities for OF error bounding. In addition to the descriptive plots, given a plot representing OF-CM capabilities to bound the error, we provide a numeric score that categorizes the plot according to its decreasing profile, that is, a score assessing CM performance.
Statistical framework. We provide a comparison framework that assesses the best suited OF-CM pair for error bounding that uses a two stage cascade process. First of all we assess the predictive value of the confidence measures by means of a descriptive plot. Then, for a sample of descriptive plots computed over training frames, we obtain a generic curve that will be used for sequences with no ground truth. As a second step, we evaluate the obtained general curve and its capabilities to really reflect the predictive value of a confidence measure using the variability across train frames by means of ANOVA.

The presented framework has shown its potential in the application on clinical decision support systems. In particular, we have analyzed the impact of the different image artifacts such as noise and decay to the output of optical flow in a cardiac diagnose system and we have improved the navigation inside the bronchial tree on bronchoscopy.
Address July 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Debora Gil;Aura Hernandez
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-2-1 Medium
Area Expedition Conference
Notes IAM; 600.075 Approved no
Call Number (down) Admin @ si @ Mar2015 Serial 2687
Permanent link to this record