toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Sergio Escalera; Vassilis Athitsos; Isabelle Guyon edit  url
openurl 
  Title Challenges in multimodal gesture recognition Type Journal Article
  Year 2016 Publication Journal of Machine Learning Research Abbreviated Journal JMLR  
  Volume 17 Issue Pages 1-54  
  Keywords Gesture Recognition; Time Series Analysis; Multimodal Data Analysis; Computer Vision; Pattern Recognition; Wearable sensors; Infrared Cameras; KinectTM  
  Abstract This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectTMrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands
of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor Zhuowen Tu  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ EAG2016 Serial 2764  
Permanent link to this record
 

 
Author Pejman Rasti; Salma Samiei; Mary Agoyi; Sergio Escalera; Gholamreza Anbarjafari edit   pdf
doi  openurl
  Title Robust non-blind color video watermarking using QR decomposition and entropy analysis Type Journal Article
  Year 2016 Publication Journal of Visual Communication and Image Representation Abbreviated Journal JVCIR  
  Volume 38 Issue Pages 838-847  
  Keywords Video watermarking; QR decomposition; Discrete Wavelet Transformation; Chirp Z-transform; Singular value decomposition; Orthogonal–triangular decomposition  
  Abstract Issues such as content identification, document and image security, audience measurement, ownership and copyright among others can be settled by the use of digital watermarking. Many recent video watermarking methods show drops in visual quality of the sequences. The present work addresses the aforementioned issue by introducing a robust and imperceptible non-blind color video frame watermarking algorithm. The method divides frames into moving and non-moving parts. The non-moving part of each color channel is processed separately using a block-based watermarking scheme. Blocks with an entropy lower than the average entropy of all blocks are subject to a further process for embedding the watermark image. Finally a watermarked frame is generated by adding moving parts to it. Several signal processing attacks are applied to each watermarked frame in order to perform experiments and are compared with some recent algorithms. Experimental results show that the proposed scheme is imperceptible and robust against common signal processing attacks.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @RSA2016 Serial 2766  
Permanent link to this record
 

 
Author Cristina Palmero; Albert Clapes; Chris Bahnsen; Andreas Møgelmose; Thomas B. Moeslund; Sergio Escalera edit   pdf
doi  openurl
  Title Multi-modal RGB-Depth-Thermal Human Body Segmentation Type Journal Article
  Year 2016 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 118 Issue 2 Pages 217-239  
  Keywords Human body segmentation; RGB ; Depth Thermal  
  Abstract This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.  
  Address  
  Corporate Author Thesis  
  Publisher Springer US Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ PCB2016 Serial 2767  
Permanent link to this record
 

 
Author Gerard Canal; Sergio Escalera; Cecilio Angulo edit   pdf
doi  openurl
  Title A Real-time Human-Robot Interaction system based on gestures for assistive scenarios Type Journal Article
  Year 2016 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU  
  Volume 149 Issue Pages 65-77  
  Keywords Gesture recognition; Human Robot Interaction; Dynamic Time Warping; Pointing location estimation  
  Abstract Natural and intuitive human interaction with robotic systems is a key point to develop robots assisting people in an easy and effective way. In this paper, a Human Robot Interaction (HRI) system able to recognize gestures usually employed in human non-verbal communication is introduced, and an in-depth study of its usability is performed. The system deals with dynamic gestures such as waving or nodding which are recognized using a Dynamic Time Warping approach based on gesture specific features computed from depth maps. A static gesture consisting in pointing at an object is also recognized. The pointed location is then estimated in order to detect candidate objects the user may refer to. When the pointed object is unclear for the robot, a disambiguation procedure by means of either a verbal or gestural dialogue is performed. This skill would lead to the robot picking an object in behalf of the user, which could present difficulties to do it by itself. The overall system — which is composed by a NAO and Wifibot robots, a KinectTM v2 sensor and two laptops — is firstly evaluated in a structured lab setup. Then, a broad set of user tests has been completed, which allows to assess correct performance in terms of recognition rates, easiness of use and response times.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier B.V. Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ CEA2016 Serial 2768  
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera edit   pdf
doi  openurl
  Title Action Recognition by Pairwise Proximity Function Support Vector Machines with Dynamic Time Warping Kernels Type Conference Article
  Year 2016 Publication 29th Canadian Conference on Artificial Intelligence Abbreviated Journal  
  Volume 9673 Issue Pages 3-14  
  Keywords  
  Abstract In the context of human action recognition using skeleton data, the 3D trajectories of joint points may be considered as multi-dimensional time series. The traditional recognition technique in the literature is based on time series dis(similarity) measures (such as Dynamic Time Warping). For these general dis(similarity) measures, k-nearest neighbor algorithms are a natural choice. However, k-NN classifiers are known to be sensitive to noise and outliers. In this paper, a new class of Support Vector Machine that is applicable to trajectory classification, such as action recognition, is developed by incorporating an efficient time-series distances measure into the kernel function. More specifically, the derivative of Dynamic Time Warping (DTW) distance measure is employed as the SVM kernel. In addition, the pairwise proximity learning strategy is utilized in order to make use of non-positive semi-definite (PSD) kernels in the SVM formulation. The recognition results of the proposed technique on two action recognition datasets demonstrates the ourperformance of our methodology compared to the state-of-the-art methods. Remarkably, we obtained 89 % accuracy on the well-known MSRAction3D dataset using only 3D trajectories of body joints obtained by Kinect  
  Address Victoria; Canada; May 2016  
  Corporate Author Thesis  
  Publisher Springer International Publishing Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference AI  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ BGE2016b Serial 2770  
Permanent link to this record
 

 
Author Jun Wan; Yibing Zhao; Shuai Zhou; Isabelle Guyon; Sergio Escalera edit   pdf
doi  openurl
  Title ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition Type Conference Article
  Year 2016 Publication 29th IEEE Conference on Computer Vision and Pattern Recognition Worshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD)and the Continuous Gesture Dataset (ConGD). Both datasets are derived from the ChaLearn Gesture Dataset
(CGD) that has a total of more than 50000 gestures for the “one-shot-learning” competition. To increase the potential of the old dataset, we designed new well curated datasets composed of 249 gesture labels, and including 47933 gestures manually labeled the begin and end frames in sequences.Using these datasets we will open two competitions
on the CodaLab platform so that researchers can test and compare their methods for “user independent” gesture recognition. The first challenge is designed for gesture spotting
and recognition in continuous sequences of gestures while the second one is designed for gesture classification from segmented data. The baseline method based on the bag of visual words model is also presented.
 
  Address Las Vegas; USA; July 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ WZZ2016 Serial 2771  
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera edit   pdf
doi  openurl
  Title Support Vector Machines with Time Series Distance Kernels for Action Classification Type Conference Article
  Year 2016 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 1-7  
  Keywords  
  Abstract Despite the outperformance of Support Vector Machine (SVM) on many practical classification problems, the algorithm is not directly applicable to multi-dimensional trajectories having different lengths. In this paper, a new class of SVM that is applicable to trajectory classification, such as action recognition, is developed by incorporating two efficient time-series distances measures into the kernel function.
Dynamic Time Warping and Longest Common Subsequence distance measures along with their derivatives are
employed as the SVM kernel. In addition, the pairwise proximity learning strategy is utilized in order to make use of non-positive semi-definite kernels in the SVM formulation. The proposed method is employed for a challenging classification problem: action recognition by depth cameras using only skeleton data; and evaluated on three benchmark action datasets. Experimental results demonstrate the outperformance of our methodology compared to the state-ofthe-art on the considered datasets.
 
  Address Lake Placid; NY (USA); March 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ BGE2016a Serial 2773  
Permanent link to this record
 

 
Author Baiyu Chen; Sergio Escalera; Isabelle Guyon; Victor Ponce; N. Shah; Marc Oliu edit   pdf
openurl 
  Title Overcoming Calibration Problems in Pattern Labeling with Pairwise Ratings: Application to Personality Traits Type Conference Article
  Year 2016 Publication 14th European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords Calibration of labels; Label bias; Ordinal labeling; Variance Models; Bradley-Terry-Luce model; Continuous labels; Regression; Personality traits; Crowd-sourced labels  
  Abstract We address the problem of calibration of workers whose task is to label patterns with continuous variables, which arises for instance in labeling images of videos of humans with continuous traits. Worker bias is particularly dicult to evaluate and correct when many workers contribute just a few labels, a situation arising typically when labeling is crowd-sourced. In the scenario of labeling short videos of people facing a camera with personality traits, we evaluate the feasibility of the pairwise ranking method to alleviate bias problems. Workers are exposed to pairs of videos at a time and must order by preference. The variable levels are reconstructed by fitting a Bradley-Terry-Luce model with maximum likelihood. This method may at first sight, seem prohibitively expensive because for N videos, p = N (N-1)/2 pairs must be potentially processed by workers rather that N videos. However, by performing extensive simulations, we determine an empirical law for the scaling of the number of pairs needed as a function of the number of videos in order to achieve a given accuracy of score reconstruction and show that the pairwise method is a ordable. We apply the method to the labeling of a large scale dataset of 10,000 videos used in the ChaLearn Apparent Personality Trait challenge.  
  Address Amsterdam; The Netherlands; October 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ CEG2016 Serial 2829  
Permanent link to this record
 

 
Author Fatemeh Noroozi; Marina Marjanovic; Angelina Njegus; Sergio Escalera; Gholamreza Anbarjafari edit  openurl
  Title Fusion of Classifier Predictions for Audio-Visual Emotion Recognition Type Conference Article
  Year 2016 Publication 23rd International Conference on Pattern Recognition Workshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are
computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set of key-frames, which are learnt in order to visually discriminate emotions by means of a Convolutional Neural Network. Finally, confidence
outputs of all classifiers from all modalities are used to define a new feature space to be learnt for final emotion prediction, in a late fusion/stacking fashion. The conducted experiments on eNTERFACE’05 database show significant performance improvements of our proposed system in comparison to state-of-the-art approaches.
 
  Address Cancun; Mexico; December 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRW  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ NMN2016 Serial 2839  
Permanent link to this record
 

 
Author Iiris Lusi; Sergio Escalera; Gholamreza Anbarjafari edit   pdf
url  openurl
  Title SASE: RGB-Depth Database for Human Head Pose Estimation Type Conference Article
  Year 2016 Publication 14th European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Slides  
  Address Amsterdam; The Netherlands; October 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ LEA2016a Serial 2840  
Permanent link to this record
 

 
Author Marc Oliu; Ciprian Corneanu; Kamal Nasrollahi; Olegs Nikisins; Sergio Escalera; Yunlian Sun; Haiqing Li; Zhenan Sun; Thomas B. Moeslund; Modris Greitans edit  url
openurl 
  Title Improved RGB-D-T based Face Recognition Type Journal Article
  Year 2016 Publication IET Biometrics Abbreviated Journal BIO  
  Volume 5 Issue 4 Pages 297 - 303  
  Keywords  
  Abstract Reliable facial recognition systems are of crucial importance in various applications from entertainment to security. Thanks to the deep-learning concepts introduced in the field, a significant improvement in the performance of the unimodal facial recognition systems has been observed in the recent years. At the same time a multimodal facial recognition is a promising approach. This study combines the latest successes in both directions by applying deep learning convolutional neural networks (CNN) to the multimodal RGB, depth, and thermal (RGB-D-T) based facial recognition problem outperforming previously published results. Furthermore, a late fusion of the CNN-based recognition block with various hand-crafted features (local binary patterns, histograms of oriented gradients, Haar-like rectangular features, histograms of Gabor ordinal measures) is introduced, demonstrating even better recognition performance on a benchmark RGB-D-T database. The obtained results in this study show that the classical engineered features and CNN-based features can complement each other for recognition purposes.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ OCN2016 Serial 2854  
Permanent link to this record
 

 
Author Oriol Pujol edit  openurl
  Title Model-based three dimensional interpolation of IVUS images Type Report
  Year 1999 Publication CVC Technical Report #27 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address CVC (UAB)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) HuPBA;MILAB Approved no  
  Call Number BCNPCL @ bcnpcl @ Puj1999 Serial 49  
Permanent link to this record
 

 
Author Oriol Pujol edit  openurl
  Title A semi-Supervised Statistical Framework and Generative Snakes for IVUS Analysis Type Book Whole
  Year 2004 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address CVC (UAB), Bellaterra  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Petia Radeva  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) HuPBA;MILAB Approved no  
  Call Number BCNPCL @ bcnpcl @ Puj2004 Serial 512  
Permanent link to this record
 

 
Author Antonio Hernandez edit  openurl
  Title Pose and Face Recovery via Spatio-temporal GrabCut Human Segmentation Type Report
  Year 2010 Publication CVC Technical Report Abbreviated Journal  
  Volume 153 Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis Master's thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (down) HUPBA;MILAB Approved no  
  Call Number Admin @ si @ Her2010 Serial 1347  
Permanent link to this record
 

 
Author Eloi Puertas; Sergio Escalera; Oriol Pujol edit  isbn
openurl 
  Title Classifying Objects at Different Sizes with Multi-Scale Stacked Sequential Learning Type Conference Article
  Year 2010 Publication 13th International Conference of the Catalan Association for Artificial Intelligence Abbreviated Journal  
  Volume 220 Issue Pages 193–200  
  Keywords  
  Abstract Sequential learning is that discipline of machine learning that deals with dependent data. In this paper, we use the Multi-scale Stacked Sequential Learning approach (MSSL) to solve the task of pixel-wise classification based on contextual information. The main contribution of this work is a shifting technique applied during the testing phase that makes possible, thanks to template images, to classify objects at different sizes. The results show that the proposed method robustly classifies such objects capturing their spatial relationships.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor R. Alquezar, A. Moreno, J. Aguilar  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-60750-642-3 Medium  
  Area Expedition Conference CCIA  
  Notes (down) HUPBA;MILAB Approved no  
  Call Number BCNPCL @ bcnpcl @ PEP2010 Serial 1448  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: