Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D
Antonio Hernandez, Miguel Angel Bautista, Xavier Perez Sala, Victor Ponce, Sergio Escalera, Xavier Baro, Oriol Pujol, Cecilio Angulo
Pattern Recognition Letters, 2014, Vol. 50, No. 1, pp. 112-121

We present a methodology to address the problem of human gesture segmentation and recognition in video and depth image sequences. A Bag-of-Visual-and-Depth-Words (BoVDW) model is introduced as an extension of the Bag-of-Visual-Words (BoVW) model. State-of-the-art RGB and depth features, including a newly proposed depth descriptor, are analysed and combined in a late fusion form. The method is integrated in a Human Gesture Recognition pipeline, together with a novel probability-based Dynamic Time Warping (PDTW) algorithm which is used to perform prior segmentation of idle gestures. The proposed DTW variant uses samples of the same gesture category to build a Gaussian Mixture Model driven probabilistic model of that gesture class. Results of the whole Human Gesture Recognition pipeline in a public data set show better performance in comparison to both standard BoVW model and DTW approach.

Keywords: RGB-D, Bag-of-Words, Dynamic Time Warping, Human Gesture Recognition

DOI: http://dx.doi.org/10.1016/j.patrec.2013.09.009