|
Records |
Links |
|
Author |
Razieh Rastgoo; Kourosh Kiani; Sergio Escalera |
|
|
Title |
Hand sign language recognition using multi-view hand skeleton |
Type |
Journal Article |
|
Year |
2020 |
Publication |
Expert Systems With Applications |
Abbreviated Journal |
ESWA |
|
|
Volume |
150 |
Issue |
|
Pages |
113336 |
|
|
Keywords |
Multi-view hand skeleton; Hand sign language recognition; 3DCNN; Hand pose estimation; RGB video; Hand action recognition |
|
|
Abstract |
Hand sign language recognition from video is a challenging research area in computer vision, which performance is affected by hand occlusion, fast hand movement, illumination changes, or background complexity, just to mention a few. In recent years, deep learning approaches have achieved state-of-the-art results in the field, though previous challenges are not completely solved. In this work, we propose a novel deep learning-based pipeline architecture for efficient automatic hand sign language recognition using Single Shot Detector (SSD), 2D Convolutional Neural Network (2DCNN), 3D Convolutional Neural Network (3DCNN), and Long Short-Term Memory (LSTM) from RGB input videos. We use a CNN-based model which estimates the 3D hand keypoints from 2D input frames. After that, we connect these estimated keypoints to build the hand skeleton by using midpoint algorithm. In order to obtain a more discriminative representation of hands, we project 3D hand skeleton into three views surface images. We further employ the heatmap image of detected keypoints as input for refinement in a stacked fashion. We apply 3DCNNs on the stacked features of hand, including pixel level, multi-view hand skeleton, and heatmap features, to extract discriminant local spatio-temporal features from these stacked inputs. The outputs of the 3DCNNs are fused and fed to a LSTM to model long-term dynamics of hand sign gestures. Analyzing 2DCNN vs. 3DCNN using different number of stacked inputs into the network, we demonstrate that 3DCNN better capture spatio-temporal dynamics of hands. To the best of our knowledge, this is the first time that this multi-modal and multi-view set of hand skeleton features are applied for hand sign language recognition. Furthermore, we present a new large-scale hand sign language dataset, namely RKS-PERSIANSIGN, including 10′000 RGB videos of 100 Persian sign words. Evaluation results of the proposed model on three datasets, NYU, First-Person, and RKS-PERSIANSIGN, indicate that our model outperforms state-of-the-art models in hand sign language recognition, hand pose estimation, and hand action recognition. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ RKE2020a |
Serial |
3411 |
|
Permanent link to this record |
|
|
|
|
Author |
Antonio Hernandez; Nadezhda Zlateva; Alexander Marinov; Miguel Reyes; Petia Radeva; Dimo Dimov; Sergio Escalera |
|
|
Title |
Human Limb Segmentation in Depth Maps based on Spatio-Temporal Graph Cuts Optimization |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Journal of Ambient Intelligence and Smart Environments |
Abbreviated Journal |
JAISE |
|
|
Volume |
4 |
Issue |
6 |
Pages |
535-546 |
|
|
Keywords |
Multi-modal vision processing; Random Forest; Graph-cuts; multi-label segmentation; human body segmentation |
|
|
Abstract |
We present a framework for object segmentation using depth maps based on Random Forest and Graph-cuts theory, and apply it to the segmentation of human limbs. First, from a set of random depth features, Random Forest is used to infer a set of label probabilities for each data sample. This vector of probabilities is used as unary term in α−β swap Graph-cuts algorithm. Moreover, depth values of spatio-temporal neighboring data points are used as boundary potentials. Results on a new multi-label human depth data set show high performance in terms of segmentation overlapping of the novel methodology compared to classical approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1876-1364 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB;HuPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ HZM2012a |
Serial |
2006 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Clapes; Miguel Reyes; Sergio Escalera |
|
|
Title |
Multi-modal User Identification and Object Recognition Surveillance System |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
34 |
Issue |
7 |
Pages |
799-808 |
|
|
Keywords |
Multi-modal RGB-Depth data analysis; User identification; Object recognition; Intelligent surveillance; Visual features; Statistical learning |
|
|
Abstract |
We propose an automatic surveillance system for user identification and object recognition based on multi-modal RGB-Depth data analysis. We model a RGBD environment learning a pixel-based background Gaussian distribution. Then, user and object candidate regions are detected and recognized using robust statistical approaches. The system robustly recognizes users and updates the system in an online way, identifying and detecting new actors in the scene. Moreover, segmented objects are described, matched, recognized, and updated online using view-point 3D descriptions, being robust to partial occlusions and local 3D viewpoint rotations. Finally, the system saves the historic of user–object assignments, being specially useful for surveillance scenarios. The system has been evaluated on a novel data set containing different indoor/outdoor scenarios, objects, and users, showing accurate recognition and better performance than standard state-of-the-art approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA; 600.046; 605.203;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ CRE2013 |
Serial |
2248 |
|
Permanent link to this record |
|
|
|
|
Author |
Miguel Reyes; Albert Clapes; Jose Ramirez; Juan R Revilla; Sergio Escalera |
|
|
Title |
Automatic Digital Biometry Analysis based on Depth Maps |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Computers in Industry |
Abbreviated Journal |
COMPUTIND |
|
|
Volume |
64 |
Issue |
9 |
Pages |
1316-1325 |
|
|
Keywords |
Multi-modal data fusion; Depth maps; Posture analysis; Anthropometric data; Musculo-skeletal disorders; Gesture analysis |
|
|
Abstract |
World Health Organization estimates that 80% of the world population is affected by back-related disorders during his life. Current practices to analyze musculo-skeletal disorders (MSDs) are expensive, subjective, and invasive. In this work, we propose a tool for static body posture analysis and dynamic range of movement estimation of the skeleton joints based on 3D anthropometric information from multi-modal data. Given a set of keypoints, RGB and depth data are aligned, depth surface is reconstructed, keypoints are matched, and accurate measurements about posture and spinal curvature are computed. Given a set of joints, range of movement measurements is also obtained. Moreover, gesture recognition based on joint movements is performed to look for the correctness in the development of physical exercises. The system shows high precision and reliable measurements, being useful for posture reeducation purposes to prevent MSDs, as well as tracking the posture evolution of patients in rehabilitation treatments. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ RCR2013 |
Serial |
2252 |
|
Permanent link to this record |
|
|
|
|
Author |
Miguel Angel Bautista; Sergio Escalera; Xavier Baro; Petia Radeva; Jordi Vitria; Oriol Pujol |
|
|
Title |
Minimal Design of Error-Correcting Output Codes |
Type |
Journal Article |
|
Year |
2011 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
33 |
Issue |
6 |
Pages |
693-702 |
|
|
Keywords |
Multi-class classification; Error-correcting output codes; Ensemble of classifiers |
|
|
Abstract |
IF JCR CCIA 1.303 2009 54/103
The classification of large number of object categories is a challenging trend in the pattern recognition field. In literature, this is often addressed using an ensemble of classifiers. In this scope, the Error-correcting output codes framework has demonstrated to be a powerful tool for combining classifiers. However, most state-of-the-art ECOC approaches use a linear or exponential number of classifiers, making the discrimination of a large number of classes unfeasible. In this paper, we explore and propose a minimal design of ECOC in terms of the number of classifiers. Evolutionary computation is used for tuning the parameters of the classifiers and looking for the best minimal ECOC code configuration. The results over several public UCI datasets and different multi-class computer vision problems show that the proposed methodology obtains comparable (even better) results than state-of-the-art ECOC methodologies with far less number of dichotomizers. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0167-8655 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; OR;HuPBA;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ BEB2011a |
Serial |
1800 |
|
Permanent link to this record |