|
Oscar Lopes, Miguel Reyes, Sergio Escalera, & Jordi Gonzalez. (2014). Spherical Blurred Shape Model for 3-D Object and Pose Recognition: Quantitative Analysis and HCI Applications in Smart Environments. TSMCB - IEEE Transactions on Systems, Man and Cybernetics (Part B), 44(12), 2379–2390.
Abstract: The use of depth maps is of increasing interest after the advent of cheap multisensor devices based on structured light, such as Kinect. In this context, there is a strong need of powerful 3-D shape descriptors able to generate rich object representations. Although several 3-D descriptors have been already proposed in the literature, the research of discriminative and computationally efficient descriptors is still an open issue. In this paper, we propose a novel point cloud descriptor called spherical blurred shape model (SBSM) that successfully encodes the structure density and local variabilities of an object based on shape voxel distances and a neighborhood propagation strategy. The proposed SBSM is proven to be rotation and scale invariant, robust to noise and occlusions, highly discriminative for multiple categories of complex objects like the human hand, and computationally efficient since the SBSM complexity is linear to the number of object voxels. Experimental evaluation in public depth multiclass object data, 3-D facial expressions data, and a novel hand poses data sets show significant performance improvements in relation to state-of-the-art approaches. Moreover, the effectiveness of the proposal is also proved for object spotting in 3-D scenes and for real-time automatic hand pose recognition in human computer interaction scenarios.
|
|
|
Xavier Perez Sala, Sergio Escalera, Cecilio Angulo, & Jordi Gonzalez. (2014). A survey on model based approaches for 2D and 3D visual human pose recovery. SENS - Sensors, 14(3), 4189–4210.
Abstract: Human Pose Recovery has been studied in the field of Computer Vision for the last 40 years. Several approaches have been reported, and significant improvements have been obtained in both data representation and model design. However, the problem of Human Pose Recovery in uncontrolled environments is far from being solved. In this paper, we define a general taxonomy to group model based approaches for Human Pose Recovery, which is composed of five main modules: appearance, viewpoint, spatial relations, temporal consistence, and behavior. Subsequently, a methodological comparison is performed following the proposed taxonomy, evaluating current SoA approaches in the aforementioned five group categories. As a result of this comparison, we discuss the main advantages and drawbacks of the reviewed literature.
Keywords: human pose recovery; human body modelling; behavior analysis; computer vision
|
|
|
Frederic Sampedro, Anna Domenech, & Sergio Escalera. (2014). Obtaining quantitative global tumoral state indicators based on whole-body PET/CT scans: A breast cancer case study. NMC - Nuclear Medicine Communications, 35(4), 362–371.
Abstract: Objectives: In this work we address the need for the computation of quantitative global tumoral state indicators from oncological whole-body PET/computed tomography scans. The combination of such indicators with other oncological information such as tumor markers or biopsy results would prove useful in oncological decision-making scenarios.
Materials and methods: From an ordering of 100 breast cancer patients on the basis of oncological state through visual analysis by a consensus of nuclear medicine specialists, a set of numerical indicators computed from image analysis of the PET/computed tomography scan is presented, which attempts to summarize a patient’s oncological state in a quantitative manner taking into consideration the total tumor volume, aggressiveness, and spread.
Results: Results obtained by comparative analysis of the proposed indicators with respect to the experts’ evaluation show up to 87% Pearson’s correlation coefficient when providing expert-guided PET metabolic tumor volume segmentation and 64% correlation when using completely automatic image analysis techniques.
Conclusion: Global quantitative tumor information obtained by whole-body PET/CT image analysis can prove useful in clinical nuclear medicine settings and oncological decision-making scenarios. The completely automatic computation of such indicators would improve its impact as time efficiency and specialist independence would be achieved.
|
|
|
Francesco Ciompi, Oriol Pujol, & Petia Radeva. (2014). ECOC-DRF: Discriminative random fields based on error correcting output codes. PR - Pattern Recognition, 47(6), 2193–2204.
Abstract: We present ECOC-DRF, a framework where potential functions for Discriminative Random Fields are formulated as an ensemble of classifiers. We introduce the label trick, a technique to express transitions in the pairwise potential as meta-classes. This allows to independently learn any possible transition between labels without assuming any pre-defined model. The Error Correcting Output Codes matrix is used as ensemble framework for the combination of margin classifiers. We apply ECOC-DRF to a large set of classification problems, covering synthetic, natural and medical images for binary and multi-class cases, outperforming state-of-the art in almost all the experiments.
Keywords: Discriminative random fields; Error-correcting output codes; Multi-class classification; Graphical models
|
|
|
Victor Ponce, Sergio Escalera, Marc Perez, Oriol Janes, & Xavier Baro. (2015). Non-Verbal Communication Analysis in Victim-Offender Mediations. PRL - Pattern Recognition Letters, 67(1), 19–27.
Abstract: We present a non-invasive ambient intelligence framework for the semi-automatic analysis of non-verbal communication applied to the restorative justice field. We propose the use of computer vision and social signal processing technologies in real scenarios of Victim–Offender Mediations, applying feature extraction techniques to multi-modal audio-RGB-depth data. We compute a set of behavioral indicators that define communicative cues from the fields of psychology and observational methodology. We test our methodology on data captured in real Victim–Offender Mediation sessions in Catalonia. We define the ground truth based on expert opinions when annotating the observed social responses. Using different state of the art binary classification approaches, our system achieves recognition accuracies of 86% when predicting satisfaction, and 79% when predicting both agreement and receptivity. Applying a regression strategy, we obtain a mean deviation for the predictions between 0.5 and 0.7 in the range [1–5] for the computed social signals.
Keywords: Victim–Offender Mediation; Multi-modal human behavior analysis; Face and gesture recognition; Social signal processing; Computer vision; Machine learning
|
|