Fatemeh Noroozi, Marina Marjanovic, Angelina Njegus, Sergio Escalera, & Gholamreza Anbarjafari. (2016). Fusion of Classifier Predictions for Audio-Visual Emotion Recognition. In 23rd International Conference on Pattern Recognition Workshops.
Abstract: In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are
computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set of key-frames, which are learnt in order to visually discriminate emotions by means of a Convolutional Neural Network. Finally, confidence
outputs of all classifiers from all modalities are used to define a new feature space to be learnt for final emotion prediction, in a late fusion/stacking fashion. The conducted experiments on eNTERFACE’05 database show significant performance improvements of our proposed system in comparison to state-of-the-art approaches.
|
Iiris Lusi, Sergio Escalera, & Gholamreza Anbarjafari. (2016). SASE: RGB-Depth Database for Human Head Pose Estimation. In 14th European Conference on Computer Vision Workshops.
|
Xavier Perez Sala, Fernando De la Torre, Laura Igual, Sergio Escalera, & Cecilio Angulo. (2017). Subspace Procrustes Analysis. IJCV - International Journal of Computer Vision, 121(3), 327–343.
Abstract: Procrustes Analysis (PA) has been a popular technique to align and build 2-D statistical models of shapes. Given a set of 2-D shapes PA is applied to remove rigid transformations. Then, a non-rigid 2-D model is computed by modeling (e.g., PCA) the residual. Although PA has been widely used, it has several limitations for modeling 2-D shapes: occluded landmarks and missing data can result in local minima solutions, and there is no guarantee that the 2-D shapes provide a uniform sampling of the 3-D space of rotations for the object. To address previous issues, this paper proposes Subspace PA (SPA). Given several
instances of a 3-D object, SPA computes the mean and a 2-D subspace that can simultaneously model all rigid and non-rigid deformations of the 3-D object. We propose a discrete (DSPA) and continuous (CSPA) formulation for SPA, assuming that 3-D samples of an object are provided. DSPA extends the traditional PA, and produces unbiased 2-D models by uniformly sampling different views of the 3-D object. CSPA provides a continuous approach to uniformly sample the space of 3-D rotations, being more efficient in space and time. Experiments using SPA to learn 2-D models of bodies from motion capture data illustrate the benefits of our approach.
|
Frederic Sampedro, Anna Domenech, Sergio Escalera, & Ignasi Carrio. (2017). Computing quantitative indicators of structural renal damage in pediatric DMSA scans. REMNIM - Revista Española de Medicina Nuclear e Imagen Molecular, 36(2), 72–77.
Abstract: OBJECTIVES:
The proposal and implementation of a computational framework for the quantification of structural renal damage from 99mTc-dimercaptosuccinic acid (DMSA) scans. The aim of this work is to propose, implement, and validate a computational framework for the quantification of structural renal damage from DMSA scans and in an observer-independent manner.
MATERIALS AND METHODS:
From a set of 16 pediatric DMSA-positive scans and 16 matched controls and using both expert-guided and automatic approaches, a set of image-derived quantitative indicators was computed based on the relative size, intensity and histogram distribution of the lesion. A correlation analysis was conducted in order to investigate the association of these indicators with other clinical data of interest in this scenario, including C-reactive protein (CRP), white cell count, vesicoureteral reflux, fever, relative perfusion, and the presence of renal sequelae in a 6-month follow-up DMSA scan.
RESULTS:
A fully automatic lesion detection and segmentation system was able to successfully classify DMSA-positive from negative scans (AUC=0.92, sensitivity=81% and specificity=94%). The image-computed relative size of the lesion correlated with the presence of fever and CRP levels (p<0.05), and a measurement derived from the distribution histogram of the lesion obtained significant performance results in the detection of permanent renal damage (AUC=0.86, sensitivity=100% and specificity=75%).
CONCLUSIONS:
The proposal and implementation of a computational framework for the quantification of structural renal damage from DMSA scans showed a promising potential to complement visual diagnosis and non-imaging indicators.
|
Mikkel Thogersen, Sergio Escalera, Jordi Gonzalez, & Thomas B. Moeslund. (2016). Segmentation of RGB-D Indoor scenes by Stacking Random Forests and Conditional Random Fields. PRL - Pattern Recognition Letters, 80, 208–215.
Abstract: This paper proposes a technique for RGB-D scene segmentation using Multi-class
Multi-scale Stacked Sequential Learning (MMSSL) paradigm. Following recent trends in state-of-the-art, a base classifier uses an initial SLIC segmentation to obtain superpixels which provide a diminution of data while retaining object boundaries. A series of color and depth features are extracted from the superpixels, and are used in a Conditional Random Field (CRF) to predict superpixel labels. Furthermore, a Random Forest (RF) classifier using random offset features is also used as an input to the CRF, acting as an initial prediction. As a stacked classifier, another Random Forest is used acting on a spatial multi-scale decomposition of the CRF confidence map to correct the erroneous labels assigned by the previous classifier. The model is tested on the popular NYU-v2 dataset.
The approach shows that simple multi-modal features with the power of the MMSSL
paradigm can achieve better performance than state of the art results on the same dataset.
|
Jose Garcia-Rodriguez, Isabelle Guyon, Sergio Escalera, Alexandra Psarrou, Andrew Lewis, & Miguel Cazorla. (2017). Editorial: Special Issue on Computational Intelligence for Vision and Robotics. Neural Computing and Applications - Neural Computing and Applications, 28(5), 853–854.
|
Pejman Rasti, Tonis Uiboupin, Sergio Escalera, & Gholamreza Anbarjafari. (2016). Convolutional Neural Network Super Resolution for Face Recognition in Surveillance Monitoring. In 9th Conference on Articulated Motion and Deformable Objects.
|
Dennis H. Lundtoft, Kamal Nasrollahi, Thomas B. Moeslund, & Sergio Escalera. (2016). Spatiotemporal Facial Super-Pixels for Pain Detection. In 9th Conference on Articulated Motion and Deformable Objects.
Abstract: Best student paper award.
Pain detection using facial images is of critical importance in many Health applications. Since pain is a spatiotemporal process, recent works on this topic employ facial spatiotemporal features to detect pain. These systems extract such features from the entire area of the face. In this paper, we show that by employing super-pixels we can divide the face into three regions, in a way that only one of these regions (about one third of the face) contributes to the pain estimation and the other two regions can be discarded. The experimental results on the UNBCMcMaster database show that the proposed system using this single region outperforms state-of-the-art systems in detecting no-pain scenarios, while it reaches comparable results in detecting weak and severe pain scenarios.
Keywords: Facial images; Super-pixels; Spatiotemporal filters; Pain detection
|
Mark Philip Philipsen, Anders Jorgensen, Thomas B. Moeslund, & Sergio Escalera. (2016). RGB-D Segmentation of Poultry Entrails. In 9th Conference on Articulated Motion and Deformable Objects.
Abstract: Best commercial paper award.
|
Sergio Escalera, Mercedes Torres-Torres, Brais Martinez, Xavier Baro, Hugo Jair Escalante, Isabelle Guyon, et al. (2016). ChaLearn Looking at People and Faces of the World: Face AnalysisWorkshop and Challenge 2016. In 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops.
Abstract: We present the 2016 ChaLearn Looking at People and Faces of the World Challenge and Workshop, which ran three competitions on the common theme of face analysis from still images. The first one, Looking at People, addressed age estimation, while the second and third competitions, Faces of the World, addressed accessory classification and smile and gender classification, respectively. We present two crowd-sourcing methodologies used to collect manual annotations. A custom-build application was used to collect and label data about the apparent age of people (as opposed to the real age). For the Faces of the World data, the citizen-science Zooniverse platform was used. This paper summarizes the three challenges and the data used, as well as the results achieved by the participants of the competitions. Details of the ChaLearn LAP FotW competitions can be found at http://gesture.chalearn.org.
|
Antonio Esteban Lansaque, Carles Sanchez, Agnes Borras, Marta Diez-Ferrer, Antoni Rosell, & Debora Gil. (2016). Stable Airway Center Tracking for Bronchoscopic Navigation. In 28th Conference of the international Society for Medical Innovation and Technology.
Abstract: Bronchoscopists use X‐ray fluoroscopy to guide bronchoscopes to the lesion to be biopsied without any kind of incisions. Reducing exposure to X‐ray is important for both patients and doctors but alternatives like electromagnetic navigation require specific equipment and increase the cost of the clinical procedure. We propose a guiding system based on the extraction of airway centers from intra‐operative videos. Such anatomical landmarks could be
matched to the airway centerline extracted from a pre‐planned CT to indicate the best path to the lesion. We present an extraction of lumen centers
from intra‐operative videos based on tracking of maximal stable regions of energy maps.
|
Sergio Escalera, Jordi Gonzalez, Xavier Baro, & Jamie Shotton. (2016). Guest Editor Introduction to the Special Issue on Multimodal Human Pose Recovery and Behavior Analysis. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1489–1491.
Abstract: The sixteen papers in this special section focus on human pose recovery and behavior analysis (HuPBA). This is one of the most challenging topics in computer vision, pattern analysis, and machine learning. It is of critical importance for application areas that include gaming, computer interaction, human robot interaction, security, commerce, assistive technologies and rehabilitation, sports, sign language recognition, and driver assistance technology, to mention just a few. In essence, HuPBA requires dealing with the articulated nature of the human body, changes in appearance due to clothing, and the inherent problems of clutter scenes, such as background artifacts, occlusions, and illumination changes. These papers represent the most recent research in this field, including new methods considering still images, image sequences, depth data, stereo vision, 3D vision, audio, and IMUs, among others.
|
Sergio Escalera, Jordi Gonzalez, Xavier Baro, Fernando Alonso, & Martha Mackay. (2016). Care Respite: a remote monitoring eHealth system for improving ambient assisted living. In Human Motion Analysis for Healthcare Applications.
Abstract: Advances in technology that capture human motion have been quite remarkable during the last five years. New sensors have been developed, such as the Microsoft Kinect, Asus Xtion Pro live, PrimeSense Carmine and Leap Motion. Their main advantages are their non-intrusive nature, low cost and widely available support for developers offered by large corporations or Open Communities. Although they were originally developed for computer games, they have inspired numerous healthcare related ideas and projects in areas such as Medical Disorder Diagnosis, Assisted Living, Rehabilitation and Surgery.
In Assisted Living, human motion analysis allows continuous monitoring of elderly and vulnerable people and their activities to potentially detect life-threatening events such as falls. Human motion analysis in rehabilitation provides the opportunity for motivating patients through gamification, evaluating prescribed programmes of exercises and assessing patients’ progress. In operating theatres, surgeons may use a gesture-based interface to access medical information or control a tele-surgery system. Human motion analysis may also be used to diagnose a range of mental and physical diseases and conditions.
This event will discuss recent advances in human motion sensing and provide an application to healthcare for networking and exploring potential synergies and collaborations.
|
Jose Ramirez Moreno, Juan R Revilla, Miguel Reyes, & Sergio Escalera. (2016). Validación del Software ADIBAS asociado al sensor Kinect de Microsoft para la evaluación de la posición corporal. In 4th Congreso WCPT-SAR.
|
Marc Oliu, Ciprian Corneanu, Kamal Nasrollahi, Olegs Nikisins, Sergio Escalera, Yunlian Sun, et al. (2016). Improved RGB-D-T based Face Recognition. BIO - IET Biometrics, 5(4), 297–303.
Abstract: Reliable facial recognition systems are of crucial importance in various applications from entertainment to security. Thanks to the deep-learning concepts introduced in the field, a significant improvement in the performance of the unimodal facial recognition systems has been observed in the recent years. At the same time a multimodal facial recognition is a promising approach. This study combines the latest successes in both directions by applying deep learning convolutional neural networks (CNN) to the multimodal RGB, depth, and thermal (RGB-D-T) based facial recognition problem outperforming previously published results. Furthermore, a late fusion of the CNN-based recognition block with various hand-crafted features (local binary patterns, histograms of oriented gradients, Haar-like rectangular features, histograms of Gabor ordinal measures) is introduced, demonstrating even better recognition performance on a benchmark RGB-D-T database. The obtained results in this study show that the classical engineered features and CNN-based features can complement each other for recognition purposes.
|