|
German Barquero, Johnny Nuñez, Sergio Escalera, Zhen Xu, Wei-Wei Tu, & Isabelle Guyon. (2022). Didn’t see that coming: a survey on non-verbal social human behavior forecasting. In Understanding Social Behavior in Dyadic and Small Group Interactions (Vol. 173, pp. 139–178).
Abstract: Non-verbal social human behavior forecasting has increasingly attracted the interest of the research community in recent years. Its direct applications to human-robot interaction and socially-aware human motion generation make it a very attractive field. In this survey, we define the behavior forecasting problem for multiple interactive agents in a generic way that aims at unifying the fields of social signals prediction and human motion forecasting, traditionally separated. We hold that both problem formulations refer to the same conceptual problem, and identify many shared fundamental challenges: future stochasticity, context awareness, history exploitation, etc. We also propose a taxonomy that comprises
methods published in the last 5 years in a very informative way and describes the current main concerns of the community with regard to this problem. In order to promote further research on this field, we also provide a summarized and friendly overview of audiovisual datasets featuring non-acted social interactions. Finally, we describe the most common metrics used in this task and their particular issues.
|
|
|
Sergio Escalera, R. M. Martinez, Jordi Vitria, Petia Radeva, & Maria Teresa Anguera. (2009). Dominance Detection in Face-to-face Conversations. In 2nd IEEE Workshop on CVPR for Human communicative Behavior analysis (97–102).
Abstract: Dominance is referred to the level of influence a person has in a conversation. Dominance is an important research area in social psychology, but the problem of its automatic estimation is a very recent topic in the contexts of social and wearable computing. In this paper, we focus on dominance detection from visual cues. We estimate the correlation among observers by categorizing the dominant people in a set of face-to-face conversations. Different dominance indicators from gestural communication are defined, manually annotated, and compared to the observers opinion. Moreover, the considered indicators are automatically extracted from video sequences and learnt by using binary classifiers. Results from the three analysis shows a high correlation and allows the categorization of dominant people in public discussion video sequences.
|
|
|
Sergio Escalera, R. M. Martinez, Jordi Vitria, Petia Radeva, & Maria Teresa Anguera. (2010). Deteccion automatica de la dominancia en conversaciones diadicas. EP - Escritos de Psicologia, 3(2), 41–45.
Abstract: Dominance is referred to the level of influence that a person has in a conversation. Dominance is an important research area in social psychology, but the problem of its automatic estimation is a very recent topic in the contexts of social and wearable computing. In this paper, we focus on the dominance detection of visual cues. We estimate the correlation among observers by categorizing the dominant people in a set of face-to-face conversations. Different dominance indicators from gestural communication are defined, manually annotated, and compared to the observers' opinion. Moreover, these indicators are automatically extracted from video sequences and learnt by using binary classifiers. Results from the three analyses showed a high correlation and allows the categorization of dominant people in public discussion video sequences.
Keywords: Dominance detection; Non-verbal communication; Visual features
|
|
|
Oriol Pujol. (1999). Model-based three dimensional interpolation of IVUS images.
|
|
|
Oriol Pujol. (2004). A semi-Supervised Statistical Framework and Generative Snakes for IVUS Analysis (Petia Radeva, Ed.). Ph.D. thesis, , .
|
|
|
Antonio Hernandez. (2010). Pose and Face Recovery via Spatio-temporal GrabCut Human Segmentation (Vol. 153). Master's thesis, , .
|
|
|
Eloi Puertas, Sergio Escalera, & Oriol Pujol. (2010). Classifying Objects at Different Sizes with Multi-Scale Stacked Sequential Learning. In J. Aguilar A. M. R. Alquezar (Ed.), 13th International Conference of the Catalan Association for Artificial Intelligence (Vol. 220, 193–200).
Abstract: Sequential learning is that discipline of machine learning that deals with dependent data. In this paper, we use the Multi-scale Stacked Sequential Learning approach (MSSL) to solve the task of pixel-wise classification based on contextual information. The main contribution of this work is a shifting technique applied during the testing phase that makes possible, thanks to template images, to classify objects at different sizes. The results show that the proposed method robustly classifies such objects capturing their spatial relationships.
|
|
|
Xavier Perez Sala, Cecilio Angulo, & Sergio Escalera. (2011). Biologically Inspired Turn Control in Robot Navigation. In 14th Congrès Català en Intel·ligencia Artificial (pp. 187–196).
Abstract: An exportable and robust system for turn control using only camera images is proposed for path execution in robot navigation. Robot motion information is extracted in the form of optical flow from SURF robust descriptors of consecutive frames in the image sequence. This information is used to compute the instantaneous rotation angle. Finally, control loop is closed correcting robot displacements when it is requested for a turn command. The proposed system has been successfully tested on the four-legged Sony Aibo robot.
|
|
|
Eloi Puertas, Sergio Escalera, & Oriol Pujol. (2011). Multi-Class Multi-Scale Stacked Sequential Learning. In Carlo Sansone, Josef Kittler, & Fabio Roli (Eds.), 10th International Conference on Multiple Classifier Systems (Vol. 6713, pp. 197–206). Springer.
|
|
|
Xavier Perez Sala, Cecilio Angulo, & Sergio Escalera. (2011). Biologically Inspired Path Execution Using SURF Flow in Robot Navigation. In 11th International Work Conference on Artificial Neural Networks (Vol. II, pp. 581–588). Springer Berlin Heidelberg.
Abstract: An exportable and robust system using only camera images is proposed for path execution in robot navigation. Motion information is extracted in the form of optical flow from SURF robust descriptors of consecutive frames, so the method is called SURF flow. This information is used to correct robot displacement when a straight forward path command is sent to the robot, but it is not really executed due to several robot and environmental concerns. The proposed system has been successfully tested on the legged robot Aibo.
|
|
|
Sergio Escalera. (2013). Multi-Modal Human Behaviour Analysis from Visual Data Sources. ERCIM - ERCIM News journal, 21–22.
Abstract: The Human Pose Recovery and Behaviour Analysis group (HuPBA), University of Barcelona, is developing a line of research on multi-modal analysis of humans in visual data. The novel technology is being applied in several scenarios with high social impact, including sign language recognition, assisted technology and supported diagnosis for the elderly and people with mental/physical disabilities, fitness conditioning, and Human Computer Interaction.
|
|
|
Alex Pardo, Albert Clapes, Sergio Escalera, & Oriol Pujol. (2013). Actions in Context: System for people with Dementia. In 2nd International Workshop on Citizen Sensor Networks (Citisen2013) at the European Conference on Complex Systems (pp. 3–14). Springer International Publishing.
Abstract: In the next forty years, the number of people living with dementia is expected to triple. In the last stages, people affected by this disease become dependent. This hinders the autonomy of the patient and has a huge social impact in time, money and effort. Given this scenario, we propose an ubiquitous system capable of recognizing daily specific actions. The system fuses and synchronizes data obtained from two complementary modalities – ambient and egocentric. The ambient approach consists in a fixed RGB-Depth camera for user and object recognition and user-object interaction, whereas the egocentric point of view is given by a personal area network (PAN) formed by a few wearable sensors and a smartphone, used for gesture recognition. The system processes multi-modal data in real-time, performing paralleled task recognition and modality synchronization, showing high performance recognizing subjects, objects, and interactions, showing its reliability to be applied in real case scenarios.
Keywords: Multi-modal data Fusion; Computer vision; Wearable sensors; Gesture recognition; Dementia
|
|
|
Miguel Reyes, Gabriel Dominguez, & Sergio Escalera. (2011). Feature Weighting in Dynamic Time Warping for Gesture Recognition in Depth Data. In 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision (pp. 1182–1188).
Abstract: We present a gesture recognition approach for depth video data based on a novel Feature Weighting approach within the Dynamic Time Warping framework. Depth features from human joints are compared through video sequences using Dynamic Time Warping, and weights are assigned to features based on inter-intra class gesture variability. Feature Weighting in Dynamic Time Warping is then applied for recognizing begin-end of gestures in data sequences. The obtained results recognizing several gestures in depth data show high performance compared with classical Dynamic Time Warping approach.
|
|
|
Alvaro Cepero, Albert Clapes, & Sergio Escalera. (2013). Quantitative analysis of non-verbal communication for competence analysis. In 16th Catalan Conference on Artificial Intelligence (Vol. 256, pp. 105–114).
|
|
|
Albert Clapes, Miguel Reyes, & Sergio Escalera. (2012). User Identification and Object Recognition in Clutter Scenes Based on RGB-Depth Analysis. In 7th Conference on Articulated Motion and Deformable Objects (Vol. 7378, pp. 1–11). LNCS. Springer Berlin Heidelberg.
Abstract: We propose an automatic system for user identification and object recognition based on multi-modal RGB-Depth data analysis. We model a RGBD environment learning a pixel-based background Gaussian distribution. Then, user and object candidate regions are detected and recognized online using robust statistical approaches over RGBD descriptions. Finally, the system saves the historic of user-object assignments, being specially useful for surveillance scenarios. The system has been evaluated on a novel data set containing different indoor/outdoor scenarios, objects, and users, showing accurate recognition and better performance than standard state-of-the-art approaches.
|
|