toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Fatemeh Noroozi; Ciprian Corneanu; Dorota Kamińska; Tomasz Sapiński; Sergio Escalera; Gholamreza Anbarjafari edit   pdf
url  openurl
  Title Survey on Emotional Body Gesture Recognition Type Journal Article
  Year (down) 2021 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC  
  Volume 12 Issue 2 Pages 505 - 523  
  Keywords  
  Abstract Automatic emotion recognition has become a trending research topic in the past decade. While works based on facial expressions or speech abound, recognizing affect from body gestures remains a less explored topic. We present a new comprehensive survey hoping to boost research in the field. We first introduce emotional body gestures as a component of what is commonly known as “body language” and comment general aspects as gender differences and culture dependence. We then define a complete framework for automatic emotional body gesture recognition. We introduce person detection and comment static and dynamic body pose estimation methods both in RGB and 3D. We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures. We also discuss multi-modal approaches that combine speech or face with body gestures for improved emotion recognition. While pre-processing methodologies (e.g. human detection and pose estimation) are nowadays mature technologies fully developed for robust large scale analysis, we show that for emotion recognition the quantity of labelled data is scarce, there is no agreement on clearly defined output spaces and the representations are shallow and largely based on naive geometrical representations.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ NCK2021 Serial 3657  
Permanent link to this record
 

 
Author Kaustubh Kulkarni; Ciprian Corneanu; Ikechukwu Ofodile; Sergio Escalera; Xavier Baro; Sylwia Hyniewska; Juri Allik; Gholamreza Anbarjafari edit   pdf
url  openurl
  Title Automatic Recognition of Facial Displays of Unfelt Emotions Type Journal Article
  Year (down) 2021 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC  
  Volume 12 Issue 2 Pages 377 - 390  
  Keywords  
  Abstract Humans modify their facial expressions in order to communicate their internal states and sometimes to mislead observers regarding their true emotional states. Evidence in experimental psychology shows that discriminative facial responses are short and subtle. This suggests that such behavior would be easier to distinguish when captured in high resolution at an increased frame rate. We are proposing SASE-FE, the first dataset of facial expressions that are either congruent or incongruent with underlying emotion states. We show that overall the problem of recognizing whether facial movements are expressions of authentic emotions or not can be successfully addressed by learning spatio-temporal representations of the data. For this purpose, we propose a method that aggregates features along fiducial trajectories in a deeply learnt space. Performance of the proposed model shows that on average, it is easier to distinguish among genuine facial expressions of emotion than among unfelt facial expressions of emotion and that certain emotion pairs such as contempt and disgust are more difficult to distinguish than the rest. Furthermore, the proposed methodology improves state of the art results on CK+ and OULU-CASIA datasets for video emotion recognition, and achieves competitive results when classifying facial action units on BP4D datase.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ KCO2021 Serial 3658  
Permanent link to this record
 

 
Author Wenlong Deng; Yongli Mou; Takahiro Kashiwa; Sergio Escalera; Kohei Nagai; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger edit  url
openurl 
  Title Vision based Pixel-level Bridge Structural Damage Detection Using a Link ASPP Network Type Journal Article
  Year (down) 2020 Publication Automation in Construction Abbreviated Journal AC  
  Volume 110 Issue Pages 102973  
  Keywords Semantic image segmentation; Deep learning  
  Abstract Structural Health Monitoring (SHM) has greatly benefited from computer vision. Recently, deep learning approaches are widely used to accurately estimate the state of deterioration of infrastructure. In this work, we focus on the problem of bridge surface structural damage detection, such as delamination and rebar exposure. It is well known that the quality of a deep learning model is highly dependent on the quality of the training dataset. Bridge damage detection, our application domain, has the following main challenges: (i) labeling the damages requires knowledgeable civil engineering professionals, which makes it difficult to collect a large annotated dataset; (ii) the damage area could be very small, whereas the background area is large, which creates an unbalanced training environment; (iii) due to the difficulty to exactly determine the extension of the damage, there is often a variation among different labelers who perform pixel-wise labeling. In this paper, we propose a novel model for bridge structural damage detection to address the first two challenges. This paper follows the idea of an atrous spatial pyramid pooling (ASPP) module that is designed as a novel network for bridge damage detection. Further, we introduce the weight balanced Intersection over Union (IoU) loss function to achieve accurate segmentation on a highly unbalanced small dataset. The experimental results show that (i) the IoU loss function improves the overall performance of damage detection, as compared to cross entropy loss or focal loss, and (ii) the proposed model has a better ability to detect a minority class than other light segmentation networks.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ DMK2020 Serial 3314  
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera edit  url
openurl 
  Title Hand sign language recognition using multi-view hand skeleton Type Journal Article
  Year (down) 2020 Publication Expert Systems With Applications Abbreviated Journal ESWA  
  Volume 150 Issue Pages 113336  
  Keywords Multi-view hand skeleton; Hand sign language recognition; 3DCNN; Hand pose estimation; RGB video; Hand action recognition  
  Abstract Hand sign language recognition from video is a challenging research area in computer vision, which performance is affected by hand occlusion, fast hand movement, illumination changes, or background complexity, just to mention a few. In recent years, deep learning approaches have achieved state-of-the-art results in the field, though previous challenges are not completely solved. In this work, we propose a novel deep learning-based pipeline architecture for efficient automatic hand sign language recognition using Single Shot Detector (SSD), 2D Convolutional Neural Network (2DCNN), 3D Convolutional Neural Network (3DCNN), and Long Short-Term Memory (LSTM) from RGB input videos. We use a CNN-based model which estimates the 3D hand keypoints from 2D input frames. After that, we connect these estimated keypoints to build the hand skeleton by using midpoint algorithm. In order to obtain a more discriminative representation of hands, we project 3D hand skeleton into three views surface images. We further employ the heatmap image of detected keypoints as input for refinement in a stacked fashion. We apply 3DCNNs on the stacked features of hand, including pixel level, multi-view hand skeleton, and heatmap features, to extract discriminant local spatio-temporal features from these stacked inputs. The outputs of the 3DCNNs are fused and fed to a LSTM to model long-term dynamics of hand sign gestures. Analyzing 2DCNN vs. 3DCNN using different number of stacked inputs into the network, we demonstrate that 3DCNN better capture spatio-temporal dynamics of hands. To the best of our knowledge, this is the first time that this multi-modal and multi-view set of hand skeleton features are applied for hand sign language recognition. Furthermore, we present a new large-scale hand sign language dataset, namely RKS-PERSIANSIGN, including 10′000 RGB videos of 100 Persian sign words. Evaluation results of the proposed model on three datasets, NYU, First-Person, and RKS-PERSIANSIGN, indicate that our model outperforms state-of-the-art models in hand sign language recognition, hand pose estimation, and hand action recognition.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ RKE2020a Serial 3411  
Permanent link to this record
 

 
Author Shifeng Zhang; Ajian Liu; Jun Wan; Yanyan Liang; Guogong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z. Li edit  url
doi  openurl
  Title CASIA-SURF: A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing Type Journal
  Year (down) 2020 Publication IEEE Transactions on Biometrics, Behavior, and Identity Science Abbreviated Journal TTBIS  
  Volume 2 Issue 2 Pages 182 - 193  
  Keywords  
  Abstract Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects (≤170) and modalities (≤2), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities. Specifically, it consists of 1,000 subjects with 21,000 videos and each sample has 3 modalities ( i.e. , RGB, Depth and IR). We also provide comprehensive evaluation metrics, diverse evaluation protocols, training/validation/testing subsets and a measurement tool, developing a new benchmark for face anti-spoofing. Moreover, we present a novel multi-modal multi-scale fusion method as a strong baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modality across different scales. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2019?authuser=0  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ ZLW2020 Serial 3412  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: