|
Records |
Links |
|
Author |
Andre Litvin; Kamal Nasrollahi; Sergio Escalera; Cagri Ozcinar; Thomas B. Moeslund; Gholamreza Anbarjafari |

|
|
Title |
A Novel Deep Network Architecture for Reconstructing RGB Facial Images from Thermal for Face Recognition |
Type  |
Journal Article |
|
Year |
2019 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
|
|
Volume |
78 |
Issue |
18 |
Pages |
25259–25271 |
|
|
Keywords |
Fully convolutional networks; FusionNet; Thermal imaging; Face recognition |
|
|
Abstract |
This work proposes a fully convolutional network architecture for RGB face image generation from a given input thermal face image to be applied in face recognition scenarios. The proposed method is based on the FusionNet architecture and increases robustness against overfitting using dropout after bridge connections, randomised leaky ReLUs (RReLUs), and orthogonal regularization. Furthermore, we propose to use a decoding block with resize convolution instead of transposed convolution to improve final RGB face image generation. To validate our proposed network architecture, we train a face classifier and compare its face recognition rate on the reconstructed RGB images from the proposed architecture, to those when reconstructing images with the original FusionNet, as well as when using the original RGB images. As a result, we are introducing a new architecture which leads to a more accurate network. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no menciona;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ LNE2019 |
Serial |
3318 |
|
Permanent link to this record |
|
|
|
|
Author |
Ikechukwu Ofodile; Ahmed Helmi; Albert Clapes; Egils Avots; Kerttu Maria Peensoo; Sandhra Mirella Valdma; Andreas Valdmann; Heli Valtna Lukner; Sergey Omelkov; Sergio Escalera; Cagri Ozcinar; Gholamreza Anbarjafari |


|
|
Title |
Action recognition using single-pixel time-of-flight detection |
Type  |
Journal Article |
|
Year |
2019 |
Publication |
Entropy |
Abbreviated Journal |
ENTROPY |
|
|
Volume |
21 |
Issue |
4 |
Pages |
414 |
|
|
Keywords |
single pixel single photon image acquisition; time-of-flight; action recognition |
|
|
Abstract |
Action recognition is a challenging task that plays an important role in many robotic systems, which highly depend on visual input feeds. However, due to privacy concerns, it is important to find a method which can recognise actions without using visual feed. In this paper, we propose a concept for detecting actions while preserving the test subject’s privacy. Our proposed method relies only on recording the temporal evolution of light pulses scattered back from the scene.
Such data trace to record one action contains a sequence of one-dimensional arrays of voltage values acquired by a single-pixel detector at 1 GHz repetition rate. Information about both the distance to the object and its shape are embedded in the traces. We apply machine learning in the form of recurrent neural networks for data analysis and demonstrate successful action recognition. The experimental results show that our proposed method could achieve on average 96.47% accuracy on the actions walking forward, walking backwards, sitting down, standing up and waving hand, using recurrent
neural network. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no proj;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ OHC2019 |
Serial |
3319 |
|
Permanent link to this record |
|
|
|
|
Author |
Hugo Jair Escalante; Heysem Kaya; Albert Ali Salah; Sergio Escalera; Yagmur Gucluturk; Umut Guçlu; Xavier Baro; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Stephane Ayache; Evelyne Viegas; Furkan Gurpinar; Achmadnoer Sukma Wicaksana; Cynthia Liem; Marcel A. J. Van Gerven; Rob Van Lier |


|
|
Title |
Modeling, Recognizing, and Explaining Apparent Personality from Videos |
Type  |
Journal Article |
|
Year |
2022 |
Publication |
IEEE Transactions on Affective Computing |
Abbreviated Journal |
TAC |
|
|
Volume |
13 |
Issue |
2 |
Pages |
894-911 |
|
|
Keywords |
|
|
|
Abstract |
Explainability and interpretability are two critical aspects of decision support systems. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of apparent personality recognition. To the best of our knowledge, this is the first effort in this direction. We describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, evaluation protocol, proposed solutions and summarize the results of the challenge. We investigate the issue of bias in detail. Finally, derived from our study, we outline research opportunities that we foresee will be relevant in this area in the near future. |
|
|
Address |
1 April-June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no menciona;MV;OR;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ EKS2022 |
Serial |
3406 |
|
Permanent link to this record |
|
|
|
|
Author |
Razieh Rastgoo; Kourosh Kiani; Sergio Escalera |

|
|
Title |
Hand sign language recognition using multi-view hand skeleton |
Type  |
Journal Article |
|
Year |
2020 |
Publication |
Expert Systems With Applications |
Abbreviated Journal |
ESWA |
|
|
Volume |
150 |
Issue |
|
Pages |
113336 |
|
|
Keywords |
Multi-view hand skeleton; Hand sign language recognition; 3DCNN; Hand pose estimation; RGB video; Hand action recognition |
|
|
Abstract |
Hand sign language recognition from video is a challenging research area in computer vision, which performance is affected by hand occlusion, fast hand movement, illumination changes, or background complexity, just to mention a few. In recent years, deep learning approaches have achieved state-of-the-art results in the field, though previous challenges are not completely solved. In this work, we propose a novel deep learning-based pipeline architecture for efficient automatic hand sign language recognition using Single Shot Detector (SSD), 2D Convolutional Neural Network (2DCNN), 3D Convolutional Neural Network (3DCNN), and Long Short-Term Memory (LSTM) from RGB input videos. We use a CNN-based model which estimates the 3D hand keypoints from 2D input frames. After that, we connect these estimated keypoints to build the hand skeleton by using midpoint algorithm. In order to obtain a more discriminative representation of hands, we project 3D hand skeleton into three views surface images. We further employ the heatmap image of detected keypoints as input for refinement in a stacked fashion. We apply 3DCNNs on the stacked features of hand, including pixel level, multi-view hand skeleton, and heatmap features, to extract discriminant local spatio-temporal features from these stacked inputs. The outputs of the 3DCNNs are fused and fed to a LSTM to model long-term dynamics of hand sign gestures. Analyzing 2DCNN vs. 3DCNN using different number of stacked inputs into the network, we demonstrate that 3DCNN better capture spatio-temporal dynamics of hands. To the best of our knowledge, this is the first time that this multi-modal and multi-view set of hand skeleton features are applied for hand sign language recognition. Furthermore, we present a new large-scale hand sign language dataset, namely RKS-PERSIANSIGN, including 10′000 RGB videos of 100 Persian sign words. Evaluation results of the proposed model on three datasets, NYU, First-Person, and RKS-PERSIANSIGN, indicate that our model outperforms state-of-the-art models in hand sign language recognition, hand pose estimation, and hand action recognition. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no proj;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ RKE2020a |
Serial |
3411 |
|
Permanent link to this record |
|
|
|
|
Author |
Yunan Li; Jun Wan; Qiguang Miao; Sergio Escalera; Huijuan Fang; Huizhou Chen; Xiangda Qi; Guodong Guo |

|
|
Title |
CR-Net: A Deep Classification-Regression Network for Multimodal Apparent Personality Analysis |
Type  |
Journal Article |
|
Year |
2020 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
|
|
Volume |
128 |
Issue |
|
Pages |
2763–2780 |
|
|
Keywords |
|
|
|
Abstract |
First impressions strongly influence social interactions, having a high impact in the personal and professional life. In this paper, we present a deep Classification-Regression Network (CR-Net) for analyzing the Big Five personality problem and further assisting on job interview recommendation in a first impressions setup. The setup is based on the ChaLearn First Impressions dataset, including multimodal data with video, audio, and text converted from the corresponding audio data, where each person is talking in front of a camera. In order to give a comprehensive prediction, we analyze the videos from both the entire scene (including the person’s motions and background) and the face of the person. Our CR-Net first performs personality trait classification and applies a regression later, which can obtain accurate predictions for both personality traits and interview recommendation. Furthermore, we present a new loss function called Bell Loss to address inaccurate predictions caused by the regression-to-the-mean problem. Extensive experiments on the First Impressions dataset show the effectiveness of our proposed network, outperforming the state-of-the-art. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no menciona;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ LWM2020 |
Serial |
3413 |
|
Permanent link to this record |