|
Juan Jose Rubio, Takahiro Kashiwa, Teera Laiteerapong, Wenlong Deng, Kohei Nagai, Sergio Escalera, et al. (2019). Multi-class structural damage segmentation using fully convolutional networks. COMPUTIND - Computers in Industry, 112, 103121.
Abstract: Structural Health Monitoring (SHM) has benefited from computer vision and more recently, Deep Learning approaches, to accurately estimate the state of deterioration of infrastructure. In our work, we test Fully Convolutional Networks (FCNs) with a dataset of deck areas of bridges for damage segmentation. We create a dataset for delamination and rebar exposure that has been collected from inspection records of bridges in Niigata Prefecture, Japan. The dataset consists of 734 images with three labels per image, which makes it the largest dataset of images of bridge deck damage. This data allows us to estimate the performance of our method based on regions of agreement, which emulates the uncertainty of in-field inspections. We demonstrate the practicality of FCNs to perform automated semantic segmentation of surface damages. Our model achieves a mean accuracy of 89.7% for delamination and 78.4% for rebar exposure, and a weighted F1 score of 81.9%.
Keywords: Bridge damage detection; Deep learning; Semantic segmentation
|
|
|
Egils Avots, Meysam Madadi, Sergio Escalera, Jordi Gonzalez, Xavier Baro, Paul Pallin, et al. (2019). From 2D to 3D geodesic-based garment matching. MTAP - Multimedia Tools and Applications, 78(18), 25829–25853.
Abstract: A new approach for 2D to 3D garment retexturing is proposed based on Gaussian mixture models and thin plate splines (TPS). An automatically segmented garment of an individual is matched to a new source garment and rendered, resulting in augmented images in which the target garment has been retextured using the texture of the source garment. We divide the problem into garment boundary matching based on Gaussian mixture models and then interpolate inner points using surface topology extracted through geodesic paths, which leads to a more realistic result than standard approaches. We evaluated and compared our system quantitatively by root mean square error (RMS) and qualitatively using the mean opinion score (MOS), showing the benefits of the proposed methodology on our gathered dataset.
Keywords: Shape matching; Geodesic distance; Texture mapping; RGBD image processing; Gaussian mixture model
|
|
|
Andre Litvin, Kamal Nasrollahi, Sergio Escalera, Cagri Ozcinar, Thomas B. Moeslund, & Gholamreza Anbarjafari. (2019). A Novel Deep Network Architecture for Reconstructing RGB Facial Images from Thermal for Face Recognition. MTAP - Multimedia Tools and Applications, 78(18), 25259–25271.
Abstract: This work proposes a fully convolutional network architecture for RGB face image generation from a given input thermal face image to be applied in face recognition scenarios. The proposed method is based on the FusionNet architecture and increases robustness against overfitting using dropout after bridge connections, randomised leaky ReLUs (RReLUs), and orthogonal regularization. Furthermore, we propose to use a decoding block with resize convolution instead of transposed convolution to improve final RGB face image generation. To validate our proposed network architecture, we train a face classifier and compare its face recognition rate on the reconstructed RGB images from the proposed architecture, to those when reconstructing images with the original FusionNet, as well as when using the original RGB images. As a result, we are introducing a new architecture which leads to a more accurate network.
Keywords: Fully convolutional networks; FusionNet; Thermal imaging; Face recognition
|
|
|
Ikechukwu Ofodile, Ahmed Helmi, Albert Clapes, Egils Avots, Kerttu Maria Peensoo, Sandhra Mirella Valdma, et al. (2019). Action recognition using single-pixel time-of-flight detection. ENTROPY - Entropy, 21(4), 414.
Abstract: Action recognition is a challenging task that plays an important role in many robotic systems, which highly depend on visual input feeds. However, due to privacy concerns, it is important to find a method which can recognise actions without using visual feed. In this paper, we propose a concept for detecting actions while preserving the test subject’s privacy. Our proposed method relies only on recording the temporal evolution of light pulses scattered back from the scene.
Such data trace to record one action contains a sequence of one-dimensional arrays of voltage values acquired by a single-pixel detector at 1 GHz repetition rate. Information about both the distance to the object and its shape are embedded in the traces. We apply machine learning in the form of recurrent neural networks for data analysis and demonstrate successful action recognition. The experimental results show that our proposed method could achieve on average 96.47% accuracy on the actions walking forward, walking backwards, sitting down, standing up and waving hand, using recurrent
neural network.
Keywords: single pixel single photon image acquisition; time-of-flight; action recognition
|
|
|
Hugo Jair Escalante, Heysem Kaya, Albert Ali Salah, Sergio Escalera, Yagmur Gucluturk, Umut Guçlu, et al. (2022). Modeling, Recognizing, and Explaining Apparent Personality from Videos. TAC - IEEE Transactions on Affective Computing, 13(2), 894–911.
Abstract: Explainability and interpretability are two critical aspects of decision support systems. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of apparent personality recognition. To the best of our knowledge, this is the first effort in this direction. We describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, evaluation protocol, proposed solutions and summarize the results of the challenge. We investigate the issue of bias in detail. Finally, derived from our study, we outline research opportunities that we foresee will be relevant in this area in the near future.
|
|