toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Wenlong Deng; Yongli Mou; Takahiro Kashiwa; Sergio Escalera; Kohei Nagai; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger edit  url
openurl 
  Title Vision based Pixel-level Bridge Structural Damage Detection Using a Link ASPP Network Type Journal Article
  Year (down) 2020 Publication Automation in Construction Abbreviated Journal AC  
  Volume 110 Issue Pages 102973  
  Keywords Semantic image segmentation; Deep learning  
  Abstract Structural Health Monitoring (SHM) has greatly benefited from computer vision. Recently, deep learning approaches are widely used to accurately estimate the state of deterioration of infrastructure. In this work, we focus on the problem of bridge surface structural damage detection, such as delamination and rebar exposure. It is well known that the quality of a deep learning model is highly dependent on the quality of the training dataset. Bridge damage detection, our application domain, has the following main challenges: (i) labeling the damages requires knowledgeable civil engineering professionals, which makes it difficult to collect a large annotated dataset; (ii) the damage area could be very small, whereas the background area is large, which creates an unbalanced training environment; (iii) due to the difficulty to exactly determine the extension of the damage, there is often a variation among different labelers who perform pixel-wise labeling. In this paper, we propose a novel model for bridge structural damage detection to address the first two challenges. This paper follows the idea of an atrous spatial pyramid pooling (ASPP) module that is designed as a novel network for bridge damage detection. Further, we introduce the weight balanced Intersection over Union (IoU) loss function to achieve accurate segmentation on a highly unbalanced small dataset. The experimental results show that (i) the IoU loss function improves the overall performance of damage detection, as compared to cross entropy loss or focal loss, and (ii) the proposed model has a better ability to detect a minority class than other light segmentation networks.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ DMK2020 Serial 3314  
Permanent link to this record
 

 
Author Yunan Li; Jun Wan; Qiguang Miao; Sergio Escalera; Huijuan Fang; Huizhou Chen; Xiangda Qi; Guodong Guo edit  url
openurl 
  Title CR-Net: A Deep Classification-Regression Network for Multimodal Apparent Personality Analysis Type Journal Article
  Year (down) 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 128 Issue Pages 2763–2780  
  Keywords  
  Abstract First impressions strongly influence social interactions, having a high impact in the personal and professional life. In this paper, we present a deep Classification-Regression Network (CR-Net) for analyzing the Big Five personality problem and further assisting on job interview recommendation in a first impressions setup. The setup is based on the ChaLearn First Impressions dataset, including multimodal data with video, audio, and text converted from the corresponding audio data, where each person is talking in front of a camera. In order to give a comprehensive prediction, we analyze the videos from both the entire scene (including the person’s motions and background) and the face of the person. Our CR-Net first performs personality trait classification and applies a regression later, which can obtain accurate predictions for both personality traits and interview recommendation. Furthermore, we present a new loss function called Bell Loss to address inaccurate predictions caused by the regression-to-the-mean problem. Extensive experiments on the First Impressions dataset show the effectiveness of our proposed network, outperforming the state-of-the-art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no menciona Approved no  
  Call Number Admin @ si @ LWM2020 Serial 3413  
Permanent link to this record
 

 
Author Zhengying Liu; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu edit   pdf
url  openurl
  Title Towards automated computer vision: analysis of the AutoCV challenges 2019 Type Journal Article
  Year (down) 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 135 Issue Pages 196-203  
  Keywords Computer vision; AutoML; Deep learning  
  Abstract We present the results of recent challenges in Automated Computer Vision (AutoCV, renamed here for clarity AutoCV1 and AutoCV2, 2019), which are part of a series of challenge on Automated Deep Learning (AutoDL). These two competitions aim at searching for fully automated solutions for classification tasks in computer vision, with an emphasis on any-time performance. The first competition was limited to image classification while the second one included both images and videos. Our design imposed to the participants to submit their code on a challenge platform for blind testing on five datasets, both for training and testing, without any human intervention whatsoever. Winning solutions adopted deep learning techniques based on already published architectures, such as AutoAugment, MobileNet and ResNet, to reach state-of-the-art performance in the time budget of the challenge (only 20 minutes of GPU time). The novel contributions include strategies to deliver good preliminary results at any time during the learning process, such that a method can be stopped early and still deliver good performance. This feature is key for the adoption of such techniques by data analysts desiring to obtain rapidly preliminary results on large datasets and to speed up the development process. The soundness of our design was verified in several aspects: (1) Little overfitting of the on-line leaderboard providing feedback on 5 development datasets was observed, compared to the final blind testing on the 5 (separate) final test datasets, suggesting that winning solutions might generalize to other computer vision classification tasks; (2) Error bars on the winners’ performance allow us to say with confident that they performed significantly better than the baseline solutions we provided; (3) The ranking of participants according to the any-time metric we designed, namely the Area under the Learning Curve, was different from that of the fixed-time metric, i.e. AUC at the end of the fixed time budget. We released all winning solutions under open-source licenses. At the end of the AutoDL challenge series, all data of the challenge will be made publicly available, thus providing a collection of uniformly formatted datasets, which can serve to conduct further research, particularly on meta-learning.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ LXE2020 Serial 3427  
Permanent link to this record
 

 
Author Andre Litvin; Kamal Nasrollahi; Sergio Escalera; Cagri Ozcinar; Thomas B. Moeslund; Gholamreza Anbarjafari edit  url
openurl 
  Title A Novel Deep Network Architecture for Reconstructing RGB Facial Images from Thermal for Face Recognition Type Journal Article
  Year (down) 2019 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 78 Issue 18 Pages 25259–25271  
  Keywords Fully convolutional networks; FusionNet; Thermal imaging; Face recognition  
  Abstract This work proposes a fully convolutional network architecture for RGB face image generation from a given input thermal face image to be applied in face recognition scenarios. The proposed method is based on the FusionNet architecture and increases robustness against overfitting using dropout after bridge connections, randomised leaky ReLUs (RReLUs), and orthogonal regularization. Furthermore, we propose to use a decoding block with resize convolution instead of transposed convolution to improve final RGB face image generation. To validate our proposed network architecture, we train a face classifier and compare its face recognition rate on the reconstructed RGB images from the proposed architecture, to those when reconstructing images with the original FusionNet, as well as when using the original RGB images. As a result, we are introducing a new architecture which leads to a more accurate network.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no menciona Approved no  
  Call Number Admin @ si @ LNE2019 Serial 3318  
Permanent link to this record
 

 
Author Egils Avots; Meysam Madadi; Sergio Escalera; Jordi Gonzalez; Xavier Baro; Paul Pallin; Gholamreza Anbarjafari edit   pdf
url  doi
openurl 
  Title From 2D to 3D geodesic-based garment matching Type Journal Article
  Year (down) 2019 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 78 Issue 18 Pages 25829–25853  
  Keywords Shape matching; Geodesic distance; Texture mapping; RGBD image processing; Gaussian mixture model  
  Abstract A new approach for 2D to 3D garment retexturing is proposed based on Gaussian mixture models and thin plate splines (TPS). An automatically segmented garment of an individual is matched to a new source garment and rendered, resulting in augmented images in which the target garment has been retextured using the texture of the source garment. We divide the problem into garment boundary matching based on Gaussian mixture models and then interpolate inner points using surface topology extracted through geodesic paths, which leads to a more realistic result than standard approaches. We evaluated and compared our system quantitatively by root mean square error (RMS) and qualitatively using the mean opinion score (MOS), showing the benefits of the proposed methodology on our gathered dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; ISE; 600.098; 600.119; 602.133 Approved no  
  Call Number Admin @ si @ AME2019 Serial 3317  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: