|
Y. Mori, M.Misawa, Jorge Bernal, M. Bretthauer, S.Kudo, A. Rastogi, et al. (2022). Artificial Intelligence for Disease Diagnosis-the Gold Standard Challenge. Gastrointestinal Endoscopy, 96(2), 370–372.
|
|
|
Wenjuan Gong, Yue Zhang, Wei Wang, Peng Cheng, & Jordi Gonzalez. (2023). Meta-MMFNet: Meta-learning-based Multi-model Fusion Network for Micro-expression Recognition. TMCCA - ACM Transactions on Multimedia Computing, Communications, and Applications, 20(2), 1–20.
Abstract: Despite its wide applications in criminal investigations and clinical communications with patients suffering from autism, automatic micro-expression recognition remains a challenging problem because of the lack of training data and imbalanced classes problems. In this study, we proposed a meta-learning-based multi-model fusion network (Meta-MMFNet) to solve the existing problems. The proposed method is based on the metric-based meta-learning pipeline, which is specifically designed for few-shot learning and is suitable for model-level fusion. The frame difference and optical flow features were fused, deep features were extracted from the fused feature, and finally in the meta-learning-based framework, weighted sum model fusion method was applied for micro-expression classification. Meta-MMFNet achieved better results than state-of-the-art methods on four datasets. The code is available at https://github.com/wenjgong/meta-fusion-based-method.
|
|
|
Egils Avots, Meysam Madadi, Sergio Escalera, Jordi Gonzalez, Xavier Baro, Paul Pallin, et al. (2019). From 2D to 3D geodesic-based garment matching. MTAP - Multimedia Tools and Applications, 78(18), 25829–25853.
Abstract: A new approach for 2D to 3D garment retexturing is proposed based on Gaussian mixture models and thin plate splines (TPS). An automatically segmented garment of an individual is matched to a new source garment and rendered, resulting in augmented images in which the target garment has been retextured using the texture of the source garment. We divide the problem into garment boundary matching based on Gaussian mixture models and then interpolate inner points using surface topology extracted through geodesic paths, which leads to a more realistic result than standard approaches. We evaluated and compared our system quantitatively by root mean square error (RMS) and qualitatively using the mean opinion score (MOS), showing the benefits of the proposed methodology on our gathered dataset.
Keywords: Shape matching; Geodesic distance; Texture mapping; RGBD image processing; Gaussian mixture model
|
|
|
Wenwen Fu, Zhihong An, Wendong Huang, Haoran Sun, Wenjuan Gong, & Jordi Gonzalez. (2023). A Spatio-Temporal Spotting Network with Sliding Windows for Micro-Expression Detection. ELEC - Electronics, 12(18), 3947.
Abstract: Micro-expressions reveal underlying emotions and are widely applied in political psychology, lie detection, law enforcement and medical care. Micro-expression spotting aims to detect the temporal locations of facial expressions from video sequences and is a crucial task in micro-expression recognition. In this study, the problem of micro-expression spotting is formulated as micro-expression classification per frame. We propose an effective spotting model with sliding windows called the spatio-temporal spotting network. The method involves a sliding window detection mechanism, combines the spatial features from the local key frames and the global temporal features and performs micro-expression spotting. The experiments are conducted on the CAS(ME)2 database and the SAMM Long Videos database, and the results demonstrate that the proposed method outperforms the state-of-the-art method by 30.58% for the CAS(ME)2 and 23.98% for the SAMM Long Videos according to overall F-scores.
Keywords: micro-expression spotting; sliding window; key frame extraction
|
|
|
Marco Pedersoli, Jordi Gonzalez, Andrew Bagdanov, & Xavier Roca. (2011). Efficient Discriminative Multiresolution Cascade for Real-Time Human Detection Applications. PRL - Pattern Recognition Letters, 32(13), 1581–1587.
Abstract: Human detection is fundamental in many machine vision applications, like video surveillance, driving assistance, action recognition and scene understanding. However in most of these applications real-time performance is necessary and this is not achieved yet by current detection methods.
This paper presents a new method for human detection based on a multiresolution cascade of Histograms of Oriented Gradients (HOG) that can highly reduce the computational cost of detection search without affecting accuracy. The method consists of a cascade of sliding window detectors. Each detector is a linear Support Vector Machine (SVM) composed of HOG features at different resolutions, from coarse at the first level to fine at the last one.
In contrast to previous methods, our approach uses a non-uniform stride of the sliding window that is defined by the feature resolution and allows the detection to be incrementally refined as going from coarse-to-fine resolution. In this way, the speed-up of the cascade is not only due to the fewer number of features computed at the first levels of the cascade, but also to the reduced number of windows that need to be evaluated at the coarse resolution. Experimental results show that our method reaches a detection rate comparable with the state-of-the-art of detectors based on HOG features, while at the same time the detection search is up to 23 times faster.
|
|