|
Wenjuan Gong, Zhang Yue, Wei Wang, Cheng Peng, & Jordi Gonzalez. (2022). Meta-MMFNet: Meta-Learning Based Multi-Model Fusion Network for Micro-Expression Recognition. ACMTMC - ACM Transactions on Multimedia Computing, Communications, and Applications, .
Abstract: Despite its wide applications in criminal investigations and clinical communications with patients suffering from autism, automatic micro-expression recognition remains a challenging problem because of the lack of training data and imbalanced classes problems. In this study, we proposed a meta-learning based multi-model fusion network (Meta-MMFNet) to solve the existing problems. The proposed method is based on the metric-based meta-learning pipeline, which is specifically designed for few-shot learning and is suitable for model-level fusion. The frame difference and optical flow features were fused, deep features were extracted from the fused feature, and finally in the meta-learning-based framework, weighted sum model fusion method was applied for micro-expression classification. Meta-MMFNet achieved better results than state-of-the-art methods on four datasets. The code is available at https://github.com/wenjgong/meta-fusion-based-method.
Keywords: Feature Fusion; Model Fusion; Meta-Learning; Micro-Expression Recognition
|
|
|
Parichehr Behjati Ardakani, Pau Rodriguez, Carles Fernandez, Armin Mehri, Xavier Roca, Seiichi Ozawa, et al. (2022). Frequency-based Enhancement Network for Efficient Super-Resolution. ACCESS - IEEE Access, 10, 57383–57397.
Abstract: Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet
Keywords: Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution
|
|
|
Diego Velazquez, Josep M. Gonfaus, Pau Rodriguez, Xavier Roca, Seiichi Ozawa, & Jordi Gonzalez. (2021). Logo Detection With No Priors. ACCESS - IEEE Access, 9, 106998–107011.
Abstract: In recent years, top referred methods on object detection like R-CNN have implemented this task as a combination of proposal region generation and supervised classification on the proposed bounding boxes. Although this pipeline has achieved state-of-the-art results in multiple datasets, it has inherent limitations that make object detection a very complex and inefficient task in computational terms. Instead of considering this standard strategy, in this paper we enhance Detection Transformers (DETR) which tackles object detection as a set-prediction problem directly in an end-to-end fully differentiable pipeline without requiring priors. In particular, we incorporate Feature Pyramids (FP) to the DETR architecture and demonstrate the effectiveness of the resulting DETR-FP approach on improving logo detection results thanks to the improved detection of small logos. So, without requiring any domain specific prior to be fed to the model, DETR-FP obtains competitive results on the OpenLogo and MS-COCO datasets offering a relative improvement of up to 30%, when compared to a Faster R-CNN baseline which strongly depends on hand-designed priors.
|
|
|
Maria Vanrell, Jordi Vitria, & Xavier Roca. (1997). A multidimensional scaling approach to explore the behavior of a texture perception algorithm. Machine Vision and Applications, 9, 262–271.
|
|
|
A. Pujol, & Juan J. Villanueva. (2002). A supervised Modification of the Hausdorff distance for visual shape classification. International Journal of Pattern Recognition and Artificial Intelligence, 349–359.
|
|