Records |
Author |
Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji |
Title |
Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding |
Type |
Conference Article |
Year |
2022 |
Publication |
29th IEEE International Conference on Image Processing |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics |
Abstract |
In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM. |
Address |
Bordeaux; France; October 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICIP |
Notes |
MACO |
Approved |
no |
Call Number |
Admin @ si @ ZWM2022 |
Serial |
3790 |
Permanent link to this record |
|
|
|
Author |
Saiping Zhang; Luis Herranz; Marta Mrak; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang |
Title |
DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video |
Type |
Conference Article |
Year |
2022 |
Publication |
47th International Conference on Acoustics, Speech, and Signal Processing |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms. |
Address |
Virtual; May 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICASSP |
Notes |
MACO; 600.161; 601.379 |
Approved |
no |
Call Number |
Admin @ si @ ZHM2022a |
Serial |
3765 |
Permanent link to this record |
|
|
|
Author |
Danna Xue; Fei Yang; Pei Wang; Luis Herranz; Jinqiu Sun; Yu Zhu; Yanning Zhang |
Title |
SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision |
Type |
Conference Article |
Year |
2022 |
Publication |
30th ACM International Conference on Multimedia |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
6539-6548 |
Keywords |
|
Abstract |
Accurate semantic segmentation models typically require significant computational resources, inhibiting their use in practical applications. Recent works rely on well-crafted lightweight models to achieve fast inference. However, these models cannot flexibly adapt to varying accuracy and efficiency requirements. In this paper, we propose a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference depending on the desired accuracy-efficiency tradeoff. More specifically, we employ parametrized channel slimming by stepwise downward knowledge distillation during training. Motivated by the observation that the differences between segmentation results of each submodel are mainly near the semantic borders, we introduce an additional boundary guided semantic segmentation loss to further improve the performance of each submodel. We show that our proposed SlimSeg with various mainstream networks can produce flexible models that provide dynamic adjustment of computational cost and better performance than independent models. Extensive experiments on semantic segmentation benchmarks, Cityscapes and CamVid, demonstrate the generalization ability of our framework. |
Address |
Lisboa, Portugal, October 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
Association for Computing Machinery |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
978-1-4503-9203-7 |
Medium |
|
Area |
|
Expedition |
|
Conference |
MM |
Notes |
MACO; 600.161; 601.400 |
Approved |
no |
Call Number |
Admin @ si @ XYW2022 |
Serial |
3758 |
Permanent link to this record |
|
|
|
Author |
Zhaocheng Liu; Luis Herranz; Fei Yang; Saiping Zhang; Shuai Wan; Marta Mrak; Marc Gorriz |
Title |
Slimmable Video Codec |
Type |
Conference Article |
Year |
2022 |
Publication |
CVPR 2022 Workshop and Challenge on Learned Image Compression (CLIC 2022, 5th Edition) |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
1742-1746 |
Keywords |
|
Abstract |
Neural video compression has emerged as a novel paradigm combining trainable multilayer neural net-works and machine learning, achieving competitive rate-distortion (RD) performances, but still remaining impractical due to heavy neural architectures, with large memory and computational demands. In addition, models are usually optimized for a single RD tradeoff. Recent slimmable image codecs can dynamically adjust their model capacity to gracefully reduce the memory and computation requirements, without harming RD performance. In this paper we propose a slimmable video codec (SlimVC), by integrating a slimmable temporal entropy model in a slimmable autoencoder. Despite a significantly more complex architecture, we show that slimming remains a powerful mechanism to control rate, memory footprint, computational cost and latency, all being important requirements for practical video compression. |
Address |
Virtual; 19 June 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CVPRW |
Notes |
MACO; 601.379; 601.161 |
Approved |
no |
Call Number |
Admin @ si @ LHY2022 |
Serial |
3687 |
Permanent link to this record |
|
|
|
Author |
Saiping Zhang, Luis Herranz, Marta Mrak, Marc Gorriz Blanch, Shuai Wan, Fuzheng Yang |
Title |
PeQuENet: Perceptual Quality Enhancement of Compressed Video with Adaptation-and Attention-based Network |
Type |
Miscellaneous |
Year |
2022 |
Publication |
Arxiv |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
In this paper we propose a generative adversarial network (GAN) framework to enhance the perceptual quality of compressed videos. Our framework includes attention and adaptation to different quantization parameters (QPs) in a single model. The attention module exploits global receptive fields that can capture and align long-range correlations between consecutive frames, which can be beneficial for enhancing perceptual quality of videos. The frame to be enhanced is fed into the deep network together with its neighboring frames, and in the first stage features at different depths are extracted. Then extracted features are fed into attention blocks to explore global temporal correlations, followed by a series of upsampling and convolution layers. Finally, the resulting features are processed by the QP-conditional adaptation module which leverages the corresponding QP information. In this way, a single model can be used to enhance adaptively to various QPs without requiring multiple models specific for every QP value, while having similar performance. Experimental results demonstrate the superior performance of the proposed PeQuENet compared with the state-of-the-art compressed video quality enhancement algorithms. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MACO; no proj |
Approved |
no |
Call Number |
Admin @ si @ ZHM2022b |
Serial |
3819 |
Permanent link to this record |
|
|
|
Author |
Eduardo Aguilar; Bhalaji Nagarajan; Beatriz Remeseiro; Petia Radeva |
Title |
Bayesian deep learning for semantic segmentation of food images |
Type |
Journal Article |
Year |
2022 |
Publication |
Computers and Electrical Engineering |
Abbreviated Journal |
CEE |
Volume |
103 |
Issue |
|
Pages |
108380 |
Keywords |
Deep learning; Uncertainty quantification; Bayesian inference; Image segmentation; Food analysis |
Abstract |
Deep learning has provided promising results in various applications; however, algorithms tend to be overconfident in their predictions, even though they may be entirely wrong. Particularly for critical applications, the model should provide answers only when it is very sure of them. This article presents a Bayesian version of two different state-of-the-art semantic segmentation methods to perform multi-class segmentation of foods and estimate the uncertainty about the given predictions. The proposed methods were evaluated on three public pixel-annotated food datasets. As a result, we can conclude that Bayesian methods improve the performance achieved by the baseline architectures and, in addition, provide information to improve decision-making. Furthermore, based on the extracted uncertainty map, we proposed three measures to rank the images according to the degree of noisy annotations they contained. Note that the top 135 images ranked by one of these measures include more than half of the worst-labeled food images. |
Address |
October 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
Science Direct |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB |
Approved |
no |
Call Number |
Admin @ si @ ANR2022 |
Serial |
3763 |
Permanent link to this record |
|
|
|
Author |
Ahmed M. A. Salih; Ilaria Boscolo Galazzo; Federica Cruciani; Lorenza Brusini; Petia Radeva |
Title |
Investigating Explainable Artificial Intelligence for MRI-based Classification of Dementia: a New Stability Criterion for Explainable Methods |
Type |
Conference Article |
Year |
2022 |
Publication |
29th IEEE International Conference on Image Processing |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
Image processing; Stability criteria; Machine learning; Robustness; Alzheimer's disease; Monitoring |
Abstract |
Individuals diagnosed with Mild Cognitive Impairment (MCI) have shown an increased risk of developing Alzheimer’s Disease (AD). As such, early identification of dementia represents a key prognostic element, though hampered by complex disease patterns. Increasing efforts have focused on Machine Learning (ML) to build accurate classification models relying on a multitude of clinical/imaging variables. However, ML itself does not provide sensible explanations related to the model mechanism and feature contribution. Explainable Artificial Intelligence (XAI) represents the enabling technology in this framework, allowing to understand ML outcomes and derive human-understandable explanations. In this study, we aimed at exploring ML combined with MRI-based features and XAI to solve this classification problem and interpret the outcome. In particular, we propose a new method to assess the robustness of feature rankings provided by XAI methods, especially when multicollinearity exists. Our findings indicate that our method was able to disentangle the list of the informative features underlying dementia, with important implications for aiding personalized monitoring plans. |
Address |
Bordeaux; France; October 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICIP |
Notes |
MILAB |
Approved |
no |
Call Number |
Admin @ si @ SBC2022 |
Serial |
3789 |
Permanent link to this record |
|
|
|
Author |
Javier Rodenas; Bhalaji Nagarajan; Marc Bolaños; Petia Radeva |
Title |
Learning Multi-Subset of Classes for Fine-Grained Food Recognition |
Type |
Conference Article |
Year |
2022 |
Publication |
7th International Workshop on Multimedia Assisted Dietary Management |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
17–26 |
Keywords |
|
Abstract |
Food image recognition is a complex computer vision task, because of the large number of fine-grained food classes. Fine-grained recognition tasks focus on learning subtle discriminative details to distinguish similar classes. In this paper, we introduce a new method to improve the classification of classes that are more difficult to discriminate based on Multi-Subsets learning. Using a pre-trained network, we organize classes in multiple subsets using a clustering technique. Later, we embed these subsets in a multi-head model structure. This structure has three distinguishable parts. First, we use several shared blocks to learn the generalized representation of the data. Second, we use multiple specialized blocks focusing on specific subsets that are difficult to distinguish. Lastly, we use a fully connected layer to weight the different subsets in an end-to-end manner by combining the neuron outputs. We validated our proposed method using two recent state-of-the-art vision transformers on three public food recognition datasets. Our method was successful in learning the confused classes better and we outperformed the state-of-the-art on the three datasets. |
Address |
Lisboa; Portugal; October 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
MADiMa |
Notes |
MILAB |
Approved |
no |
Call Number |
Admin @ si @ RNB2022 |
Serial |
3797 |
Permanent link to this record |
|
|
|
Author |
Nil Ballus; Bhalaji Nagarajan; Petia Radeva |
Title |
Opt-SSL: An Enhanced Self-Supervised Framework for Food Recognition |
Type |
Conference Article |
Year |
2022 |
Publication |
10th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
Volume |
13256 |
Issue |
|
Pages |
|
Keywords |
Self-supervised; Contrastive learning; Food recognition |
Abstract |
Self-supervised Learning has been showing upbeat performance in several computer vision tasks. The popular contrastive methods make use of a Siamese architecture with different loss functions. In this work, we go deeper into two very recent state of the art frameworks, namely, SimSiam and Barlow Twins. Inspired by them, we propose a new self-supervised learning method we call Opt-SSL that combines both image and feature contrasting. We validate the proposed method on the food recognition task, showing that our proposed framework enables the self-learning networks to learn better visual representations. |
Address |
Aveiro; Portugal; May 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
Notes |
MILAB; no menciona |
Approved |
no |
Call Number |
Admin @ si @ BNR2022 |
Serial |
3782 |
Permanent link to this record |
|
|
|
Author |
Vishwesh Pillai; Pranav Mehar; Manisha Das; Deep Gupta; Petia Radeva |
Title |
Integrated Hierarchical and Flat Classifiers for Food Image Classification using Epistemic Uncertainty |
Type |
Conference Article |
Year |
2022 |
Publication |
IEEE International Conference on Signal Processing and Communications |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
The problem of food image recognition is an essential one in today’s context because health conditions such as diabetes, obesity, and heart disease require constant monitoring of a person’s diet. To automate this process, several models are available to recognize food images. Due to a considerable number of unique food dishes and various cuisines, a traditional flat classifier ceases to perform well. To address this issue, prediction schemes consisting of both flat and hierarchical classifiers, with the analysis of epistemic uncertainty are used to switch between the classifiers. However, the accuracy of the predictions made using epistemic uncertainty data remains considerably low. Therefore, this paper presents a prediction scheme using three different threshold criteria that helps to increase the accuracy of epistemic uncertainty predictions. The performance of the proposed method is demonstrated using several experiments performed on the MAFood-121 dataset. The experimental results validate the proposal performance and show that the proposed threshold criteria help to increase the overall accuracy of the predictions by correctly classifying the uncertainty distribution of the samples. |
Address |
Bangalore; India; July 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
SPCOM |
Notes |
MILAB; no menciona |
Approved |
no |
Call Number |
Admin @ si @ PMD2022 |
Serial |
3796 |
Permanent link to this record |
|
|
|
Author |
Bhalaji Nagarajan; Ricardo Marques; Marcos Mejia; Petia Radeva |
Title |
Class-conditional Importance Weighting for Deep Learning with Noisy Labels |
Type |
Conference Article |
Year |
2022 |
Publication |
17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications |
Abbreviated Journal |
|
Volume |
5 |
Issue |
|
Pages |
679-686 |
Keywords |
Noisy Labeling; Loss Correction; Class-conditional Importance Weighting; Learning with Noisy Labels |
Abstract |
Large-scale accurate labels are very important to the Deep Neural Networks to train them and assure high performance. However, it is very expensive to create a clean dataset since usually it relies on human interaction. To this purpose, the labelling process is made cheap with a trade-off of having noisy labels. Learning with Noisy Labels is an active area of research being at the same time very challenging. The recent advances in Self-supervised learning and robust loss functions have helped in advancing noisy label research. In this paper, we propose a loss correction method that relies on dynamic weights computed based on the model training. We extend the existing Contrast to Divide algorithm coupled with DivideMix using a new class-conditional weighted scheme. We validate the method using the standard noise experiments and achieved encouraging results. |
Address |
Virtual; February 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
VISAPP |
Notes |
MILAB; no menciona |
Approved |
no |
Call Number |
Admin @ si @ NMM2022 |
Serial |
3798 |
Permanent link to this record |
|
|
|
Author |
Guillem Martinez; Maya Aghaei; Martin Dijkstra; Bhalaji Nagarajan; Femke Jaarsma; Jaap van de Loosdrecht; Petia Radeva; Klaas Dijkstra |
Title |
Hyper-Spectral Imaging for Overlapping Plastic Flakes Segmentation |
Type |
Conference Article |
Year |
2022 |
Publication |
47th International Conference on Acoustics, Speech, and Signal Processing |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
Hyper-spectral imaging; plastic sorting; multi-label segmentation; bitfield encoding |
Abstract |
In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms. |
Address |
Singapore; May 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICASSP |
Notes |
MILAB; no proj |
Approved |
no |
Call Number |
Admin @ si @ MAD2022 |
Serial |
3767 |
Permanent link to this record |
|
|
|
Author |
Spencer Low; Oliver Nina; Angel Sappa; Erik Blasch; Nathan Inkawhich |
Title |
Multi-Modal Aerial View Object Classification Challenge Results – PBVS 2022 |
Type |
Conference Article |
Year |
2022 |
Publication |
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
350-358 |
Keywords |
|
Abstract |
This paper details the results and main findings of the second iteration of the Multi-modal Aerial View Object Classification (MAVOC) challenge. The primary goal of both MAVOC challenges is to inspire research into methods for building recognition models that utilize both synthetic aperture radar (SAR) and electro-optical (EO) imagery. Teams are encouraged to develop multi-modal approaches that incorporate complementary information from both domains. While the 2021 challenge showed a proof of concept that both modalities could be used together, the 2022 challenge focuses on the detailed multi-modal methods. The 2022 challenge uses the same UNIfied Coincident Optical and Radar for recognitioN (UNICORN) dataset and competition format that was used in 2021. Specifically, the challenge focuses on two tasks, (1) SAR classification and (2) SAR + EO classification. The bulk of this document is dedicated to discussing the top performing methods and describing their performance on our blind test set. Notably, all of the top ten teams outperform a Resnet-18 baseline. For SAR classification, the top team showed a 129% improvement over baseline and an 8% average improvement from the 2021 winner. The top team for SAR + EO classification shows a 165% improvement with a 32% average improvement over 2021. |
Address |
New Orleans; USA; June 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CVPRW |
Notes |
MSIAU |
Approved |
no |
Call Number |
Admin @ si @ LNS2022 |
Serial |
3768 |
Permanent link to this record |
|
|
|
Author |
Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud |
Title |
A Novel Domain Transfer-Based Approach for Unsupervised Thermal Image Super-Resolution |
Type |
Journal Article |
Year |
2022 |
Publication |
Sensors |
Abbreviated Journal |
SENS |
Volume |
22 |
Issue |
6 |
Pages |
2254 |
Keywords |
Thermal image super-resolution; unsupervised super-resolution; thermal images; attention module; semiregistered thermal images |
Abstract |
This paper presents a transfer domain strategy to tackle the limitations of low-resolution thermal sensors and generate higher-resolution images of reasonable quality. The proposed technique employs a CycleGAN architecture and uses a ResNet as an encoder in the generator along with an attention module and a novel loss function. The network is trained on a multi-resolution thermal image dataset acquired with three different thermal sensors. Results report better performance benchmarking results on the 2nd CVPR-PBVS-2021 thermal image super-resolution challenge than state-of-the-art methods. The code of this work is available online. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MSIAU; |
Approved |
no |
Call Number |
Admin @ si @ RSV2022b |
Serial |
3688 |
Permanent link to this record |
|
|
|
Author |
Mohamed Ramzy Ibrahim; Robert Benavente; Felipe Lumbreras; Daniel Ponsa |
Title |
3DRRDB: Super Resolution of Multiple Remote Sensing Images using 3D Residual in Residual Dense Blocks |
Type |
Conference Article |
Year |
2022 |
Publication |
CVPR 2022 Workshop on IEEE Perception Beyond the Visible Spectrum workshop series (PBVS, 18th Edition) |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
Training; Solid modeling; Three-dimensional displays; PSNR; Convolution; Superresolution; Pattern recognition |
Abstract |
The rapid advancement of Deep Convolutional Neural Networks helped in solving many remote sensing problems, especially the problems of super-resolution. However, most state-of-the-art methods focus more on Single Image Super-Resolution neglecting Multi-Image Super-Resolution. In this work, a new proposed 3D Residual in Residual Dense Blocks model (3DRRDB) focuses on remote sensing Multi-Image Super-Resolution for two different single spectral bands. The proposed 3DRRDB model explores the idea of 3D convolution layers in deeply connected Dense Blocks and the effect of local and global residual connections with residual scaling in Multi-Image Super-Resolution. The model tested on the Proba-V challenge dataset shows a significant improvement above the current state-of-the-art models scoring a Corrected Peak Signal to Noise Ratio (cPSNR) of 48.79 dB and 50.83 dB for Near Infrared (NIR) and RED Bands respectively. Moreover, the proposed 3DRRDB model scores a Corrected Structural Similarity Index Measure (cSSIM) of 0.9865 and 0.9909 for NIR and RED bands respectively. |
Address |
New Orleans, USA; 19 June 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CVPRW |
Notes |
MSIAU; 600.130 |
Approved |
no |
Call Number |
Admin @ si @ IBL2022 |
Serial |
3693 |
Permanent link to this record |