%0 Conference Proceedings %T SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution %A Mohamed Ramzy Ibrahim %A Robert Benavente %A Daniel Ponsa %A Felipe Lumbreras %B 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications %D 2024 %F Mohamed Ramzy Ibrahim2024 %O MSIAU %O exported from refbase (http://refbase.cvc.uab.es/show.php?record=4004), last updated on Tue, 12 Mar 2024 12:28:12 +0100 %X Remote sensing applications, impacted by acquisition season and sensor variety, require high-resolution images. Transformer-based models improve satellite image super-resolution but are less effective than convolutional neural networks (CNNs) at extracting local details, crucial for image clarity. This paper introduces SWViT-RRDB, a new deep learning model for satellite imagery super-resolution. The SWViT-RRDB, combining transformer with convolution and attention blocks, overcomes the limitations of existing models by better representing small objects in satellite images. In this model, a pipeline of residual fusion group (RFG) blocks is used to combine the multi-headed self-attention (MSA) with residual in residual dense block (RRDB). This combines global and local image data for better super-resolution. Additionally, an overlapping cross-attention block (OCAB) is used to enhance fusion and allow interaction between neighboring pixels to maintain long-range pixel dependencies across the image. The SWViT-RRDB model and its larger variants outperform state-of-the-art (SoTA) models on two different satellite datasets in terms of PSNR and SSIM. %U https://www.insticc.org/node/TechnicalProgram/visigrapp/2024/presentationDetails/123993