Home | [191–200] << 201 202 203 204 205 206 207 208 209 210 >> [211–220] |
Records | |||||
---|---|---|---|---|---|
Author | Pau Riba; Sounak Dey; Ali Furkan Biten; Josep Llados | ||||
Title | Localizing Infinity-shaped fishes: Sketch-guided object localization in the wild | Type | Miscellaneous | ||
Year | 2021 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This work investigates the problem of sketch-guided object localization (SGOL), where human sketches are used as queries to conduct the object localization in natural images. In this cross-modal setting, we first contribute with a tough-to-beat baseline that without any specific SGOL training is able to outperform the previous works on a fixed set of classes. The baseline is useful to analyze the performance of SGOL approaches based on available simple yet powerful methods. We advance prior arts by proposing a sketch-conditioned DETR (DEtection TRansformer) architecture which avoids a hard classification and alleviates the domain gap between sketches and images to localize object instances. Although the main goal of SGOL is focused on object detection, we explored its natural extension to sketch-guided instance segmentation. This novel task allows to move towards identifying the objects at pixel level, which is of key importance in several applications. We experimentally demonstrate that our model and its variants significantly advance over previous state-of-the-art results. All training and testing code of our model will be released to facilitate future researchhttps://github.com/priba/sgol_wild. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RDB2021 | Serial | 3674 | ||
Permanent link to this record | |||||
Author | Albert Suso; Pau Riba; Oriol Ramos Terrades; Josep Llados | ||||
Title | A Self-supervised Inverse Graphics Approach for Sketch Parametrization | Type | Conference Article | ||
Year | 2021 | Publication | 16th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | 12916 | Issue | Pages | 28-42 | |
Keywords | |||||
Abstract | The study of neural generative models of handwritten text and human sketches is a hot topic in the computer vision field. The landmark SketchRNN provided a breakthrough by sequentially generating sketches as a sequence of waypoints, and more recent articles have managed to generate fully vector sketches by coding the strokes as Bézier curves. However, the previous attempts with this approach need them all a ground truth consisting in the sequence of points that make up each stroke, which seriously limits the datasets the model is able to train in. In this work, we present a self-supervised end-to-end inverse graphics approach that learns to embed each image to its best fit of Bézier curves. The self-supervised nature of the training process allows us to train the model in a wider range of datasets, but also to perform better after-training predictions by applying an overfitting process on the input binary image. We report qualitative an quantitative evaluations on the MNIST and the Quick, Draw! datasets. | ||||
Address | Lausanne; Suissa; September 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ SRR2021 | Serial | 3675 | ||
Permanent link to this record | |||||
Author | Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal | ||||
Title | Graph-Based Deep Generative Modelling for Document Layout Generation | Type | Conference Article | ||
Year | 2021 | Publication | 16th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | 12917 | Issue | Pages | 525-537 | |
Keywords | |||||
Abstract | One of the major prerequisites for any deep learning approach is the availability of large-scale training data. When dealing with scanned document images in real world scenarios, the principal information of its content is stored in the layout itself. In this work, we have proposed an automated deep generative model using Graph Neural Networks (GNNs) to generate synthetic data with highly variable and plausible document layouts that can be used to train document interpretation systems, in this case, specially in digital mailroom applications. It is also the first graph-based approach for document layout generation task experimented on administrative document images, in this case, invoices. | ||||
Address | Lausanne; Suissa; September 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.121; 600.140; 110.312 | Approved | no | ||
Call Number | Admin @ si @ BRL2021 | Serial | 3676 | ||
Permanent link to this record | |||||
Author | Josep Llados | ||||
Title | The 5G of Document Intelligence | Type | Conference Article | ||
Year | 2021 | Publication | 3rd Workshop on Future of Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Lausanne; Suissa; September 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3677 | ||
Permanent link to this record | |||||
Author | Mohamed Ali Souibgui; Sanket Biswas; Sana Khamekhem Jemni; Yousri Kessentini; Alicia Fornes; Josep Llados; Umapada Pal | ||||
Title | DocEnTr: An End-to-End Document Image Enhancement Transformer | Type | Conference Article | ||
Year | 2022 | Publication | 26th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1699-1705 | ||
Keywords | Degradation; Head; Optical character recognition; Self-supervised learning; Benchmark testing; Transformers; Magnetic heads | ||||
Abstract | Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: https://github.com/dali92002/DocEnTR | ||||
Address | August 21-25, 2022 , Montréal Québec | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.121; 600.162; 602.230; 600.140 | Approved | no | ||
Call Number | Admin @ si @ SBJ2022 | Serial | 3730 | ||
Permanent link to this record | |||||
Author | Fei Yang; Yaxing Wang; Luis Herranz; Yongmei Cheng; Mikhail Mozerov | ||||
Title | A Novel Framework for Image-to-image Translation and Image Compression | Type | Journal Article | ||
Year | 2022 | Publication | Neurocomputing | Abbreviated Journal | NEUCOM |
Volume | 508 | Issue | Pages | 58-70 | |
Keywords | |||||
Abstract | Data-driven paradigms using machine learning are becoming ubiquitous in image processing and communications. In particular, image-to-image (I2I) translation is a generic and widely used approach to image processing problems, such as image synthesis, style transfer, and image restoration. At the same time, neural image compression has emerged as a data-driven alternative to traditional coding approaches in visual communications. In this paper, we study the combination of these two paradigms into a joint I2I compression and translation framework, focusing on multi-domain image synthesis. We first propose distributed I2I translation by integrating quantization and entropy coding into an I2I translation framework (i.e. I2Icodec). In practice, the image compression functionality (i.e. autoencoding) is also desirable, requiring to deploy alongside I2Icodec a regular image codec. Thus, we further propose a unified framework that allows both translation and autoencoding capabilities in a single codec. Adaptive residual blocks conditioned on the translation/compression mode provide flexible adaptation to the desired functionality. The experiments show promising results in both I2I translation and image compression using a single model. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP | Approved | no | ||
Call Number | Admin @ si @ YWH2022 | Serial | 3679 | ||
Permanent link to this record | |||||
Author | AN Ruchai; VI Kober; KA Dorofeev; VN Karnaukhov; Mikhail Mozerov | ||||
Title | Classification of breast abnormalities using a deep convolutional neural network and transfer learning | Type | Journal Article | ||
Year | 2021 | Publication | Journal of Communications Technology and Electronics | Abbreviated Journal | |
Volume | 66 | Issue | 6 | Pages | 778–783 |
Keywords | |||||
Abstract | A new algorithm for classification of breast pathologies in digital mammography using a convolutional neural network and transfer learning is proposed. The following pretrained neural networks were chosen: MobileNetV2, InceptionResNetV2, Xception, and ResNetV2. All mammographic images were pre-processed to improve classification reliability. Transfer training was carried out using additional data augmentation and fine-tuning. The performance of the proposed algorithm for classification of breast pathologies in terms of accuracy on real data is discussed and compared with that of state-of-the-art algorithms on the available MIAS database. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; | Approved | no | ||
Call Number | Admin @ si @ RKD2022 | Serial | 3680 | ||
Permanent link to this record | |||||
Author | Shun Yao; Fei Yang; Yongmei Cheng; Mikhail Mozerov | ||||
Title | 3D Shapes Local Geometry Codes Learning with SDF | Type | Conference Article | ||
Year | 2021 | Publication | International Conference on Computer Vision Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 2110-2117 | ||
Keywords | |||||
Abstract | A signed distance function (SDF) as the 3D shape description is one of the most effective approaches to represent 3D geometry for rendering and reconstruction. Our work is inspired by the state-of-the-art method DeepSDF [17] that learns and analyzes the 3D shape as the iso-surface of its shell and this method has shown promising results especially in the 3D shape reconstruction and compression domain. In this paper, we consider the degeneration problem of reconstruction coming from the capacity decrease of the DeepSDF model, which approximates the SDF with a neural network and a single latent code. We propose Local Geometry Code Learning (LGCL), a model that improves the original DeepSDF results by learning from a local shape geometry of the full 3D shape. We add an extra graph neural network to split the single transmittable latent code into a set of local latent codes distributed on the 3D shape. Mentioned latent codes are used to approximate the SDF in their local regions, which will alleviate the complexity of the approximation compared to the original DeepSDF. Furthermore, we introduce a new geometric loss function to facilitate the training of these local latent codes. Note that other local shape adjusting methods use the 3D voxel representation, which in turn is a problem highly difficult to solve or even is insolvable. In contrast, our architecture is based on graph processing implicitly and performs the learning regression process directly in the latent code space, thus make the proposed architecture more flexible and also simple for realization. Our experiments on 3D shape reconstruction demonstrate that our LGCL method can keep more details with a significantly smaller size of the SDF decoder and outperforms considerably the original DeepSDF method under the most important quantitative metrics. | ||||
Address | VIRTUAL; October 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCVW | ||
Notes | LAMP | Approved | no | ||
Call Number | Admin @ si @ YYC2021 | Serial | 3681 | ||
Permanent link to this record | |||||
Author | Alex Gomez-Villa; Adrian Martin; Javier Vazquez; Marcelo Bertalmio; Jesus Malo | ||||
Title | On the synthesis of visual illusions using deep generative models | Type | Journal Article | ||
Year | 2022 | Publication | Journal of Vision | Abbreviated Journal | JOV |
Volume | 22(8) | Issue | 2 | Pages | 1-18 |
Keywords | |||||
Abstract | Visual illusions expand our understanding of the visual system by imposing constraints in the models in two different ways: i) visual illusions for humans should induce equivalent illusions in the model, and ii) illusions synthesized from the model should be compelling for human viewers too. These constraints are alternative strategies to find good vision models. Following the first research strategy, recent studies have shown that artificial neural network architectures also have human-like illusory percepts when stimulated with classical hand-crafted stimuli designed to fool humans. In this work we focus on the second (less explored) strategy: we propose a framework to synthesize new visual illusions using the optimization abilities of current automatic differentiation techniques. The proposed framework can be used with classical vision models as well as with more recent artificial neural network architectures. This framework, validated by psychophysical experiments, can be used to study the difference between a vision model and the actual human perception and to optimize the vision model to decrease this difference. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.161; 611.007 | Approved | no | ||
Call Number | Admin @ si @ GMV2022 | Serial | 3682 | ||
Permanent link to this record | |||||
Author | Yasuko Sugito; Javier Vazquez; Trevor Canham; Marcelo Bertalmio | ||||
Title | Image quality evaluation in professional HDR/WCG production questions the need for HDR metrics | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 31 | Issue | Pages | 5163 - 5177 | |
Keywords | Measurement; Image color analysis; Image coding; Production; Dynamic range; Brightness; Extraterrestrial measurements | ||||
Abstract | In the quality evaluation of high dynamic range and wide color gamut (HDR/WCG) images, a number of works have concluded that native HDR metrics, such as HDR visual difference predictor (HDR-VDP), HDR video quality metric (HDR-VQM), or convolutional neural network (CNN)-based visibility metrics for HDR content, provide the best results. These metrics consider only the luminance component, but several color difference metrics have been specifically developed for, and validated with, HDR/WCG images. In this paper, we perform subjective evaluation experiments in a professional HDR/WCG production setting, under a real use case scenario. The results are quite relevant in that they show, firstly, that the performance of HDR metrics is worse than that of a classic, simple standard dynamic range (SDR) metric applied directly to the HDR content; and secondly, that the chrominance metrics specifically developed for HDR/WCG imaging have poor correlation with observer scores and are also outperformed by an SDR metric. Based on these findings, we show how a very simple framework for creating color HDR metrics, that uses only luminance SDR metrics, transfer functions, and classic color spaces, is able to consistently outperform, by a considerable margin, state-of-the-art HDR metrics on a varied set of HDR content, for both perceptual quantization (PQ) and Hybrid Log-Gamma (HLG) encoding, luminance and chroma distortions, and on different color spaces of common use. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | 600.161; 611.007 | Approved | no | ||
Call Number | Admin @ si @ SVG2022 | Serial | 3683 | ||
Permanent link to this record | |||||
Author | Idoia Ruiz; Joan Serrat | ||||
Title | Hierarchical Novelty Detection for Traffic Sign Recognition | Type | Journal Article | ||
Year | 2022 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 22 | Issue | 12 | Pages | 4389 |
Keywords | Novelty detection; hierarchical classification; deep learning; traffic sign recognition; autonomous driving; computer vision | ||||
Abstract | Recent works have made significant progress in novelty detection, i.e., the problem of detecting samples of novel classes, never seen during training, while classifying those that belong to known classes. However, the only information this task provides about novel samples is that they are unknown. In this work, we leverage hierarchical taxonomies of classes to provide informative outputs for samples of novel classes. We predict their closest class in the taxonomy, i.e., its parent class. We address this problem, known as hierarchical novelty detection, by proposing a novel loss, namely Hierarchical Cosine Loss that is designed to learn class prototypes along with an embedding of discriminative features consistent with the taxonomy. We apply it to traffic sign recognition, where we predict the parent class semantics for new types of traffic signs. Our model beats state-of-the art approaches on two large scale traffic sign benchmarks, Mapillary Traffic Sign Dataset (MTSD) and Tsinghua-Tencent 100K (TT100K), and performs similarly on natural images benchmarks (AWA2, CUB). For TT100K and MTSD, our approach is able to detect novel samples at the correct nodes of the hierarchy with 81% and 36% of accuracy, respectively, at 80% known class accuracy. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.154 | Approved | no | ||
Call Number | Admin @ si @ RuS2022 | Serial | 3684 | ||
Permanent link to this record | |||||
Author | Xavier Otazu; Xim Cerda-Company | ||||
Title | The contribution of luminance and chromatic channels to color assimilation | Type | Journal Article | ||
Year | 2022 | Publication | Journal of Vision | Abbreviated Journal | JOV |
Volume | 22(6) | Issue | 10 | Pages | 1-15 |
Keywords | |||||
Abstract | Color induction is the phenomenon where the physical and the perceived colors of an object differ owing to the color distribution and the spatial configuration of the surrounding objects. Previous works studying this phenomenon on the lsY MacLeod–Boynton color space, show that color assimilation is present only when the magnocellular pathway (i.e., the Y axis) is activated (i.e., when there are luminance differences). Concretely, the authors showed that the effect is mainly induced by the koniocellular pathway (s axis), but not by the parvocellular pathway (l axis), suggesting that when magnocellular pathway is activated it inhibits the koniocellular pathway. In the present work, we study whether parvo-, konio-, and magnocellular pathways may influence on each other through the color induction effect. Our results show that color assimilation does not depend on a chromatic–chromatic interaction, and that chromatic assimilation is driven by the interaction between luminance and chromatic channels (mainly the magno- and the koniocellular pathways). Our results also show that chromatic induction is greatly decreased when all three visual pathways are simultaneously activated, and that chromatic pathways could influence each other through the magnocellular (luminance) pathway. In addition, we observe that chromatic channels can influence the luminance channel, hence inducing a small brightness induction. All these results show that color induction is a highly complex process where interactions between the several visual pathways are yet unknown and should be studied in greater detail. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | Neurobit; 600.128; 600.120; 600.158 | Approved | no | ||
Call Number | Admin @ si @ OtC2022 | Serial | 3685 | ||
Permanent link to this record | |||||
Author | Kai Wang; Xialei Liu; Andrew Bagdanov; Luis Herranz; Shangling Jui; Joost Van de Weijer | ||||
Title | Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition | Type | Conference Article | ||
Year | 2022 | Publication | CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) | Abbreviated Journal | |
Volume | Issue | Pages | 3728-3738 | ||
Keywords | Training; Computer vision; Image recognition; Upper bound; Conferences; Pattern recognition; Task analysis | ||||
Abstract | In this paper we consider the problem of incremental meta-learning in which classes are presented incrementally in discrete tasks. We propose Episodic Replay Distillation (ERD), that mixes classes from the current task with exemplars from previous tasks when sampling episodes for meta-learning. To allow the training to benefit from a large as possible variety of classes, which leads to more gener-
alizable feature representations, we propose the cross-task meta loss. Furthermore, we propose episodic replay distillation that also exploits exemplars for improved knowledge distillation. Experiments on four datasets demonstrate that ERD surpasses the state-of-the-art. In particular, on the more challenging one-shot, long task sequence scenarios, we reduce the gap between Incremental Meta-Learning and the joint-training upper bound from 3.5% / 10.1% / 13.4% / 11.7% with the current state-of-the-art to 2.6% / 2.9% / 5.0% / 0.2% with our method on Tiered-ImageNet / Mini-ImageNet / CIFAR100 / CUB, respectively. |
||||
Address | New Orleans, USA; 20 June 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP; 600.147 | Approved | no | ||
Call Number | Admin @ si @ WLB2022 | Serial | 3686 | ||
Permanent link to this record | |||||
Author | Zhaocheng Liu; Luis Herranz; Fei Yang; Saiping Zhang; Shuai Wan; Marta Mrak; Marc Gorriz | ||||
Title | Slimmable Video Codec | Type | Conference Article | ||
Year | 2022 | Publication | CVPR 2022 Workshop and Challenge on Learned Image Compression (CLIC 2022, 5th Edition) | Abbreviated Journal | |
Volume | Issue | Pages | 1742-1746 | ||
Keywords | |||||
Abstract | Neural video compression has emerged as a novel paradigm combining trainable multilayer neural net-works and machine learning, achieving competitive rate-distortion (RD) performances, but still remaining impractical due to heavy neural architectures, with large memory and computational demands. In addition, models are usually optimized for a single RD tradeoff. Recent slimmable image codecs can dynamically adjust their model capacity to gracefully reduce the memory and computation requirements, without harming RD performance. In this paper we propose a slimmable video codec (SlimVC), by integrating a slimmable temporal entropy model in a slimmable autoencoder. Despite a significantly more complex architecture, we show that slimming remains a powerful mechanism to control rate, memory footprint, computational cost and latency, all being important requirements for practical video compression. | ||||
Address | Virtual; 19 June 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | MACO; 601.379; 601.161 | Approved | no | ||
Call Number | Admin @ si @ LHY2022 | Serial | 3687 | ||
Permanent link to this record | |||||
Author | Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud | ||||
Title | A Novel Domain Transfer-Based Approach for Unsupervised Thermal Image Super-Resolution | Type | Journal Article | ||
Year | 2022 | Publication | Sensors | Abbreviated Journal | SENS |
Volume | 22 | Issue | 6 | Pages | 2254 |
Keywords | Thermal image super-resolution; unsupervised super-resolution; thermal images; attention module; semiregistered thermal images | ||||
Abstract | This paper presents a transfer domain strategy to tackle the limitations of low-resolution thermal sensors and generate higher-resolution images of reasonable quality. The proposed technique employs a CycleGAN architecture and uses a ResNet as an encoder in the generator along with an attention module and a novel loss function. The network is trained on a multi-resolution thermal image dataset acquired with three different thermal sensors. Results report better performance benchmarking results on the 2nd CVPR-PBVS-2021 thermal image super-resolution challenge than state-of-the-art methods. The code of this work is available online. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MSIAU; | Approved | no | ||
Call Number | Admin @ si @ RSV2022b | Serial | 3688 | ||
Permanent link to this record |