|
Records |
Links |
|
Author |
Y. Mori; M.Misawa; Jorge Bernal; M. Bretthauer; S.Kudo; A. Rastogi; Gloria Fernandez Esparrach |
|
|
Title |
Artificial Intelligence for Disease Diagnosis-the Gold Standard Challenge |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Gastrointestinal Endoscopy |
Abbreviated Journal |
|
|
|
Volume |
96 |
Issue |
2 |
Pages |
370-372 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ MMB2022 |
Serial |
3701 |
|
Permanent link to this record |
|
|
|
|
Author |
Bojana Gajic; Ariel Amato; Ramon Baldrich; Joost Van de Weijer; Carlo Gatta |
|
|
Title |
Area Under the ROC Curve Maximization for Metric Learning |
Type |
Conference Article |
|
Year |
2022 |
Publication |
CVPR 2022 Workshop on Efficien Deep Learning for Computer Vision (ECV 2022, 5th Edition) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Training; Computer vision; Conferences; Area measurement; Benchmark testing; Pattern recognition |
|
|
Abstract |
Most popular metric learning losses have no direct relation with the evaluation metrics that are subsequently applied to evaluate their performance. We hypothesize that training a metric learning model by maximizing the area under the ROC curve (which is a typical performance measure of recognition systems) can induce an implicit ranking suitable for retrieval problems. This hypothesis is supported by previous work that proved that a curve dominates in ROC space if and only if it dominates in Precision-Recall space. To test this hypothesis, we design and maximize an approximated, derivable relaxation of the area under the ROC curve. The proposed AUC loss achieves state-of-the-art results on two large scale retrieval benchmark datasets (Stanford Online Products and DeepFashion In-Shop). Moreover, the AUC loss achieves comparable performance to more complex, domain specific, state-of-the-art methods for vehicle re-identification. |
|
|
Address |
New Orleans, USA; 20 June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
CIC; LAMP; |
Approved |
no |
|
|
Call Number |
Admin @ si @ GAB2022 |
Serial |
3700 |
|
Permanent link to this record |
|
|
|
|
Author |
Guillermo Torres; Sonia Baeza; Carles Sanchez; Ignasi Guasch; Antoni Rosell; Debora Gil |
|
|
Title |
An Intelligent Radiomic Approach for Lung Cancer Screening |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Applied Sciences |
Abbreviated Journal |
APPLSCI |
|
|
Volume |
12 |
Issue |
3 |
Pages |
1568 |
|
|
Keywords |
Lung cancer; Early diagnosis; Screening; Neural networks; Image embedding; Architecture optimization |
|
|
Abstract |
The efficiency of lung cancer screening for reducing mortality is hindered by the high rate of false positives. Artificial intelligence applied to radiomics could help to early discard benign cases from the analysis of CT scans. The available amount of data and the fact that benign cases are a minority, constitutes a main challenge for the successful use of state of the art methods (like deep learning), which can be biased, over-fitted and lack of clinical reproducibility. We present an hybrid approach combining the potential of radiomic features to characterize nodules in CT scans and the generalization of the feed forward networks. In order to obtain maximal reproducibility with minimal training data, we propose an embedding of nodules based on the statistical significance of radiomic features for malignancy detection. This representation space of lesions is the input to a feed
forward network, which architecture and hyperparameters are optimized using own-defined metrics of the diagnostic power of the whole system. Results of the best model on an independent set of patients achieve 100% of sensitivity and 83% of specificity (AUC = 0.94) for malignancy detection. |
|
|
Address |
Jan 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM; 600.139; 600.145 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TBS2022 |
Serial |
3699 |
|
Permanent link to this record |
|
|
|
|
Author |
Razieh Rastgoo; Kourosh Kiani; Sergio Escalera; Vassilis Athitsos; Mohammad Sabokrou |
|
|
Title |
All You Need In Sign Language Production |
Type |
Miscellaneous |
|
Year |
2022 |
Publication |
Arxiv |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Sign Language Production; Sign Language Recog- nition; Sign Language Translation; Deep Learning; Survey; Deaf |
|
|
Abstract |
Sign Language is the dominant form of communication language used in the deaf and hearing-impaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental.
To this end, sign language recognition and production are two necessary parts for making such a two-way system. Signlanguage recognition and production need to cope with some critical challenges. In this survey, we review recent advances in
Sign Language Production (SLP) and related areas using deep learning. To have more realistic perspectives to sign language, we present an introduction to the Deaf culture, Deaf centers, psychological perspective of sign language, the main differences between spoken language and sign language. Furthermore, we present the fundamental components of a bi-directional sign language translation system, discussing the main challenges in this area. Also, the backbone architectures and methods in SLP are briefly introduced and the proposed taxonomy on SLP is presented. Finally, a general framework for SLP and performance evaluation, and also a discussion on the recent developments, advantages, and limitations in SLP, commenting on possible lines for future research are presented. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ RKE2022c |
Serial |
3698 |
|
Permanent link to this record |
|
|
|
|
Author |
Miquel Angel Piera; Jose Luis Muñoz; Debora Gil; Gonzalo Martin; Jordi Manzano |
|
|
Title |
A Socio-Technical Simulation Model for the Design of the Future Single Pilot Cockpit: An Opportunity to Improve Pilot Performance |
Type |
Journal Article |
|
Year |
2022 |
Publication |
IEEE Access |
Abbreviated Journal |
ACCESS |
|
|
Volume |
10 |
Issue |
|
Pages |
22330-22343 |
|
|
Keywords |
Human factors ; Performance evaluation ; Simulation; Sociotechnical systems ; System performance |
|
|
Abstract |
The future deployment of single pilot operations must be supported by new cockpit computer services. Such services require an adaptive context-aware integration of technical functionalities with the concurrent tasks that a pilot must deal with. Advanced artificial intelligence supporting services and improved communication capabilities are the key enabling technologies that will render future cockpits more integrated with the present digitalized air traffic management system. However, an issue in the integration of such technologies is the lack of socio-technical analysis in the design of these teaming mechanisms. A key factor in determining how and when a service support should be provided is the dynamic evolution of pilot workload. This paper investigates how the socio-technical model-based systems engineering approach paves the way for the design of a digital assistant framework by formalizing this workload. The model was validated in an Airbus A-320 cockpit simulator, and the results confirmed the degraded pilot behavioral model and the performance impact according to different contextual flight deck information. This study contributes to practical knowledge for designing human-machine task-sharing systems. |
|
|
Address |
Feb 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM; |
Approved |
no |
|
|
Call Number |
Admin @ si @ PMG2022 |
Serial |
3697 |
|
Permanent link to this record |
|
|
|
|
Author |
David Berga; Xavier Otazu |
|
|
Title |
A neurodynamic model of saliency prediction in v1 |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Neural Computation |
Abbreviated Journal |
NEURALCOMPUT |
|
|
Volume |
34 |
Issue |
2 |
Pages |
378-414 |
|
|
Keywords |
|
|
|
Abstract |
Lateral connections in the primary visual cortex (V1) have long been hypothesized to be responsible for several visual processing mechanisms such as brightness induction, chromatic induction, visual discomfort, and bottom-up visual attention (also named saliency). Many computational models have been developed to independently predict these and other visual processes, but no computational model has been able to reproduce all of them simultaneously. In this work, we show that a biologically plausible computational model of lateral interactions of V1 is able to simultaneously predict saliency and all the aforementioned visual processes. Our model's architecture (NSWAM) is based on Penacchio's neurodynamic model of lateral connections of V1. It is defined as a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation, and scale. We tested NSWAM saliency predictions using images from several eye tracking data sets. We show that the accuracy of predictions obtained by our architecture, using shuffled metrics, is similar to other state-of-the-art computational methods, particularly with synthetic images (CAT2000-Pattern and SID4VAM) that mainly contain low-level features. Moreover, we outperform other biologically inspired saliency models that are specifically designed to exclusively reproduce saliency. We show that our biologically plausible model of lateral connections can simultaneously explain different visual processes present in V1 (without applying any type of training or optimization and keeping the same parameterization for all the visual processes). This can be useful for the definition of a unified architecture of the primary visual cortex. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT; 600.128; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BeO2022 |
Serial |
3696 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Brugues Pujolras; Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
A Multilingual Approach to Scene Text Visual Question Answering |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Document Analysis Systems.15th IAPR International Workshop, (DAS2022) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
65-79 |
|
|
Keywords |
Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning |
|
|
Abstract |
Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines. |
|
|
Address |
La Rochelle, France; May 22–25, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 611.004; 600.155; 601.002 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BGK2022b |
Serial |
3695 |
|
Permanent link to this record |
|
|
|
|
Author |
Adria Molina; Lluis Gomez; Oriol Ramos Terrades; Josep Llados |
|
|
Title |
A Generic Image Retrieval Method for Date Estimation of Historical Document Collections |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Document Analysis Systems.15th IAPR International Workshop, (DAS2022) |
Abbreviated Journal |
|
|
|
Volume |
13237 |
Issue |
|
Pages |
583–597 |
|
|
Keywords |
Date estimation; Document retrieval; Image retrieval; Ranking loss; Smooth-nDCG |
|
|
Abstract |
Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others. This paper presents a robust date estimation system based in a retrieval approach that generalizes well in front of heterogeneous collections. We use a ranking loss function named smooth-nDCG to train a Convolutional Neural Network that learns an ordination of documents for each problem. One of the main usages of the presented approach is as a tool for historical contextual retrieval. It means that scholars could perform comparative analysis of historical images from big datasets in terms of the period where they were produced. We provide experimental evaluation on different types of documents from real datasets of manuscript and newspaper images. |
|
|
Address |
La Rochelle, France; May 22–25, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.140; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MGR2022 |
Serial |
3694 |
|
Permanent link to this record |
|
|
|
|
Author |
Mohamed Ramzy Ibrahim; Robert Benavente; Felipe Lumbreras; Daniel Ponsa |
|
|
Title |
3DRRDB: Super Resolution of Multiple Remote Sensing Images using 3D Residual in Residual Dense Blocks |
Type |
Conference Article |
|
Year |
2022 |
Publication |
CVPR 2022 Workshop on IEEE Perception Beyond the Visible Spectrum workshop series (PBVS, 18th Edition) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Training; Solid modeling; Three-dimensional displays; PSNR; Convolution; Superresolution; Pattern recognition |
|
|
Abstract |
The rapid advancement of Deep Convolutional Neural Networks helped in solving many remote sensing problems, especially the problems of super-resolution. However, most state-of-the-art methods focus more on Single Image Super-Resolution neglecting Multi-Image Super-Resolution. In this work, a new proposed 3D Residual in Residual Dense Blocks model (3DRRDB) focuses on remote sensing Multi-Image Super-Resolution for two different single spectral bands. The proposed 3DRRDB model explores the idea of 3D convolution layers in deeply connected Dense Blocks and the effect of local and global residual connections with residual scaling in Multi-Image Super-Resolution. The model tested on the Proba-V challenge dataset shows a significant improvement above the current state-of-the-art models scoring a Corrected Peak Signal to Noise Ratio (cPSNR) of 48.79 dB and 50.83 dB for Near Infrared (NIR) and RED Bands respectively. Moreover, the proposed 3DRRDB model scores a Corrected Structural Similarity Index Measure (cSSIM) of 0.9865 and 0.9909 for NIR and RED bands respectively. |
|
|
Address |
New Orleans, USA; 19 June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
MSIAU; 600.130 |
Approved |
no |
|
|
Call Number |
Admin @ si @ IBL2022 |
Serial |
3693 |
|
Permanent link to this record |
|
|
|
|
Author |
Wenjuan Gong; Zhang Yue; Wei Wang; Cheng Peng; Jordi Gonzalez |
|
|
Title |
Meta-MMFNet: Meta-Learning Based Multi-Model Fusion Network for Micro-Expression Recognition |
Type |
Journal Article |
|
Year |
2022 |
Publication |
ACM Transactions on Multimedia Computing, Communications, and Applications |
Abbreviated Journal |
ACMTMC |
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Feature Fusion; Model Fusion; Meta-Learning; Micro-Expression Recognition |
|
|
Abstract |
Despite its wide applications in criminal investigations and clinical communications with patients suffering from autism, automatic micro-expression recognition remains a challenging problem because of the lack of training data and imbalanced classes problems. In this study, we proposed a meta-learning based multi-model fusion network (Meta-MMFNet) to solve the existing problems. The proposed method is based on the metric-based meta-learning pipeline, which is specifically designed for few-shot learning and is suitable for model-level fusion. The frame difference and optical flow features were fused, deep features were extracted from the fused feature, and finally in the meta-learning-based framework, weighted sum model fusion method was applied for micro-expression classification. Meta-MMFNet achieved better results than state-of-the-art methods on four datasets. The code is available at https://github.com/wenjgong/meta-fusion-based-method. |
|
|
Address |
May 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 600.157 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GYW2022 |
Serial |
3692 |
|
Permanent link to this record |
|
|
|
|
Author |
Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla |
|
|
Title |
Multi-Image Super-Resolution for Thermal Images |
Type |
Conference Article |
|
Year |
2022 |
Publication |
17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) |
Abbreviated Journal |
|
|
|
Volume |
4 |
Issue |
|
Pages |
635-642 |
|
|
Keywords |
Thermal Images; Multi-view; Multi-frame; Super-Resolution; Deep Learning; Attention Block |
|
|
Abstract |
This paper proposes a novel CNN architecture for the multi-thermal image super-resolution problem. In the proposed scheme, the multi-images are synthetically generated by downsampling and slightly shifting the given image; noise is also added to each of these synthesized images. The proposed architecture uses two
attention blocks paths to extract high-frequency details taking advantage of the large information extracted from multiple images of the same scene. Experimental results are provided, showing the proposed scheme has overcome the state-of-the-art approaches. |
|
|
Address |
Online; Feb 6-8, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VISAPP |
|
|
Notes |
MSIAU; 601.349 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RSV2022a |
Serial |
3690 |
|
Permanent link to this record |
|
|
|
|
Author |
Jorge Charco; Angel Sappa; Boris X. Vintimilla |
|
|
Title |
Human Pose Estimation through a Novel Multi-view Scheme |
Type |
Conference Article |
|
Year |
2022 |
Publication |
17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) |
Abbreviated Journal |
|
|
|
Volume |
5 |
Issue |
|
Pages |
855-862 |
|
|
Keywords |
Multi-view Scheme; Human Pose Estimation; Relative Camera Pose; Monocular Approach |
|
|
Abstract |
This paper presents a multi-view scheme to tackle the challenging problem of the self-occlusion in human pose estimation problem. The proposed approach first obtains the human body joints of a set of images, which are captured from different views at the same time. Then, it enhances the obtained joints by using a
multi-view scheme. Basically, the joints from a given view are used to enhance poorly estimated joints from another view, especially intended to tackle the self occlusions cases. A network architecture initially proposed for the monocular case is adapted to be used in the proposed multi-view scheme. Experimental results and
comparisons with the state-of-the-art approaches on Human3.6m dataset are presented showing improvements in the accuracy of body joints estimations. |
|
|
Address |
On line; Feb 6, 2022 – Feb 8, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
2184-4321 |
ISBN |
978-989-758-555-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VISAPP |
|
|
Notes |
MSIAU; 600.160 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CSV2022 |
Serial |
3689 |
|
Permanent link to this record |
|
|
|
|
Author |
Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud |
|
|
Title |
A Novel Domain Transfer-Based Approach for Unsupervised Thermal Image Super-Resolution |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Sensors |
Abbreviated Journal |
SENS |
|
|
Volume |
22 |
Issue |
6 |
Pages |
2254 |
|
|
Keywords |
Thermal image super-resolution; unsupervised super-resolution; thermal images; attention module; semiregistered thermal images |
|
|
Abstract |
This paper presents a transfer domain strategy to tackle the limitations of low-resolution thermal sensors and generate higher-resolution images of reasonable quality. The proposed technique employs a CycleGAN architecture and uses a ResNet as an encoder in the generator along with an attention module and a novel loss function. The network is trained on a multi-resolution thermal image dataset acquired with three different thermal sensors. Results report better performance benchmarking results on the 2nd CVPR-PBVS-2021 thermal image super-resolution challenge than state-of-the-art methods. The code of this work is available online. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MSIAU; |
Approved |
no |
|
|
Call Number |
Admin @ si @ RSV2022b |
Serial |
3688 |
|
Permanent link to this record |
|
|
|
|
Author |
Zhaocheng Liu; Luis Herranz; Fei Yang; Saiping Zhang; Shuai Wan; Marta Mrak; Marc Gorriz |
|
|
Title |
Slimmable Video Codec |
Type |
Conference Article |
|
Year |
2022 |
Publication |
CVPR 2022 Workshop and Challenge on Learned Image Compression (CLIC 2022, 5th Edition) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1742-1746 |
|
|
Keywords |
|
|
|
Abstract |
Neural video compression has emerged as a novel paradigm combining trainable multilayer neural net-works and machine learning, achieving competitive rate-distortion (RD) performances, but still remaining impractical due to heavy neural architectures, with large memory and computational demands. In addition, models are usually optimized for a single RD tradeoff. Recent slimmable image codecs can dynamically adjust their model capacity to gracefully reduce the memory and computation requirements, without harming RD performance. In this paper we propose a slimmable video codec (SlimVC), by integrating a slimmable temporal entropy model in a slimmable autoencoder. Despite a significantly more complex architecture, we show that slimming remains a powerful mechanism to control rate, memory footprint, computational cost and latency, all being important requirements for practical video compression. |
|
|
Address |
Virtual; 19 June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
MACO; 601.379; 601.161 |
Approved |
no |
|
|
Call Number |
Admin @ si @ LHY2022 |
Serial |
3687 |
|
Permanent link to this record |
|
|
|
|
Author |
Kai Wang; Xialei Liu; Andrew Bagdanov; Luis Herranz; Shangling Jui; Joost Van de Weijer |
|
|
Title |
Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition |
Type |
Conference Article |
|
Year |
2022 |
Publication |
CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
3728-3738 |
|
|
Keywords |
Training; Computer vision; Image recognition; Upper bound; Conferences; Pattern recognition; Task analysis |
|
|
Abstract |
In this paper we consider the problem of incremental meta-learning in which classes are presented incrementally in discrete tasks. We propose Episodic Replay Distillation (ERD), that mixes classes from the current task with exemplars from previous tasks when sampling episodes for meta-learning. To allow the training to benefit from a large as possible variety of classes, which leads to more gener-
alizable feature representations, we propose the cross-task meta loss. Furthermore, we propose episodic replay distillation that also exploits exemplars for improved knowledge distillation. Experiments on four datasets demonstrate that ERD surpasses the state-of-the-art. In particular, on the more challenging one-shot, long task sequence scenarios, we reduce the gap between Incremental Meta-Learning and
the joint-training upper bound from 3.5% / 10.1% / 13.4% / 11.7% with the current state-of-the-art to 2.6% / 2.9% / 5.0% / 0.2% with our method on Tiered-ImageNet / Mini-ImageNet / CIFAR100 / CUB, respectively. |
|
|
Address |
New Orleans, USA; 20 June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
LAMP; 600.147 |
Approved |
no |
|
|
Call Number |
Admin @ si @ WLB2022 |
Serial |
3686 |
|
Permanent link to this record |