toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Saad Minhas; Zeba Khanam; Shoaib Ehsan; Klaus McDonald Maier; Aura Hernandez-Sabate edit  doi
openurl 
  Title Weather Classification by Utilizing Synthetic Data Type Journal Article
  Year 2022 Publication Sensors Abbreviated Journal SENS  
  Volume 22 Issue 9 Pages 3193  
  Keywords (down) Weather classification; synthetic data; dataset; autonomous car; computer vision; advanced driver assistance systems; deep learning; intelligent transportation systems  
  Abstract Weather prediction from real-world images can be termed a complex task when targeting classification using neural networks. Moreover, the number of images throughout the available datasets can contain a huge amount of variance when comparing locations with the weather those images are representing. In this article, the capabilities of a custom built driver simulator are explored specifically to simulate a wide range of weather conditions. Moreover, the performance of a new synthetic dataset generated by the above simulator is also assessed. The results indicate that the use of synthetic datasets in conjunction with real-world datasets can increase the training efficiency of the CNNs by as much as 74%. The article paves a way forward to tackle the persistent problem of bias in vision-based datasets.  
  Address 21 April 2022  
  Corporate Author Thesis  
  Publisher MDPI Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; 600.139; 600.159; 600.166; 600.145; Approved no  
  Call Number Admin @ si @ MKE2022 Serial 3761  
Permanent link to this record
 

 
Author Emanuele Vivoli; Ali Furkan Biten; Andres Mafla; Dimosthenis Karatzas; Lluis Gomez edit   pdf
url  doi
openurl 
  Title MUST-VQA: MUltilingual Scene-text VQA Type Conference Article
  Year 2022 Publication Proceedings European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume 13804 Issue Pages 345–358  
  Keywords (down) Visual question answering; Scene text; Translation robustness; Multilingual models; Zero-shot transfer; Power of language models  
  Abstract In this paper, we present a framework for Multilingual Scene Text Visual Question Answering that deals with new languages in a zero-shot fashion. Specifically, we consider the task of Scene Text Visual Question Answering (STVQA) in which the question can be asked in different languages and it is not necessarily aligned to the scene text language. Thus, we first introduce a natural step towards a more generalized version of STVQA: MUST-VQA. Accounting for this, we discuss two evaluation scenarios in the constrained setting, namely IID and zero-shot and we demonstrate that the models can perform on a par on a zero-shot setting. We further provide extensive experimentation and show the effectiveness of adapting multilingual language models into STVQA tasks.  
  Address Tel-Aviv; Israel; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes DAG; 302.105; 600.155; 611.002 Approved no  
  Call Number Admin @ si @ VBM2022 Serial 3770  
Permanent link to this record
 

 
Author Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund edit   pdf
url  doi
openurl 
  Title Multi-Task Classification of Sewer Pipe Defects and Properties Using a Cross-Task Graph Neural Network Decoder Type Conference Article
  Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 2806-2817  
  Keywords (down) Vision Systems; Applications Multi-Task Classification  
  Abstract The sewerage infrastructure is one of the most important and expensive infrastructures in modern society. In order to efficiently manage the sewerage infrastructure, automated sewer inspection has to be utilized. However, while sewer
defect classification has been investigated for decades, little attention has been given to classifying sewer pipe properties such as water level, pipe material, and pipe shape, which are needed to evaluate the level of sewer pipe deterioration.
In this work we classify sewer pipe defects and properties concurrently and present a novel decoder-focused multi-task classification architecture Cross-Task Graph Neural Network (CT-GNN), which refines the disjointed per-task predictions using cross-task information. The CT-GNN architecture extends the traditional disjointed task-heads decoder, by utilizing a cross-task graph and unique class node embeddings. The cross-task graph can either be determined a priori based on the conditional probability between the task classes or determined dynamically using self-attention.
CT-GNN can be added to any backbone and trained end-toend at a small increase in the parameter count. We achieve state-of-the-art performance on all four classification tasks in the Sewer-ML dataset, improving defect classification and
water level classification by 5.3 and 8.0 percentage points, respectively. We also outperform the single task methods as well as other multi-task classification approaches while introducing 50 times fewer parameters than previous modelfocused approaches.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ BME2022 Serial 3638  
Permanent link to this record
 

 
Author Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji edit  url
doi  openurl
  Title Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding Type Conference Article
  Year 2022 Publication 29th IEEE International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords (down) Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics  
  Abstract In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM.  
  Address Bordeaux; France; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes MACO Approved no  
  Call Number Admin @ si @ ZWM2022 Serial 3790  
Permanent link to this record
 

 
Author Henry Velesaca; Patricia Suarez; Angel Sappa; Dario Carpio; Rafael E. Rivadeneira; Angel Sanchez edit   pdf
url  openurl
  Title Review on Common Techniques for Urban Environment Video Analytics Type Conference Article
  Year 2022 Publication Anais do III Workshop Brasileiro de Cidades Inteligentes Abbreviated Journal  
  Volume Issue Pages 107-118  
  Keywords (down) Video Analytics; Review; Urban Environments; Smart Cities  
  Abstract This work compiles the different computer vision-based approaches
from the state-of-the-art intended for video analytics in urban environments.
The manuscript groups the different approaches according to the typical modules present in video analysis, including image preprocessing, object detection,
classification, and tracking. This proposed pipeline serves as a basic guide to
representing these most representative approaches in this topic of video analysis
that will be addressed in this work. Furthermore, the manuscript is not intended
to be an exhaustive review of the most advanced approaches, but only a list of
common techniques proposed to address recurring problems in this field.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WBCI  
  Notes MSIAU; 601.349 Approved no  
  Call Number Admin @ si @ VSS2022 Serial 3773  
Permanent link to this record
 

 
Author Giacomo Magnifico; Beata Megyesi; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes edit   pdf
url  openurl
  Title Lost in Transcription of Graphic Signs in Ciphers Type Conference Article
  Year 2022 Publication International Conference on Historical Cryptology (HistoCrypt 2022) Abbreviated Journal  
  Volume Issue Pages 153-158  
  Keywords (down) transcription of ciphers; hand-written text recognition of symbols; graphic signs  
  Abstract Hand-written Text Recognition techniques with the aim to automatically identify and transcribe hand-written text have been applied to historical sources including ciphers. In this paper, we compare the performance of two machine learning architectures, an unsupervised method based on clustering and a deep learning method with few-shot learning. Both models are tested on seen and unseen data from historical ciphers with different symbol sets consisting of various types of graphic signs. We compare the models and highlight their differences in performance, with their advantages and shortcomings.  
  Address Amsterdam, Netherlands, June 20-22, 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference HystoCrypt  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ MBS2022 Serial 3731  
Permanent link to this record
 

 
Author Mohamed Ramzy Ibrahim; Robert Benavente; Felipe Lumbreras; Daniel Ponsa edit   pdf
doi  openurl
  Title 3DRRDB: Super Resolution of Multiple Remote Sensing Images using 3D Residual in Residual Dense Blocks Type Conference Article
  Year 2022 Publication CVPR 2022 Workshop on IEEE Perception Beyond the Visible Spectrum workshop series (PBVS, 18th Edition) Abbreviated Journal  
  Volume Issue Pages  
  Keywords (down) Training; Solid modeling; Three-dimensional displays; PSNR; Convolution; Superresolution; Pattern recognition  
  Abstract The rapid advancement of Deep Convolutional Neural Networks helped in solving many remote sensing problems, especially the problems of super-resolution. However, most state-of-the-art methods focus more on Single Image Super-Resolution neglecting Multi-Image Super-Resolution. In this work, a new proposed 3D Residual in Residual Dense Blocks model (3DRRDB) focuses on remote sensing Multi-Image Super-Resolution for two different single spectral bands. The proposed 3DRRDB model explores the idea of 3D convolution layers in deeply connected Dense Blocks and the effect of local and global residual connections with residual scaling in Multi-Image Super-Resolution. The model tested on the Proba-V challenge dataset shows a significant improvement above the current state-of-the-art models scoring a Corrected Peak Signal to Noise Ratio (cPSNR) of 48.79 dB and 50.83 dB for Near Infrared (NIR) and RED Bands respectively. Moreover, the proposed 3DRRDB model scores a Corrected Structural Similarity Index Measure (cSSIM) of 0.9865 and 0.9909 for NIR and RED bands respectively.  
  Address New Orleans, USA; 19 June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes MSIAU; 600.130 Approved no  
  Call Number Admin @ si @ IBL2022 Serial 3693  
Permanent link to this record
 

 
Author Kai Wang; Xialei Liu; Andrew Bagdanov; Luis Herranz; Shangling Jui; Joost Van de Weijer edit   pdf
doi  openurl
  Title Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition Type Conference Article
  Year 2022 Publication CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) Abbreviated Journal  
  Volume Issue Pages 3728-3738  
  Keywords (down) Training; Computer vision; Image recognition; Upper bound; Conferences; Pattern recognition; Task analysis  
  Abstract In this paper we consider the problem of incremental meta-learning in which classes are presented incrementally in discrete tasks. We propose Episodic Replay Distillation (ERD), that mixes classes from the current task with exemplars from previous tasks when sampling episodes for meta-learning. To allow the training to benefit from a large as possible variety of classes, which leads to more gener-
alizable feature representations, we propose the cross-task meta loss. Furthermore, we propose episodic replay distillation that also exploits exemplars for improved knowledge distillation. Experiments on four datasets demonstrate that ERD surpasses the state-of-the-art. In particular, on the more challenging one-shot, long task sequence scenarios, we reduce the gap between Incremental Meta-Learning and
the joint-training upper bound from 3.5% / 10.1% / 13.4% / 11.7% with the current state-of-the-art to 2.6% / 2.9% / 5.0% / 0.2% with our method on Tiered-ImageNet / Mini-ImageNet / CIFAR100 / CUB, respectively.
 
  Address New Orleans, USA; 20 June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes LAMP; 600.147 Approved no  
  Call Number Admin @ si @ WLB2022 Serial 3686  
Permanent link to this record
 

 
Author Bojana Gajic; Ariel Amato; Ramon Baldrich; Joost Van de Weijer; Carlo Gatta edit   pdf
doi  openurl
  Title Area Under the ROC Curve Maximization for Metric Learning Type Conference Article
  Year 2022 Publication CVPR 2022 Workshop on Efficien Deep Learning for Computer Vision (ECV 2022, 5th Edition) Abbreviated Journal  
  Volume Issue Pages  
  Keywords (down) Training; Computer vision; Conferences; Area measurement; Benchmark testing; Pattern recognition  
  Abstract Most popular metric learning losses have no direct relation with the evaluation metrics that are subsequently applied to evaluate their performance. We hypothesize that training a metric learning model by maximizing the area under the ROC curve (which is a typical performance measure of recognition systems) can induce an implicit ranking suitable for retrieval problems. This hypothesis is supported by previous work that proved that a curve dominates in ROC space if and only if it dominates in Precision-Recall space. To test this hypothesis, we design and maximize an approximated, derivable relaxation of the area under the ROC curve. The proposed AUC loss achieves state-of-the-art results on two large scale retrieval benchmark datasets (Stanford Online Products and DeepFashion In-Shop). Moreover, the AUC loss achieves comparable performance to more complex, domain specific, state-of-the-art methods for vehicle re-identification.  
  Address New Orleans, USA; 20 June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes CIC; LAMP; Approved no  
  Call Number Admin @ si @ GAB2022 Serial 3700  
Permanent link to this record
 

 
Author Aneesh Rangnekar; Zachary Mulhollan; Anthony Vodacek; Matthew Hoffman; Angel Sappa; Erik Blasch; Jun Yu; Liwen Zhang; Shenshen Du; Hao Chang; Keda Lu; Zhong Zhang; Fang Gao; Ye Yu; Feng Shuang; Lei Wang; Qiang Ling; Pranjay Shyam; Kuk-Jin Yoon; Kyung-Soo Kim edit   pdf
url  doi
openurl 
  Title Semi-Supervised Hyperspectral Object Detection Challenge Results – PBVS 2022 Type Conference Article
  Year 2022 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Abbreviated Journal  
  Volume Issue Pages 390-398  
  Keywords (down) Training; Computer visio; Conferences; Training data; Object detection; Semisupervised learning; Transformers  
  Abstract This paper summarizes the top contributions to the first semi-supervised hyperspectral object detection (SSHOD) challenge, which was organized as a part of the Perception Beyond the Visible Spectrum (PBVS) 2022 workshop at the Computer Vision and Pattern Recognition (CVPR) conference. The SSHODC challenge is a first-of-its-kind hyperspectral dataset with temporally contiguous frames collected from a university rooftop observing a 4-way vehicle intersection over a period of three days. The dataset contains a total of 2890 frames, captured at an average resolution of 1600 × 192 pixels, with 51 hyperspectral bands from 400nm to 900nm. SSHOD challenge uses 989 images as the training set, 605 images as validation set and 1296 images as the evaluation (test) set. Each set was acquired on a different day to maximize the variance in weather conditions. Labels are provided for 10% of the annotated data, hence formulating a semi-supervised learning task for the participants which is evaluated in terms of average precision over the entire set of classes, as well as individual moving object classes: namely vehicle, bus and bike. The challenge received participation registration from 38 individuals, with 8 participating in the validation phase and 3 participating in the test phase. This paper describes the dataset acquisition, with challenge formulation, proposed methods and qualitative and quantitative results.  
  Address New Orleans; USA; June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes MSIAU; no menciona Approved no  
  Call Number Admin @ si @ RMV2022 Serial 3774  
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla edit   pdf
url  openurl
  Title Multi-Image Super-Resolution for Thermal Images Type Conference Article
  Year 2022 Publication 17th International Conference on Computer Vision Theory and Applications (VISAPP 2022) Abbreviated Journal  
  Volume 4 Issue Pages 635-642  
  Keywords (down) Thermal Images; Multi-view; Multi-frame; Super-Resolution; Deep Learning; Attention Block  
  Abstract This paper proposes a novel CNN architecture for the multi-thermal image super-resolution problem. In the proposed scheme, the multi-images are synthetically generated by downsampling and slightly shifting the given image; noise is also added to each of these synthesized images. The proposed architecture uses two
attention blocks paths to extract high-frequency details taking advantage of the large information extracted from multiple images of the same scene. Experimental results are provided, showing the proposed scheme has overcome the state-of-the-art approaches.
 
  Address Online; Feb 6-8, 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISAPP  
  Notes MSIAU; 601.349 Approved no  
  Call Number Admin @ si @ RSV2022a Serial 3690  
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud edit   pdf
doi  openurl
  Title A Novel Domain Transfer-Based Approach for Unsupervised Thermal Image Super-Resolution Type Journal Article
  Year 2022 Publication Sensors Abbreviated Journal SENS  
  Volume 22 Issue 6 Pages 2254  
  Keywords (down) Thermal image super-resolution; unsupervised super-resolution; thermal images; attention module; semiregistered thermal images  
  Abstract This paper presents a transfer domain strategy to tackle the limitations of low-resolution thermal sensors and generate higher-resolution images of reasonable quality. The proposed technique employs a CycleGAN architecture and uses a ResNet as an encoder in the generator along with an attention module and a novel loss function. The network is trained on a multi-resolution thermal image dataset acquired with three different thermal sensors. Results report better performance benchmarking results on the 2nd CVPR-PBVS-2021 thermal image super-resolution challenge than state-of-the-art methods. The code of this work is available online.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MSIAU; Approved no  
  Call Number Admin @ si @ RSV2022b Serial 3688  
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera; Vassilis Athitsos; Mohammad Sabokrou edit   pdf
doi  openurl
  Title All You Need In Sign Language Production Type Miscellaneous
  Year 2022 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords (down) Sign Language Production; Sign Language Recog- nition; Sign Language Translation; Deep Learning; Survey; Deaf  
  Abstract Sign Language is the dominant form of communication language used in the deaf and hearing-impaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental.
To this end, sign language recognition and production are two necessary parts for making such a two-way system. Signlanguage recognition and production need to cope with some critical challenges. In this survey, we review recent advances in
Sign Language Production (SLP) and related areas using deep learning. To have more realistic perspectives to sign language, we present an introduction to the Deaf culture, Deaf centers, psychological perspective of sign language, the main differences between spoken language and sign language. Furthermore, we present the fundamental components of a bi-directional sign language translation system, discussing the main challenges in this area. Also, the backbone architectures and methods in SLP are briefly introduced and the proposed taxonomy on SLP is presented. Finally, a general framework for SLP and performance evaluation, and also a discussion on the recent developments, advantages, and limitations in SLP, commenting on possible lines for future research are presented.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no menciona Approved no  
  Call Number Admin @ si @ RKE2022c Serial 3698  
Permanent link to this record
 

 
Author Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund edit  doi
openurl 
  Title Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification Type Journal Article
  Year 2022 Publication Automation in Construction Abbreviated Journal AC  
  Volume 144 Issue Pages 104614  
  Keywords (down) Sewer Defect Classification; Vision Transformers; Sinkhorn-Knopp; Convolutional Neural Networks; Closed-Circuit Television; Sewer Inspection  
  Abstract A crucial part of image classification consists of capturing non-local spatial semantics of image content. This paper describes the multi-scale hybrid vision transformer (MSHViT), an extension of the classical convolutional neural network (CNN) backbone, for multi-label sewer defect classification. To better model spatial semantics in the images, features are aggregated at different scales non-locally through the use of a lightweight vision transformer, and a smaller set of tokens was produced through a novel Sinkhorn clustering-based tokenizer using distinct cluster centers. The proposed MSHViT and Sinkhorn tokenizer were evaluated on the Sewer-ML multi-label sewer defect classification dataset, showing consistent performance improvements of up to 2.53 percentage points.  
  Address Dec 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA Approved no  
  Call Number Admin @ si @ BME2022c Serial 3780  
Permanent link to this record
 

 
Author Nil Ballus; Bhalaji Nagarajan; Petia Radeva edit  url
doi  openurl
  Title Opt-SSL: An Enhanced Self-Supervised Framework for Food Recognition Type Conference Article
  Year 2022 Publication 10th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal  
  Volume 13256 Issue Pages  
  Keywords (down) Self-supervised; Contrastive learning; Food recognition  
  Abstract Self-supervised Learning has been showing upbeat performance in several computer vision tasks. The popular contrastive methods make use of a Siamese architecture with different loss functions. In this work, we go deeper into two very recent state of the art frameworks, namely, SimSiam and Barlow Twins. Inspired by them, we propose a new self-supervised learning method we call Opt-SSL that combines both image and feature contrasting. We validate the proposed method on the food recognition task, showing that our proposed framework enables the self-learning networks to learn better visual representations.  
  Address Aveiro; Portugal; May 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IbPRIA  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ BNR2022 Serial 3782  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: