toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Ahmed M. A. Salih; Ilaria Boscolo Galazzo; Federica Cruciani; Lorenza Brusini; Petia Radeva edit  url
doi  openurl
  Title Investigating Explainable Artificial Intelligence for MRI-based Classification of Dementia: a New Stability Criterion for Explainable Methods Type Conference Article
  Year 2022 Publication 29th IEEE International Conference on Image Processing Abbreviated Journal  
  Volume (up) Issue Pages  
  Keywords Image processing; Stability criteria; Machine learning; Robustness; Alzheimer's disease; Monitoring  
  Abstract Individuals diagnosed with Mild Cognitive Impairment (MCI) have shown an increased risk of developing Alzheimer’s Disease (AD). As such, early identification of dementia represents a key prognostic element, though hampered by complex disease patterns. Increasing efforts have focused on Machine Learning (ML) to build accurate classification models relying on a multitude of clinical/imaging variables. However, ML itself does not provide sensible explanations related to the model mechanism and feature contribution. Explainable Artificial Intelligence (XAI) represents the enabling technology in this framework, allowing to understand ML outcomes and derive human-understandable explanations. In this study, we aimed at exploring ML combined with MRI-based features and XAI to solve this classification problem and interpret the outcome. In particular, we propose a new method to assess the robustness of feature rankings provided by XAI methods, especially when multicollinearity exists. Our findings indicate that our method was able to disentangle the list of the informative features underlying dementia, with important implications for aiding personalized monitoring plans.  
  Address Bordeaux; France; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes MILAB Approved no  
  Call Number Admin @ si @ SBC2022 Serial 3789  
Permanent link to this record
 

 
Author Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji edit  url
doi  openurl
  Title Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding Type Conference Article
  Year 2022 Publication 29th IEEE International Conference on Image Processing Abbreviated Journal  
  Volume (up) Issue Pages  
  Keywords Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics  
  Abstract In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM.  
  Address Bordeaux; France; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes MACO Approved no  
  Call Number Admin @ si @ ZWM2022 Serial 3790  
Permanent link to this record
 

 
Author Yaxing Wang; Joost Van de Weijer; Lu Yu; Shangling Jui edit  openurl
  Title Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data Type Conference Article
  Year 2022 Publication 10th International Conference on Learning Representations Abbreviated Journal  
  Volume (up) Issue Pages  
  Keywords  
  Abstract Conditional image synthesis is an integral part of many X2I translation systems, including image-to-image, text-to-image and audio-to-image translation systems. Training these large systems generally requires huge amounts of training data.
Therefore, we investigate knowledge distillation to transfer knowledge from a high-quality unconditioned generative model (e.g., StyleGAN) to a conditioned synthetic image generation modules in a variety of systems. To initialize the conditional and reference branch (from a unconditional GAN) we exploit the style mixing characteristics of high-quality GANs to generate an infinite supply of style-mixed triplets to perform the knowledge distillation. Extensive experimental results in a number of image generation tasks (i.e., image-to-image, semantic segmentation-to-image, text-to-image and audio-to-image) demonstrate qualitatively and quantitatively that our method successfully transfers knowledge to the synthetic image generation modules, resulting in more realistic images than previous methods as confirmed by a significant drop in the FID.
 
  Address Virtual  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICLR  
  Notes LAMP; 600.147 Approved no  
  Call Number Admin @ si @ WWY2022 Serial 3791  
Permanent link to this record
 

 
Author Kai Wang; Fei Yang; Joost Van de Weijer edit   pdf
openurl 
  Title Attention Distillation: self-supervised vision transformer students need more guidance Type Conference Article
  Year 2022 Publication 33rd British Machine Vision Conference Abbreviated Journal  
  Volume (up) Issue Pages  
  Keywords  
  Abstract Self-supervised learning has been widely applied to train high-quality vision transformers. Unleashing their excellent performance on memory and compute constraint devices is therefore an important research topic. However, how to distill knowledge from one self-supervised ViT to another has not yet been explored. Moreover, the existing self-supervised knowledge distillation (SSKD) methods focus on ConvNet based architectures are suboptimal for ViT knowledge distillation. In this paper, we study knowledge distillation of self-supervised vision transformers (ViT-SSKD). We show that directly distilling information from the crucial attention mechanism from teacher to student can significantly narrow the performance gap between both. In experiments on ImageNet-Subset and ImageNet-1K, we show that our method AttnDistill outperforms existing self-supervised knowledge distillation (SSKD) methods and achieves state-of-the-art k-NN accuracy compared with self-supervised learning (SSL) methods learning from scratch (with the ViT-S model). We are also the first to apply the tiny ViT-T model on self-supervised learning. Moreover, AttnDistill is independent of self-supervised learning algorithms, it can be adapted to ViT based SSL methods to improve the performance in future research.  
  Address London; UK; November 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference BMVC  
  Notes LAMP; 600.147 Approved no  
  Call Number Admin @ si @ WYW2022 Serial 3793  
Permanent link to this record
 

 
Author Kai Wang; Chenshen Wu; Andrew Bagdanov; Xialei Liu; Shiqi Yang; Shangling Jui; Joost Van de Weijer edit  openurl
  Title Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-Identification Type Conference Article
  Year 2022 Publication 33rd British Machine Vision Conference Abbreviated Journal  
  Volume (up) Issue Pages  
  Keywords  
  Abstract Lifelong object re-identification incrementally learns from a stream of re-identification tasks. The objective is to learn a representation that can be applied to all tasks and that generalizes to previously unseen re-identification tasks. The main challenge is that at inference time the representation must generalize to previously unseen identities. To address this problem, we apply continual meta metric learning to lifelong object re-identification. To prevent forgetting of previous tasks, we use knowledge distillation and explore the roles of positive and negative pairs. Based on our observation that the distillation and metric losses are antagonistic, we propose to remove positive pairs from distillation to robustify model updates. Our method, called Distillation without Positive Pairs (DwoPP), is evaluated on extensive intra-domain experiments on person and vehicle re-identification datasets, as well as inter-domain experiments on the LReID benchmark. Our experiments demonstrate that DwoPP significantly outperforms the state-of-the-art.  
  Address London; UK; November 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference BMVC  
  Notes LAMP; 600.147 Approved no  
  Call Number Admin @ si @ WWB2022 Serial 3794  
Permanent link to this record
 

 
Author Vishwesh Pillai; Pranav Mehar; Manisha Das; Deep Gupta; Petia Radeva edit  url
doi  openurl
  Title Integrated Hierarchical and Flat Classifiers for Food Image Classification using Epistemic Uncertainty Type Conference Article
  Year 2022 Publication IEEE International Conference on Signal Processing and Communications Abbreviated Journal  
  Volume (up) Issue Pages  
  Keywords  
  Abstract The problem of food image recognition is an essential one in today’s context because health conditions such as diabetes, obesity, and heart disease require constant monitoring of a person’s diet. To automate this process, several models are available to recognize food images. Due to a considerable number of unique food dishes and various cuisines, a traditional flat classifier ceases to perform well. To address this issue, prediction schemes consisting of both flat and hierarchical classifiers, with the analysis of epistemic uncertainty are used to switch between the classifiers. However, the accuracy of the predictions made using epistemic uncertainty data remains considerably low. Therefore, this paper presents a prediction scheme using three different threshold criteria that helps to increase the accuracy of epistemic uncertainty predictions. The performance of the proposed method is demonstrated using several experiments performed on the MAFood-121 dataset. The experimental results validate the proposal performance and show that the proposed threshold criteria help to increase the overall accuracy of the predictions by correctly classifying the uncertainty distribution of the samples.  
  Address Bangalore; India; July 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SPCOM  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ PMD2022 Serial 3796  
Permanent link to this record
 

 
Author Javier Rodenas; Bhalaji Nagarajan; Marc Bolaños; Petia Radeva edit  url
openurl 
  Title Learning Multi-Subset of Classes for Fine-Grained Food Recognition Type Conference Article
  Year 2022 Publication 7th International Workshop on Multimedia Assisted Dietary Management Abbreviated Journal  
  Volume (up) Issue Pages 17–26  
  Keywords  
  Abstract Food image recognition is a complex computer vision task, because of the large number of fine-grained food classes. Fine-grained recognition tasks focus on learning subtle discriminative details to distinguish similar classes. In this paper, we introduce a new method to improve the classification of classes that are more difficult to discriminate based on Multi-Subsets learning. Using a pre-trained network, we organize classes in multiple subsets using a clustering technique. Later, we embed these subsets in a multi-head model structure. This structure has three distinguishable parts. First, we use several shared blocks to learn the generalized representation of the data. Second, we use multiple specialized blocks focusing on specific subsets that are difficult to distinguish. Lastly, we use a fully connected layer to weight the different subsets in an end-to-end manner by combining the neuron outputs. We validated our proposed method using two recent state-of-the-art vision transformers on three public food recognition datasets. Our method was successful in learning the confused classes better and we outperformed the state-of-the-art on the three datasets.  
  Address Lisboa; Portugal; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MADiMa  
  Notes MILAB Approved no  
  Call Number Admin @ si @ RNB2022 Serial 3797  
Permanent link to this record
 

 
Author Silvio Giancola; Anthony Cioppa; Adrien Deliege; Floriane Magera; Vladimir Somers; Le Kang; Xin Zhou; Olivier Barnich; Christophe De Vleeschouwer; Alexandre Alahi; Bernard Ghanem; Marc Van Droogenbroeck; Abdulrahman Darwish; Adrien Maglo; Albert Clapes; Andreas Luyts; Andrei Boiarov; Artur Xarles; Astrid Orcesi; Avijit Shah; Baoyu Fan; Bharath Comandur; Chen Chen; Chen Zhang; Chen Zhao; Chengzhi Lin; Cheuk-Yiu Chan; Chun Chuen Hui; Dengjie Li; Fan Yang; Fan Liang; Fang Da; Feng Yan; Fufu Yu; Guanshuo Wang; H. Anthony Chan; He Zhu; Hongwei Kan; Jiaming Chu; Jianming Hu; Jianyang Gu; Jin Chen; Joao V. B. Soares; Jonas Theiner; Jorge De Corte; Jose Henrique Brito; Jun Zhang; Junjie Li; Junwei Liang; Leqi Shen; Lin Ma; Lingchi Chen; Miguel Santos Marques; Mike Azatov; Nikita Kasatkin; Ning Wang; Qiong Jia; Quoc Cuong Pham; Ralph Ewerth; Ran Song; Rengang Li; Rikke Gade; Ruben Debien; Runze Zhang; Sangrok Lee; Sergio Escalera; Shan Jiang; Shigeyuki Odashima; Shimin Chen; Shoichi Masui; Shouhong Ding; Sin-wai Chan; Siyu Chen; Tallal El-Shabrawy; Tao He; Thomas B. Moeslund; Wan-Chi Siu; Wei Zhang; Wei Li; Xiangwei Wang; Xiao Tan; Xiaochuan Li; Xiaolin Wei; Xiaoqing Ye; Xing Liu; Xinying Wang; Yandong Guo; Yaqian Zhao; Yi Yu; Yingying Li; Yue He; Yujie Zhong; Zhenhua Guo; Zhiheng Li edit  url
doi  openurl
  Title SoccerNet 2022 Challenges Results Type Conference Article
  Year 2022 Publication 5th International ACM Workshop on Multimedia Content Analysis in Sports Abbreviated Journal  
  Volume (up) Issue Pages 75-86  
  Keywords  
  Abstract The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team. In 2022, the challenges were composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving action timestamps in long untrimmed videos, (2) replay grounding, focusing on retrieving the live moment of an action shown in a replay, (3) pitch localization, focusing on detecting line and goal part elements, (4) camera calibration, dedicated to retrieving the intrinsic and extrinsic camera parameters, (5) player re-identification, focusing on retrieving the same players across multiple views, and (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams. Compared to last year's challenges, tasks (1-2) had their evaluation metrics redefined to consider tighter temporal accuracies, and tasks (3-6) were novel, including their underlying data and annotations. More information on the tasks, challenges and leaderboards are available on this https URL. Baselines and development kits are available on this https URL.  
  Address Lisboa; Portugal; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ACMW  
  Notes HUPBA; no menciona Approved no  
  Call Number Admin @ si @ GCD2022 Serial 3801  
Permanent link to this record
 

 
Author Patricia Suarez; Dario Carpio; Angel Sappa; Henry Velesaca edit   pdf
url  doi
openurl 
  Title Transformer based Image Dehazing Type Conference Article
  Year 2022 Publication 16th IEEE International Conference on Signal Image Technology & Internet Based System Abbreviated Journal  
  Volume (up) Issue Pages  
  Keywords atmospheric light; brightness component; computational cost; dehazing quality; haze-free image  
  Abstract This paper presents a novel approach to remove non homogeneous haze from real images. The proposed method consists mainly of image feature extraction, haze removal, and image reconstruction. To accomplish this challenging task, we propose an architecture based on transformers, which have been recently introduced and have shown great potential in different computer vision tasks. Our model is based on the SwinIR an image restoration architecture based on a transformer, but by modifying the deep feature extraction module, the depth level of the model, and by applying a combined loss function that improves styling and adapts the model for the non-homogeneous haze removal present in images. The obtained results prove to be superior to those obtained by state-of-the-art models.  
  Address Dijon; France; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SITIS  
  Notes MSIAU; no proj Approved no  
  Call Number Admin @ si @ SCS2022 Serial 3803  
Permanent link to this record
 

 
Author Angel Sappa; Patricia Suarez; Henry Velesaca; Dario Carpio edit   pdf
url  openurl
  Title Domain Adaptation in Image Dehazing: Exploring the Usage of Images from Virtual Scenarios Type Conference Article
  Year 2022 Publication 16th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing Abbreviated Journal  
  Volume (up) Issue Pages 85-92  
  Keywords Domain adaptation; Synthetic hazed dataset; Dehazing  
  Abstract This work presents a novel domain adaptation strategy for deep learning-based approaches to solve the image dehazing
problem. Firstly, a large set of synthetic images is generated by using a realistic 3D graphic simulator; these synthetic
images contain different densities of haze, which are used for training the model that is later adapted to any real scenario.
The adaptation process requires just a few images to fine-tune the model parameters. The proposed strategy allows
overcoming the limitation of training a given model with few images. In other words, the proposed strategy implements
the adaptation of a haze removal model trained with synthetic images to real scenarios. It should be noticed that it is quite
difficult, if not impossible, to have large sets of pairs of real-world images (with and without haze) to train in a supervised
way dehazing algorithms. Experimental results are provided showing the validity of the proposed domain adaptation
strategy.
 
  Address Lisboa; Portugal; July 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CGVCVIP  
  Notes MSIAU; no proj Approved no  
  Call Number Admin @ si @ SSV2022 Serial 3804  
Permanent link to this record
 

 
Author Michael Teutsch; Angel Sappa; Riad I. Hammoud edit  doi
isbn  openurl
  Title Cross-Spectral Image Processing Type Book Chapter
  Year 2022 Publication Computer Vision in the Infrared Spectrum. Synthesis Lectures on Computer Vision Abbreviated Journal  
  Volume (up) Issue Pages 23-34  
  Keywords  
  Abstract Although this book is on IR computer vision and its main focus lies on IR image and video processing and analysis, a special attention is dedicated to cross-spectral image processing due to the increasing number of publications and applications in this domain. In these cross-spectral frameworks, IR information is used together with information from other spectral bands to tackle some specific problems by developing more robust solutions. Tasks considered for cross-spectral processing are for instance dehazing, segmentation, vegetation index estimation, or face recognition. This increasing number of applications is motivated by cross- and multi-spectral camera setups available already on the market like for example smartphones, remote sensing multispectral cameras, or multi-spectral cameras for automotive systems or drones. In this chapter, different cross-spectral image processing techniques will be reviewed together with possible applications. Initially, image registration approaches for the cross-spectral case are reviewed: the registration stage is the first image processing task, which is needed to align images acquired by different sensors within the same reference coordinate system. Then, recent cross-spectral image colorization approaches, which are intended to colorize infrared images for different applications are presented. Finally, the cross-spectral image enhancement problem is tackled by including guided super resolution techniques, image dehazing approaches, cross-spectral filtering and edge detection. Figure 3.1 illustrates cross-spectral image processing stages as well as their possible connections. Table 3.1 presents some of the available public cross-spectral datasets generally used as reference data to evaluate cross-spectral image registration, colorization, enhancement, or exploitation results.  
  Address  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title SLCV  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-00698-2 Medium  
  Area Expedition Conference  
  Notes MSIAU; MACO Approved no  
  Call Number Admin @ si @ TSH2022b Serial 3805  
Permanent link to this record
 

 
Author Michael Teutsch; Angel Sappa; Riad I. Hammoud edit  doi
isbn  openurl
  Title Detection, Classification, and Tracking Type Book Chapter
  Year 2022 Publication Computer Vision in the Infrared Spectrum. Synthesis Lectures on Computer Vision Abbreviated Journal  
  Volume (up) Issue Pages 35-58  
  Keywords  
  Abstract Automatic image and video exploitation or content analysis is a technique to extract higher-level information from a scene such as objects, behavior, (inter-)actions, environment, or even weather conditions. The relevant information is assumed to be contained in the two-dimensional signal provided in an image (width and height in pixels) or the three-dimensional signal provided in a video (width, height, and time). But also intermediate-level information such as object classes [196], locations [197], or motion [198] can help applications to fulfill certain tasks such as intelligent compression [199], video summarization [200], or video retrieval [201]. Usually, videos with their temporal dimension are a richer source of data compared to single images [202] and thus certain video content can be extracted from videos only such as object motion or object behavior. Often, machine learning or nowadays deep learning techniques are utilized to model prior knowledge about object or scene appearance using labeled training samples [203, 204]. After a learning phase, these models are then applied in real world applications, which is called inference.  
  Address  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title SLCV  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-00698-2 Medium  
  Area Expedition Conference  
  Notes MSIAU; MACO Approved no  
  Call Number Admin @ si @ TSH2022c Serial 3806  
Permanent link to this record
 

 
Author Michael Teutsch; Angel Sappa; Riad I. Hammoud edit  doi
openurl 
  Title Image and Video Enhancement Type Book Chapter
  Year 2022 Publication Computer Vision in the Infrared Spectrum. Synthesis Lectures on Computer Vision Abbreviated Journal  
  Volume (up) Issue Pages 9-21  
  Keywords  
  Abstract Image and video enhancement aims at improving the signal quality relative to imaging artifacts such as noise and blur or atmospheric perturbations such as turbulence and haze. It is usually performed in order to assist humans in analyzing image and video content or simply to present humans visually appealing images and videos. However, image and video enhancement can also be used as a preprocessing technique to ease the task and thus improve the performance of subsequent automatic image content analysis algorithms: preceding dehazing can improve object detection as shown by [23] or explicit turbulence modeling can improve moving object detection as discussed by [24]. But it remains an open question whether image and video enhancement should rather be performed explicitly as a preprocessing step or implicitly for example by feeding affected images directly to a neural network for image content analysis like object detection [25]. Especially for real-time video processing at low latency it can be better to handle image perturbation implicitly in order to minimize the processing time of an algorithm. This can be achieved by making algorithms for image content analysis robust or even invariant to perturbations such as noise or blur. Additionally, mistakes of an individual preprocessing module can obviously affect the quality of the entire processing pipeline.  
  Address  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title SLCV  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MSIAU; MACO Approved no  
  Call Number Admin @ si @ TSH2022a Serial 3807  
Permanent link to this record
 

 
Author Alex Falcon; Swathikiran Sudhakaran; Giuseppe Serra; Sergio Escalera; Oswald Lanz edit   pdf
doi  openurl
  Title Relevance-based Margin for Contrastively-trained Video Retrieval Models Type Conference Article
  Year 2022 Publication ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval Abbreviated Journal  
  Volume (up) Issue Pages 146-157  
  Keywords  
  Abstract Video retrieval using natural language queries has attracted increasing interest due to its relevance in real-world applications, from intelligent access in private media galleries to web-scale video search. Learning the cross-similarity of video and text in a joint embedding space is the dominant approach. To do so, a contrastive loss is usually employed because it organizes the embedding space by putting similar items close and dissimilar items far. This framework leads to competitive recall rates, as they solely focus on the rank of the groundtruth items. Yet, assessing the quality of the ranking list is of utmost importance when considering intelligent retrieval systems, since multiple items may share similar semantics, hence a high relevance. Moreover, the aforementioned framework uses a fixed margin to separate similar and dissimilar items, treating all non-groundtruth items as equally irrelevant. In this paper we propose to use a variable margin: we argue that varying the margin used during training based on how much relevant an item is to a given query, i.e. a relevance-based margin, easily improves the quality of the ranking lists measured through nDCG and mAP. We demonstrate the advantages of our technique using different models on EPIC-Kitchens-100 and YouCook2. We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance. Finally, extensive ablation studies and qualitative analysis support the robustness of our approach. Code will be released at \urlhttps://github.com/aranciokov/RelevanceMargin-ICMR22.  
  Address Newwark, NJ, USA, 27 June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICMR  
  Notes HuPBA; no menciona Approved no  
  Call Number Admin @ si @ FSS2022 Serial 3808  
Permanent link to this record
 

 
Author Shiqi Yang; Yaxing Wang; Kai Wang; Shangling Jui; Joost Van de Weijer edit   pdf
openurl 
  Title Local Prediction Aggregation: A Frustratingly Easy Source-free Domain Adaptation Method Type Miscellaneous
  Year 2022 Publication Arxiv Abbreviated Journal  
  Volume (up) Issue Pages  
  Keywords  
  Abstract We propose a simple but effective source-free domain adaptation (SFDA) method. Treating SFDA as an unsupervised clustering problem and following the intuition that local neighbors in feature space should have more similar predictions than other features, we propose to optimize an objective of prediction consistency. This objective encourages local neighborhood features in feature space to have similar predictions while features farther away in feature space have dissimilar predictions, leading to efficient feature clustering and cluster assignment simultaneously. For efficient training, we seek to optimize an upper-bound of the objective resulting in two simple terms. Furthermore, we relate popular existing methods in domain adaptation, source-free domain adaptation and contrastive learning via the perspective of discriminability and diversity. The experimental results prove the superiority of our method, and our method can be adopted as a simple but strong baseline for future research in SFDA. Our method can be also adapted to source-free open-set and partial-set DA which further shows the generalization ability of our method. Code is available in this https URL.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.147 Approved no  
  Call Number Admin @ si @ YWW2022b Serial 3815  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: