toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Kaustubh Kulkarni; Ciprian Corneanu; Ikechukwu Ofodile; Sergio Escalera; Xavier Baro; Sylwia Hyniewska; Juri Allik; Gholamreza Anbarjafari edit   pdf
url  openurl
  Title Automatic Recognition of Facial Displays of Unfelt Emotions Type Journal Article
  Year 2021 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC  
  Volume 12 Issue 2 Pages 377 - 390  
  Keywords  
  Abstract Humans modify their facial expressions in order to communicate their internal states and sometimes to mislead observers regarding their true emotional states. Evidence in experimental psychology shows that discriminative facial responses are short and subtle. This suggests that such behavior would be easier to distinguish when captured in high resolution at an increased frame rate. We are proposing SASE-FE, the first dataset of facial expressions that are either congruent or incongruent with underlying emotion states. We show that overall the problem of recognizing whether facial movements are expressions of authentic emotions or not can be successfully addressed by learning spatio-temporal representations of the data. For this purpose, we propose a method that aggregates features along fiducial trajectories in a deeply learnt space. Performance of the proposed model shows that on average, it is easier to distinguish among genuine facial expressions of emotion than among unfelt facial expressions of emotion and that certain emotion pairs such as contempt and disgust are more difficult to distinguish than the rest. Furthermore, the proposed methodology improves state of the art results on CK+ and OULU-CASIA datasets for video emotion recognition, and achieves competitive results when classifying facial action units on BP4D datase.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ KCO2021 Serial 3658  
Permanent link to this record
 

 
Author Joan Codina-Filba; Sergio Escalera; Joan Escudero; Coen Antens; Pau Buch-Cardona; Mireia Farrus edit  url
openurl 
  Title Mobile eHealth Platform for Home Monitoring of Bipolar Disorder Type Conference Article
  Year 2021 Publication 27th ACM International Conference on Multimedia Modeling Abbreviated Journal  
  Volume 12573 Issue Pages 330-341  
  Keywords  
  Abstract People suffering Bipolar Disorder (BD) experiment changes in mood status having depressive or manic episodes with normal periods in the middle. BD is a chronic disease with a high level of non-adherence to medication that needs a continuous monitoring of patients to detect when they relapse in an episode, so that physicians can take care of them. Here we present MoodRecord, an easy-to-use, non-intrusive, multilingual, robust and scalable platform suitable for home monitoring patients with BD, that allows physicians and relatives to track the patient state and get alarms when abnormalities occur.

MoodRecord takes advantage of the capabilities of smartphones as a communication and recording device to do a continuous monitoring of patients. It automatically records user activity, and asks the user to answer some questions or to record himself in video, according to a predefined plan designed by physicians. The video is analysed, recognising the mood status from images and bipolar assessment scores are extracted from speech parameters. The data obtained from the different sources are merged periodically to observe if a relapse may start and if so, raise the corresponding alarm. The application got a positive evaluation in a pilot with users from three different countries. During the pilot, the predictions of the voice and image modules showed a coherent correlation with the diagnosis performed by clinicians.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MMM  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ CEE2021 Serial 3659  
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera edit  url
doi  openurl
  Title Real-time Isolated Hand Sign Language RecognitioN Using Deep Networks and SVD Type Journal
  Year 2022 Publication Journal of Ambient Intelligence and Humanized Computing Abbreviated Journal  
  Volume 13 Issue Pages 591–611  
  Keywords  
  Abstract One of the challenges in computer vision models, especially sign language, is real-time recognition. In this work, we present a simple yet low-complex and efficient model, comprising single shot detector, 2D convolutional neural network, singular value decomposition (SVD), and long short term memory, to real-time isolated hand sign language recognition (IHSLR) from RGB video. We employ the SVD method as an efficient, compact, and discriminative feature extractor from the estimated 3D hand keypoints coordinators. Despite the previous works that employ the estimated 3D hand keypoints coordinates as raw features, we propose a novel and revolutionary way to apply the SVD to the estimated 3D hand keypoints coordinates to get more discriminative features. SVD method is also applied to the geometric relations between the consecutive segments of each finger in each hand and also the angles between these sections. We perform a detailed analysis of recognition time and accuracy. One of our contributions is that this is the first time that the SVD method is applied to the hand pose parameters. Results on four datasets, RKS-PERSIANSIGN (99.5±0.04), First-Person (91±0.06), ASVID (93±0.05), and isoGD (86.1±0.04), confirm the efficiency of our method in both accuracy (mean+std) and time recognition. Furthermore, our model outperforms or gets competitive results with the state-of-the-art alternatives in IHSLR and hand action recognition.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ RKE2022a Serial 3660  
Permanent link to this record
 

 
Author Ajian Liu; Zichang Tan; Jun Wan; Sergio Escalera; Guodong Guo; Stan Z. Li edit  url
doi  openurl
  Title CASIA-SURF CeFA: A Benchmark for Multi-modal Cross-Ethnicity Face Anti-Spoofing Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 1178-1186  
  Keywords  
  Abstract The issue of ethnic bias has proven to affect the performance of face recognition in previous works, while it still remains to be vacant in face anti-spoofing. Therefore, in order to study the ethnic bias for face anti-spoofing, we introduce the largest CASIA-SURF Cross-ethnicity Face Anti-spoofing (CeFA) dataset, covering 3 ethnicities, 3 modalities, 1,607 subjects, and 2D plus 3D attack types. Five protocols are introduced to measure the affect under varied evaluation conditions, such as cross-ethnicity, unknown spoofs or both of them. As our knowledge, CASIA-SURF CeFA is the first dataset including explicit ethnic labels in current released datasets. Then, we propose a novel multi-modal fusion method as a strong baseline to alleviate the ethnic bias, which employs a partially shared fusion strategy to learn complementary information from multiple modalities. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability for other existing datasets, i.e., CASIA-SURF, OULU-NPU and SiW datasets. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2020?authuser=0.  
  Address Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ LTW2021 Serial 3661  
Permanent link to this record
 

 
Author Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning Type Conference Article
  Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 1381-1390  
  Keywords Measurement; Training; Visualization; Analytical models; Computer vision; Computational modeling; Training data  
  Abstract Explaining an image with missing or non-existent objects is known as object bias (hallucination) in image captioning. This behaviour is quite common in the state-of-the-art captioning models which is not desirable by humans. To decrease the object hallucination in captioning, we propose three simple yet efficient training augmentation method for sentences which requires no new training data or increase
in the model size. By extensive analysis, we show that the proposed methods can significantly diminish our models’ object bias on hallucination metrics. Moreover, we experimentally demonstrate that our methods decrease the dependency on the visual features. All of our code, configuration files and model weights are available online.
 
  Address Virtual; Waikoloa; Hawai; USA; January 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 600.155; 302.105 Approved no  
  Call Number Admin @ si @ BGK2022 Serial 3662  
Permanent link to this record
 

 
Author Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Type Conference Article
  Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 1391-1400  
  Keywords Measurement; Training; Integrated circuits; Annotations; Semantics; Training data; Semisupervised learning  
  Abstract The task of image-text matching aims to map representations from different modalities into a common joint visual-textual embedding. However, the most widely used datasets for this task, MSCOCO and Flickr30K, are actually image captioning datasets that offer a very limited set of relationships between images and sentences in their ground-truth annotations. This limited ground truth information forces us to use evaluation metrics based on binary relevance: given a sentence query we consider only one image as relevant. However, many other relevant images or captions may be present in the dataset. In this work, we propose two metrics that evaluate the degree of semantic relevance of retrieved items, independently of their annotated binary relevance. Additionally, we incorporate a novel strategy that uses an image captioning metric, CIDEr, to define a Semantic Adaptive Margin (SAM) to be optimized in a standard triplet loss. By incorporating our formulation to existing models, a large improvement is obtained in scenarios where available training data is limited. We also demonstrate that the performance on the annotated image-caption pairs is maintained while improving on other non-annotated relevant items when employing the full training set. The code for our new metric can be found at github. com/furkanbiten/ncsmetric and the model implementation at github. com/andrespmd/semanticadaptive_margin.  
  Address Virtual; Waikoloa; Hawai; USA; January 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 600.155; 302.105; Approved no  
  Call Number Admin @ si @ BMG2022 Serial 3663  
Permanent link to this record
 

 
Author Diego Velazquez; Josep M. Gonfaus; Pau Rodriguez; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez edit  url
doi  openurl
  Title Logo Detection With No Priors Type Journal Article
  Year 2021 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume 9 Issue Pages 106998-107011  
  Keywords  
  Abstract In recent years, top referred methods on object detection like R-CNN have implemented this task as a combination of proposal region generation and supervised classification on the proposed bounding boxes. Although this pipeline has achieved state-of-the-art results in multiple datasets, it has inherent limitations that make object detection a very complex and inefficient task in computational terms. Instead of considering this standard strategy, in this paper we enhance Detection Transformers (DETR) which tackles object detection as a set-prediction problem directly in an end-to-end fully differentiable pipeline without requiring priors. In particular, we incorporate Feature Pyramids (FP) to the DETR architecture and demonstrate the effectiveness of the resulting DETR-FP approach on improving logo detection results thanks to the improved detection of small logos. So, without requiring any domain specific prior to be fed to the model, DETR-FP obtains competitive results on the OpenLogo and MS-COCO datasets offering a relative improvement of up to 30%, when compared to a Faster R-CNN baseline which strongly depends on hand-designed priors.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ VGR2021 Serial 3664  
Permanent link to this record
 

 
Author Diana Ramirez Cifuentes; Ana Freire; Ricardo Baeza Yates; Nadia Sanz Lamora; Aida Alvarez; Alexandre Gonzalez; Meritxell Lozano; Roger Llobet; Diego Velazquez; Josep M. Gonfaus; Jordi Gonzalez edit  url
doi  openurl
  Title Characterization of Anorexia Nervosa on Social Media: Textual, Visual, Relational, Behavioral, and Demographical Analysis Type Journal Article
  Year 2021 Publication Journal of Medical Internet Research Abbreviated Journal JMIR  
  Volume 23 Issue 7 Pages e25925  
  Keywords  
  Abstract Background: Eating disorders are psychological conditions characterized by unhealthy eating habits. Anorexia nervosa (AN) is defined as the belief of being overweight despite being dangerously underweight. The psychological signs involve emotional and behavioral issues. There is evidence that signs and symptoms can manifest on social media, wherein both harmful and beneficial content is shared daily.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ RFB2021 Serial 3665  
Permanent link to this record
 

 
Author Michael Teutsch; Angel Sappa; Riad I. Hammoud edit  url
isbn  openurl
  Title Computer Vision in the Infrared Spectrum: Challenges and Approaches Type Book Whole
  Year 2021 Publication Synthesis Lectures on Computer Vision Abbreviated Journal  
  Volume 10 Issue 2 Pages 1-138  
  Keywords  
  Abstract Human visual perception is limited to the visual-optical spectrum. Machine vision is not. Cameras sensitive to the different infrared spectra can enhance the abilities of autonomous systems and visually perceive the environment in a holistic way. Relevant scene content can be made visible especially in situations, where sensors of other modalities face issues like a visual-optical camera that needs a source of illumination. As a consequence, not only human mistakes can be avoided by increasing the level of automation, but also machine-induced errors can be reduced that, for example, could make a self-driving car crash into a pedestrian under difficult illumination conditions. Furthermore, multi-spectral sensor systems with infrared imagery as one modality are a rich source of information and can provably increase the robustness of many autonomous systems. Applications that can benefit from utilizing infrared imagery range from robotics to automotive and from biometrics to surveillance. In this book, we provide a brief yet concise introduction to the current state-of-the-art of computer vision and machine learning in the infrared spectrum. Based on various popular computer vision tasks such as image enhancement, object detection, or object tracking, we first motivate each task starting from established literature in the visual-optical spectrum. Then, we discuss the differences between processing images and videos in the visual-optical spectrum and the various infrared spectra. An overview of the current literature is provided together with an outlook for each task. Furthermore, available and annotated public datasets and common evaluation methods and metrics are presented. In a separate chapter, popular applications that can greatly benefit from the use of infrared imagery as a data source are presented and discussed. Among them are automatic target recognition, video surveillance, or biometrics including face recognition. Finally, we conclude with recommendations for well-fitting sensor setups and data processing algorithms for certain computer vision tasks. We address this book to prospective researchers and engineers new to the field but also to anyone who wants to get introduced to the challenges and the approaches of computer vision using infrared images or videos. Readers will be able to start their work directly after reading the book supported by a highly comprehensive backlog of recent and relevant literature as well as related infrared datasets including existing evaluation frameworks. Together with consistently decreasing costs for infrared cameras, new fields of application appear and make computer vision in the infrared spectrum a great opportunity to face nowadays scientific and engineering challenges.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN 978-1636392431 Medium  
  Area Expedition Conference  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ TSH2021 Serial 3666  
Permanent link to this record
 

 
Author Henry Velesaca; Patricia Suarez; Dario Carpio; Angel Sappa edit  url
openurl 
  Title Synthesized Image Datasets: Towards an Annotation-Free Instance Segmentation Strategy Type Conference Article
  Year 2021 Publication 16th International Symposium on Visual Computing Abbreviated Journal  
  Volume 13017 Issue Pages 131–143  
  Keywords  
  Abstract This paper presents a complete pipeline to perform deep learning-based instance segmentation of different types of grains (e.g., corn, sunflower, soybeans, lentils, chickpeas, mote, and beans). The proposed approach consists of using synthesized image datasets for the training process, which are easily generated according to the category of the instance to be segmented. The synthesized imaging process allows generating a large set of well-annotated grain samples with high variability—as large and high as the user requires. Instance segmentation is performed through a popular deep learning based approach, the Mask R-CNN architecture, but any learning-based instance segmentation approach can be considered. Results obtained by the proposed pipeline show that the strategy of using synthesized image datasets for training instance segmentation helps to avoid the time-consuming image annotation stage, as well as to achieve higher intersection over union and average precision performances. Results obtained with different varieties of grains are shown, as well as comparisons with manually annotated images, showing both the simplicity of the process and the improvements in the performance.  
  Address Virtual; October 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ISVC  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ VSC2021 Serial 3667  
Permanent link to this record
 

 
Author Patricia Suarez; Dario Carpio; Angel Sappa edit  url
openurl 
  Title Non-homogeneous Haze Removal Through a Multiple Attention Module Architecture Type Conference Article
  Year 2021 Publication 16th International Symposium on Visual Computing Abbreviated Journal  
  Volume 13018 Issue Pages 178–190  
  Keywords  
  Abstract This paper presents a novel attention based architecture to remove non-homogeneous haze. The proposed model is focused on obtaining the most representative characteristics of the image, at each learning cycle, by means of adaptive attention modules coupled with a residual learning convolutional network. The latter is based on the Res2Net model. The proposed architecture is trained with just a few set of images. Its performance is evaluated on a public benchmark—images from the non-homogeneous haze NTIRE 2021 challenge—and compared with state of the art approaches reaching the best result.  
  Address Virtual; October 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ISVC  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ SCS2021 Serial 3668  
Permanent link to this record
 

 
Author F.Negin; Pau Rodriguez; M.Koperski; A.Kerboua; Jordi Gonzalez; J.Bourgeois; E.Chapoulie; P.Robert; F.Bremond edit  url
openurl 
  Title PRAXIS: Towards automatic cognitive assessment using gesture recognition Type Journal Article
  Year 2018 Publication Expert Systems with Applications Abbreviated Journal ESWA  
  Volume 106 Issue Pages 21-35  
  Keywords  
  Abstract Praxis test is a gesture-based diagnostic test which has been accepted as diagnostically indicative of cortical pathologies such as Alzheimer’s disease. Despite being simple, this test is oftentimes skipped by the clinicians. In this paper, we propose a novel framework to investigate the potential of static and dynamic upper-body gestures based on the Praxis test and their potential in a medical framework to automatize the test procedures for computer-assisted cognitive assessment of older adults.

In order to carry out gesture recognition as well as correctness assessment of the performances we have recollected a novel challenging RGB-D gesture video dataset recorded by Kinect v2, which contains 29 specific gestures suggested by clinicians and recorded from both experts and patients performing the gesture set. Moreover, we propose a framework to learn the dynamics of upper-body gestures, considering the videos as sequences of short-term clips of gestures. Our approach first uses body part detection to extract image patches surrounding the hands and then, by means of a fine-tuned convolutional neural network (CNN) model, it learns deep hand features which are then linked to a long short-term memory to capture the temporal dependencies between video frames.
We report the results of four developed methods using different modalities. The experiments show effectiveness of our deep learning based approach in gesture recognition and performance assessment tasks. Satisfaction of clinicians from the assessment reports indicates the impact of framework corresponding to the diagnosis.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ NRK2018 Serial 3669  
Permanent link to this record
 

 
Author O.F.Ahmad; Y.Mori; M.Misawa; S.Kudo; J.T.Anderson; Jorge Bernal edit  url
doi  openurl
  Title Establishing key research questions for the implementation of artificial intelligence in colonoscopy: a modified Delphi method Type Journal Article
  Year 2021 Publication Endoscopy Abbreviated Journal END  
  Volume 53 Issue 9 Pages 893-901  
  Keywords  
  Abstract BACKGROUND : Artificial intelligence (AI) research in colonoscopy is progressing rapidly but widespread clinical implementation is not yet a reality. We aimed to identify the top implementation research priorities. METHODS : An established modified Delphi approach for research priority setting was used. Fifteen international experts, including endoscopists and translational computer scientists/engineers, from nine countries participated in an online survey over 9 months. Questions related to AI implementation in colonoscopy were generated as a long-list in the first round, and then scored in two subsequent rounds to identify the top 10 research questions. RESULTS : The top 10 ranked questions were categorized into five themes. Theme 1: clinical trial design/end points (4 questions), related to optimum trial designs for polyp detection and characterization, determining the optimal end points for evaluation of AI, and demonstrating impact on interval cancer rates. Theme 2: technological developments (3 questions), including improving detection of more challenging and advanced lesions, reduction of false-positive rates, and minimizing latency. Theme 3: clinical adoption/integration (1 question), concerning the effective combination of detection and characterization into one workflow. Theme 4: data access/annotation (1 question), concerning more efficient or automated data annotation methods to reduce the burden on human experts. Theme 5: regulatory approval (1 question), related to making regulatory approval processes more efficient. CONCLUSIONS : This is the first reported international research priority setting exercise for AI in colonoscopy. The study findings should be used as a framework to guide future research with key stakeholders to accelerate the clinical implementation of AI in endoscopy.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ AMM2021 Serial 3670  
Permanent link to this record
 

 
Author Yasuko Sugito; Trevor Canham; Javier Vazquez; Marcelo Bertalmio edit  url
doi  openurl
  Title A Study of Objective Quality Metrics for HLG-Based HDR/WCG Image Coding Type Journal
  Year 2021 Publication SMPTE Motion Imaging Journal Abbreviated Journal SMPTE  
  Volume 130 Issue 4 Pages 53 - 65  
  Keywords  
  Abstract In this work, we study the suitability of high dynamic range, wide color gamut (HDR/WCG) objective quality metrics to assess the perceived deterioration of compressed images encoded using the hybrid log-gamma (HLG) method, which is the standard for HDR television. Several image quality metrics have been developed to deal specifically with HDR content, although in previous work we showed that the best results (i.e., better matches to the opinion of human expert observers) are obtained by an HDR metric that consists simply in applying a given standard dynamic range metric, called visual information fidelity (VIF), directly to HLG-encoded images. However, all these HDR metrics ignore the chroma components for their calculations, that is, they consider only the luminance channel. For this reason, in the current work, we conduct subjective evaluation experiments in a professional setting using compressed HDR/WCG images encoded with HLG and analyze the ability of the best HDR metric to detect perceivable distortions in the chroma components, as well as the suitability of popular color metrics (including ΔITPR , which supports parameters for HLG) to correlate with the opinion scores. Our first contribution is to show that there is a need to consider the chroma components in HDR metrics, as there are color distortions that subjects perceive but that the best HDR metric fails to detect. Our second contribution is the surprising result that VIF, which utilizes only the luminance channel, correlates much better with the subjective evaluation scores than the metrics investigated that do consider the color components.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number SCV2021 Serial 3671  
Permanent link to this record
 

 
Author Javad Zolfaghari Bengar; Joost Van de Weijer; Bartlomiej Twardowski; Bogdan Raducanu edit  url
doi  openurl
  Title Reducing Label Effort: Self- Supervised Meets Active Learning Type Conference Article
  Year 2021 Publication International Conference on Computer Vision Workshops Abbreviated Journal  
  Volume Issue Pages 1631-1639  
  Keywords  
  Abstract Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a large amount of unlabeled data in an unsupervised way and fine-tunes on few labeled samples. Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets. The current work focuses on whether the two paradigms can benefit from each other. We studied object recognition datasets including CIFAR10, CIFAR100 and Tiny ImageNet with several labeling budgets for the evaluations. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort, that for a low labeling budget, active learning offers no benefit to self-training, and finally that the combination of active learning and self-training is fruitful when the labeling budget is high. The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled.  
  Address October 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume (up) Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCVW  
  Notes LAMP; OR Approved no  
  Call Number Admin @ si @ ZVT2021 Serial 3672  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: