|   | 
Details
   web
Records
Author (down) Zhijie Fang; David Vazquez; Antonio Lopez
Title On-Board Detection of Pedestrian Intentions Type Journal Article
Year 2017 Publication Sensors Abbreviated Journal SENS
Volume 17 Issue 10 Pages 2193
Keywords pedestrian intention; ADAS; self-driving
Abstract Avoiding vehicle-to-pedestrian crashes is a critical requirement for nowadays advanced driver assistant systems (ADAS) and future self-driving vehicles. Accordingly, detecting pedestrians from raw sensor data has a history of more than 15 years of research, with vision playing a central role.
During the last years, deep learning has boosted the accuracy of image-based pedestrian detectors.
However, detection is just the first step towards answering the core question, namely is the vehicle going to crash with a pedestrian provided preventive actions are not taken? Therefore, knowing as soon as possible if a detected pedestrian has the intention of crossing the road ahead of the vehicle is
essential for performing safe and comfortable maneuvers that prevent a crash. However, compared to pedestrian detection, there is relatively little literature on detecting pedestrian intentions. This paper aims to contribute along this line by presenting a new vision-based approach which analyzes the
pose of a pedestrian along several frames to determine if he or she is going to enter the road or not. We present experiments showing 750 ms of anticipation for pedestrians crossing the road, which at a typical urban driving speed of 50 km/h can provide 15 additional meters (compared to a pure pedestrian detector) for vehicle automatic reactions or to warn the driver. Moreover, in contrast with state-of-the-art methods, our approach is monocular, neither requiring stereo nor optical flow information.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.085; 600.076; 601.223; 600.116; 600.118 Approved no
Call Number Admin @ si @ FVL2017 Serial 2983
Permanent link to this record
 

 
Author (down) Zhijie Fang; Antonio Lopez
Title Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation Type Journal Article
Year 2019 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS
Volume 21 Issue 11 Pages 4773 - 4783
Keywords
Abstract Anticipating the intentions of vulnerable road users (VRUs) such as pedestrians and cyclists is critical for performing safe and comfortable driving maneuvers. This is the case for human driving and, thus, should be taken into account by systems providing any level of driving assistance, from advanced driver assistant systems (ADAS) to fully autonomous vehicles (AVs). In this paper, we show how the latest advances on monocular vision-based human pose estimation, i.e. those relying on deep Convolutional Neural Networks (CNNs), enable to recognize the intentions of such VRUs. In the case of cyclists, we assume that they follow traffic rules to indicate future maneuvers with arm signals. In the case of pedestrians, no indications can be assumed. Instead, we hypothesize that the walking pattern of a pedestrian allows to determine if he/she has the intention of crossing the road in the path of the ego-vehicle, so that the ego-vehicle must maneuver accordingly (e.g. slowing down or stopping). In this paper, we show how the same methodology can be used for recognizing pedestrians and cyclists' intentions. For pedestrians, we perform experiments on the JAAD dataset. For cyclists, we did not found an analogous dataset, thus, we created our own one by acquiring and annotating videos which we share with the research community. Overall, the proposed pipeline provides new state-of-the-art results on the intention recognition of VRUs.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ FaL2019 Serial 3305
Permanent link to this record
 

 
Author (down) Zhengying Liu; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu
Title Towards automated computer vision: analysis of the AutoCV challenges 2019 Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 135 Issue Pages 196-203
Keywords Computer vision; AutoML; Deep learning
Abstract We present the results of recent challenges in Automated Computer Vision (AutoCV, renamed here for clarity AutoCV1 and AutoCV2, 2019), which are part of a series of challenge on Automated Deep Learning (AutoDL). These two competitions aim at searching for fully automated solutions for classification tasks in computer vision, with an emphasis on any-time performance. The first competition was limited to image classification while the second one included both images and videos. Our design imposed to the participants to submit their code on a challenge platform for blind testing on five datasets, both for training and testing, without any human intervention whatsoever. Winning solutions adopted deep learning techniques based on already published architectures, such as AutoAugment, MobileNet and ResNet, to reach state-of-the-art performance in the time budget of the challenge (only 20 minutes of GPU time). The novel contributions include strategies to deliver good preliminary results at any time during the learning process, such that a method can be stopped early and still deliver good performance. This feature is key for the adoption of such techniques by data analysts desiring to obtain rapidly preliminary results on large datasets and to speed up the development process. The soundness of our design was verified in several aspects: (1) Little overfitting of the on-line leaderboard providing feedback on 5 development datasets was observed, compared to the final blind testing on the 5 (separate) final test datasets, suggesting that winning solutions might generalize to other computer vision classification tasks; (2) Error bars on the winners’ performance allow us to say with confident that they performed significantly better than the baseline solutions we provided; (3) The ranking of participants according to the any-time metric we designed, namely the Area under the Learning Curve, was different from that of the fixed-time metric, i.e. AUC at the end of the fixed time budget. We released all winning solutions under open-source licenses. At the end of the AutoDL challenge series, all data of the challenge will be made publicly available, thus providing a collection of uniformly formatted datasets, which can serve to conduct further research, particularly on meta-learning.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ LXE2020 Serial 3427
Permanent link to this record
 

 
Author (down) Zhengying Liu; Adrien Pavao; Zhen Xu; Sergio Escalera; Fabio Ferreira; Isabelle Guyon; Sirui Hong; Frank Hutter; Rongrong Ji; Julio C. S. Jacques Junior; Ge Li; Marius Lindauer; Zhipeng Luo; Meysam Madadi; Thomas Nierhoff; Kangning Niu; Chunguang Pan; Danny Stoll; Sebastien Treguer; Jin Wang; Peng Wang; Chenglin Wu; Youcheng Xiong; Arber Zela; Yang Zhang
Title Winning Solutions and Post-Challenge Analyses of the ChaLearn AutoDL Challenge 2019 Type Journal Article
Year 2021 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 43 Issue 9 Pages 3108 - 3125
Keywords
Abstract This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a “meta-learner”, “data ingestor”, “model selector”, “model/learner”, and “evaluator”. This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free “AutoDL self-service.”
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ LPX2021 Serial 3587
Permanent link to this record
 

 
Author (down) Zhen Xu; Sergio Escalera; Adrien Pavao; Magali Richard; Wei-Wei Tu; Quanming Yao; Huan Zhao; Isabelle Guyon
Title Codabench: Flexible, easy-to-use, and reproducible meta-benchmark platform Type Journal Article
Year 2022 Publication Patterns Abbreviated Journal PATTERNS
Volume 3 Issue 7 Pages 100543
Keywords Machine learning; data science; benchmark platform; reproducibility; competitions
Abstract Obtaining a standardized benchmark of computational methods is a major issue in data-science communities. Dedicated frameworks enabling fair benchmarking in a unified environment are yet to be developed. Here, we introduce Codabench, a meta-benchmark platform that is open sourced and community driven for benchmarking algorithms or software agents versus datasets or tasks. A public instance of Codabench is open to everyone free of charge and allows benchmark organizers to fairly compare submissions under the same setting (software, hardware, data, algorithms), with custom protocols and data formats. Codabench has unique features facilitating easy organization of flexible and reproducible benchmarks, such as the possibility of reusing templates of benchmarks and supplying compute resources on demand. Codabench has been used internally and externally on various applications, receiving more than 130 users and 2,500 submissions. As illustrative use cases, we introduce four diverse benchmarks covering graph machine learning, cancer heterogeneity, clinical diagnosis, and reinforcement learning.
Address June 24, 2022
Corporate Author Thesis
Publisher Science Direct Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA Approved no
Call Number Admin @ si @ XEP2022 Serial 3764
Permanent link to this record
 

 
Author (down) Zahra Raisi-Estabragh; Carlos Martin-Isla; Louise Nissen; Liliana Szabo; Victor M. Campello; Sergio Escalera; Simon Winther; Morten Bottcher; Karim Lekadir; and Steffen E. Petersen
Title Radiomics analysis enhances the diagnostic performance of CMR stress perfusion: a proof-of-concept study using the Dan-NICAD dataset Type Journal Article
Year 2023 Publication Frontiers in Cardiovascular Medicine Abbreviated Journal FCM
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ RMN2023 Serial 3937
Permanent link to this record
 

 
Author (down) Yunchao Gong; Svetlana Lazebnik; Albert Gordo; Florent Perronnin
Title Iterative quantization: A procrustean approach to learning binary codes for Large-Scale Image Retrieval Type Journal Article
Year 2012 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 35 Issue 12 Pages 2916-2929
Keywords
Abstract This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections. We formulate this problem in terms of finding a rotation of zero-centered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube, and propose a simple and efficient alternating minimization algorithm to accomplish this task. This algorithm, dubbed iterative quantization (ITQ), has connections to multi-class spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). The resulting binary codes significantly outperform several other state-of-the-art methods. We also show that further performance improvements can result from transforming the data with a nonlinear kernel mapping prior to PCA or CCA. Finally, we demonstrate an application of ITQ to learning binary attributes or “classemes” on the ImageNet dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0162-8828 ISBN 978-1-4577-0394-2 Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ GLG 2012b Serial 2008
Permanent link to this record
 

 
Author (down) Yunan Li; Jun Wan; Qiguang Miao; Sergio Escalera; Huijuan Fang; Huizhou Chen; Xiangda Qi; Guodong Guo
Title CR-Net: A Deep Classification-Regression Network for Multimodal Apparent Personality Analysis Type Journal Article
Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 128 Issue Pages 2763–2780
Keywords
Abstract First impressions strongly influence social interactions, having a high impact in the personal and professional life. In this paper, we present a deep Classification-Regression Network (CR-Net) for analyzing the Big Five personality problem and further assisting on job interview recommendation in a first impressions setup. The setup is based on the ChaLearn First Impressions dataset, including multimodal data with video, audio, and text converted from the corresponding audio data, where each person is talking in front of a camera. In order to give a comprehensive prediction, we analyze the videos from both the entire scene (including the person’s motions and background) and the face of the person. Our CR-Net first performs personality trait classification and applies a regression later, which can obtain accurate predictions for both personality traits and interview recommendation. Furthermore, we present a new loss function called Bell Loss to address inaccurate predictions caused by the regression-to-the-mean problem. Extensive experiments on the First Impressions dataset show the effectiveness of our proposed network, outperforming the state-of-the-art.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ LWM2020 Serial 3413
Permanent link to this record
 

 
Author (down) Yuhua Luo; Francisco Jose Perales; Juan J. Villanueva
Title An automatic Rotoscopy System for Human Motion Based on a Biomedical Graphical Model. Type Journal Article
Year 1992 Publication Computer & Graphics Abbreviated Journal
Volume 16 Issue 4 Pages 355-362
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number ISE @ ise @ LPV1992 Serial 249
Permanent link to this record
 

 
Author (down) Yi Xiao; Felipe Codevilla; Akhil Gurram; Onay Urfalioglu; Antonio Lopez
Title Multimodal end-to-end autonomous driving Type Journal Article
Year 2020 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS
Volume Issue Pages 1-11
Keywords
Abstract A crucial component of an autonomous vehicle (AV) is the artificial intelligence (AI) is able to drive towards a desired destination. Today, there are different paradigms addressing the development of AI drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception and maneuver planning and control. On the other hand, we find end-to-end driving approaches that try to learn a direct mapping from input raw sensor data to vehicle control signals. The later are relatively less studied, but are gaining popularity since they are less demanding in terms of sensor data annotation. This paper focuses on end-to-end autonomous driving. So far, most proposals relying on this paradigm assume RGB images as input sensor data. However, AVs will not be equipped only with cameras, but also with active sensors providing accurate depth information (e.g., LiDARs). Accordingly, this paper analyses whether combining RGB and depth modalities, i.e. using RGBD data, produces better end-to-end AI drivers than relying on a single modality. We consider multimodality based on early, mid and late fusion schemes, both in multisensory and single-sensor (monocular depth estimation) settings. Using the CARLA simulator and conditional imitation learning (CIL), we show how, indeed, early fusion multimodality outperforms single-modality.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ XCG2020 Serial 3490
Permanent link to this record
 

 
Author (down) Yecong Wan; Yuanshuo Cheng; Miingwen Shao; Jordi Gonzalez
Title Image rain removal and illumination enhancement done in one go Type Journal Article
Year 2022 Publication Knowledge-Based Systems Abbreviated Journal KBS
Volume 252 Issue Pages 109244
Keywords
Abstract Rain removal plays an important role in the restoration of degraded images. Recently, CNN-based methods have achieved remarkable success. However, these approaches neglect that the appearance of real-world rain is often accompanied by low light conditions, which will further degrade the image quality, thereby hindering the restoration mission. Therefore, it is very indispensable to jointly remove the rain and enhance illumination for real-world rain image restoration. To this end, we proposed a novel spatially-adaptive network, dubbed SANet, which can remove the rain and enhance illumination in one go with the guidance of degradation mask. Meanwhile, to fully utilize negative samples, a contrastive loss is proposed to preserve more natural textures and consistent illumination. In addition, we present a new synthetic dataset, named DarkRain, to boost the development of rain image restoration algorithms in practical scenarios. DarkRain not only contains different degrees of rain, but also considers different lighting conditions, and more realistically simulates real-world rainfall scenarios. SANet is extensively evaluated on the proposed dataset and attains new state-of-the-art performance against other combining methods. Moreover, after a simple transformation, our SANet surpasses existing the state-of-the-art algorithms in both rain removal and low-light image enhancement.
Address Sept 2022
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE; 600.157; 600.168 Approved no
Call Number Admin @ si @ WCS2022 Serial 3744
Permanent link to this record
 

 
Author (down) Yaxing Wang; Luis Herranz; Joost Van de Weijer
Title Mix and match networks: multi-domain alignment for unpaired image-to-image translation Type Journal Article
Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 128 Issue Pages 2849–2872
Keywords
Abstract This paper addresses the problem of inferring unseen cross-modal image-to-image translations between multiple modalities. We assume that only some of the pairwise translations have been seen (i.e. trained) and infer the remaining unseen translations (where training pairs are not available). We propose mix and match networks, an approach where multiple encoders and decoders are aligned in such a way that the desired translation can be obtained by simply cascading the source encoder and the target decoder, even when they have not interacted during the training stage (i.e. unseen). The main challenge lies in the alignment of the latent representations at the bottlenecks of encoder-decoder pairs. We propose an architecture with several tools to encourage alignment, including autoencoders and robust side information and latent consistency losses. We show the benefits of our approach in terms of effectiveness and scalability compared with other pairwise image-to-image translation approaches. We also propose zero-pair cross-modal image translation, a challenging setting where the objective is inferring semantic segmentation from depth (and vice-versa) without explicit segmentation-depth pairs, and only from two (disjoint) segmentation-RGB and depth-RGB training sets. We observe that a certain part of the shared information between unseen modalities might not be reachable, so we further propose a variant that leverages pseudo-pairs which allows us to exploit this shared information between the unseen modalities
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.109; 600.106; 600.141; 600.120 Approved no
Call Number Admin @ si @ WHW2020 Serial 3424
Permanent link to this record
 

 
Author (down) Yaxing Wang; Abel Gonzalez-Garcia; Luis Herranz; Joost Van de Weijer
Title Controlling biases and diversity in diverse image-to-image translation Type Journal Article
Year 2021 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 202 Issue Pages 103082
Keywords
Abstract JCR 2019 Q2, IF=3.121
The task of unpaired image-to-image translation is highly challenging due to the lack of explicit cross-domain pairs of instances. We consider here diverse image translation (DIT), an even more challenging setting in which an image can have multiple plausible translations. This is normally achieved by explicitly disentangling content and style in the latent representation and sampling different styles codes while maintaining the image content. Despite the success of current DIT models, they are prone to suffer from bias. In this paper, we study the problem of bias in image-to-image translation. Biased datasets may add undesired changes (e.g. change gender or race in face images) to the output translations as a consequence of the particular underlying visual distribution in the target domain. In order to alleviate the effects of this problem we propose the use of semantic constraints that enforce the preservation of desired image properties. Our proposed model is a step towards unbiased diverse image-to-image translation (UDIT), and results in less unwanted changes in the translated images while still performing the wanted transformation. Experiments on several heavily biased datasets show the effectiveness of the proposed techniques in different domains such as faces, objects, and scenes.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.141; 600.109; 600.147 Approved no
Call Number Admin @ si @ WGH2021 Serial 3464
Permanent link to this record
 

 
Author (down) Yaxing Wang; Abel Gonzalez-Garcia; Chenshen Wu; Luis Herranz; Fahad Shahbaz Khan; Shangling Jui; Jian Yang; Joost Van de Weijer
Title MineGAN++: Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains Type Journal Article
Year 2024 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 132 Issue Pages 490–514
Keywords
Abstract Given the often enormous effort required to train GANs, both computationally as well as in dataset collection, the re-use of pretrained GANs largely increases the potential impact of generative models. Therefore, we propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs. This is done using a miner network that identifies which part of the generative distribution of each pretrained GAN outputs samples closest to the target domain. Mining effectively steers GAN sampling towards suitable regions of the latent space, which facilitates the posterior finetuning and avoids pathologies of other methods, such as mode collapse and lack of flexibility. Furthermore, to prevent overfitting on small target domains, we introduce sparse subnetwork selection, that restricts the set of trainable neurons to those that are relevant for the target dataset. We perform comprehensive experiments on several challenging datasets using various GAN architectures (BigGAN, Progressive GAN, and StyleGAN) and show that the proposed method, called MineGAN, effectively transfers knowledge to domains with few target images, outperforming existing methods. In addition, MineGAN can successfully transfer knowledge from multiple pretrained GANs. MineGAN.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; MACO Approved no
Call Number Admin @ si @ WGW2024 Serial 3888
Permanent link to this record
 

 
Author (down) Yasuko Sugito; Javier Vazquez; Trevor Canham; Marcelo Bertalmio
Title Image quality evaluation in professional HDR/WCG production questions the need for HDR metrics Type Journal Article
Year 2022 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP
Volume 31 Issue Pages 5163 - 5177
Keywords Measurement; Image color analysis; Image coding; Production; Dynamic range; Brightness; Extraterrestrial measurements
Abstract In the quality evaluation of high dynamic range and wide color gamut (HDR/WCG) images, a number of works have concluded that native HDR metrics, such as HDR visual difference predictor (HDR-VDP), HDR video quality metric (HDR-VQM), or convolutional neural network (CNN)-based visibility metrics for HDR content, provide the best results. These metrics consider only the luminance component, but several color difference metrics have been specifically developed for, and validated with, HDR/WCG images. In this paper, we perform subjective evaluation experiments in a professional HDR/WCG production setting, under a real use case scenario. The results are quite relevant in that they show, firstly, that the performance of HDR metrics is worse than that of a classic, simple standard dynamic range (SDR) metric applied directly to the HDR content; and secondly, that the chrominance metrics specifically developed for HDR/WCG imaging have poor correlation with observer scores and are also outperformed by an SDR metric. Based on these findings, we show how a very simple framework for creating color HDR metrics, that uses only luminance SDR metrics, transfer functions, and classic color spaces, is able to consistently outperform, by a considerable margin, state-of-the-art HDR metrics on a varied set of HDR content, for both perceptual quantization (PQ) and Hybrid Log-Gamma (HLG) encoding, luminance and chroma distortions, and on different color spaces of common use.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes 600.161; 611.007 Approved no
Call Number Admin @ si @ SVG2022 Serial 3683
Permanent link to this record