Bojana Gajic, Eduard Vazquez, & Ramon Baldrich. (2017). Evaluation of Deep Image Descriptors for Texture Retrieval. In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017) (pp. 251–257).
Abstract: The increasing complexity learnt in the layers of a Convolutional Neural Network has proven to be of great help for the task of classification. The topic has received great attention in recently published literature.
Nonetheless, just a handful of works study low-level representations, commonly associated with lower layers. In this paper, we explore recent findings which conclude, counterintuitively, the last layer of the VGG convolutional network is the best to describe a low-level property such as texture. To shed some light on this issue, we are proposing a psychophysical experiment to evaluate the adequacy of different layers of the VGG network for texture retrieval. Results obtained suggest that, whereas the last convolutional layer is a good choice for a specific task of classification, it might not be the best choice as a texture descriptor, showing a very poor performance on texture retrieval. Intermediate layers show the best performance, showing a good combination of basic filters, as in the primary visual cortex, and also a degree of higher level information to describe more complex textures.
Keywords: Texture Representation; Texture Retrieval; Convolutional Neural Networks; Psychophysical Evaluation
|
Marcos V Conde, Javier Vazquez, Michael S Brown, & Radu TImofte. (2024). NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement. In 38th AAAI Conference on Artificial Intelligence.
Abstract: 3D lookup tables (3D LUTs) are a key component for image enhancement. Modern image signal processors (ISPs) have dedicated support for these as part of the camera rendering pipeline. Cameras typically provide multiple options for picture styles, where each style is usually obtained by applying a unique handcrafted 3D LUT. Current approaches for learning and applying 3D LUTs are notably fast, yet not so memory-efficient, as storing multiple 3D LUTs is required. For this reason and other implementation limitations, their use on mobile devices is less popular. In this work, we propose a Neural Implicit LUT (NILUT), an implicitly defined continuous 3D color transformation parameterized by a neural network. We show that NILUTs are capable of accurately emulating real 3D LUTs. Moreover, a NILUT can be extended to incorporate multiple styles into a single network with the ability to blend styles implicitly. Our novel approach is memory-efficient, controllable and can complement previous methods, including learned ISPs.
|
Marcos V Conde, Florin Vasluianu, Javier Vazquez, & Radu Timofte. (2023). Perceptual image enhancement for smartphone real-time applications. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 1848–1858).
Abstract: Recent advances in camera designs and imaging pipelines allow us to capture high-quality images using smartphones. However, due to the small size and lens limitations of the smartphone cameras, we commonly find artifacts or degradation in the processed images. The most common unpleasant effects are noise artifacts, diffraction artifacts, blur, and HDR overexposure. Deep learning methods for image restoration can successfully remove these artifacts. However, most approaches are not suitable for real-time applications on mobile devices due to their heavy computation and memory requirements. In this paper, we propose LPIENet, a lightweight network for perceptual image enhancement, with the focus on deploying it on smartphones. Our experiments show that, with much fewer parameters and operations, our model can deal with the mentioned artifacts and achieve competitive performance compared with state-of-the-art methods on standard benchmarks. Moreover, to prove the efficiency and reliability of our approach, we deployed the model directly on commercial smartphones and evaluated its performance. Our model can process 2K resolution images under 1 second in mid-level commercial smartphones.
|
Danna Xue, Luis Herranz, Javier Vazquez, & Yanning Zhang. (2023). Burst Perception-Distortion Tradeoff: Analysis and Evaluation. In IEEE International Conference on Acoustics, Speech and Signal Processing.
Abstract: Burst image restoration attempts to effectively utilize the complementary cues appearing in sequential images to produce a high-quality image. Most current methods use all the available images to obtain the reconstructed image. However, using more images for burst restoration is not always the best option regarding reconstruction quality and efficiency, as the images acquired by handheld imaging devices suffer from degradation and misalignment caused by the camera noise and shake. In this paper, we extend the perception-distortion tradeoff theory by introducing multiple-frame information. We propose the area of the unattainable region as a new metric for perception-distortion tradeoff evaluation and comparison. Based on this metric, we analyse the performance of burst restoration from the perspective of the perception-distortion tradeoff under both aligned bursts and misaligned bursts situations. Our analysis reveals the importance of inter-frame alignment for burst restoration and shows that the optimal burst length for the restoration model depends both on the degree of degradation and misalignment.
|
Yawei Li, Yulun Zhang, Radu Timofte, Luc Van Gool, Zhijun Tu, Kunpeng Du, et al. (2023). NTIRE 2023 challenge on image denoising: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 1904–1920).
Abstract: This paper reviews the NTIRE 2023 challenge on image denoising (σ = 50) with a focus on the proposed solutions and results. The aim is to obtain a network design capable to produce high-quality results with the best performance measured by PSNR for image denoising. Independent additive white Gaussian noise (AWGN) is assumed and the noise level is 50. The challenge had 225 registered participants, and 16 teams made valid submissions. They gauge the state-of-the-art for image denoising.
|
Justine Giroux, Mohammad Reza Karimi Dastjerdi, Yannick Hold-Geoffroy, Javier Vazquez, & Jean François Lalonde. (2024). Towards a Perceptual Evaluation Framework for Lighting Estimation. In Arxiv.
Abstract: rogress in lighting estimation is tracked by computing existing image quality assessment (IQA) metrics on images from standard datasets. While this may appear to be a reasonable approach, we demonstrate that doing so does not correlate to human preference when the estimated lighting is used to relight a virtual scene into a real photograph. To study this, we design a controlled psychophysical experiment where human observers must choose their preference amongst rendered scenes lit using a set of lighting estimation algorithms selected from the recent literature, and use it to analyse how these algorithms perform according to human perception. Then, we demonstrate that none of the most popular IQA metrics from the literature, taken individually, correctly represent human perception. Finally, we show that by learning a combination of existing IQA metrics, we can more accurately represent human preference. This provides a new perceptual framework to help evaluate future lighting estimation algorithms.
|
Xavier Otazu, M. Ribo, M. Peracaula, J.M. Paredes, & J. Nuñez. (2002). Detection of superimposed periodic signals using wavelets. Monthly Notices of the Royal Astronomical Society, 333, 2: 365–372 (IF: 4.671).
|
Xavier Otazu, M. Ribo, J.M. Paredes, M. Peracaula, & J. Nuñez. (2004). Multiresolution approach for period determination on unevenly sampled data. Monthly Notices of the Royal Astronomical Society, 351:251–219 (IF: 5.238).
|
Maria Vanrell, Ramon Baldrich, Anna Salvatella, Robert Benavente, & Francesc Tous. (2004). Induction operators for a computational colour-texture representation. Computer Vision and Image Understanding, 94(1–3):92–114, ISSN: 1077–3142 (IF: 0.651).
|
Robert Benavente, Maria Vanrell, & Ramon Baldrich. (2004). Estimation of Fuzzy Sets for Computational Colour Categorization. Color Research and Application, 29(5):342–353 (IF: 0.739).
|
M. Gonzalez-Audicana, Xavier Otazu, O. Fors, & A. Seco. (2005). Comparison between Mallats and the trous discrete wavelet transform based algorithms for the fusion of multispectral and panchromatic images. International Journal of Remote Sensing, 26(3):595–614 (IF: 0.925).
|
Xavier Otazu, & Maria Vanrell. (2005). Perceptual representation of textured images. Journal of Imaging Science and Technology, 49(3):262–271 (IF: 0.522).
|
Maria Vanrell, & Jordi Vitria. (1997). Optimal 3x3 decomposable disks for morphological transformations. Image and Vision Computing, 15(2): 845–854.
|
Xavier Otazu, M. Gonzalez-Audicana, O. Fors, & J. Nuñez. (2005). Introduction of Sensor Spectral Response Into Image Fusion Methods. Application to Wavelet-Based Methods. IEEE Transactions on Geoscience and Remote Sensing, 43(10): 2376–2385 (IF: 1.627).
|
A. Richichi, O. Fors, M.T. Merino, Xavier Otazu, J. Nuñez, A. Prades, et al. (2006). The Calar Alto lunar occultation program: update and new results. Astronomy and Astrophysics (Section ’Stellar structure and evolution’), 445:1081–1088.
|
Robert Benavente, Maria Vanrell, & Ramon Baldrich. (2006). A data set for fuzzy colour naming. Color Research & Application, 31(1):48–56.
|
J. Nuñez, Xavier Otazu, & M.T. Merino. (2005). A Multiresolution-Based Method for the Determination of the Relative Resolution between Images. First Application to Remote Sensing and Medical Images. International Journal of Imaging Systems and Technology, 15(5): 225–235 (IF: 0.439).
|
Xavier Otazu, & Maria Vanrell. (2006). Several lightness induction effects with a computational multiresolution wavelet framework. 29th European Conference on Visual Perception (ECVP’06), Perception Suppl s, 32: 56–56.
|