|
Fei Yang, Luis Herranz, Joost Van de Weijer, Jose Antonio Iglesias, Antonio Lopez, & Mikhail Mozerov. (2020). Variable Rate Deep Image Compression with Modulated Autoencoder. SPL - IEEE Signal Processing Letters, 27, 331–335.
Abstract: Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.
|
|
|
Mikhail Mozerov, & Joost Van de Weijer. (2017). Improved Recursive Geodesic Distance Computation for Edge Preserving Filter. TIP - IEEE Transactions on Image Processing, 26(8), 3696–3706.
Abstract: All known recursive filters based on the geodesic distance affinity are realized by two 1D recursions applied in two orthogonal directions of the image plane. The 2D extension of the filter is not valid and has theoretically drawbacks, which lead to known artifacts. In this paper, a maximum influence propagation method is proposed to approximate the 2D extension for the
geodesic distance-based recursive filter. The method allows to partially overcome the drawbacks of the 1D recursion approach. We show that our improved recursion better approximates the true geodesic distance filter, and the application of this improved filter for image denoising outperforms the existing recursive implementation of the geodesic distance. As an application,
we consider a geodesic distance-based filter for image denoising.
Experimental evaluation of our denoising method demonstrates comparable and for several test images better results, than stateof-the-art approaches, while our algorithm is considerably fasterwith computational complexity O(8P).
Keywords: Geodesic distance filter; color image filtering; image enhancement
|
|
|
Xinhang Song, Shuqiang Jiang, & Luis Herranz. (2017). Multi-Scale Multi-Feature Context Modeling for Scene Recognition in the Semantic Manifold. TIP - IEEE Transactions on Image Processing, 26(6), 2721–2735.
Abstract: Before the big data era, scene recognition was often approached with two-step inference using localized intermediate representations (objects, topics, and so on). One of such approaches is the semantic manifold (SM), in which patches and images are modeled as points in a semantic probability simplex. Patch models are learned resorting to weak supervision via image labels, which leads to the problem of scene categories co-occurring in this semantic space. Fortunately, each category has its own co-occurrence patterns that are consistent across the images in that category. Thus, discovering and modeling these patterns are critical to improve the recognition performance in this representation. Since the emergence of large data sets, such as ImageNet and Places, these approaches have been relegated in favor of the much more powerful convolutional neural networks (CNNs), which can automatically learn multi-layered representations from the data. In this paper, we address many limitations of the original SM approach and related works. We propose discriminative patch representations using neural networks and further propose a hybrid architecture in which the semantic manifold is built on top of multiscale CNNs. Both representations can be computed significantly faster than the Gaussian mixture models of the original SM. To combine multiple scales, spatial relations, and multiple features, we formulate rich context models using Markov random fields. To solve the optimization problem, we analyze global and local approaches, where a top-down hierarchical algorithm has the best performance. Experimental results show that exploiting different types of contextual relations jointly consistently improves the recognition accuracy.
|
|
|
Mikhail Mozerov, Fei Yang, & Joost Van de Weijer. (2019). Sparse Data Interpolation Using the Geodesic Distance Affinity Space. SPL - IEEE Signal Processing Letters, 26(6), 943–947.
Abstract: In this letter, we adapt the geodesic distance-based recursive filter to the sparse data interpolation problem. The proposed technique is general and can be easily applied to any kind of sparse data. We demonstrate its superiority over other interpolation techniques in three experiments for qualitative and quantitative evaluation. In addition, we compare our method with the popular interpolation algorithm presented in the paper on EpicFlow optical flow, which is intuitively motivated by a similar geodesic distance principle. The comparison shows that our algorithm is more accurate and considerably faster than the EpicFlow interpolation technique.
|
|
|
Pedro Martins, Paulo Carvalho, & Carlo Gatta. (2014). Context-aware features and robust image representations. JVCIR - Journal of Visual Communication and Image Representation, 25(2), 339–348.
Abstract: Local image features are often used to efficiently represent image content. The limited number of types of features that a local feature extractor responds to might be insufficient to provide a robust image representation. To overcome this limitation, we propose a context-aware feature extraction formulated under an information theoretic framework. The algorithm does not respond to a specific type of features; the idea is to retrieve complementary features which are relevant within the image context. We empirically validate the method by investigating the repeatability, the completeness, and the complementarity of context-aware features on standard benchmarks. In a comparison with strictly local features, we show that our context-aware features produce more robust image representations. Furthermore, we study the complementarity between strictly local features and context-aware ones to produce an even more robust representation.
|
|