|
Fei Yang, Luis Herranz, Joost Van de Weijer, Jose Antonio Iglesias, Antonio Lopez, & Mikhail Mozerov. (2020). Variable Rate Deep Image Compression with Modulated Autoencoder. SPL - IEEE Signal Processing Letters, 27, 331–335.
Abstract: Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.
|
|
|
Ferran Diego, Joan Serrat, & Antonio Lopez. (2013). Joint spatio-temporal alignment of sequences. TMM - IEEE Transactions on Multimedia, 15(6), 1377–1387.
Abstract: Video alignment is important in different areas of computer vision such as wide baseline matching, action recognition, change detection, video copy detection and frame dropping prevention. Current video alignment methods usually deal with a relatively simple case of fixed or rigidly attached cameras or simultaneous acquisition. Therefore, in this paper we propose a joint video alignment for bringing two video sequences into a spatio-temporal alignment. Specifically, the novelty of the paper is to formulate the video alignment to fold the spatial and temporal alignment into a single alignment framework. This simultaneously satisfies a frame-correspondence and frame-alignment similarity; exploiting the knowledge among neighbor frames by a standard pairwise Markov random field (MRF). This new formulation is able to handle the alignment of sequences recorded at different times by independent moving cameras that follows a similar trajectory, and also generalizes the particular cases that of fixed geometric transformation and/or linear temporal mapping. We conduct experiments on different scenarios such as sequences recorded simultaneously or by moving cameras to validate the robustness of the proposed approach. The proposed method provides the highest video alignment accuracy compared to the state-of-the-art methods on sequences recorded from vehicles driving along the same track at different times.
Keywords: video alignment
|
|
|
Jose Manuel Alvarez, Theo Gevers, Ferran Diego, & Antonio Lopez. (2013). Road Geometry Classification by Adaptative Shape Models. TITS - IEEE Transactions on Intelligent Transportation Systems, 14(1), 459–468.
Abstract: Vision-based road detection is important for different applications in transportation, such as autonomous driving, vehicle collision warning, and pedestrian crossing detection. Common approaches to road detection are based on low-level road appearance (e.g., color or texture) and neglect of the scene geometry and context. Hence, using only low-level features makes these algorithms highly depend on structured roads, road homogeneity, and lighting conditions. Therefore, the aim of this paper is to classify road geometries for road detection through the analysis of scene composition and temporal coherence. Road geometry classification is proposed by building corresponding models from training images containing prototypical road geometries. We propose adaptive shape models where spatial pyramids are steered by the inherent spatial structure of road images. To reduce the influence of lighting variations, invariant features are used. Large-scale experiments show that the proposed road geometry classifier yields a high recognition rate of 73.57% ± 13.1, clearly outperforming other state-of-the-art methods. Including road shape information improves road detection results over existing appearance-based methods. Finally, it is shown that invariant features and temporal information provide robustness against disturbing imaging conditions.
Keywords: road detection
|
|
|
Joan Serrat, Felipe Lumbreras, & Antonio Lopez. (2013). Cost estimation of custom hoses from STL files and CAD drawings. COMPUTIND - Computers in Industry, 64(3), 299–309.
Abstract: We present a method for the cost estimation of custom hoses from CAD models. They can come in two formats, which are easy to generate: a STL file or the image of a CAD drawing showing several orthogonal projections. The challenges in either cases are, first, to obtain from them a high level 3D description of the shape, and second, to learn a regression function for the prediction of the manufacturing time, based on geometric features of the reconstructed shape. The chosen description is the 3D line along the medial axis of the tube and the diameter of the circular sections along it. In order to extract it from STL files, we have adapted RANSAC, a robust parametric fitting algorithm. As for CAD drawing images, we propose a new technique for 3D reconstruction from data entered on any number of orthogonal projections. The regression function is a Gaussian process, which does not constrain the function to adopt any specific form and is governed by just two parameters. We assess the accuracy of the manufacturing time estimation by k-fold cross validation on 171 STL file models for which the time is provided by an expert. The results show the feasibility of the method, whereby the relative error for 80% of the testing samples is below 15%.
Keywords: On-line quotation; STL format; Regression; Gaussian process
|
|
|
Miguel Oliveira, Victor Santos, Angel Sappa, P. Dias, & A. Moreira. (2016). Incremental Scenario Representations for Autonomous Driving using Geometric Polygonal Primitives. RAS - Robotics and Autonomous Systems, 83, 312–325.
Abstract: When an autonomous vehicle is traveling through some scenario it receives a continuous stream of sensor data. This sensor data arrives in an asynchronous fashion and often contains overlapping or redundant information. Thus, it is not trivial how a representation of the environment observed by the vehicle can be created and updated over time. This paper presents a novel methodology to compute an incremental 3D representation of a scenario from 3D range measurements. We propose to use macro scale polygonal primitives to model the scenario. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Furthermore, we propose mechanisms designed to update the geometric polygonal primitives over time whenever fresh sensor data is collected. Results show that the approach is capable of producing accurate descriptions of the scene, and that it is computationally very efficient when compared to other reconstruction techniques.
Keywords: Incremental scene reconstruction; Point clouds; Autonomous vehicles; Polygonal primitives
|
|