|
Guim Perarnau, Joost Van de Weijer, Bogdan Raducanu and Jose Manuel Alvarez. 2016. Invertible conditional gans for image editing. 30th Annual Conference on Neural Information Processing Systems Worshops.
Abstract: Generative Adversarial Networks (GANs) have recently demonstrated to successfully approximate complex data distributions. A relevant extension of this model is conditional GANs (cGANs), where the introduction of external information allows to determine specific representations of the generated images. In this work, we evaluate encoders to inverse the mapping of a cGAN, i.e., mapping a real image into a latent space and a conditional representation. This allows, for example, to reconstruct and modify real images of faces conditioning on arbitrary attributes.
Additionally, we evaluate the design of cGANs. The combination of an encoder
with a cGAN, which we call Invertible cGAN (IcGAN), enables to re-generate real
images with deterministic complex modifications.
|
|
|
Jose Manuel Alvarez and Antonio Lopez. 2008. Novel Index for Objective Evaluation of Road Detection Algorithms. Intelligent Transportation Systems. 11th International IEEE Conference on,.815–820.
|
|
|
Marçal Rusiñol, David Aldavert, Ricardo Toledo and Josep Llados. 2011. Browsing Heterogeneous Document Collections by a Segmentation-Free Word Spotting Method. 11th International Conference on Document Analysis and Recognition.63–67.
Abstract: In this paper, we present a segmentation-free word spotting method that is able to deal with heterogeneous document image collections. We propose a patch-based framework where patches are represented by a bag-of-visual-words model powered by SIFT descriptors. A later refinement of the feature vectors is performed by applying the latent semantic indexing technique. The proposed method performs well on both handwritten and typewritten historical document images. We have also tested our method on documents written in non-Latin scripts.
|
|
|
Victor Vaquero, German Ros, Francesc Moreno-Noguer, Antonio Lopez and Alberto Sanfeliu. 2017. Joint coarse-and-fine reasoning for deep optical flow. 24th International Conference on Image Processing.2558–2562.
Abstract: We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning. The coarse reasoning is performed over a discrete classification space to obtain a general rough solution, while the fine details of the solution are obtained over a continuous regression space. In our approach both components are jointly estimated, which proved to be beneficial for improving estimation accuracy. Additionally, we propose a new network architecture, which combines coarse and fine components by treating the fine estimation as a refinement built on top of the coarse solution, and therefore adding details to the general prediction. We apply our approach to the challenging problem of optical flow estimation and empirically validate it against state-of-the-art CNN-based solutions trained from scratch and tested on large optical flow datasets.
|
|
|
David Aldavert, Ricardo Toledo, Arnau Ramisa and Ramon Lopez de Mantaras. 2009. Visual Registration Method For A Low Cost Robot: Computer Vision Systems. 7th International Conference on Computer Vision Systems. Springer Berlin Heidelberg, 204–214. (LNCS.)
Abstract: An autonomous mobile robot must face the correspondence or data association problem in order to carry out tasks like place recognition or unknown environment mapping. In order to put into correspondence two maps, most methods estimate the transformation relating the maps from matches established between low level feature extracted from sensor data. However, finding explicit matches between features is a challenging and computationally expensive task. In this paper, we propose a new method to align obstacle maps without searching explicit matches between features. The maps are obtained from a stereo pair. Then, we use a vocabulary tree approach to identify putative corresponding maps followed by the Newton minimization algorithm to find the transformation that relates both maps. The proposed method is evaluated in a typical office environment showing good performance.
|
|
|
Joan Serrat, Jordi Vitria and J. Pladellorens. 1991. Morphological Segmentation of Heart Scintigraphic image Sequences. Computer Assisted Radiology..
|
|
|
David Geronimo, Angel Sappa, Antonio Lopez and Daniel Ponsa. 2007. Adaptive Image Sampling and Windows Classification for On-board Pedestrian Detection. Proceedings of the 5th International Conference on Computer Vision Systems.
Abstract: On–board pedestrian detection is in the frontier of the state–of–the–art since it implies processing outdoor scenarios from a mobile platform and searching for aspect–changing objects in cluttered urban environments. Most promising approaches include the development of classifiers based on feature selection and machine learning. However, they use a large number of features which compromises real–time. Thus, methods for running the classifiers in only a few image windows must be provided. In this paper we contribute in both aspects, proposing a camera
pose estimation method for adaptive sparse image sampling, as well as a classifier for pedestrian detection based on Haar wavelets and edge orientation histograms as features and AdaBoost as learning machine. Both proposals are compared with relevant approaches in the literature, showing comparable results but reducing processing time by four for the sampling tasks and by ten for the classification one.
Keywords: Pedestrian Detection
|
|
|
Mohammad Rouhani and Angel Sappa. 2009. A Novel Approach to Geometric Fitting of Implicit Quadrics. 8th International Conference on Advanced Concepts for Intelligent Vision Systems. Springer Berlin Heidelberg, 121–132. (LNCS.)
Abstract: This paper presents a novel approach for estimating the geometric distance from a given point to the corresponding implicit quadric curve/surface. The proposed estimation is based on the height of a tetrahedron, which is used as a coarse but reliable estimation of the real distance. The estimated distance is then used for finding the best set of quadric parameters, by means of the Levenberg-Marquardt algorithm, which is a common framework in other geometric fitting approaches. Comparisons of the proposed approach with previous ones are provided to show both improvements in CPU time as well as in the accuracy of the obtained results.
|
|
|
Patricia Marquez and 6 others. 2014. Factors Affecting Optical Flow Performance in Tagging Magnetic Resonance Imaging. 17th International Conference on Medical Image Computing and Computer Assisted Intervention. Springer International Publishing, 231–238. (LNCS.)
Abstract: Changes in cardiac deformation patterns are correlated with cardiac pathologies. Deformation can be extracted from tagging Magnetic Resonance Imaging (tMRI) using Optical Flow (OF) techniques. For applications of OF in a clinical setting it is important to assess to what extent the performance of a particular OF method is stable across dierent clinical acquisition artifacts. This paper presents a statistical validation framework, based on ANOVA, to assess the motion and appearance factors that have the largest in uence on OF accuracy drop.
In order to validate this framework, we created a database of simulated tMRI data including the most common artifacts of MRI and test three dierent OF methods, including HARP.
Keywords: Optical flow; Performance Evaluation; Synthetic Database; ANOVA; Tagging Magnetic Resonance Imaging
|
|
|
Felipe Codevilla, Matthias Muller, Antonio Lopez, Vladlen Koltun and Alexey Dosovitskiy. 2018. End-to-end Driving via Conditional Imitation Learning. IEEE International Conference on Robotics and Automation.4693–4700.
Abstract: Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at this https URL
|
|