|
German Ros, J. Guerrero, Angel Sappa and Antonio Lopez. 2013. VSLAM pose initialization via Lie groups and Lie algebras optimization. Proceedings of IEEE International Conference on Robotics and Automation.5740–5747.
Abstract: We present a novel technique for estimating initial 3D poses in the context of localization and Visual SLAM problems. The presented approach can deal with noise, outliers and a large amount of input data and still performs in real time in a standard CPU. Our method produces solutions with an accuracy comparable to those produced by RANSAC but can be much faster when the percentage of outliers is high or for large amounts of input data. On the current work we propose to formulate the pose estimation as an optimization problem on Lie groups, considering their manifold structure as well as their associated Lie algebras. This allows us to perform a fast and simple optimization at the same time that conserve all the constraints imposed by the Lie group SE(3). Additionally, we present several key design concepts related with the cost function and its Jacobian; aspects that are critical for the good performance of the algorithm.
Keywords: SLAM
|
|
|
David Aldavert, Marçal Rusiñol, Ricardo Toledo and Josep Llados. 2013. Integrating Visual and Textual Cues for Query-by-String Word Spotting. 12th International Conference on Document Analysis and Recognition.511–515.
Abstract: In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character $n$-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances.
|
|
|
Karel Paleček, David Geronimo and Frederic Lerasle. 2012. Pre-attention cues for person detection. Cognitive Behavioural Systems, COST 2102 International Training School. Springer Berlin Heidelberg, 225–235. (LNCS.)
Abstract: Current state-of-the-art person detectors have been proven reliable and achieve very good detection rates. However, the performance is often far from real time, which limits their use to low resolution images only. In this paper, we deal with candidate window generation problem for person detection, i.e. we want to reduce the computational complexity of a person detector by reducing the number of regions that has to be evaluated. We base our work on Alexe’s paper [1], which introduced several pre-attention cues for generic object detection. We evaluate these cues in the context of person detection and show that their performance degrades rapidly for scenes containing multiple objects of interest such as pictures from urban environment. We extend this set by new cues, which better suits our class-specific task. The cues are designed to be simple and efficient, so that they can be used in the pre-attention phase of a more complex sliding window based person detector.
|
|
|
Jose Carlos Rubio, Joan Serrat and Antonio Lopez. 2012. Video Co-segmentation. 11th Asian Conference on Computer Vision. Springer Berlin Heidelberg, 13–24. (LNCS.)
Abstract: Segmentation of a single image is in general a highly underconstrained problem. A frequent approach to solve it is to somehow provide prior knowledge or constraints on how the objects of interest look like (in terms of their shape, size, color, location or structure). Image co-segmentation trades the need for such knowledge for something much easier to obtain, namely, additional images showing the object from other viewpoints. Now the segmentation problem is posed as one of differentiating the similar object regions in all the images from the more varying background. In this paper, for the first time, we extend this approach to video segmentation: given two or more video sequences showing the same object (or objects belonging to the same class) moving in a similar manner, we aim to outline its region in all the frames. In addition, the method works in an unsupervised manner, by learning to segment at testing time. We compare favorably with two state-of-the-art methods on video segmentation and report results on benchmark videos.
|
|
|
Monica Piñol, Angel Sappa and Ricardo Toledo. 2012. MultiTable Reinforcement for Visual Object Recognition. 4th International Conference on Signal and Image Processing. Springer India, 469–480. (LNCS.)
Abstract: This paper presents a bag of feature based method for visual object recognition. Our contribution is focussed on the selection of the best feature descriptor. It is implemented by using a novel multi-table reinforcement learning method that selects among five of classical descriptors (i.e., Spin, SIFT, SURF, C-SIFT and PHOW) the one that best describes each image. Experimental results and comparisons are provided showing the improvements achieved with the proposed approach.
|
|
|
Mohammad Rouhani and Angel Sappa. 2012. Non-Rigid Shape Registration: A Single Linear Least Squares Framework. 12th European Conference on Computer Vision. Springer Berlin Heidelberg, 264–277. (LNCS.)
Abstract: This paper proposes a non-rigid registration formulation capturing both global and local deformations in a single framework. This formulation is based on a quadratic estimation of the registration distance together with a quadratic regularization term. Hence, the optimal transformation parameters are easily obtained by solving a liner system of equations, which guarantee a fast convergence. Experimental results with challenging 2D and 3D shapes are presented to show the validity of the proposed framework. Furthermore, comparisons with the most relevant approaches are provided.
|
|
|
Miguel Oliveira, V.Santos and Angel Sappa. 2012. Short term path planning using a multiple hypothesis evaluation approach for an autonomous driving competition. IEEE 4th Workshop on Planning, Perception and Navigation for Intelligent Vehicles.
|
|
|
Jose Manuel Alvarez, Y. LeCun, Theo Gevers and Antonio Lopez. 2012. Semantic Road Segmentation via Multi-Scale Ensembles of Learned Features. 12th European Conference on Computer Vision – Workshops and Demonstrations. Springer Berlin Heidelberg, 586–595. (LNCS.)
Abstract: Semantic segmentation refers to the process of assigning an object label (e.g., building, road, sidewalk, car, pedestrian) to every pixel in an image. Common approaches formulate the task as a random field labeling problem modeling the interactions between labels by combining local and contextual features such as color, depth, edges, SIFT or HoG. These models are trained to maximize the likelihood of the correct classification given a training set. However, these approaches rely on hand–designed features (e.g., texture, SIFT or HoG) and a higher computational time required in the inference process.
Therefore, in this paper, we focus on estimating the unary potentials of a conditional random field via ensembles of learned features. We propose an algorithm based on convolutional neural networks to learn local features from training data at different scales and resolutions. Then, diversification between these features is exploited using a weighted linear combination. Experiments on a publicly available database show the effectiveness of the proposed method to perform semantic road scene segmentation in still images. The algorithm outperforms appearance based methods and its performance is similar compared to state–of–the–art methods using other sources of information such as depth, motion or stereo.
Keywords: road detection
|
|
|
Gemma Roig, Xavier Boix, R. de Nijs, Sebastian Ramos, K. Kühnlenz and Luc Van Gool. 2013. Active MAP Inference in CRFs for Efficient Semantic Segmentation. 15th IEEE International Conference on Computer Vision.2312–2319.
Abstract: Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.
Keywords: Semantic Segmentation
|
|
|
Jiaolong Xu, David Vazquez, Antonio Lopez, Javier Marin and Daniel Ponsa. 2013. Learning a Multiview Part-based Model in Virtual World for Pedestrian Detection. IEEE Intelligent Vehicles Symposium. IEEE, 467–472.
Abstract: State-of-the-art deformable part-based models based on latent SVM have shown excellent results on human detection. In this paper, we propose to train a multiview deformable part-based model with automatically generated part examples from virtual-world data. The method is efficient as: (i) the part detectors are trained with precisely extracted virtual examples, thus no latent learning is needed, (ii) the multiview pedestrian detector enhances the performance of the pedestrian root model, (iii) a top-down approach is used for part detection which reduces the searching space. We evaluate our model on Daimler and Karlsruhe Pedestrian Benchmarks with publicly available Caltech pedestrian detection evaluation framework and the result outperforms the state-of-the-art latent SVM V4.0, on both average miss rate and speed (our detector is ten times faster).
Keywords: Pedestrian Detection; Virtual World; Part based
|
|