|
Jose Carlos Rubio, Joan Serrat and Antonio Lopez. 2012. Video Co-segmentation. 11th Asian Conference on Computer Vision. Springer Berlin Heidelberg, 13–24. (LNCS.)
Abstract: Segmentation of a single image is in general a highly underconstrained problem. A frequent approach to solve it is to somehow provide prior knowledge or constraints on how the objects of interest look like (in terms of their shape, size, color, location or structure). Image co-segmentation trades the need for such knowledge for something much easier to obtain, namely, additional images showing the object from other viewpoints. Now the segmentation problem is posed as one of differentiating the similar object regions in all the images from the more varying background. In this paper, for the first time, we extend this approach to video segmentation: given two or more video sequences showing the same object (or objects belonging to the same class) moving in a similar manner, we aim to outline its region in all the frames. In addition, the method works in an unsupervised manner, by learning to segment at testing time. We compare favorably with two state-of-the-art methods on video segmentation and report results on benchmark videos.
|
|
|
Monica Piñol, Angel Sappa and Ricardo Toledo. 2012. MultiTable Reinforcement for Visual Object Recognition. 4th International Conference on Signal and Image Processing. Springer India, 469–480. (LNCS.)
Abstract: This paper presents a bag of feature based method for visual object recognition. Our contribution is focussed on the selection of the best feature descriptor. It is implemented by using a novel multi-table reinforcement learning method that selects among five of classical descriptors (i.e., Spin, SIFT, SURF, C-SIFT and PHOW) the one that best describes each image. Experimental results and comparisons are provided showing the improvements achieved with the proposed approach.
|
|
|
Mohammad Rouhani and Angel Sappa. 2012. Non-Rigid Shape Registration: A Single Linear Least Squares Framework. 12th European Conference on Computer Vision. Springer Berlin Heidelberg, 264–277. (LNCS.)
Abstract: This paper proposes a non-rigid registration formulation capturing both global and local deformations in a single framework. This formulation is based on a quadratic estimation of the registration distance together with a quadratic regularization term. Hence, the optimal transformation parameters are easily obtained by solving a liner system of equations, which guarantee a fast convergence. Experimental results with challenging 2D and 3D shapes are presented to show the validity of the proposed framework. Furthermore, comparisons with the most relevant approaches are provided.
|
|
|
Miguel Oliveira, V.Santos and Angel Sappa. 2012. Short term path planning using a multiple hypothesis evaluation approach for an autonomous driving competition. IEEE 4th Workshop on Planning, Perception and Navigation for Intelligent Vehicles.
|
|
|
Jose Manuel Alvarez, Y. LeCun, Theo Gevers and Antonio Lopez. 2012. Semantic Road Segmentation via Multi-Scale Ensembles of Learned Features. 12th European Conference on Computer Vision – Workshops and Demonstrations. Springer Berlin Heidelberg, 586–595. (LNCS.)
Abstract: Semantic segmentation refers to the process of assigning an object label (e.g., building, road, sidewalk, car, pedestrian) to every pixel in an image. Common approaches formulate the task as a random field labeling problem modeling the interactions between labels by combining local and contextual features such as color, depth, edges, SIFT or HoG. These models are trained to maximize the likelihood of the correct classification given a training set. However, these approaches rely on hand–designed features (e.g., texture, SIFT or HoG) and a higher computational time required in the inference process.
Therefore, in this paper, we focus on estimating the unary potentials of a conditional random field via ensembles of learned features. We propose an algorithm based on convolutional neural networks to learn local features from training data at different scales and resolutions. Then, diversification between these features is exploited using a weighted linear combination. Experiments on a publicly available database show the effectiveness of the proposed method to perform semantic road scene segmentation in still images. The algorithm outperforms appearance based methods and its performance is similar compared to state–of–the–art methods using other sources of information such as depth, motion or stereo.
Keywords: road detection
|
|
|
Gemma Roig, Xavier Boix, R. de Nijs, Sebastian Ramos, K. Kühnlenz and Luc Van Gool. 2013. Active MAP Inference in CRFs for Efficient Semantic Segmentation. 15th IEEE International Conference on Computer Vision.2312–2319.
Abstract: Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.
Keywords: Semantic Segmentation
|
|
|
Jiaolong Xu, David Vazquez, Antonio Lopez, Javier Marin and Daniel Ponsa. 2013. Learning a Multiview Part-based Model in Virtual World for Pedestrian Detection. IEEE Intelligent Vehicles Symposium. IEEE, 467–472.
Abstract: State-of-the-art deformable part-based models based on latent SVM have shown excellent results on human detection. In this paper, we propose to train a multiview deformable part-based model with automatically generated part examples from virtual-world data. The method is efficient as: (i) the part detectors are trained with precisely extracted virtual examples, thus no latent learning is needed, (ii) the multiview pedestrian detector enhances the performance of the pedestrian root model, (iii) a top-down approach is used for part detection which reduces the searching space. We evaluate our model on Daimler and Karlsruhe Pedestrian Benchmarks with publicly available Caltech pedestrian detection evaluation framework and the result outperforms the state-of-the-art latent SVM V4.0, on both average miss rate and speed (our detector is ten times faster).
Keywords: Pedestrian Detection; Virtual World; Part based
|
|
|
Patricia Marquez, Debora Gil, Aura Hernandez-Sabate and Daniel Kondermann. 2013. When Is A Confidence Measure Good Enough? 9th International Conference on Computer Vision Systems. Springer Link, 344–353. (LNCS.)
Abstract: Confidence estimation has recently become a hot topic in image processing and computer vision.Yet, several definitions exist of the term “confidence” which are sometimes used interchangeably. This is a position paper, in which we aim to give an overview on existing definitions,
thereby clarifying the meaning of the used terms to facilitate further research in this field. Based on these clarifications, we develop a theory to compare confidence measures with respect to their quality.
Keywords: Optical flow, confidence measure, performance evaluation
|
|
|
David Vazquez, Jiaolong Xu, Sebastian Ramos, Antonio Lopez and Daniel Ponsa. 2013. Weakly Supervised Automatic Annotation of Pedestrian Bounding Boxes. CVPR Workshop on Ground Truth – What is a good dataset?. IEEE, 706–711.
Abstract: Among the components of a pedestrian detector, its trained pedestrian classifier is crucial for achieving the desired performance. The initial task of the training process consists in collecting samples of pedestrians and background, which involves tiresome manual annotation of pedestrian bounding boxes (BBs). Thus, recent works have assessed the use of automatically collected samples from photo-realistic virtual worlds. However, learning from virtual-world samples and testing in real-world images may suffer the dataset shift problem. Accordingly, in this paper we assess an strategy to collect samples from the real world and retrain with them, thus avoiding the dataset shift, but in such a way that no BBs of real-world pedestrians have to be provided. In particular, we train a pedestrian classifier based on virtual-world samples (no human annotation required). Then, using such a classifier we collect pedestrian samples from real-world images by detection. After, a human oracle rejects the false detections efficiently (weak annotation). Finally, a new classifier is trained with the accepted detections. We show that this classifier is competitive with respect to the counterpart trained with samples collected by manually annotating hundreds of pedestrian BBs.
Keywords: Pedestrian Detection; Domain Adaptation
|
|
|
Jiaolong Xu, David Vazquez, Sebastian Ramos, Antonio Lopez and Daniel Ponsa. 2013. Adapting a Pedestrian Detector by Boosting LDA Exemplar Classifiers. CVPR Workshop on Ground Truth – What is a good dataset?.688–693.
Abstract: Training vision-based pedestrian detectors using synthetic datasets (virtual world) is a useful technique to collect automatically the training examples with their pixel-wise ground truth. However, as it is often the case, these detectors must operate in real-world images, experiencing a significant drop of their performance. In fact, this effect also occurs among different real-world datasets, i.e. detectors' accuracy drops when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, in order to avoid this problem, it is required to adapt the detector trained with synthetic data to operate in the real-world scenario. In this paper, we propose a domain adaptation approach based on boosting LDA exemplar classifiers from both virtual and real worlds. We evaluate our proposal on multiple real-world pedestrian detection datasets. The results show that our method can efficiently adapt the exemplar classifiers from virtual to real world, avoiding drops in average precision over the 15%.
Keywords: Pedestrian Detection; Domain Adaptation
|
|