|
Alejandro Gonzalez Alzate, Gabriel Villalonga, German Ros, David Vazquez and Antonio Lopez. 2015. 3D-Guided Multiscale Sliding Window for Pedestrian Detection. Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015.560–568.
Abstract: The most relevant modules of a pedestrian detector are the candidate generation and the candidate classification. The former aims at presenting image windows to the latter so that they are classified as containing a pedestrian or not. Much attention has being paid to the classification module, while candidate generation has mainly relied on (multiscale) sliding window pyramid. However, candidate generation is critical for achieving real-time. In this paper we assume a context of autonomous driving based on stereo vision. Accordingly, we evaluate the effect of taking into account the 3D information (derived from the stereo) in order to prune the hundred of thousands windows per image generated by classical pyramidal sliding window. For our study we use a multimodal (RGB, disparity) and multi-descriptor (HOG, LBP, HOG+LBP) holistic ensemble based on linear SVM. Evaluation on data from the challenging KITTI benchmark suite shows the effectiveness of using 3D information to dramatically reduce the number of candidate windows, even improving the overall pedestrian detection accuracy.
Keywords: Pedestrian Detection
|
|
|
David Geronimo, Angel Sappa, Antonio Lopez and Daniel Ponsa. 2007. Adaptive Image Sampling and Windows Classification for On-board Pedestrian Detection. Proceedings of the 5th International Conference on Computer Vision Systems.
Abstract: On–board pedestrian detection is in the frontier of the state–of–the–art since it implies processing outdoor scenarios from a mobile platform and searching for aspect–changing objects in cluttered urban environments. Most promising approaches include the development of classifiers based on feature selection and machine learning. However, they use a large number of features which compromises real–time. Thus, methods for running the classifiers in only a few image windows must be provided. In this paper we contribute in both aspects, proposing a camera
pose estimation method for adaptive sparse image sampling, as well as a classifier for pedestrian detection based on Haar wavelets and edge orientation histograms as features and AdaBoost as learning machine. Both proposals are compared with relevant approaches in the literature, showing comparable results but reducing processing time by four for the sampling tasks and by ten for the classification one.
Keywords: Pedestrian Detection
|
|
|
David Geronimo, Antonio Lopez and Angel Sappa. 2007. Computer Vision Approaches for Pedestrian Detection: Visible Spectrum Survey. In J. Marti et al., ed. 3rd Iberian Conference on Pattern Recognition and Image Analysis, LNCS 4477.547–554.
Abstract: Pedestrian detection from images of the visible spectrum is a high relevant area of research given its potential impact in the design of pedestrian protection systems. There are many proposals in the literature but they lack a comparative viewpoint. According to this, in this paper we first propose a common framework where we fit the different approaches, and second we use this framework to provide a comparative point of view of the details of such different approaches, pointing out also the main challenges to be solved in the future. In summary, we expect
this survey to be useful for both novel and experienced researchers in the field. In the first case, as a clarifying snapshot of the state of the art; in the second, as a way to unveil trends and to take conclusions from the comparative study.
Keywords: Pedestrian detection
|
|
|
David Geronimo, Antonio Lopez, Daniel Ponsa and Angel Sappa. 2007. Haar Wavelets and Edge Orientation Histograms for On-Board Pedestrian Detection. In J. Marti et al., ed. 3rd Iberian Conference on Pattern Recognition and Image Analysis, LNCS 4477.418–425.
Keywords: Pedestrian detection
|
|
|
Ishaan Gulrajani and 6 others. 2017. PixelVAE: A Latent Variable Model for Natural Images. 5th International Conference on Learning Representations.
Abstract: Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and generate samples that preserve global structure but tend to suffer from image blurriness. PixelCNNs model sharp contours and details very well, but lack an explicit latent representation and have difficulty modeling large-scale structure in a computationally efficient way. In this paper, we present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. The resulting architecture achieves state-of-the-art log-likelihood on binarized MNIST. We extend PixelVAE to a hierarchy of multiple latent variables at different scales; this hierarchical model achieves competitive likelihood on 64x64 ImageNet and generates high-quality samples on LSUN bedrooms.
Keywords: Deep Learning; Unsupervised Learning
|
|
|
Josep M. Gonfaus, Xavier Boix, Joost Van de Weijer, Andrew Bagdanov, Joan Serrat and Jordi Gonzalez. 2010. Harmony Potentials for Joint Classification and Segmentation. 23rd IEEE Conference on Computer Vision and Pattern Recognition.3280–3287.
Abstract: Hierarchical conditional random fields have been successfully applied to object segmentation. One reason is their ability to incorporate contextual information at different scales. However, these models do not allow multiple labels to be assigned to a single node. At higher scales in the image, this yields an oversimplified model, since multiple classes can be reasonable expected to appear within one region. This simplified model especially limits the impact that observations at larger scales may have on the CRF model. Neglecting the information at larger scales is undesirable since class-label estimates based on these scales are more reliable than at smaller, noisier scales. To address this problem, we propose a new potential, called harmony potential, which can encode any possible combination of class labels. We propose an effective sampling strategy that renders tractable the underlying optimization problem. Results show that our approach obtains state-of-the-art results on two challenging datasets: Pascal VOC 2009 and MSRC-21.
|
|
|
A. Dupuy, Joan Serrat, Jordi Vitria and J. Pladellorens. 1991. Analysis of gammagraphic images by mathematical morphology. Pattern Recognition and image Analysis: IV Spanish Symposium of Pattern Recognition and image Analysis, World Scientific Pub..
|
|
|
Ferran Diego, Daniel Ponsa, Joan Serrat and Antonio Lopez. 2010. Vehicle geolocalization based on video synchronization. 13th Annual International Conference on Intelligent Transportation Systems.1511–1516.
Abstract: TC8.6
This paper proposes a novel method for estimating the geospatial localization of a vehicle. I uses as input a georeferenced video sequence recorded by a forward-facing camera attached to the windscreen. The core of the proposed method is an on-line video synchronization which finds out the corresponding frame in the georeferenced video sequence to the one recorded at each time by the camera on a second drive through the same track. Once found the corresponding frame in the georeferenced video sequence, we transfer its geospatial information of this frame. The key advantages of this method are: 1) the increase of the update rate and the geospatial accuracy with regard to a standard low-cost GPS and 2) the ability to localize a vehicle even when a GPS is not available or is not reliable enough, like in certain urban areas. Experimental results for an urban environments are presented, showing an average of relative accuracy of 1.5 meters.
Keywords: video alignment
|
|
|
Fadi Dornaika and Angel Sappa. 2008. Real Time on Board Stereo Camera Pose through Image Registration. IEEE Intelligent Vehicles Symposium,.804–809.
|
|
|
Fadi Dornaika and Angel Sappa. 2007. Improving Appearance-Based 3D Face Tracking Using Sparse Stereo Data. In J. Braz, A.R., H. Araujo and J. Jorge,, ed. Advances in Computer Graphics and Computer Vision,. Springer Verlag, 354–366.
|
|