|
David Geronimo, Angel Sappa and Antonio Lopez. 2010. Stereo-based Candidate Generation for Pedestrian Protection Systems. Binocular Vision: Development, Depth Perception and Disorders. NOVA Publishers, 189–208.
Abstract: This chapter describes a stereo-based algorithm that provides candidate image windows to a latter 2D classification stage in an on-board pedestrian detection system. The proposed algorithm, which consists of three stages, is based on the use of both stereo imaging and scene prior knowledge (i.e., pedestrians are on the ground) to reduce the candidate searching space. First, a successful road surface fitting algorithm provides estimates on the relative ground-camera pose. This stage directs the search toward the road area thus avoiding irrelevant regions like the sky. Then, three different schemes are used to scan the estimated road surface with pedestrian-sized windows: (a) uniformly distributed through the road surface (3D); (b) uniformly distributed through the image (2D); (c) not uniformly distributed but according to a quadratic function (combined 2D-3D). Finally, the set of candidate windows is reduced by analyzing their 3D content. Experimental results of the proposed algorithm, together with statistics of searching space reduction are provided.
Keywords: Pedestrian Detection
|
|
|
Niki Aifanti, Angel Sappa, N. Grammalidis and Sotiris Malassiotis. 2009. Advances in Tracking and Recognition of Human Motion. Encyclopedia of Information Science and Technology.65–71.
|
|
|
Antonio Lopez, Jiaolong Xu, Jose L. Gomez, David Vazquez and German Ros. 2017. From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example. In Gabriela Csurka, ed. Domain Adaptation in Computer Vision Applications. Springer, 243–258.
Abstract: Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world.
Keywords: Domain Adaptation
|
|
|
David Vazquez. 2013. Domain Adaptation of Virtual and Real Worlds for Pedestrian Detection. (Ph.D. thesis, Ediciones Graficas Rey.)
Abstract: Pedestrian detection is of paramount interest for many applications, e.g. Advanced Driver Assistance Systems, Intelligent Video Surveillance and Multimedia systems. Most promising pedestrian detectors rely on appearance-based classifiers trained with annotated data. However, the required annotation step represents an intensive and subjective task for humans, what makes worth to minimize their intervention in this process by using computational tools like realistic virtual worlds. The reason to use these kind of tools relies in the fact that they allow the automatic generation of precise and rich annotations of visual information. Nevertheless, the use of this kind of data comes with the following question: can a pedestrian appearance model learnt with virtual-world data work successfully for pedestrian detection in real-world scenarios?. To answer this question, we conduct different experiments that suggest a positive answer. However, the pedestrian classifiers trained with virtual-world data can suffer the so called dataset shift problem as real-world based classifiers does. Accordingly, we have designed different domain adaptation techniques to face this problem, all of them integrated in a same framework (V-AYLA). We have explored different methods to train a domain adapted pedestrian classifiers by collecting a few pedestrian samples from the target domain (real world) and combining them with many samples of the source domain (virtual world). The extensive experiments we present show that pedestrian detectors developed within the V-AYLA framework do achieve domain adaptation. Ideally, we would like to adapt our system without any human intervention. Therefore, as a first proof of concept we also propose an unsupervised domain adaptation technique that avoids human intervention during the adaptation process. To the best of our knowledge, this Thesis work is the first demonstrating adaptation of virtual and real worlds for developing an object detector. Last but not least, we also assessed a different strategy to avoid the dataset shift that consists in collecting real-world samples and retrain with them in such a way that no bounding boxes of real-world pedestrians have to be provided. We show that the generated classifier is competitive with respect to the counterpart trained with samples collected by manually annotating pedestrian bounding boxes. The results presented on this Thesis not only end with a proposal for adapting a virtual-world pedestrian detector to the real world, but also it goes further by pointing out a new methodology that would allow the system to adapt to different situations, which we hope will provide the foundations for future research in this unexplored area.
Keywords: Pedestrian Detection; Domain Adaptation
|
|
|
Felipe Lumbreras, Ramon Baldrich, Maria Vanrell, Joan Serrat and Juan J. Villanueva. 1999. Multiresolution texture classification of ceramic tiles. Recent Research developments in optical engineering, Research Signpost, 2: 213–228.
|
|
|
Ricardo Toledo. 2001. Cardiac workstation and dynamic model to assist in coronary tree analysis. (Ph.D. thesis, .)
|
|
|
Antonio Lopez. 2000. Multilocal Methods for Ridge and Valley Delineation in Image Analysis. (Ph.D. thesis, .)
|
|
|
Felipe Lumbreras. 2001. Segmentation, classification and modelization of textures by means of multiresolution decomposition techniques..
|
|
|
Angel Sappa, Niki Aifanti, N. Grammalidis and Sotiris Malassiotis. 2004. Advances in Vision-Based Human Body Modeling. In N. Sarris and M. Strintzis., ed. 3D Modeling & Animation: Systhesis and Analysis Techniques for the Human Body.1–26.
|
|
|
Angel Sappa and Fadi Dornaika. 2006. An Edge-Based Approach to Motion Detection. 6th International Conference on Computational Science (ICCS´06), LNCS 3991:563–570.
|
|