|
David Vazquez. 2013. Domain Adaptation of Virtual and Real Worlds for Pedestrian Detection. (Ph.D. thesis, Ediciones Graficas Rey.)
Abstract: Pedestrian detection is of paramount interest for many applications, e.g. Advanced Driver Assistance Systems, Intelligent Video Surveillance and Multimedia systems. Most promising pedestrian detectors rely on appearance-based classifiers trained with annotated data. However, the required annotation step represents an intensive and subjective task for humans, what makes worth to minimize their intervention in this process by using computational tools like realistic virtual worlds. The reason to use these kind of tools relies in the fact that they allow the automatic generation of precise and rich annotations of visual information. Nevertheless, the use of this kind of data comes with the following question: can a pedestrian appearance model learnt with virtual-world data work successfully for pedestrian detection in real-world scenarios?. To answer this question, we conduct different experiments that suggest a positive answer. However, the pedestrian classifiers trained with virtual-world data can suffer the so called dataset shift problem as real-world based classifiers does. Accordingly, we have designed different domain adaptation techniques to face this problem, all of them integrated in a same framework (V-AYLA). We have explored different methods to train a domain adapted pedestrian classifiers by collecting a few pedestrian samples from the target domain (real world) and combining them with many samples of the source domain (virtual world). The extensive experiments we present show that pedestrian detectors developed within the V-AYLA framework do achieve domain adaptation. Ideally, we would like to adapt our system without any human intervention. Therefore, as a first proof of concept we also propose an unsupervised domain adaptation technique that avoids human intervention during the adaptation process. To the best of our knowledge, this Thesis work is the first demonstrating adaptation of virtual and real worlds for developing an object detector. Last but not least, we also assessed a different strategy to avoid the dataset shift that consists in collecting real-world samples and retrain with them in such a way that no bounding boxes of real-world pedestrians have to be provided. We show that the generated classifier is competitive with respect to the counterpart trained with samples collected by manually annotating pedestrian bounding boxes. The results presented on this Thesis not only end with a proposal for adapting a virtual-world pedestrian detector to the real world, but also it goes further by pointing out a new methodology that would allow the system to adapt to different situations, which we hope will provide the foundations for future research in this unexplored area.
Keywords: Pedestrian Detection; Domain Adaptation
|
|
|
Felipe Lumbreras, Ramon Baldrich, Maria Vanrell, Joan Serrat and Juan J. Villanueva. 1999. Multiresolution texture classification of ceramic tiles. Recent Research developments in optical engineering, Research Signpost, 2: 213–228.
|
|
|
Ricardo Toledo. 2001. Cardiac workstation and dynamic model to assist in coronary tree analysis. (Ph.D. thesis, .)
|
|
|
Antonio Lopez. 2000. Multilocal Methods for Ridge and Valley Delineation in Image Analysis. (Ph.D. thesis, .)
|
|
|
Felipe Lumbreras. 2001. Segmentation, classification and modelization of textures by means of multiresolution decomposition techniques..
|
|
|
Angel Sappa, Niki Aifanti, N. Grammalidis and Sotiris Malassiotis. 2004. Advances in Vision-Based Human Body Modeling. In N. Sarris and M. Strintzis., ed. 3D Modeling & Animation: Systhesis and Analysis Techniques for the Human Body.1–26.
|
|
|
Angel Sappa and Fadi Dornaika. 2006. An Edge-Based Approach to Motion Detection. 6th International Conference on Computational Science (ICCS´06), LNCS 3991:563–570.
|
|
|
Fadi Dornaika and Angel Sappa. 2006. 3D Face Tracking using Appearance Registration and Robust Iterative Closest Point Algorithm. 21st International Symposium on Computer and Information Sciences (ISCIS´06), LNCS 4263: 532–541.
|
|
|
Fadi Dornaika and Angel Sappa. 2006. Rigid and Non-Rigid Face Motion Tracking by Aligning Texture Maps and Stereo-Based 3D Models. 8th International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS´06), LNCS 4179: 675–684.
|
|
|
Fadi Dornaika and Angel Sappa. 2006. 3D Motion from Image Derivatives using the Least Trimmed Square Regression. International Workshop on Intelligent Computing in Pattern Analysis/Synthesis (IWICPAS´06), LNCS 4153: 76–84.
|
|