TY - CONF AU - David Vazquez AU - Antonio Lopez AU - Daniel Ponsa AU - Javier Marin A2 - ICMI PY - 2011// TI - Virtual Worlds and Active Learning for Human Detection BT - 13th International Conference on Multimodal Interaction SP - 393 EP - 400 PB - ACM DL CY - New York, NY, USA, USA KW - Pedestrian Detection KW - Human detection KW - Virtual KW - Domain Adaptation KW - Active Learning N2 - Image based human detection is of paramount interest due to its potential applications in fields such as advanced driving assistance, surveillance and media analysis. However, even detecting non-occluded standing humans remains a challenge of intensive research. The most promising human detectors rely on classifiers developed in the discriminative paradigm, i.e., trained with labelled samples. However, labeling is a manual intensive step, especially in cases like human detection where it is necessary to provide at least bounding boxes framing the humans for training. To overcome such problem, some authors have proposed the use of a virtual world where the labels of the different objects are obtained automatically. This means that the human models (classifiers) are learnt using the appearance of rendered images, i.e., using realistic computer graphics. Later, these models are used for human detection in images of the real world. The results of this technique are surprisingly good. However, these are not always as good as the classical approach of training and testing with data coming from the same camera, or similar ones. Accordingly, in this paper we address the challenge of using a virtual world for gathering (while playing a videogame) a large amount of automatically labelled samples (virtual humans and background) and then training a classifier that performs equal, in real-world images, than the one obtained by equally training from manually labelled real-world samples. For doing that, we cast the problem as one of domain adaptation. In doing so, we assume that a small amount of manually labelled samples from real-world images is required. To collect these labelled samples we propose a non-standard active learning technique. Therefore, ultimately our human model is learnt by the combination of virtual and real world labelled samples (Fig. 1), which has not been done before. We present quantitative results showing that this approach is valid. SN - 978-1-4503-0641-6 L1 - http://refbase.cvc.uab.es/files/VLP2011a.pdf UR - http://dx.doi.org/10.1145/2070481.2070556 N1 - ADAS ID - David Vazquez2011 ER -