Home | [131–140] << 141 142 143 144 145 146 147 148 149 150 >> [151–160] |
Records | |||||
---|---|---|---|---|---|
Author | Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva | ||||
Title | Multi-Face Tracking by Extended Bag-of-Tracklets in Egocentric Videos | Type | Miscellaneous | ||
Year | 2015 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Egocentric images offer a hands-free way to record daily experiences and special events, where social interactions are of special interest. A natural question that arises is how to extract and track the appearance of multiple persons in a social event captured by a wearable camera. In this paper, we propose a novel method to find correspondences of multiple-faces in low temporal resolution egocentric sequences acquired through a wearable camera. This kind of sequences imposes additional challenges to the multitracking problem with respect to conventional videos. Due to the free motion of the camera and to its low temporal resolution (2 fpm), abrupt changes in the field of view, in illumination conditions and in the target location are very frequent. To overcome such a difficulty, we propose to generate, for each detected face, a set of correspondences along the whole sequence that we call tracklet and to take advantage of their redundancy to deal with both false positive face detections and unreliable tracklets. Similar tracklets are grouped into the so called extended bag-of-tracklets (eBoT), which are aimed to correspond to specific persons. Finally, a prototype tracklet is extracted for each eBoT. We validated our method over a dataset of 18.000 images from 38 egocentric sequences with 52 trackable persons and compared to the state-of-the-art methods, demonstrating its effectiveness and robustness. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ ADR2015b | Serial | 2713 | ||
Permanent link to this record | |||||
Author | Mariella Dimiccoli; Marc Bolaños; Estefania Talavera; Maedeh Aghaei; Stavri G. Nikolov; Petia Radeva | ||||
Title | SR-Clustering: Semantic Regularized Clustering for Egocentric Photo Streams Segmentation | Type | Journal Article | ||
Year | 2017 | Publication | Computer Vision and Image Understanding | Abbreviated Journal | CVIU |
Volume | 155 | Issue | Pages | 55-69 | |
Keywords | |||||
Abstract | While wearable cameras are becoming increasingly popular, locating relevant information in large unstructured collections of egocentric images is still a tedious and time consuming processes. This paper addresses the problem of organizing egocentric photo streams acquired by a wearable camera into semantically meaningful segments. First, contextual and semantic information is extracted for each image by employing a Convolutional Neural Networks approach. Later, by integrating language processing, a vocabulary of concepts is defined in a semantic space. Finally, by exploiting the temporal coherence in photo streams, images which share contextual and semantic attributes are grouped together. The resulting temporal segmentation is particularly suited for further analysis, ranging from activity and event recognition to semantic indexing and summarization. Experiments over egocentric sets of nearly 17,000 images, show that the proposed approach outperforms state-of-the-art methods. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; 601.235 | Approved | no | ||
Call Number | Admin @ si @ DBT2017 | Serial | 2714 | ||
Permanent link to this record | |||||
Author | Suman Ghosh; Ernest Valveny | ||||
Title | Query by String word spotting based on character bi-gram indexing | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 881-885 | ||
Keywords | |||||
Abstract | In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC). These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets | ||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.077 | Approved | no | ||
Call Number | Admin @ si @ GhV2015a | Serial | 2715 | ||
Permanent link to this record | |||||
Author | Suman Ghosh; Ernest Valveny | ||||
Title | A Sliding Window Framework for Word Spotting Based on Word Attributes | Type | Conference Article | ||
Year | 2015 | Publication | Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 | Abbreviated Journal | |
Volume | 9117 | Issue | Pages | 652-661 | |
Keywords | Word spotting; Sliding window; Word attributes | ||||
Abstract | In this paper we propose a segmentation-free approach to word spotting. Word images are first encoded into feature vectors using Fisher Vector. Then, these feature vectors are used together with pyramidal histogram of characters labels (PHOC) to learn SVM-based attribute models. Documents are represented by these PHOC based word attributes. To efficiently compute the word attributes over a sliding window, we propose to use an integral image representation of the document using a simplified version of the attribute model. Finally we re-rank the top word candidates using the more discriminative full version of the word attributes. We show state-of-the-art results for segmentation-free query-by-example word spotting in single-writer and multi-writer standard datasets. | ||||
Address | Santiago de Compostela; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer International Publishing | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-319-19389-2 | Medium | |
Area | Expedition | Conference | IbPRIA | ||
Notes | DAG; 600.077 | Approved | no | ||
Call Number | Admin @ si @ GhV2015b | Serial | 2716 | ||
Permanent link to this record | |||||
Author | Ciprian Corneanu; Marc Oliu; Jeffrey F. Cohn; Sergio Escalera | ||||
Title | Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History | Type | Journal Article | ||
Year | 2016 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 28 | Issue | 8 | Pages | 1548-1568 |
Keywords | Facial expression; affect; emotion recognition; RGB; 3D; thermal; multimodal | ||||
Abstract | Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB; | Approved | no | ||
Call Number | Admin @ si @ COC2016 | Serial | 2718 | ||
Permanent link to this record | |||||
Author | Antonio Hernandez; Sergio Escalera; Stan Sclaroff | ||||
Title | Poselet-basedContextual Rescoring for Human Pose Estimation via Pictorial Structures | Type | Journal Article | ||
Year | 2016 | Publication | International Journal of Computer Vision | Abbreviated Journal | IJCV |
Volume | 118 | Issue | 1 | Pages | 49–64 |
Keywords | Contextual rescoring; Poselets; Human pose estimation | ||||
Abstract | In this paper we propose a contextual rescoring method for predicting the position of body parts in a human pose estimation framework. A set of poselets is incorporated in the model, and their detections are used to extract spatial and score-related features relative to other body part hypotheses. A method is proposed for the automatic discovery of a compact subset of poselets that covers the different poses in a set of validation images while maximizing precision. A rescoring mechanism is defined as a set-based boosting classifier that computes a new score for each body joint detection, given its relationship to detections of other body joints and mid-level parts in the image. This new score is incorporated in the pictorial structure model as an additional unary potential, following the recent work of Pishchulin et al. Experiments on two benchmarks show comparable results to Pishchulin et al. while reducing the size of the mid-level representation by an order of magnitude, reducing the execution time by 68 % accordingly. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer US | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0920-5691 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB; | Approved | no | ||
Call Number | Admin @ si @ HES2016 | Serial | 2719 | ||
Permanent link to this record | |||||
Author | Fadi Dornaika; Bogdan Raducanu; Alireza Bosaghzadeh | ||||
Title | Facial expression recognition based on multi observations with application to social robotics | Type | Book Chapter | ||
Year | 2015 | Publication | Emotional and Facial Expressions: Recognition, Developmental Differences and Social Importance | Abbreviated Journal | |
Volume | Issue | Pages | 153-166 | ||
Keywords | |||||
Abstract | Human-robot interaction is a hot topic nowadays in the social robotics
community. One crucial aspect is represented by the affective communication which comes encoded through the facial expressions. In this chapter, we propose a novel approach for facial expression recognition, which exploits an efficient and adaptive graph-based label propagation (semi-supervised mode) in a multi-observation framework. The facial features are extracted using an appearance-based 3D face tracker, viewand texture independent. Our method has been extensively tested on the CMU dataset, and has been conveniently compared with other methods for graph construction. With the proposed approach, we developed an application for an AIBO robot, in which it mirrors the recognized facial expression. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Nova Science publishers | Place of Publication | Editor | Bruce Flores | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; | Approved | no | ||
Call Number | Admin @ si @ DRB2015 | Serial | 2720 | ||
Permanent link to this record | |||||
Author | Juan Ramon Terven Salinas; Bogdan Raducanu; Maria Elena Meza-de-Luna; Joaquin Salas | ||||
Title | Head-gestures mirroring detection in dyadic social linteractions with computer vision-based wearable devices | Type | Journal Article | ||
Year | 2016 | Publication | Neurocomputing | Abbreviated Journal | NEUCOM |
Volume | 175 | Issue | B | Pages | 866–876 |
Keywords | Head gestures recognition; Mirroring detection; Dyadic social interaction analysis; Wearable devices | ||||
Abstract | During face-to-face human interaction, nonverbal communication plays a fundamental role. A relevant aspect that takes part during social interactions is represented by mirroring, in which a person tends to mimic the non-verbal behavior (head and body gestures, vocal prosody, etc.) of the counterpart. In this paper, we introduce a computer vision-based system to detect mirroring in dyadic social interactions with the use of a wearable platform. In our context, mirroring is inferred as simultaneous head noddings displayed by the interlocutors. Our approach consists of the following steps: (1) facial features extraction; (2) facial features stabilization; (3) head nodding recognition; and (4) mirroring detection. Our system achieves a mirroring detection accuracy of 72% on a custom mirroring dataset. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.072; 600.068; | Approved | no | ||
Call Number | Admin @ si @ TRM2016 | Serial | 2721 | ||
Permanent link to this record | |||||
Author | Juan Ramon Terven Salinas; Bogdan Raducanu; Maria Elena Meza-de-Luna; Joaquin Salas | ||||
Title | Evaluating Real-Time Mirroring of Head Gestures using Smart Glasses | Type | Conference Article | ||
Year | 2015 | Publication | 16th IEEE International Conference on Computer Vision Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 452-460 | ||
Keywords | |||||
Abstract | Mirroring occurs when one person tends to mimic the non-verbal communication of their counterparts. Even though mirroring is a complex phenomenon, in this study, we focus on the detection of head-nodding as a simple non-verbal communication cue due to its significance as a gesture displayed during social interactions. This paper introduces a computer vision-based method to detect mirroring through the analysis of head gestures using wearable cameras (smart glasses). In addition, we study how such a method can be used to explore perceived competence. The proposed method has been evaluated and the experiments demonstrate how static and wearable cameras seem to be equally effective to gather the information required for the analysis. | ||||
Address | Santiago de Chile; December 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCVW | ||
Notes | LAMP; 600.068; 600.072; | Approved | no | ||
Call Number | Admin @ si @ TRM2015 | Serial | 2722 | ||
Permanent link to this record | |||||
Author | Adriana Romero; Carlo Gatta; Gustavo Camps-Valls | ||||
Title | Unsupervised Deep Feature Extraction for Remote Sensing Image Classification | Type | Journal Article | ||
Year | 2016 | Publication | IEEE Transaction on Geoscience and Remote Sensing | Abbreviated Journal | TGRS |
Volume | 54 | Issue | 3 | Pages | 1349 - 1362 |
Keywords | |||||
Abstract | This paper introduces the use of single-layer and deep convolutional networks for remote sensing data analysis. Direct application to multi- and hyperspectral imagery of supervised (shallow or deep) convolutional networks is very challenging given the high input data dimensionality and the relatively small amount of available labeled data. Therefore, we propose the use of greedy layerwise unsupervised pretraining coupled with a highly efficient algorithm for unsupervised learning of sparse features. The algorithm is rooted on sparse representations and enforces both population and lifetime sparsity of the extracted features, simultaneously. We successfully illustrate the expressive power of the extracted representations in several scenarios: classification of aerial scenes, as well as land-use classification in very high resolution or land-cover classification from multi- and hyperspectral images. The proposed algorithm clearly outperforms standard principal component analysis (PCA) and its kernel counterpart (kPCA), as well as current state-of-the-art algorithms of aerial classification, while being extremely computationally efficient at learning representations of data. Results show that single-layer convolutional networks can extract powerful discriminative features only when the receptive field accounts for neighboring pixels and are preferred when the classification requires high resolution and detailed results. However, deep architectures significantly outperform single-layer variants, capturing increasing levels of abstraction and complexity throughout the feature hierarchy. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0196-2892 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; 600.079;MILAB | Approved | no | ||
Call Number | Admin @ si @ RGC2016 | Serial | 2723 | ||
Permanent link to this record | |||||
Author | M. Campos-Taberner; Adriana Romero; Carlo Gatta; Gustavo Camps-Valls | ||||
Title | Shared feature representations of LiDAR and optical images: Trading sparsity for semantic discrimination | Type | Conference Article | ||
Year | 2015 | Publication | IEEE International Geoscience and Remote Sensing Symposium IGARSS2015 | Abbreviated Journal | |
Volume | Issue | Pages | 4169 - 4172 | ||
Keywords | |||||
Abstract | This paper studies the level of complementary information conveyed by extremely high resolution LiDAR and optical images. We pursue this goal following an indirect approach via unsupervised spatial-spectral feature extraction. We used a recently presented unsupervised convolutional neural network trained to enforce both population and lifetime spar-sity in the feature representation. We derived independent and joint feature representations, and analyzed the sparsity scores and the discriminative power. Interestingly, the obtained results revealed that the RGB+LiDAR representation is no longer sparse, and the derived basis functions merge color and elevation yielding a set of more expressive colored edge filters. The joint feature representation is also more discriminative when used for clustering and topological data visualization. | ||||
Address | Milan; Italy; July 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IGARSS | ||
Notes | LAMP; 600.079;MILAB | Approved | no | ||
Call Number | Admin @ si @ CRG2015 | Serial | 2724 | ||
Permanent link to this record | |||||
Author | R. Bertrand; Oriol Ramos Terrades; P. Gomez-Kramer; P. Franco; Jean-Marc Ogier | ||||
Title | A Conditional Random Field model for font forgery detection | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 576 - 580 | ||
Keywords | |||||
Abstract | Nowadays, document forgery is becoming a real issue. A large amount of documents that contain critical information as payment slips, invoices or contracts, are constantly subject to fraudster manipulation because of the lack of security regarding this kind of document. Previously, a system to detect fraudulent documents based on its intrinsic features has been presented. It was especially designed to retrieve copy-move forgery and imperfection due to fraudster manipulation. However, when a set of characters is not present in the original document, copy-move forgery is not feasible. Hence, the fraudster will use a text toolbox to add or modify information in the document by imitating the font or he will cut and paste characters from another document where the font properties are similar. This often results in font type errors. Thus, a clue to detect document forgery consists of finding characters, words or sentences in a document with font properties different from their surroundings. To this end, we present in this paper an automatic forgery detection method based on document font features. Using the Conditional Random Field a measurement of probability that a character belongs to a specific font is made by comparing the character font features to a knowledge database. Then, the character is classified as a genuine or a fake one by comparing its probability to belong to a certain font type with those of the neighboring characters. | ||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.077 | Approved | no | ||
Call Number | Admin @ si @ BRG2015 | Serial | 2725 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados; David Fernandez; Cristina Cañero | ||||
Title | Use case visual Bag-of-Words techniques for camera based identity document classification | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 721 - 725 | ||
Keywords | |||||
Abstract | Nowadays, automatic identity document recognition, including passport and driving license recognition, is at the core of many applications within the administrative and service sectors, such as police, hospitality, car renting, etc. In former years, the document information was manually extracted whereas today this data is recognized automatically from images obtained by flat-bed scanners. Yet, since these scanners tend to be expensive and voluminous, companies in the sector have recently turned their attention to cheaper, small and yet computationally powerful scanners: the mobile devices. The document identity recognition from mobile images enclose several new difficulties w.r.t traditional scanned images, such as the loss of a controlled background, perspective, blurring, etc. In this paper we present a real application for identity document classification of images taken from mobile devices. This classification process is of extreme importance since a prior knowledge of the document type and origin strongly facilitates the subsequent information extraction. The proposed method is based on a traditional Bagof-Words in which we have taken into consideration several key aspects to enhance recognition rate. The method performance has been studied on three datasets containing more than 2000 images from 129 different document classes. | ||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.077; 600.061; | Approved | no | ||
Call Number | Admin @ si @ HRL2015a | Serial | 2726 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados | ||||
Title | Attributed Graph Grammar for floor plan analysis | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 726 - 730 | ||
Keywords | |||||
Abstract | In this paper, we propose the use of an Attributed Graph Grammar as unique framework to model and recognize the structure of floor plans. This grammar represents a building as a hierarchical composition of structurally and semantically related elements, where common representations are learned stochastically from annotated data. Given an input image, the parsing consists on constructing that graph representation that better agrees with the probabilistic model defined by the grammar. The proposed method provides several advantages with respect to the traditional floor plan analysis techniques. It uses an unsupervised statistical approach for detecting walls that adapts to different graphical notations and relaxes strong structural assumptions such are straightness and orthogonality. Moreover, the independence between the knowledge model and the parsing implementation allows the method to learn automatically different building configurations and thus, to cope the existing variability. These advantages are clearly demonstrated by comparing it with the most recent floor plan interpretation techniques on 4 datasets of real floor plans with different notations. | ||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.077; 600.061 | Approved | no | ||
Call Number | Admin @ si @ HRL2015b | Serial | 2727 | ||
Permanent link to this record | |||||
Author | Jiaolong Xu; David Vazquez; Krystian Mikolajczyk; Antonio Lopez | ||||
Title | Hierarchical online domain adaptation of deformable part-based models | Type | Conference Article | ||
Year | 2016 | Publication | IEEE International Conference on Robotics and Automation | Abbreviated Journal | |
Volume | Issue | Pages | 5536-5541 | ||
Keywords | Domain Adaptation; Pedestrian Detection | ||||
Abstract | We propose an online domain adaptation method for the deformable part-based model (DPM). The online domain adaptation is based on a two-level hierarchical adaptation tree, which consists of instance detectors in the leaf nodes and a category detector at the root node. Moreover, combined with a multiple object tracking procedure (MOT), our proposal neither requires target-domain annotated data nor revisiting the source-domain data for performing the source-to-target domain adaptation of the DPM. From a practical point of view this means that, given a source-domain DPM and new video for training on a new domain without object annotations, our procedure outputs a new DPM adapted to the domain represented by the video. As proof-of-concept we apply our proposal to the challenging task of pedestrian detection. In this case, each instance detector is an exemplar classifier trained online with only one pedestrian per frame. The pedestrian instances are collected by MOT and the hierarchical model is constructed dynamically according to the pedestrian trajectories. Our experimental results show that the adapted detector achieves the accuracy of recent supervised domain adaptation methods (i.e., requiring manually annotated targetdomain data), and improves the source detector more than 10 percentage points. | ||||
Address | Stockholm; Sweden; May 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICRA | ||
Notes | ADAS; 600.085; 600.082; 600.076 | Approved | no | ||
Call Number | Admin @ si @ XVM2016 | Serial | 2728 | ||
Permanent link to this record |