toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links (up)
Author Fahad Shahbaz Khan; Jiaolong Xu; Muhammad Anwer Rao; Joost Van de Weijer; Andrew Bagdanov; Antonio Lopez edit  doi
openurl 
  Title Recognizing Actions through Action-specific Person Detection Type Journal Article
  Year 2015 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 24 Issue 11 Pages 4422-4432  
  Keywords  
  Abstract Action recognition in still images is a challenging problem in computer vision. To facilitate comparative evaluation independently of person detection, the standard evaluation protocol for action recognition uses an oracle person detector to obtain perfect bounding box information at both training and test time. The assumption is that, in practice, a general person detector will provide candidate bounding boxes for action recognition. In this paper, we argue that this paradigm is suboptimal and that action class labels should already be considered during the detection stage. Motivated by the observation that body pose is strongly conditioned on action class, we show that: 1) the existing state-of-the-art generic person detectors are not adequate for proposing candidate bounding boxes for action classification; 2) due to limited training examples, the direct training of action-specific person detectors is also inadequate; and 3) using only a small number of labeled action examples, the transfer learning is able to adapt an existing detector to propose higher quality bounding boxes for subsequent action classification. To the best of our knowledge, we are the first to investigate transfer learning for the task of action-specific person detection in still images. We perform extensive experiments on two benchmark data sets: 1) Stanford-40 and 2) PASCAL VOC 2012. For the action detection task (i.e., both person localization and classification of the action performed), our approach outperforms methods based on general person detection by 5.7% mean average precision (MAP) on Stanford-40 and 2.1% MAP on PASCAL VOC 2012. Our approach also significantly outperforms the state of the art with a MAP of 45.4% on Stanford-40 and 31.4% on PASCAL VOC 2012. We also evaluate our action detection approach for the task of action classification (i.e., recognizing actions without localizing them). For this task, our approach, without using any ground-truth person localization at test tim- , outperforms on both data sets state-of-the-art methods, which do use person locations.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; LAMP; 600.076; 600.079 Approved no  
  Call Number Admin @ si @ KXR2015 Serial 2668  
Permanent link to this record
 

 
Author Mikhail Mozerov; Joost Van de Weijer edit  doi
openurl 
  Title Accurate stereo matching by two step global optimization Type Journal Article
  Year 2015 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 24 Issue 3 Pages 1153-1163  
  Keywords  
  Abstract In stereo matching cost filtering methods and energy minimization algorithms are considered as two different techniques. Due to their global extend energy minimization methods obtain good stereo matching results. However, they tend to fail in occluded regions, in which cost filtering approaches obtain better results. In this paper we intend to combine both approaches with the aim to improve overall stereo matching results. We show that a global optimization with a fully connected model can be solved by cost fil tering methods. Based on this observation we propose to perform stereo matching as a two-step energy minimization algorithm. We consider two MRF models: a fully connected model defined on the complete set of pixels in an image and a conventional locally connected model. We solve the energy minimization problem for the fully connected model, after which the marginal function of the solution is used as the unary potential in the locally connected MRF model. Experiments on the Middlebury stereo datasets show that the proposed method achieves state-of-the-arts results.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes ISE; LAMP; 600.079; 600.078 Approved no  
  Call Number Admin @ si @ MoW2015a Serial 2568  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Michael Felsberg; Carlo Gatta edit   pdf
doi  openurl
  Title Semantic Pyramids for Gender and Action Recognition Type Journal Article
  Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 23 Issue 8 Pages 3633-3645  
  Keywords  
  Abstract Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes CIC; LAMP; 601.160; 600.074; 600.079;MILAB Approved no  
  Call Number Admin @ si @ KWR2014 Serial 2507  
Permanent link to this record
 

 
Author Shida Beigpour; Christian Riess; Joost Van de Weijer; Elli Angelopoulou edit   pdf
doi  openurl
  Title Multi-Illuminant Estimation with Conditional Random Fields Type Journal Article
  Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 23 Issue 1 Pages 83-95  
  Keywords color constancy; CRF; multi-illuminant  
  Abstract Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes CIC; LAMP; 600.074; 600.079 Approved no  
  Call Number Admin @ si @ BRW2014 Serial 2451  
Permanent link to this record
 

 
Author Adriana Romero; Carlo Gatta; Gustavo Camps-Valls edit   pdf
doi  openurl
  Title Unsupervised Deep Feature Extraction for Remote Sensing Image Classification Type Journal Article
  Year 2016 Publication IEEE Transaction on Geoscience and Remote Sensing Abbreviated Journal TGRS  
  Volume 54 Issue 3 Pages 1349 - 1362  
  Keywords  
  Abstract This paper introduces the use of single-layer and deep convolutional networks for remote sensing data analysis. Direct application to multi- and hyperspectral imagery of supervised (shallow or deep) convolutional networks is very challenging given the high input data dimensionality and the relatively small amount of available labeled data. Therefore, we propose the use of greedy layerwise unsupervised pretraining coupled with a highly efficient algorithm for unsupervised learning of sparse features. The algorithm is rooted on sparse representations and enforces both population and lifetime sparsity of the extracted features, simultaneously. We successfully illustrate the expressive power of the extracted representations in several scenarios: classification of aerial scenes, as well as land-use classification in very high resolution or land-cover classification from multi- and hyperspectral images. The proposed algorithm clearly outperforms standard principal component analysis (PCA) and its kernel counterpart (kPCA), as well as current state-of-the-art algorithms of aerial classification, while being extremely computationally efficient at learning representations of data. Results show that single-layer convolutional networks can extract powerful discriminative features only when the receptive field accounts for neighboring pixels and are preferred when the classification requires high resolution and detailed results. However, deep architectures significantly outperform single-layer variants, capturing increasing levels of abstraction and complexity throughout the feature hierarchy.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0196-2892 ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.079;MILAB Approved no  
  Call Number Admin @ si @ RGC2016 Serial 2723  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: