toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Jiaolong Xu; David Vazquez; Antonio Lopez; Javier Marin; Daniel Ponsa edit   pdf
doi  isbn
openurl 
  Title Learning a Part-based Pedestrian Detector in Virtual World Type Journal Article
  Year 2014 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 15 Issue 5 Pages 2121-2131  
  Keywords Domain Adaptation; Pedestrian Detection; Virtual Worlds  
  Abstract Detecting pedestrians with on-board vision systems is of paramount interest for assisting drivers to prevent vehicle-to-pedestrian accidents. The core of a pedestrian detector is its classification module, which aims at deciding if a given image window contains a pedestrian. Given the difficulty of this task, many classifiers have been proposed during the last fifteen years. Among them, the so-called (deformable) part-based classifiers including multi-view modeling are usually top ranked in accuracy. Training such classifiers is not trivial since a proper aspect clustering and spatial part alignment of the pedestrian training samples are crucial for obtaining an accurate classifier. In this paper, first we perform automatic aspect clustering and part alignment by using virtual-world pedestrians, i.e., human annotations are not required. Second, we use a mixture-of-parts approach that allows part sharing among different aspects. Third, these proposals are integrated in a learning framework which also allows to incorporate real-world training data to perform domain adaptation between virtual- and real-world cameras. Overall, the obtained results on four popular on-board datasets show that our proposal clearly outperforms the state-of-the-art deformable part-based detector known as latent SVM.  
  Address  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1931-0587 ISBN 978-1-4673-2754-1 Medium  
  Area Expedition Conference  
  Notes ADAS; 600.076 Approved no  
  Call Number ADAS @ adas @ XVL2014 Serial 2433  
Permanent link to this record
 

 
Author Jiaolong Xu; Sebastian Ramos;David Vazquez; Antonio Lopez edit   pdf
doi  openurl
  Title Cost-sensitive Structured SVM for Multi-category Domain Adaptation Type Conference Article
  Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 3886 - 3891  
  Keywords Domain Adaptation; Pedestrian Detection  
  Abstract Domain adaptation addresses the problem of accuracy drop that a classifier may suffer when the training data (source domain) and the testing data (target domain) are drawn from different distributions. In this work, we focus on domain adaptation for structured SVM (SSVM). We propose a cost-sensitive domain adaptation method for SSVM, namely COSS-SSVM. In particular, during the re-training of an adapted classifier based on target and source data, the idea that we explore consists in introducing a non-zero cost even for correctly classified source domain samples. Eventually, we aim to learn a more targetoriented classifier by not rewarding (zero loss) properly classified source-domain training samples. We assess the effectiveness of COSS-SSVM on multi-category object recognition.  
  Address Stockholm; Sweden; August 2014  
  Corporate Author (up) Thesis  
  Publisher IEEE Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN Medium  
  Area Expedition Conference ICPR  
  Notes ADAS; 600.057; 600.054; 601.217; 600.076 Approved no  
  Call Number ADAS @ adas @ XRV2014a Serial 2434  
Permanent link to this record
 

 
Author Onur Ferhat; Fernando Vilariño; F. Javier Sanchez edit  url
openurl 
  Title A cheap portable eye-tracker solution for common setups. Type Journal Article
  Year 2014 Publication Journal of Eye Movement Research Abbreviated Journal JEMR  
  Volume 7 Issue 3 Pages 1-10  
  Keywords  
  Abstract We analyze the feasibility of a cheap eye-tracker where the hardware consists of a single webcam and a Raspberry Pi device. Our aim is to discover the limits of such a system and to see whether it provides an acceptable performance. We base our work on the open source Opengazer (Zielinski, 2013) and we propose several improvements to create a robust, real-time system which can work on a computer with 30Hz sampling rate. After assessing the accuracy of our eye-tracker in elaborated experiments involving 12 subjects under 4 different system setups, we install it on a Raspberry Pi to create a portable stand-alone eye-tracker which achieves 1.42° horizontal accuracy with 3Hz refresh rate for a building cost of 70 Euros.  
  Address  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ;SIAI Approved no  
  Call Number Admin @ si @ FVS2014 Serial 2435  
Permanent link to this record
 

 
Author J.S. Cope; P.Remagnino; S.Mannan; Katerine Diaz; Francesc J. Ferri; P.Wilkin edit  url
doi  openurl
  Title Reverse Engineering Expert Visual Observations: From Fixations To The Learning Of Spatial Filters With A Neural-Gas Algorithm Type Journal Article
  Year 2013 Publication Expert Systems with Applications Abbreviated Journal EXWA  
  Volume 40 Issue 17 Pages 6707-6712  
  Keywords Neural gas; Expert vision; Eye-tracking; Fixations  
  Abstract Human beings can become experts in performing specific vision tasks, for example, doctors analysing medical images, or botanists studying leaves. With sufficient knowledge and experience, people can become very efficient at such tasks. When attempting to perform these tasks with a machine vision system, it would be highly beneficial to be able to replicate the process which the expert undergoes. Advances in eye-tracking technology can provide data to allow us to discover the manner in which an expert studies an image. This paper presents a first step towards utilizing these data for computer vision purposes. A growing-neural-gas algorithm is used to learn a set of Gabor filters which give high responses to image regions which a human expert fixated on. These filters can then be used to identify regions in other images which are likely to be useful for a given vision task. The algorithm is evaluated by learning filters for locating specific areas of plant leaves.  
  Address  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0957-4174 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ CRM2013 Serial 2438  
Permanent link to this record
 

 
Author Katerine Diaz; Francesc J. Ferri; W. Diaz edit  doi
isbn  openurl
  Title Fast Approximated Discriminative Common Vectors using rank-one SVD updates Type Conference Article
  Year 2013 Publication 20th International Conference On Neural Information Processing Abbreviated Journal  
  Volume 8228 Issue III Pages 368-375  
  Keywords  
  Abstract An efficient incremental approach to the discriminative common vector (DCV) method for dimensionality reduction and classification is presented. The proposal consists of a rank-one update along with an adaptive restriction on the rank of the null space which leads to an approximate but convenient solution. The algorithm can be implemented very efficiently in terms of matrix operations and space complexity, which enables its use in large-scale dynamic application domains. Deep comparative experimentation using publicly available high dimensional image datasets has been carried out in order to properly assess the proposed algorithm against several recent incremental formulations.
K. Diaz-Chito, F.J. Ferri, W. Diaz
 
  Address Daegu; Korea; November 2013  
  Corporate Author (up) Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-42050-4 Medium  
  Area Expedition Conference ICONIP  
  Notes ADAS Approved no  
  Call Number Admin @ si @ DFD2013 Serial 2439  
Permanent link to this record
 

 
Author Katerine Diaz; Francesc J. Ferri edit  url
isbn  openurl
  Title Extensiones del método de vectores comunes discriminantes Aplicadas a la clasificación de imágenes Type Book Whole
  Year 2013 Publication Extensiones del método de vectores comunes discriminantes Aplicadas a la clasificación de imágenes Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Los métodos basados en subespacios son una herramienta muy utilizada en aplicaciones de visión por computador. Aquí se presentan y validan algunos algoritmos que hemos propuesto en este campo de investigación. El primer algoritmo está relacionado con una extensión del método de vectores comunes discriminantes con kernel, que reinterpreta el espacio nulo de la matriz de dispersión intra-clase del conjunto de entrenamiento para obtener las características discriminantes. Dentro de los métodos basados en subespacios existen diferentes tipos de entrenamiento. Uno de los más populares, pero no por ello uno de los más eficientes, es el aprendizaje por lotes. En este tipo de aprendizaje, todas las muestras del conjunto de entrenamiento tienen que estar disponibles desde el inicio. De este modo, cuando nuevas muestras se ponen a disposición del algoritmo, el sistema tiene que ser reentrenado de nuevo desde cero. Una alternativa a este tipo de entrenamiento es el aprendizaje incremental. Aquí­ se proponen diferentes algoritmos incrementales del método de vectores comunes discriminantes.  
  Address  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-639-55339-0 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ DiF2013 Serial 2440  
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera edit  doi
openurl 
  Title Combining Local and Global Learners in the Pairwise Multiclass Classification Type Journal Article
  Year 2015 Publication Pattern Analysis and Applications Abbreviated Journal PAA  
  Volume 18 Issue 4 Pages 845-860  
  Keywords Multiclass classification; Pairwise approach; One-versus-one  
  Abstract Pairwise classification is a well-known class binarization technique that converts a multiclass problem into a number of two-class problems, one problem for each pair of classes. However, in the pairwise technique, nuisance votes of many irrelevant classifiers may result in a wrong class prediction. To overcome this problem, a simple, but efficient method is proposed and evaluated in this paper. The proposed method is based on excluding some classes and focusing on the most probable classes in the neighborhood space, named Local Crossing Off (LCO). This procedure is performed by employing a modified version of standard K-nearest neighbor and large margin nearest neighbor algorithms. The LCO method takes advantage of nearest neighbor classification algorithm because of its local learning behavior as well as the global behavior of powerful binary classifiers to discriminate between two classes. Combining these two properties in the proposed LCO technique will avoid the weaknesses of each method and will increase the efficiency of the whole classification system. On several benchmark datasets of varying size and difficulty, we found that the LCO approach leads to significant improvements using different base learners. The experimental results show that the proposed technique not only achieves better classification accuracy in comparison to other standard approaches, but also is computationally more efficient for tackling classification problems which have a relatively large number of target classes.  
  Address  
  Corporate Author (up) Thesis  
  Publisher Springer London Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1433-7541 ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA;MILAB Approved no  
  Call Number Admin @ si @ BGE2014 Serial 2441  
Permanent link to this record
 

 
Author Oscar Lopes; Miguel Reyes; Sergio Escalera; Jordi Gonzalez edit  doi
openurl 
  Title Spherical Blurred Shape Model for 3-D Object and Pose Recognition: Quantitative Analysis and HCI Applications in Smart Environments Type Journal Article
  Year 2014 Publication IEEE Transactions on Systems, Man and Cybernetics (Part B) Abbreviated Journal TSMCB  
  Volume 44 Issue 12 Pages 2379-2390  
  Keywords  
  Abstract The use of depth maps is of increasing interest after the advent of cheap multisensor devices based on structured light, such as Kinect. In this context, there is a strong need of powerful 3-D shape descriptors able to generate rich object representations. Although several 3-D descriptors have been already proposed in the literature, the research of discriminative and computationally efficient descriptors is still an open issue. In this paper, we propose a novel point cloud descriptor called spherical blurred shape model (SBSM) that successfully encodes the structure density and local variabilities of an object based on shape voxel distances and a neighborhood propagation strategy. The proposed SBSM is proven to be rotation and scale invariant, robust to noise and occlusions, highly discriminative for multiple categories of complex objects like the human hand, and computationally efficient since the SBSM complexity is linear to the number of object voxels. Experimental evaluation in public depth multiclass object data, 3-D facial expressions data, and a novel hand poses data sets show significant performance improvements in relation to state-of-the-art approaches. Moreover, the effectiveness of the proposal is also proved for object spotting in 3-D scenes and for real-time automatic hand pose recognition in human computer interaction scenarios.  
  Address  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 2168-2267 ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; ISE; 600.078;MILAB Approved no  
  Call Number Admin @ si @ LRE2014 Serial 2442  
Permanent link to this record
 

 
Author Xavier Perez Sala; Sergio Escalera; Cecilio Angulo; Jordi Gonzalez edit   pdf
doi  openurl
  Title A survey on model based approaches for 2D and 3D visual human pose recovery Type Journal Article
  Year 2014 Publication Sensors Abbreviated Journal SENS  
  Volume 14 Issue 3 Pages 4189-4210  
  Keywords human pose recovery; human body modelling; behavior analysis; computer vision  
  Abstract Human Pose Recovery has been studied in the field of Computer Vision for the last 40 years. Several approaches have been reported, and significant improvements have been obtained in both data representation and model design. However, the problem of Human Pose Recovery in uncontrolled environments is far from being solved. In this paper, we define a general taxonomy to group model based approaches for Human Pose Recovery, which is composed of five main modules: appearance, viewpoint, spatial relations, temporal consistence, and behavior. Subsequently, a methodological comparison is performed following the proposed taxonomy, evaluating current SoA approaches in the aforementioned five group categories. As a result of this comparison, we discuss the main advantages and drawbacks of the reviewed literature.  
  Address  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; ISE; 600.046; 600.063; 600.078;MILAB Approved no  
  Call Number Admin @ si @ PEA2014 Serial 2443  
Permanent link to this record
 

 
Author Frederic Sampedro; Anna Domenech; Sergio Escalera edit  url
openurl 
  Title Obtaining quantitative global tumoral state indicators based on whole-body PET/CT scans: A breast cancer case study Type Journal Article
  Year 2014 Publication Nuclear Medicine Communications Abbreviated Journal NMC  
  Volume 35 Issue 4 Pages 362-371  
  Keywords  
  Abstract Objectives: In this work we address the need for the computation of quantitative global tumoral state indicators from oncological whole-body PET/computed tomography scans. The combination of such indicators with other oncological information such as tumor markers or biopsy results would prove useful in oncological decision-making scenarios.

Materials and methods: From an ordering of 100 breast cancer patients on the basis of oncological state through visual analysis by a consensus of nuclear medicine specialists, a set of numerical indicators computed from image analysis of the PET/computed tomography scan is presented, which attempts to summarize a patient’s oncological state in a quantitative manner taking into consideration the total tumor volume, aggressiveness, and spread.

Results: Results obtained by comparative analysis of the proposed indicators with respect to the experts’ evaluation show up to 87% Pearson’s correlation coefficient when providing expert-guided PET metabolic tumor volume segmentation and 64% correlation when using completely automatic image analysis techniques.

Conclusion: Global quantitative tumor information obtained by whole-body PET/CT image analysis can prove useful in clinical nuclear medicine settings and oncological decision-making scenarios. The completely automatic computation of such indicators would improve its impact as time efficiency and specialist independence would be achieved.
 
  Address  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA;MILAB Approved no  
  Call Number SDE2014a Serial 2444  
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera edit   pdf
doi  openurl
  Title Generic Subclass Ensemble: A Novel Approach to Ensemble Classification Type Conference Article
  Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 1254 - 1259  
  Keywords  
  Abstract Multiple classifier systems, also known as classifier ensembles, have received great attention in recent years because of their improved classification accuracy in different applications. In this paper, we propose a new general approach to ensemble classification, named generic subclass ensemble, in which each base classifier is trained with data belonging to a subset of classes, and thus discriminates among a subset of target categories. The ensemble classifiers are then fused using a combination rule. The proposed approach differs from existing methods that manipulate the target attribute, since in our approach individual classification problems are not restricted to two-class problems. We perform a series of experiments to evaluate the efficiency of the generic subclass approach on a set of benchmark datasets. Experimental results with multilayer perceptrons show that the proposed approach presents a viable alternative to the most commonly used ensemble classification approaches.  
  Address Stockholm; August 2014  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN Medium  
  Area Expedition Conference ICPR  
  Notes HuPBA;MILAB Approved no  
  Call Number Admin @ si @ BGE2014b Serial 2445  
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Gang Hu; Qigang Gao; Sergio Escalera edit   pdf
doi  openurl
  Title A Framework of Multi-Classifier Fusion for Human Action Recognition Type Conference Article
  Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 1260 - 1265  
  Keywords  
  Abstract The performance of different action-recognition methods using skeleton joint locations have been recently studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of five action learning techniques, each performing the recognition task from a different perspective. The underlying rationale of the fusion approach is that different learners employ varying structures of input descriptors/features to be trained. These varying structures cannot be attached and used by a single learner. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a poorly performing learner. This leads to having a more robust and general-applicable framework. Also, we propose two simple, yet effective, action description techniques. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers' output, showing advanced performance of the proposed methodology.  
  Address Stockholm; Sweden; August 2014  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1051-4651 ISBN Medium  
  Area Expedition Conference ICPR  
  Notes HuPBA;MILAB Approved no  
  Call Number Admin @ si @ BHG2014 Serial 2446  
Permanent link to this record
 

 
Author Naveen Onkarappa edit  isbn
openurl 
  Title Optical Flow in Driver Assistance Systems Type Book Whole
  Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Motion perception is one of the most important attributes of the human brain. Visual motion perception consists in inferring speed and direction of elements in a scene based on visual inputs. Analogously, computer vision is assisted by motion cues in the scene. Motion detection in computer vision is useful in solving problems such as segmentation, depth from motion, structure from motion, compression, navigation and many others. These problems are common in several applications, for instance, video surveillance, robot navigation and advanced driver assistance systems (ADAS). One of the most widely used techniques for motion detection is the optical flow estimation. The work in this thesis attempts to make optical flow suitable for the requirements and conditions of driving scenarios. In this context, a novel space-variant representation called reverse log-polar representation is proposed that is shown to be better than the traditional log-polar space-variant representation for ADAS. The space-variant representations reduce the amount of data to be processed. Another major contribution in this research is related to the analysis of the influence of specific characteristics from driving scenarios on the optical flow accuracy. Characteristics such as vehicle speed and
road texture are considered in the aforementioned analysis. From this study, it is inferred that the regularization weight has to be adapted according to the required error measure and for different speeds and road textures. It is also shown that polar represented optical flow suits driving scenarios where predominant motion is translation. Due to the requirements of such a study and by the lack of needed datasets a new synthetic dataset is presented; it contains: i) sequences of different speeds and road textures in an urban scenario; ii) sequences with complex motion of an on-board camera; and iii) sequences with additional moving vehicles in the scene. The ground-truth optical flow is generated by the ray-tracing technique. Further, few applications of optical flow in ADAS are shown. Firstly, a robust RANSAC based technique to estimate horizon line is proposed. Then, an egomotion estimation is presented to compare the proposed space-variant representation with the classical one. As a final contribution, a modification in the regularization term is proposed that notably improves the results
in the ADAS applications. This adaptation is evaluated using a state of the art optical flow technique. The experiments on a public dataset (KITTI) validate the advantages of using the proposed modification.
 
  Address Bellaterra  
  Corporate Author (up) Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-940902-1-9 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Nav2013 Serial 2447  
Permanent link to this record
 

 
Author Jorge Bernal; Fernando Vilariño; F. Javier Sanchez; M. Arnold; Anarta Ghosh; Gerard Lacey edit   pdf
doi  isbn
openurl 
  Title Experts vs Novices: Applying Eye-tracking Methodologies in Colonoscopy Video Screening for Polyp Search Type Conference Article
  Year 2014 Publication 2014 Symposium on Eye Tracking Research and Applications Abbreviated Journal  
  Volume Issue Pages 223-226  
  Keywords  
  Abstract We present in this paper a novel study aiming at identifying the differences in visual search patterns between physicians of diverse levels of expertise during the screening of colonoscopy videos. Physicians were clustered into two groups -experts and novices- according to the number of procedures performed, and fixations were captured by an eye-tracker device during the task of polyp search in different video sequences. These fixations were integrated into heat maps, one for each cluster. The obtained maps were validated over a ground truth consisting of a mask of the polyp, and the comparison between experts and novices was performed by using metrics such as reaction time, dwelling time and energy concentration ratio. Experimental results show a statistically significant difference between experts and novices, and the obtained maps show to be a useful tool for the characterisation of the behaviour of each group.  
  Address USA; March 2014  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4503-2751-0 Medium  
  Area Expedition Conference ETRA  
  Notes MV; 600.047; 600.060;SIAI Approved no  
  Call Number Admin @ si @ BVS2014 Serial 2448  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Andrew Bagdanov; Michael Felsberg edit   pdf
doi  openurl
  Title Scale Coding Bag-of-Words for Action Recognition Type Conference Article
  Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 1514-1519  
  Keywords  
  Abstract Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image.
Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant
strategy is sub-optimal since it ignores the multi-scale information
available with each bounding box of a person.
This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music,
riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.
 
  Address Stockholm; August 2014  
  Corporate Author (up) Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes CIC; LAMP; 601.240; 600.074; 600.079 Approved no  
  Call Number Admin @ si @ KWB2014 Serial 2450  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: