Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–12] |
Records | |||||
---|---|---|---|---|---|
Author | Marc Bolaños; Maite Garolera; Petia Radeva | ||||
Title | Active labeling application applied to food-related object recognition | Type | Conference Article | ||
Year | 2013 | Publication | 5th International Workshop on Multimedia for Cooking & Eating Activities | Abbreviated Journal | |
Volume | Issue | Pages | 45-50 | ||
Keywords | |||||
Abstract | Every day, lifelogging devices, available for recording different aspects of our daily life, increase in number, quality and functions, just like the multiple applications that we give to them. Applying wearable devices to analyse the nutritional habits of people is a challenging application based on acquiring and analyzing life records in long periods of time. However, to extract the information of interest related to the eating patterns of people, we need automatic methods to process large amount of life-logging data (e.g. recognition of food-related objects). Creating a rich set of manually labeled samples to train the algorithms is slow, tedious and subjective. To address this problem, we propose a novel method in the framework of Active Labeling for construct- ing a training set of thousands of images. Inspired by the hierarchical sampling method for active learning [6], we propose an Active forest that organizes hierarchically the data for easy and fast labeling. Moreover, introducing a classifier into the hierarchical structures, as well as transforming the feature space for better data clustering, additionally im- prove the algorithm. Our method is successfully tested to label 89.700 food-related objects and achieves significant reduction in expert time labelling.
Active labeling application applied to food-related object recognition ResearchGate. Available from: http://www.researchgate.net/publication/262252017Activelabelingapplicationappliedtofood-relatedobjectrecognition [accessed Jul 14, 2015]. |
||||
Address | Barcelona; October 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ACM-CEA | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ BGR2013b | Serial | 2637 | ||
Permanent link to this record | |||||
Author | Abel Gonzalez-Garcia; Robert Benavente; Olivier Penacchio; Javier Vazquez; Maria Vanrell; C. Alejandro Parraga | ||||
Title | Coloresia: An Interactive Colour Perception Device for the Visually Impaired | Type | Book Chapter | ||
Year | 2013 | Publication | Multimodal Interaction in Image and Video Applications | Abbreviated Journal | |
Volume | 48 | Issue | Pages | 47-66 | |
Keywords | |||||
Abstract | A significative percentage of the human population suffer from impairments in their capacity to distinguish or even see colours. For them, everyday tasks like navigating through a train or metro network map becomes demanding. We present a novel technique for extracting colour information from everyday natural stimuli and presenting it to visually impaired users as pleasant, non-invasive sound. This technique was implemented inside a Personal Digital Assistant (PDA) portable device. In this implementation, colour information is extracted from the input image and categorised according to how human observers segment the colour space. This information is subsequently converted into sound and sent to the user via speakers or headphones. In the original implementation, it is possible for the user to send its feedback to reconfigure the system, however several features such as these were not implemented because the current technology is limited.We are confident that the full implementation will be possible in the near future as PDA technology improves. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1868-4394 | ISBN | 978-3-642-35931-6 | Medium | |
Area | Expedition | Conference | |||
Notes | CIC; 600.052; 605.203 | Approved | no | ||
Call Number | Admin @ si @ GBP2013 | Serial | 2266 | ||
Permanent link to this record | |||||
Author | Naveen Onkarappa; Angel Sappa | ||||
Title | A Novel Space Variant Image Representation | Type | Journal Article | ||
Year | 2013 | Publication | Journal of Mathematical Imaging and Vision | Abbreviated Journal | JMIV |
Volume | 47 | Issue | 1-2 | Pages | 48-59 |
Keywords | Space-variant representation; Log-polar mapping; Onboard vision applications | ||||
Abstract | Traditionally, in machine vision images are represented using cartesian coordinates with uniform sampling along the axes. On the contrary, biological vision systems represent images using polar coordinates with non-uniform sampling. For various advantages provided by space-variant representations many researchers are interested in space-variant computer vision. In this direction the current work proposes a novel and simple space variant representation of images. The proposed representation is compared with the classical log-polar mapping. The log-polar representation is motivated by biological vision having the characteristic of higher resolution at the fovea and reduced resolution at the periphery. On the contrary to the log-polar, the proposed new representation has higher resolution at the periphery and lower resolution at the fovea. Our proposal is proved to be a better representation in navigational scenarios such as driver assistance systems and robotics. The experimental results involve analysis of optical flow fields computed on both proposed and log-polar representations. Additionally, an egomotion estimation application is also shown as an illustrative example. The experimental analysis comprises results from synthetic as well as real sequences. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer US | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0924-9907 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.055; 605.203; 601.215 | Approved | no | ||
Call Number | Admin @ si @ OnS2013a | Serial | 2243 | ||
Permanent link to this record | |||||
Author | Mirko Arnold; Anarta Ghosh; Glen Doherty; Hugh Mulcahy; Stephen Patchett; Gerard Lacey | ||||
Title | Towards Automatic Direct Observation of Procedure and Skill (DOPS) in Colonoscopy | Type | Conference Article | ||
Year | 2013 | Publication | Proceedings of the International Conference on Computer Vision Theory and Applications | Abbreviated Journal | |
Volume | Issue | Pages | 48-53 | ||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | 800 | Expedition | Conference | VISIGRAPP | |
Notes | MV | Approved | no | ||
Call Number | fernando @ fernando @ | Serial | 2427 | ||
Permanent link to this record | |||||
Author | Daniel Sanchez; J.C.Ortega; Miguel Angel Bautista | ||||
Title | Human Body Segmentation with Multi-limb Error-Correcting Output Codes Detection and Graph Cuts Optimization | Type | Conference Article | ||
Year | 2013 | Publication | 6th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | 7887 | Issue | Pages | 50-58 | |
Keywords | Human Body Segmentation; Error-Correcting Output Codes; Cascade of Classifiers; Graph Cuts | ||||
Abstract | Human body segmentation is a hard task because of the high variability in appearance produced by changes in the point of view, lighting conditions, and number of articulations of the human body. In this paper, we propose a two-stage approach for the segmentation of the human body. In a first step, a set of human limbs are described, normalized to be rotation invariant, and trained using cascade of classifiers to be split in a tree structure way. Once the tree structure is trained, it is included in a ternary Error-Correcting Output Codes (ECOC) framework. This first classification step is applied in a windowing way on a new test image, defining a body-like probability map, which is used as an initialization of a GMM color modelling and binary Graph Cuts optimization procedure. The proposed methodology is tested in a novel limb-labelled data set. Results show performance improvements of the novel approach in comparison to classical cascade of classifiers and human detector-based Graph Cuts segmentation approaches. | ||||
Address | Madeira; Portugal; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-38627-5 | Medium | |
Area | Expedition | Conference | IbPRIA | ||
Notes | HUPBA | Approved | no | ||
Call Number | SOB2013 | Serial | 2250 | ||
Permanent link to this record | |||||
Author | Fernando Barrera; Felipe Lumbreras; Angel Sappa | ||||
Title | Multispectral Piecewise Planar Stereo using Manhattan-World Assumption | Type | Journal Article | ||
Year | 2013 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 34 | Issue | 1 | Pages | 52-61 |
Keywords | Multispectral stereo rig; Dense disparity maps from multispectral stereo; Color and infrared images | ||||
Abstract | This paper proposes a new framework for extracting dense disparity maps from a multispectral stereo rig. The system is constructed with an infrared and a color camera. It is intended to explore novel multispectral stereo matching approaches that will allow further extraction of semantic information. The proposed framework consists of three stages. Firstly, an initial sparse disparity map is generated by using a cost function based on feature matching in a multiresolution scheme. Then, by looking at the color image, a set of planar hypotheses is defined to describe the surfaces on the scene. Finally, the previous stages are combined by reformulating the disparity computation as a global minimization problem. The paper has two main contributions. The first contribution combines mutual information with a shape descriptor based on gradient in a multiresolution scheme. The second contribution, which is based on the Manhattan-world assumption, extracts a dense disparity representation using the graph cut algorithm. Experimental results in outdoor scenarios are provided showing the validity of the proposed framework. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.054; 600.055; 605.203 | Approved | no | ||
Call Number | Admin @ si @ BLS2013 | Serial | 2245 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Andreas Fischer; Carlos David Martinez Hinarejos | ||||
Title | Handwriting Recognition in Historical Documents using Very Large Vocabularies | Type | Conference Article | ||
Year | 2013 | Publication | 2nd International Workshop on Historical Document Imaging and Processing | Abbreviated Journal | |
Volume | Issue | Pages | 67-72 | ||
Keywords | |||||
Abstract | Language models are used in automatic transcription system to resolve ambiguities. This is done by limiting the vocabulary of words that can be recognized as well as estimating the n-gram probability of the words in the given text. In the context of historical documents, a non-unified spelling and the limited amount of written text pose a substantial problem for the selection of the recognizable vocabulary as well as the computation of the word probabilities. In this paper we propose for the transcription of historical Spanish text to keep the corpus for the n-gram limited to a sample of the target text, but expand the vocabulary with words gathered from external resources. We analyze the performance of such a transcription system with different sizes of external vocabularies and demonstrate the applicability and the significant increase in recognition accuracy of using up to 300 thousand external words. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-2115-0 | Medium | ||
Area | Expedition | Conference | HIP | ||
Notes | DAG; 600.056; 600.045; 600.061; 602.006; 602.101 | Approved | no | ||
Call Number | Admin @ si @ FFM2013 | Serial | 2296 | ||
Permanent link to this record | |||||
Author | Laura Igual; Agata Lapedriza; Ricard Borras | ||||
Title | Robust Gait-Based Gender Classification using Depth Cameras | Type | Journal Article | ||
Year | 2013 | Publication | EURASIP Journal on Advances in Signal Processing | Abbreviated Journal | EURASIPJ |
Volume | 37 | Issue | 1 | Pages | 72-80 |
Keywords | |||||
Abstract | This article presents a new approach for gait-based gender recognition using depth cameras, that can run in real time. The main contribution of this study is a new fast feature extraction strategy that uses the 3D point cloud obtained from the frames in a gait cycle. For each frame, these points are aligned according to their centroid and grouped. After that, they are projected into their PCA plane, obtaining a representation of the cycle particularly robust against view changes. Then, final discriminative features are computed by first making a histogram of the projected points and then using linear discriminant analysis. To test the method we have used the DGait database, which is currently the only publicly available database for gait analysis that includes depth information. We have performed experiments on manually labeled cycles and over whole video sequences, and the results show that our method improves the accuracy significantly, compared with state-of-the-art systems which do not use depth information. Furthermore, our approach is insensitive to illumination changes, given that it discards the RGB information. That makes the method especially suitable for real applications, as illustrated in the last part of the experiments section. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; OR;MV | Approved | no | ||
Call Number | Admin @ si @ ILB2013 | Serial | 2144 | ||
Permanent link to this record | |||||
Author | Michal Drozdzal; Santiago Segui; Carolina Malagelada; Fernando Azpiroz; Petia Radeva | ||||
Title | Adaptable image cuts for motility inspection using WCE | Type | Journal Article | ||
Year | 2013 | Publication | Computerized Medical Imaging and Graphics | Abbreviated Journal | CMIG |
Volume | 37 | Issue | 1 | Pages | 72-80 |
Keywords | |||||
Abstract | The Wireless Capsule Endoscopy (WCE) technology allows the visualization of the whole small intestine tract. Since the capsule is freely moving, mainly by the means of peristalsis, the data acquired during the study gives a lot of information about the intestinal motility. However, due to: (1) huge amount of frames, (2) complex intestinal scene appearance and (3) intestinal dynamics that make difficult the visualization of the small intestine physiological phenomena, the analysis of the WCE data requires computer-aided systems to speed up the analysis. In this paper, we propose an efficient algorithm for building a novel representation of the WCE video data, optimal for motility analysis and inspection. The algorithm transforms the 3D video data into 2D longitudinal view by choosing the most informative, from the intestinal motility point of view, part of each frame. This step maximizes the lumen visibility in its longitudinal extension. The task of finding “the best longitudinal view” has been defined as a cost function optimization problem which global minimum is obtained by using Dynamic Programming. Validation on both synthetic data and WCE data shows that the adaptive longitudinal view is a good alternative to the traditional motility analysis done by video analysis. The proposed novel data representation a new, holistic insight into the small intestine motility, allowing to easily define and analyze motility events that are difficult to spot by analyzing WCE video. Moreover, the visual inspection of small intestine motility is 4 times faster then by means of video skimming of the WCE. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; OR; 600.046; 605.203 | Approved | no | ||
Call Number | Admin @ si @ DSM2012 | Serial | 2151 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Joan Mas; Gemma Sanchez; Ernest Valveny | ||||
Title | Notation-invariant patch-based wall detector in architectural floor plans | Type | Book Chapter | ||
Year | 2013 | Publication | Graphics Recognition. New Trends and Challenges | Abbreviated Journal | |
Volume | 7423 | Issue | Pages | 79--88 | |
Keywords | |||||
Abstract | Architectural floor plans exhibit a large variability in notation. Therefore, segmenting and identifying the elements of any kind of plan becomes a challenging task for approaches based on grouping structural primitives obtained by vectorization. Recently, a patch-based segmentation method working at pixel level and relying on the construction of a visual vocabulary has been proposed in [1], showing its adaptability to different notations by automatically learning the visual appearance of the elements in each different notation. This paper presents an evolution of that previous work, after analyzing and testing several alternatives for each of the different steps of the method: Firstly, an automatic plan-size normalization process is done. Secondly we evaluate different features to obtain the description of every patch. Thirdly, we train an SVM classifier to obtain the category of every patch instead of constructing a visual vocabulary. These variations of the method have been tested for wall detection on two datasets of architectural floor plans with different notations. After studying in deep each of the steps in the process pipeline, we are able to find the best system configuration, which highly outperforms the results on wall segmentation obtained by the original paper. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-36823-3 | Medium | |
Area | Expedition | Conference | |||
Notes | DAG; 600.045; 600.056; 605.203 | Approved | no | ||
Call Number | Admin @ si @ HMS2013 | Serial | 2322 | ||
Permanent link to this record | |||||
Author | Jaume Amores | ||||
Title | Multiple Instance Classification: review, taxonomy and comparative study | Type | Journal Article | ||
Year | 2013 | Publication | Artificial Intelligence | Abbreviated Journal | AI |
Volume | 201 | Issue | Pages | 81-105 | |
Keywords | Multi-instance learning; Codebook; Bag-of-Words | ||||
Abstract | Multiple Instance Learning (MIL) has become an important topic in the pattern recognition community, and many solutions to this problemhave been proposed until now. Despite this fact, there is a lack of comparative studies that shed light into the characteristics and behavior of the different methods. In this work we provide such an analysis focused on the classification task (i.e.,leaving out other learning tasks such as regression). In order to perform our study, we implemented
fourteen methods grouped into three different families. We analyze the performance of the approaches across a variety of well-known databases, and we also study their behavior in synthetic scenarios in order to highlight their characteristics. As a result of this analysis, we conclude that methods that extract global bag-level information show a clearly superior performance in general. In this sense, the analysis permits us to understand why some types of methods are more successful than others, and it permits us to establish guidelines in the design of new MIL methods. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier Science Publishers Ltd. Essex, UK | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0004-3702 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 601.042; 600.057 | Approved | no | ||
Call Number | Admin @ si @ Amo2013 | Serial | 2273 | ||
Permanent link to this record | |||||
Author | Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Apostolos Antonacopoulos; Josep Llados | ||||
Title | An interactive appearance-based document retrieval system for historical newspapers | Type | Conference Article | ||
Year | 2013 | Publication | Proceedings of the International Conference on Computer Vision Theory and Applications | Abbreviated Journal | |
Volume | Issue | Pages | 84-87 | ||
Keywords | |||||
Abstract | In this paper we present a retrieval-based application aimed at assisting a user to semi-automatically segment an incoming flow of historical newspaper images by automatically detecting a particular type of pages based on their appearance. A visual descriptor is used to assess page similarity while a relevance feedback process allow refining the results iteratively. The application is tested on a large dataset of digitised historic newspapers. | ||||
Address | Barcelona; February 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISAPP | ||
Notes | DAG; 600.056; 600.045; 605.203 | Approved | no | ||
Call Number | Admin @ si @ GRK2013a | Serial | 2290 | ||
Permanent link to this record | |||||
Author | Carles Fernandez; Jordi Gonzalez; Joao Manuel R. S. Taveres; Xavier Roca | ||||
Title | Towards Ontological Cognitive System | Type | Book Chapter | ||
Year | 2013 | Publication | Topics in Medical Image Processing and Computational Vision | Abbreviated Journal | |
Volume | 8 | Issue | Pages | 87-99 | |
Keywords | |||||
Abstract | The increasing ubiquitousness of digital information in our daily lives has positioned video as a favored information vehicle, and given rise to an astonishing generation of social media and surveillance footage. This raises a series of technological demands for automatic video understanding and management, which together with the compromising attentional limitations of human operators, have motivated the research community to guide its steps towards a better attainment of such capabilities. As a result, current trends on cognitive vision promise to recognize complex events and self-adapt to different environments, while managing and integrating several types of knowledge. Future directions suggest to reinforce the multi-modal fusion of information sources and the communication with end-users. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Netherlands | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2212-9391 | ISBN | 978-94-007-0725-2 | Medium | |
Area | Expedition | Conference | |||
Notes | ISE; 605.203; 302.018; 600.049 | Approved | no | ||
Call Number | Admin @ si @ FGT2013 | Serial | 2287 | ||
Permanent link to this record | |||||
Author | Vitaliy Konovalov; Albert Clapes; Sergio Escalera | ||||
Title | Automatic Hand Detection in RGB-Depth Data Sequences | Type | Conference Article | ||
Year | 2013 | Publication | 16th Catalan Conference on Artificial Intelligence | Abbreviated Journal | |
Volume | Issue | Pages | 91-100 | ||
Keywords | |||||
Abstract | Detecting hands in multi-modal RGB-Depth visual data has become a challenging Computer Vision problem with several applications of interest. This task involves dealing with changes in illumination, viewpoint variations, the articulated nature of the human body, the high flexibility of the wrist articulation, and the deformability of the hand itself. In this work, we propose an accurate and efficient automatic hand detection scheme to be applied in Human-Computer Interaction (HCI) applications in which the user is seated at the desk and, thus, only the upper body is visible. Our main hypothesis is that hand landmarks remain at a nearly constant geodesic distance from an automatically located anatomical reference point.
In a given frame, the human body is segmented first in the depth image. Then, a graph representation of the body is built in which the geodesic paths are computed from the reference point. The dense optical flow vectors on the corresponding RGB image are used to reduce ambiguities of the geodesic paths’ connectivity, allowing to eliminate false edges interconnecting different body parts. Finally, we are able to detect the position of both hands based on invariant geodesic distances and optical flow within the body region, without involving costly learning procedures. |
||||
Address | Vic; October 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CCIA | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ KCE2013 | Serial | 2323 | ||
Permanent link to this record | |||||
Author | Bhaskar Chakraborty; Andrew Bagdanov; Jordi Gonzalez; Xavier Roca | ||||
Title | Human Action Recognition Using an Ensemble of Body-Part Detectors | Type | Journal Article | ||
Year | 2013 | Publication | Expert Systems | Abbreviated Journal | EXSY |
Volume | 30 | Issue | 2 | Pages | 101-114 |
Keywords | Human action recognition;body-part detection;hidden Markov model | ||||
Abstract | This paper describes an approach to human action recognition based on a probabilistic optimization model of body parts using hidden Markov model (HMM). Our method is able to distinguish between similar actions by only considering the body parts having major contribution to the actions, for example, legs for walking, jogging and running; arms for boxing, waving and clapping. We apply HMMs to model the stochastic movement of the body parts for action recognition. The HMM construction uses an ensemble of body-part detectors, followed by grouping of part detections, to perform human identification. Three example-based body-part detectors are trained to detect three components of the human body: the head, legs and arms. These detectors cope with viewpoint changes and self-occlusions through the use of ten sub-classifiers that detect body parts over a specific range of viewpoints. Each sub-classifier is a support vector machine trained on features selected for the discriminative power for each particular part/viewpoint combination. Grouping of these detections is performed using a simple geometric constraint model that yields a viewpoint-invariant human detector. We test our approach on three publicly available action datasets: the KTH dataset, Weizmann dataset and HumanEva dataset. Our results illustrate that with a simple and compact representation we can achieve robust recognition of human actions comparable to the most complex, state-of-the-art methods. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ CBG2013 | Serial | 1809 | ||
Permanent link to this record |