Home | [111–120] << 121 122 123 124 125 126 127 128 129 130 >> [131–140] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Marc Bolaños; Maite Garolera; Petia Radeva | ||||
Title | Active labeling application applied to food-related object recognition | Type | Conference Article | ||
Year ![]() |
2013 | Publication | 5th International Workshop on Multimedia for Cooking & Eating Activities | Abbreviated Journal | |
Volume | Issue | Pages | 45-50 | ||
Keywords | |||||
Abstract | Every day, lifelogging devices, available for recording different aspects of our daily life, increase in number, quality and functions, just like the multiple applications that we give to them. Applying wearable devices to analyse the nutritional habits of people is a challenging application based on acquiring and analyzing life records in long periods of time. However, to extract the information of interest related to the eating patterns of people, we need automatic methods to process large amount of life-logging data (e.g. recognition of food-related objects). Creating a rich set of manually labeled samples to train the algorithms is slow, tedious and subjective. To address this problem, we propose a novel method in the framework of Active Labeling for construct- ing a training set of thousands of images. Inspired by the hierarchical sampling method for active learning [6], we propose an Active forest that organizes hierarchically the data for easy and fast labeling. Moreover, introducing a classifier into the hierarchical structures, as well as transforming the feature space for better data clustering, additionally im- prove the algorithm. Our method is successfully tested to label 89.700 food-related objects and achieves significant reduction in expert time labelling.
Active labeling application applied to food-related object recognition ResearchGate. Available from: http://www.researchgate.net/publication/262252017Activelabelingapplicationappliedtofood-relatedobjectrecognition [accessed Jul 14, 2015]. |
||||
Address | Barcelona; October 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ACM-CEA | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ BGR2013b | Serial | 2637 | ||
Permanent link to this record | |||||
Author | Mohammad Rouhani; Angel Sappa | ||||
Title | The Richer Representation the Better Registration | Type | Journal Article | ||
Year ![]() |
2013 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 22 | Issue | 12 | Pages | 5036-5049 |
Keywords | |||||
Abstract | In this paper, the registration problem is formulated as a point to model distance minimization. Unlike most of the existing works, which are based on minimizing a point-wise correspondence term, this formulation avoids the correspondence search that is time-consuming. In the first stage, the target set is described through an implicit function by employing a linear least squares fitting. This function can be either an implicit polynomial or an implicit B-spline from a coarse to fine representation. In the second stage, we show how the obtained implicit representation is used as an interface to convert point-to-point registration into point-to-implicit problem. Furthermore, we show that this registration distance is smooth and can be minimized through the Levengberg-Marquardt algorithm. All the formulations presented for both stages are compact and easy to implement. In addition, we show that our registration method can be handled using any implicit representation though some are coarse and others provide finer representations; hence, a tradeoff between speed and accuracy can be set by employing the right implicit function. Experimental results and comparisons in 2D and 3D show the robustness and the speed of convergence of the proposed approach. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1057-7149 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ RoS2013 | Serial | 2665 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; David Fernandez; Alicia Fornes; Ernest Valveny; Gemma Sanchez; Josep Llados | ||||
Title | Runlength Histogram Image Signature for Perceptual Retrieval of Architectural Floor Plans | Type | Conference Article | ||
Year ![]() |
2013 | Publication | 10th IAPR International Workshop on Graphics Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Bethlehem; PA; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | GREC | ||
Notes | DAG; 600.045; 600.061; 600.056 | Approved | no | ||
Call Number | Admin @ si @ HFF2013b | Serial | 2695 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Ernest Valveny; Gemma Sanchez | ||||
Title | Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies | Type | Conference Article | ||
Year ![]() |
2013 | Publication | 10th IAPR International Workshop on Graphics Recognition | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Bethlehem; PA; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | GREC | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ HVS2013b | Serial | 2696 | ||
Permanent link to this record | |||||
Author | A.S. Coquel; Jean-Pascal Jacob; M. Primet; A. Demarez; Mariella Dimiccoli; T. Julou; L. Moisan; A. Lindner; H. Berry | ||||
Title | Localization of protein aggregation in Escherichia coli is governed by diffusion and nucleoid macromolecular crowding effect | Type | Journal Article | ||
Year ![]() |
2013 | Publication | Plos Computational Biology | Abbreviated Journal | PCB |
Volume | 9 | Issue | 4 | Pages | |
Keywords | |||||
Abstract | Aggregates of misfolded proteins are a hallmark of many age-related diseases. Recently, they have been linked to aging of Escherichia coli (E. coli) where protein aggregates accumulate at the old pole region of the aging bacterium. Because of the potential of E. coli as a model organism, elucidating aging and protein aggregation in this bacterium may pave the way to significant advances in our global understanding of aging. A first obstacle along this path is to decipher the mechanisms by which protein aggregates are targeted to specific intercellular locations. Here, using an integrated approach based on individual-based modeling, time-lapse fluorescence microscopy and automated image analysis, we show that the movement of aging-related protein aggregates in E. coli is purely diffusive (Brownian). Using single-particle tracking of protein aggregates in live E. coli cells, we estimated the average size and diffusion constant of the aggregates. Our results provide evidence that the aggregates passively diffuse within the cell, with diffusion constants that depend on their size in agreement with the Stokes-Einstein law. However, the aggregate displacements along the cell long axis are confined to a region that roughly corresponds to the nucleoid-free space in the cell pole, thus confirming the importance of increased macromolecular crowding in the nucleoids. We thus used 3D individual-based modeling to show that these three ingredients (diffusion, aggregation and diffusion hindrance in the nucleoids) are sufficient and necessary to reproduce the available experimental data on aggregate localization in the cells. Taken together, our results strongly support the hypothesis that the localization of aging-related protein aggregates in the poles of E. coli results from the coupling of passive diffusion-aggregation with spatially non-homogeneous macromolecular crowding. They further support the importance of “soft” intracellular structuring (based on macromolecular crowding) in diffusion-based protein localization in E. coli. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | : Stanislav Shvartsman, Princeton University, United States of America | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | Admin @ si @CJP2013 | Serial | 2786 | ||
Permanent link to this record | |||||
Author | Mariella Dimiccoli; Benoît Girard; Alain Berthoz; Daniel Bennequin | ||||
Title | Striola Magica: a functional explanation of otolith organs | Type | Journal Article | ||
Year ![]() |
2013 | Publication | Journal of Computational Neuroscience | Abbreviated Journal | JCN |
Volume | 35 | Issue | 2 | Pages | 125-154 |
Keywords | Otolith organs ;Striola; Vestibular pathway | ||||
Abstract | Otolith end organs of vertebrates sense linear accelerations of the head and gravitation. The hair cells on their epithelia are responsible for transduction. In mammals, the striola, parallel to the line where hair cells reverse their polarization, is a narrow region centered on a curve with curvature and torsion. It has been shown that the striolar region is functionally different from the rest, being involved in a phasic vestibular pathway. We propose a mathematical and computational model that explains the necessity of this amazing geometry for the striola to be able to carry out its function. Our hypothesis, related to the biophysics of the hair cells and to the physiology of their afferent neurons, is that striolar afferents collect information from several type I hair cells to detect the jerk in a large domain of acceleration directions. This predicts a mean number of two calyces for afferent neurons, as measured in rodents. The domain of acceleration directions sensed by our striolar model is compatible with the experimental results obtained on monkeys considering all afferents. Therefore, the main result of our study is that phasic and tonic vestibular afferents cover the same geometrical fields, but at different dynamical and frequency domains. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer US | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1573-6873. 2013 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @DBG2013 | Serial | 2787 | ||
Permanent link to this record | |||||
Author | Jose Manuel Alvarez; Theo Gevers; Antonio Lopez | ||||
Title | Evaluating Color Representation for Online Road Detection | Type | Conference Article | ||
Year ![]() |
2013 | Publication | ICCV Workshop on Computer Vision in Vehicle Technology: From Earth to Mars | Abbreviated Journal | |
Volume | Issue | Pages | 594-595 | ||
Keywords | |||||
Abstract | Detecting traversable road areas ahead a moving vehicle is a key process for modern autonomous driving systems. Most existing algorithms use color to classify pixels as road or background. These algorithms reduce the effect of lighting variations and weather conditions by exploiting the discriminant/invariant properties of different color representations. However, up to date, no comparison between these representations have been conducted. Therefore, in this paper, we perform an evaluation of existing color representations for road detection. More specifically, we focus on color planes derived from RGB data and their most com-
mon combinations. The evaluation is done on a set of 7000 road images acquired using an on-board camera in different real-driving situations. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVVT:E2M | ||
Notes | ADAS;ISE | Approved | no | ||
Call Number | Admin @ si @ AGL2013 | Serial | 2794 | ||
Permanent link to this record | |||||
Author | Albert Gordo; Florent Perronnin; Yunchao Gong; Svetlana Lazebnik | ||||
Title | Asymmetric Distances for Binary Embeddings | Type | Journal Article | ||
Year ![]() |
2014 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 36 | Issue | 1 | Pages | 33-47 |
Keywords | |||||
Abstract | In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH), PCA Embedding (PCAE), PCA Embedding with random rotations (PCAE-RR), and PCA Embedding with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.045; 605.203; 600.077 | Approved | no | ||
Call Number | Admin @ si @ GPG2014 | Serial | 2272 | ||
Permanent link to this record | |||||
Author | Jiaolong Xu; Sebastian Ramos; David Vazquez; Antonio Lopez | ||||
Title | Domain Adaptation of Deformable Part-Based Models | Type | Journal Article | ||
Year ![]() |
2014 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 36 | Issue | 12 | Pages | 2367-2380 |
Keywords | Domain Adaptation; Pedestrian Detection | ||||
Abstract | The accuracy of object classifiers can significantly drop when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, adapting the classifiers to the scenario in which they must operate is of paramount importance. We present novel domain adaptation (DA) methods for object detection. As proof of concept, we focus on adapting the state-of-the-art deformable part-based model (DPM) for pedestrian detection. We introduce an adaptive structural SVM (A-SSVM) that adapts a pre-learned classifier between different domains. By taking into account the inherent structure in feature space (e.g., the parts in a DPM), we propose a structure-aware A-SSVM (SA-SSVM). Neither A-SSVM nor SA-SSVM needs to revisit the source-domain training data to perform the adaptation. Rather, a low number of target-domain training examples (e.g., pedestrians) are used. To address the scenario where there are no target-domain annotated samples, we propose a self-adaptive DPM based on a self-paced learning (SPL) strategy and a Gaussian Process Regression (GPR). Two types of adaptation tasks are assessed: from both synthetic pedestrians and general persons (PASCAL VOC) to pedestrians imaged from an on-board camera. Results show that our proposals avoid accuracy drops as high as 15 points when comparing adapted and non-adapted detectors. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.057; 600.054; 601.217; 600.076 | Approved | no | ||
Call Number | ADAS @ adas @ XRV2014b | Serial | 2436 | ||
Permanent link to this record | |||||
Author | Santiago Segui; Michal Drozdzal; Ekaterina Zaytseva; Fernando Azpiroz; Petia Radeva; Jordi Vitria | ||||
Title | Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images | Type | Journal Article | ||
Year ![]() |
2014 | Publication | IEEE Transactions on Information Technology in Biomedicine | Abbreviated Journal | TITB |
Volume | 18 | Issue | 6 | Pages | 1831-1838 |
Keywords | Wireless Capsule Endoscopy; Small Bowel Motility Dysfunction; Contraction Detection; Structured Prediction; Betweenness Centrality | ||||
Abstract | Intestinal contractions are one of the most important events to diagnose motility pathologies of the small intestine. When visualized by wireless capsule endoscopy (WCE), the sequence of frames that represents a contraction is characterized by a clear wrinkle structure in the central frames that corresponds to the folding of the intestinal wall. In this paper we present a new method to robustly detect wrinkle frames in full WCE videos by using a new mid-level image descriptor that is based on a centrality measure proposed for graphs. We present an extended validation, carried out in a very large database, that shows that the proposed method achieves state of the art performance for this task. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | OR; MILAB; 600.046;MV | Approved | no | ||
Call Number | Admin @ si @ SDZ2014 | Serial | 2385 | ||
Permanent link to this record | |||||
Author | Josep Llados; Marçal Rusiñol | ||||
Title | Graphics Recognition Techniques | Type | Book Chapter | ||
Year ![]() |
2014 | Publication | Handbook of Document Image Processing and Recognition | Abbreviated Journal | |
Volume | D | Issue | Pages | 489-521 | |
Keywords | Dimension recognition; Graphics recognition; Graphic-rich documents; Polygonal approximation; Raster-to-vector conversion; Texture-based primitive extraction; Text-graphics separation | ||||
Abstract | This chapter describes the most relevant approaches for the analysis of graphical documents. The graphics recognition pipeline can be splitted into three tasks. The low level or lexical task extracts the basic units composing the document. The syntactic level is focused on the structure, i.e., how graphical entities are constructed, and involves the location and classification of the symbols present in the document. The third level is a functional or semantic level, i.e., it models what the graphical symbols do and what they mean in the context where they appear. This chapter covers the lexical level, while the next two chapters are devoted to the syntactic and semantic level, respectively. The main problems reviewed in this chapter are raster-to-vector conversion (vectorization algorithms) and the separation of text and graphics components. The research and industrial communities have provided standard methods achieving reasonable performance levels. Hence, graphics recognition techniques can be considered to be in a mature state from a scientific point of view. Additionally this chapter provides insights on some related problems, namely, the extraction and recognition of dimensions in engineering drawings, and the recognition of hatched and tiled patterns. Both problems are usually associated, even integrated, in the vectorization process. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer London | Place of Publication | Editor | D. Doermann; K. Tombre | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-0-85729-858-4 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.077 | Approved | no | ||
Call Number | Admin @ si @ LlR2014 | Serial | 2380 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Ahmed Sheraz; Marcus Liwicki; Ernest Valveny; Gemma Sanchez | ||||
Title | Statistical Segmentation and Structural Recognition for Floor Plan Interpretation | Type | Journal Article | ||
Year ![]() |
2014 | Publication | International Journal on Document Analysis and Recognition | Abbreviated Journal | IJDAR |
Volume | 17 | Issue | 3 | Pages | 221-237 |
Keywords | |||||
Abstract | A generic method for floor plan analysis and interpretation is presented in this article. The method, which is mainly inspired by the way engineers draw and interpret floor plans, applies two recognition steps in a bottom-up manner. First, basic building blocks, i.e., walls, doors, and windows are detected using a statistical patch-based segmentation approach. Second, a graph is generated, and structural pattern recognition techniques are applied to further locate the main entities, i.e., rooms of the building. The proposed approach is able to analyze any type of floor plan regardless of the notation used. We have evaluated our method on different publicly available datasets of real architectural floor plans with different notations. The overall detection and recognition accuracy is about 95 %, which is significantly better than any other state-of-the-art method. Our approach is generic enough such that it could be easily adopted to the recognition and interpretation of any other printed machine-generated structured documents. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-2833 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; ADAS; 600.076; 600.077 | Approved | no | ||
Call Number | HSL2014 | Serial | 2370 | ||
Permanent link to this record | |||||
Author | Salvatore Tabbone; Oriol Ramos Terrades | ||||
Title | An Overview of Symbol Recognition | Type | Book Chapter | ||
Year ![]() |
2014 | Publication | Handbook of Document Image Processing and Recognition | Abbreviated Journal | |
Volume | D | Issue | Pages | 523-551 | |
Keywords | Pattern recognition; Shape descriptors; Structural descriptors; Symbolrecognition; Symbol spotting | ||||
Abstract | According to the Cambridge Dictionaries Online, a symbol is a sign, shape, or object that is used to represent something else. Symbol recognition is a subfield of general pattern recognition problems that focuses on identifying, detecting, and recognizing symbols in technical drawings, maps, or miscellaneous documents such as logos and musical scores. This chapter aims at providing the reader an overview of the different existing ways of describing and recognizing symbols and how the field has evolved to attain a certain degree of maturity. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer London | Place of Publication | Editor | D. Doermann; K. Tombre | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-0-85729-858-4 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.077 | Approved | no | ||
Call Number | Admin @ si @ TaT2014 | Serial | 2489 | ||
Permanent link to this record | |||||
Author | Palaiahnakote Shivakumara; Anjan Dutta; Chew Lim Tan; Umapada Pal | ||||
Title | Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing | Type | Journal Article | ||
Year ![]() |
2014 | Publication | Multimedia Tools and Applications | Abbreviated Journal | MTAP |
Volume | 72 | Issue | 1 | Pages | 515-539 |
Keywords | |||||
Abstract | In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer US | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1380-7501 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.077 | Approved | no | ||
Call Number | Admin @ si @ SDT2014 | Serial | 2357 | ||
Permanent link to this record | |||||
Author | Antonio Hernandez; Miguel Angel Bautista; Xavier Perez Sala; Victor Ponce; Sergio Escalera; Xavier Baro; Oriol Pujol; Cecilio Angulo | ||||
Title | Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D | Type | Journal Article | ||
Year ![]() |
2014 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 50 | Issue | 1 | Pages | 112-121 |
Keywords | RGB-D; Bag-of-Words; Dynamic Time Warping; Human Gesture Recognition | ||||
Abstract | PATREC5825
We present a methodology to address the problem of human gesture segmentation and recognition in video and depth image sequences. A Bag-of-Visual-and-Depth-Words (BoVDW) model is introduced as an extension of the Bag-of-Visual-Words (BoVW) model. State-of-the-art RGB and depth features, including a newly proposed depth descriptor, are analysed and combined in a late fusion form. The method is integrated in a Human Gesture Recognition pipeline, together with a novel probability-based Dynamic Time Warping (PDTW) algorithm which is used to perform prior segmentation of idle gestures. The proposed DTW variant uses samples of the same gesture category to build a Gaussian Mixture Model driven probabilistic model of that gesture class. Results of the whole Human Gesture Recognition pipeline in a public data set show better performance in comparison to both standard BoVW model and DTW approach. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MV; 605.203 | Approved | no | ||
Call Number | Admin @ si @ HBP2014 | Serial | 2353 | ||
Permanent link to this record |