toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Dena Bazazian edit  isbn
openurl 
  Title Fully Convolutional Networks for Text Understanding in Scene Images Type Book Whole
  Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Text understanding in scene images has gained plenty of attention in the computer vision community and it is an important task in many applications as text carries semantically rich information about scene content and context. For instance, reading text in a scene can be applied to autonomous driving, scene understanding or assisting visually impaired people. The general aim of scene text understanding is to localize and recognize text in scene images. Text regions are first localized in the original image by a trained detector model and afterwards fed into a recognition module. The tasks of localization and recognition are highly correlated since an inaccurate localization can affect the recognition task.
The main purpose of this thesis is to devise efficient methods for scene text understanding. We investigate how the latest results on deep learning can advance text understanding pipelines. Recently, Fully Convolutional Networks (FCNs) and derived methods have achieved a significant performance on semantic segmentation and pixel level classification tasks. Therefore, we took benefit of the strengths of FCN approaches in order to detect text in natural scenes. In this thesis we have focused on two challenging tasks of scene text understanding which are Text Detection and Word Spotting. For the task of text detection, we have proposed an efficient text proposal technique in scene images. We have considered the Text Proposals method as the baseline which is an approach to reduce the search space of possible text regions in an image. In order to improve the Text Proposals method we combined it with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same level of accuracy and thus gaining a significant speed up. Our experiments demonstrate that this text proposal approach yields significantly higher recall rates than the line based text localization techniques, while also producing better-quality localization. We have also applied this technique on compressed images such as videos from wearable egocentric cameras. For the task of word spotting, we have introduced a novel mid-level word representation method. We have proposed a technique to create and exploit an intermediate representation of images based on text attributes which roughly correspond to character probability maps. Our representation extends the concept of Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We call this representation the Soft-PHOC. Furthermore, we show how to use Soft-PHOC descriptors for word spotting tasks through an efficient text line proposal algorithm. To evaluate the detected text, we propose a novel line based evaluation along with the classic bounding box based approach. We test our method on incidental scene text images which comprises real-life scenarios such as urban scenes. The importance of incidental scene text images is due to the complexity of backgrounds, perspective, variety of script and language, short text and little linguistic context. All of these factors together makes the incidental scene text images challenging.
 
  Address November 2018  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Dimosthenis Karatzas;Andrew Bagdanov  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-1-1 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Baz2018 Serial (down) 3220  
Permanent link to this record
 

 
Author Albert Clapes edit  isbn
openurl 
  Title Learning to recognize human actions: from hand-crafted to deep-learning based visual representations Type Book Whole
  Year 2019 Publication PhD Thesis, Universitat de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Action recognition is a very challenging and important problem in computer vi­sion. Researchers working on this field aspire to provide computers with the abil­ ity to visually perceive human actions – that is, to observe, interpret, and under­ stand human-related events that occur in the physical environment merely from visual data. The applications of this technology are numerous: human-machine interaction, e-health, monitoring/surveillance, and content-based video retrieval, among others. Hand-crafted methods dominated the field until the apparition of the first successful deep learning-based action recognition works. Although ear­ lier deep-based methods underperformed with respect to hand-crafted approaches, these slowly but steadily improved to become state-of-the-art, eventually achieving better results than hand-crafted ones. Still, hand-crafted approaches can be advan­ tageous in certain scenarios, specially when not enough data is available to train very large deep models or simply to be combined with deep-based methods to fur­ ther boost the performance. Hence, showing how hand-crafted features can provide extra knowledge the deep networks are notable to easily learn about human actions.
This Thesis concurs in time with this change of paradigm and, hence, reflects it into two distinguished parts. In the first part, we focus on improving current suc­ cessful hand-crafted approaches for action recognition and we do so from three dif­ ferent perspectives. Using the dense trajectories framework as a backbone: first, we explore the use of multi-modal and multi-view input
data to enrich the trajectory de­ scriptors. Second, we focus on the classification part of action recognition pipelines and propose an ensemble learning approach, where each classifier leams from a dif­ferent set of local spatiotemporal features to then combine their outputs following an strategy based on the Dempster-Shaffer Theory. And third, we propose a novel hand-crafted feature extraction method that constructs a rnid-level feature descrip­ tion to better modellong-term spatiotemporal dynarnics within action videos. Moving to the second part of the Thesis, we start with a comprehensive study of the current deep-learning based action recognition methods. We review both fun­ damental and cutting edge methodologies reported during the last few years and introduce a taxonomy of deep-leaming methods dedicated to action recognition. In particular, we analyze and discuss how these handle
the temporal dimension of data. Last but not least, we propose a residual recurrent network for action recogni­ tion that naturally integrates all our previous findings in a powerful and prornising framework.
 
  Address January 2019  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Sergio Escalera  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-2-8 Medium  
  Area Expedition Conference  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ Cla2019 Serial (down) 3219  
Permanent link to this record
 

 
Author Aymen Azaza edit  isbn
openurl 
  Title Context, Motion and Semantic Information for Computational Saliency Type Book Whole
  Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The main objective of this thesis is to highlight the salient object in an image or in a video sequence. We address three important—but in our opinion
insufficiently investigated—aspects of saliency detection. Firstly, we start
by extending previous research on saliency which explicitly models the information provided from the context. Then, we show the importance of
explicit context modelling for saliency estimation. Several important works
in saliency are based on the usage of object proposals. However, these methods
focus on the saliency of the object proposal itself and ignore the context.
To introduce context in such saliency approaches, we couple every object
proposal with its direct context. This allows us to evaluate the importance
of the immediate surround (context) for its saliency. We propose several
saliency features which are computed from the context proposals including
features based on omni-directional and horizontal context continuity. Secondly,
we investigate the usage of top-downmethods (high-level semantic
information) for the task of saliency prediction since most computational
methods are bottom-up or only include few semantic classes. We propose
to consider a wider group of object classes. These objects represent important
semantic information which we will exploit in our saliency prediction
approach. Thirdly, we develop a method to detect video saliency by computing
saliency from supervoxels and optical flow. In addition, we apply the
context features developed in this thesis for video saliency detection. The
method combines shape and motion features with our proposed context
features. To summarize, we prove that extending object proposals with their
direct context improves the task of saliency detection in both image and
video data. Also the importance of the semantic information in saliency
estimation is evaluated. Finally, we propose a newmotion feature to detect
saliency in video data. The three proposed novelties are evaluated on standard
saliency benchmark datasets and are shown to improve with respect to
state-of-the-art.
 
  Address October 2018  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Joost Van de Weijer;Ali Douik  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-945373-9-4 Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ Aza2018 Serial (down) 3218  
Permanent link to this record
 

 
Author Suman Ghosh edit  isbn
openurl 
  Title Word Spotting and Recognition in Images from Heterogeneous Sources A Type Book Whole
  Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Text is the most common way of information sharing from ages. With recent development of personal images databases and handwritten historic manuscripts the demand for algorithms to make these databases accessible for browsing and indexing are in rise. Enabling search or understanding large collection of manuscripts or image databases needs fast and robust methods. Researchers have found different ways to represent cropped words for understanding and matching, which works well when words are already segmented. However there is no trivial way to extend these for non-segmented documents. In this thesis we explore different methods for text retrieval and recognition from unsegmented document and scene images. Two different ways of representation exist in literature, one uses a fixed length representation learned from cropped words and another a sequence of features of variable length. Throughout this thesis, we have studied both these representation for their suitability in segmentation free understanding of text. In the first part we are focused on segmentation free word spotting using a fixed length representation. We extended the use of the successful PHOC (Pyramidal Histogram of Character) representation to segmentation free retrieval. In the second part of the thesis, we explore sequence based features and finally, we propose a unified solution where the same framework can generate both kind of representations.  
  Address November 2018  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Ernest Valveny  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-0-4 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Gho2018 Serial (down) 3217  
Permanent link to this record
 

 
Author Gholamreza Anbarjafari; Sergio Escalera edit  url
isbn  openurl
  Title Human-Robot Interaction: Theory and Application Type Book Whole
  Year 2018 Publication Human-Robot Interaction: Theory and Application Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-78923-316-2 Medium  
  Area Expedition Conference  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ AnE2018 Serial (down) 3216  
Permanent link to this record
 

 
Author Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya edit  openurl
  Title Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable” Type Journal
  Year 2018 Publication Informaciones Psiquiatricas Abbreviated Journal  
  Volume 232 Issue Pages 47-59  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0210-7279 ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no menciona Approved no  
  Call Number Admin @ si @ FAA2018 Serial (down) 3214  
Permanent link to this record
 

 
Author Reza Azad; Maryam Asadi-Aghbolaghi; Shohreh Kasaei; Sergio Escalera edit  doi
openurl 
  Title Dynamic 3D Hand Gesture Recognition by Learning Weighted Depth Motion Maps Type Journal Article
  Year 2019 Publication IEEE Transactions on Circuits and Systems for Video Technology Abbreviated Journal TCSVT  
  Volume 29 Issue 6 Pages 1729-1740  
  Keywords Hand gesture recognition; Multilevel temporal sampling; Weighted depth motion map; Spatio-temporal description; VLAD encoding  
  Abstract Hand gesture recognition from sequences of depth maps is a challenging computer vision task because of the low inter-class and high intra-class variability, different execution rates of each gesture, and the high articulated nature of human hand. In this paper, a multilevel temporal sampling (MTS) method is first proposed that is based on the motion energy of key-frames of depth sequences. As a result, long, middle, and short sequences are generated that contain the relevant gesture information. The MTS results in increasing the intra-class similarity while raising the inter-class dissimilarities. The weighted depth motion map (WDMM) is then proposed to extract the spatio-temporal information from generated summarized sequences by an accumulated weighted absolute difference of consecutive frames. The histogram of gradient (HOG) and local binary pattern (LBP) are exploited to extract features from WDMM. The obtained results define the current state-of-the-art on three public benchmark datasets of: MSR Gesture 3D, SKIG, and MSR Action 3D, for 3D hand gesture recognition. We also achieve competitive results on NTU action dataset.  
  Address June 2019,  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ AAK2018 Serial (down) 3213  
Permanent link to this record
 

 
Author Rain Eric Haamer; Eka Rusadze; Iiris Lusi; Tauseef Ahmed; Sergio Escalera; Gholamreza Anbarjafari edit  doi
isbn  openurl
  Title Review on Emotion Recognition Databases Type Book Chapter
  Year 2018 Publication Human-Robot Interaction: Theory and Application Abbreviated Journal  
  Volume Issue Pages  
  Keywords emotion; computer vision; databases  
  Abstract Over the past few decades human-computer interaction has become more important in our daily lives and research has developed in many directions: memory research, depression detection, and behavioural deficiency detection, lie detection, (hidden) emotion recognition etc. Because of that, the number of generic emotion and face databases or those tailored to specific needs have grown immensely large. Thus, a comprehensive yet compact guide is needed to help researchers find the most suitable database and understand what types of databases already exist. In this paper, different elicitation methods are discussed and the databases are primarily organized into neat and informative tables based on the format.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-78923-316-2 Medium  
  Area Expedition Conference  
  Notes HUPBA; 602.133 Approved no  
  Call Number Admin @ si @ HRL2018 Serial (down) 3212  
Permanent link to this record
 

 
Author Gabriela Ramirez; Esau Villatoro; Bogdan Ionescu; Hugo Jair Escalante; Sergio Escalera; Martha Larson; Henning Muller; Isabelle Guyon edit  openurl
  Title Overview of the Multimedia Information Processing for Personality & Social Networks Analysis Contes Type Conference Article
  Year 2018 Publication Multimedia Information Processing for Personality and Social Networks Analysis (MIPPSNA 2018) Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Beijing; China; August 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRW  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ RVI2018 Serial (down) 3211  
Permanent link to this record
 

 
Author Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier edit  doi
openurl 
  Title Multimodal First Impression Analysis with Deep Residual Networks Type Journal Article
  Year 2018 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC  
  Volume 8 Issue 3 Pages 316-329  
  Keywords  
  Abstract People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ GGB2018 Serial (down) 3210  
Permanent link to this record
 

 
Author Cristina Palmero; Javier Selva; Mohammad Ali Bagueri; Sergio Escalera edit   pdf
openurl 
  Title Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues Type Conference Article
  Year 2018 Publication 29th British Machine Vision Conference Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Gaze behavior is an important non-verbal cue in social signal processing and humancomputer interaction. In this paper, we tackle the problem of person- and head poseindependent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN). We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images. Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame. Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on
EYEDIAP dataset, further improved by 4% when the temporal modality is included.
 
  Address Newcastle; UK; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference BMVC  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ PSB2018 Serial (down) 3208  
Permanent link to this record
 

 
Author Mohamed Ilyes Lakhal; Albert Clapes; Sergio Escalera; Oswald Lanz; Andrea Cavallaro edit   pdf
url  openurl
  Title Residual Stacked RNNs for Action Recognition Type Conference Article
  Year 2018 Publication 9th International Workshop on Human Behavior Understanding Abbreviated Journal  
  Volume Issue Pages 534-548  
  Keywords Action recognition; Deep residual learning; Two-stream RNN  
  Abstract Action recognition pipelines that use Recurrent Neural Networks (RNN) are currently 5–10% less accurate than Convolutional Neural Networks (CNN). While most works that use RNNs employ a 2D CNN on each frame to extract descriptors for action recognition, we extract spatiotemporal features from a 3D CNN and then learn the temporal relationship of these descriptors through a stacked residual recurrent neural network (Res-RNN). We introduce for the first time residual learning to counter the degradation problem in multi-layer RNNs, which have been successful for temporal aggregation in two-stream action recognition pipelines. Finally, we use a late fusion strategy to combine RGB and optical flow data of the two-stream Res-RNN. Experimental results show that the proposed pipeline achieves competitive results on UCF-101 and state of-the-art results for RNN-like architectures on the challenging HMDB-51 dataset.  
  Address Munich; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ LCE2018b Serial (down) 3206  
Permanent link to this record
 

 
Author Ciprian Corneanu; Meysam Madadi; Sergio Escalera edit   pdf
url  openurl
  Title Deep Structure Inference Network for Facial Action Unit Recognition Type Conference Article
  Year 2018 Publication 15th European Conference on Computer Vision Abbreviated Journal  
  Volume 11216 Issue Pages 309-324  
  Keywords Computer Vision; Machine Learning; Deep Learning; Facial Expression Analysis; Facial Action Units; Structure Inference  
  Abstract Facial expressions are combinations of basic components called Action Units (AU). Recognizing AUs is key for general facial expression analysis. Recently, efforts in automatic AU recognition have been dedicated to learning combinations of local features and to exploiting correlations between AUs. We propose a deep neural architecture that tackles both problems by combining learned local and global features in its initial stages and replicating a message passing algorithm between classes similar to a graphical model inference approach in later stages. We show that by training the model end-to-end with increased supervision we improve state-of-the-art by 5.3% and 8.2% performance on BP4D and DISFA datasets, respectively.  
  Address Munich; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCV  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ CME2018 Serial (down) 3205  
Permanent link to this record
 

 
Author Marc Oliu; Javier Selva; Sergio Escalera edit   pdf
url  openurl
  Title Folded Recurrent Neural Networks for Future Video Prediction Type Conference Article
  Year 2018 Publication 15th European Conference on Computer Vision Abbreviated Journal  
  Volume 11218 Issue Pages 745-761  
  Keywords  
  Abstract Future video prediction is an ill-posed Computer Vision problem that recently received much attention. Its main challenges are the high variability in video content, the propagation of errors through time, and the non-specificity of the future frames: given a sequence of past frames there is a continuous distribution of possible futures. This work introduces bijective Gated Recurrent Units, a double mapping between the input and output of a GRU layer. This allows for recurrent auto-encoders with state sharing between encoder and decoder, stratifying the sequence representation and helping to prevent capacity problems. We show how with this topology only the encoder or decoder needs to be applied for input encoding and prediction, respectively. This reduces the computational cost and avoids re-encoding the predictions when generating a sequence of frames, mitigating the propagation of errors. Furthermore, it is possible to remove layers from an already trained model, giving an insight to the role performed by each layer and making the model more explainable. We evaluate our approach on three video datasets, outperforming state of the art prediction results on MMNIST and UCF101, and obtaining competitive results on KTH with 2 and 3 times less memory usage and computational cost than the best scored approach.  
  Address Munich; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCV  
  Notes HUPBA; no menciona Approved no  
  Call Number Admin @ si @ OSE2018 Serial (down) 3204  
Permanent link to this record
 

 
Author Meysam Madadi; Sergio Escalera; Alex Carruesco Llorens; Carlos Andujar; Xavier Baro; Jordi Gonzalez edit   pdf
url  doi
openurl 
  Title Top-down model fitting for hand pose recovery in sequences of depth images Type Journal Article
  Year 2018 Publication Image and Vision Computing Abbreviated Journal IMAVIS  
  Volume 79 Issue Pages 63-75  
  Keywords  
  Abstract State-of-the-art approaches on hand pose estimation from depth images have reported promising results under quite controlled considerations. In this paper we propose a two-step pipeline for recovering the hand pose from a sequence of depth images. The pipeline has been designed to deal with images taken from any viewpoint and exhibiting a high degree of finger occlusion. In a first step we initialize the hand pose using a part-based model, fitting a set of hand components in the depth images. In a second step we consider temporal data and estimate the parameters of a trained bilinear model consisting of shape and trajectory bases. We evaluate our approach on a new created synthetic hand dataset along with NYU and MSRA real datasets. Results demonstrate that the proposed method outperforms the most recent pose recovering approaches, including those based on CNNs.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; 600.098 Approved no  
  Call Number Admin @ si @ MEC2018 Serial (down) 3203  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: