|
Bhaskar Chakraborty, Marco Pedersoli, & Jordi Gonzalez. (2008). View-Invariant Human Action Detection using Component-Wise HMM of Body Parts. In Articulated Motion and Deformable Objects, 5th International Conference (Vol. 5098, 208–217). LNCS.
|
|
|
Bhaskar Chakraborty, Michael Holte, Thomas B. Moeslund, & Jordi Gonzalez. (2012). Selective Spatio-Temporal Interest Points. CVIU - Computer Vision and Image Understanding, 116(3), 396–410.
Abstract: Recent progress in the field of human action recognition points towards the use of Spatio-TemporalInterestPoints (STIPs) for local descriptor-based recognition strategies. In this paper, we present a novel approach for robust and selective STIP detection, by applying surround suppression combined with local and temporal constraints. This new method is significantly different from existing STIP detection techniques and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-video words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on popular benchmark datasets (KTH and Weizmann), more challenging datasets of complex scenes with background clutter and camera motion (CVC and CMU), movie and YouTube video clips (Hollywood 2 and YouTube), and complex scenes with multiple actors (MSR I and Multi-KTH), validates our approach and show state-of-the-art performance. Due to the unavailability of ground truth action annotation data for the Multi-KTH dataset, we introduce an actor specific spatio-temporal clustering of STIPs to address the problem of automatic action annotation of multiple simultaneous actors. Additionally, we perform cross-data action recognition by training on source datasets (KTH and Weizmann) and testing on completely different and more challenging target datasets (CVC, CMU, MSR I and Multi-KTH). This documents the robustness of our proposed approach in the realistic scenario, using separate training and test datasets, which in general has been a shortcoming in the performance evaluation of human action recognition techniques.
|
|
|
Bhaskar Chakraborty, Michael Holte, Thomas B. Moeslund, Jordi Gonzalez, & Xavier Roca. (2011). A Selective Spatio-Temporal Interest Point Detector for Human Action Recognition in Complex Scenes. In 13th IEEE International Conference on Computer Vision (pp. 1776–1783).
Abstract: Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper we present a new approach for STIP detection by applying surround suppression combined with local and temporal constraints. Our method is significantly different from existing STIP detectors and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-visual words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on existing benchmark datasets, and more challenging datasets of complex scenes, validate our approach and show state-of-the-art performance.
|
|
|
Bhaskar Chakraborty, Ognjen Rudovic, & Jordi Gonzalez. (2008). View-Invariant Human-Body Detection with Extension to Human Action Recognition using Component-Wise HMM of Body Parts. In 8th IEEE International Conference on Automatic Face and Gesture Recognition.
|
|
|
Bogdan Raducanu, Alireza Bosaghzadeh, & Fadi Dornaika. (2015). Multi-observation Face Recognition in Videos based on Label Propagation. In 6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015 (pp. 10–17).
Abstract: In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we
propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the
selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video
sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods.
|
|
|
Bogdan Raducanu, Alireza Bosaghzadeh, & Fadi Dornaika. (2014). Facial Expression Recognition based on Multi-view Observations with Application to Social Robotics. In 1st Workshop on Computer Vision for Affective Computing (pp. 1–8).
Abstract: Human-robot interaction is a hot topic nowadays in the social robotics community. One crucial aspect is represented by the affective communication which comes encoded through the facial expressions. In this paper, we propose a novel approach for facial expression recognition, which exploits an efficient and adaptive graph-based label propagation (semi-supervised mode) in a multi-observation framework. The facial features are extracted using an appearance-based 3D face tracker, view- and texture independent. Our method has been extensively tested on the CMU dataset, and has been conveniently compared with other methods for graph construction. With the proposed approach, we developed an application for an AIBO robot, in which it mirrors the recognized facial
expression.
|
|
|
Bogdan Raducanu, & D. Gatica-Perez. (2012). Inferring competitive role patterns in reality TV show through nonverbal analysis. MTAP - Multimedia Tools and Applications, 56(1), 207–226.
Abstract: This paper introduces a new facet of social media, namely that depicting social interaction. More concretely, we address this problem from the perspective of nonverbal behavior-based analysis of competitive meetings. For our study, we made use of “The Apprentice” reality TV show, which features a competition for a real, highly paid corporate job. Our analysis is centered around two tasks regarding a person's role in a meeting: predicting the person with the highest status, and predicting the fired candidates. We address this problem by adopting both supervised and unsupervised strategies. The current study was carried out using nonverbal audio cues. Our approach is based only on the nonverbal interaction dynamics during the meeting without relying on the spoken words. The analysis is based on two types of data: individual and relational measures. Results obtained from the analysis of a full season of the show are promising (up to 85.7% of accuracy in the first case and up to 92.8% in the second case). Our approach has been conveniently compared with the Influence Model, demonstrating its superiority.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2014). Embedding new observations via sparse-coding for non-linear manifold learning. PR - Pattern Recognition, 47(1), 480–492.
Abstract: Non-linear dimensionality reduction techniques are affected by two critical aspects: (i) the design of the adjacency graphs, and (ii) the embedding of new test data-the out-of-sample problem. For the first aspect, the proposed solutions, in general, were heuristically driven. For the second aspect, the difficulty resides in finding an accurate mapping that transfers unseen data samples into an existing manifold. Past works addressing these two aspects were heavily parametric in the sense that the optimal performance is only achieved for a suitable parameter choice that should be known in advance. In this paper, we demonstrate that the sparse representation theory not only serves for automatic graph construction as shown in recent works, but also represents an accurate alternative for out-of-sample embedding. Considering for a case study the Laplacian Eigenmaps, we applied our method to the face recognition problem. To evaluate the effectiveness of the proposed out-of-sample embedding, experiments are conducted using the K-nearest neighbor (KNN) and Kernel Support Vector Machines (KSVM) classifiers on six public face datasets. The experimental results show that the proposed model is able to achieve high categorization effectiveness as well as high consistency with non-linear embeddings/manifolds obtained in batch modes.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2013). Texture-independent recognition of facial expressions in image snapshots and videos. MVA - Machine Vision and Applications, 24(4), 811–820.
Abstract: This paper addresses the static and dynamic recognition of basic facial expressions. It has two main contributions. First, we introduce a view- and texture-independent scheme that exploits facial action parameters estimated by an appearance-based 3D face tracker. We represent the learned facial actions associated with different facial expressions by time series. Second, we compare this dynamic scheme with a static one based on analyzing individual snapshots and show that the former performs better than the latter. We provide evaluations of performance using three subspace learning techniques: linear discriminant analysis, non-parametric discriminant analysis and support vector machines.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2012). Pose-Invariant Face Recognition in Videos for Human-Machine Interaction. In 12th European Conference on Computer Vision (Vol. 7584, 566.575). LNCS. Springer Berlin Heidelberg.
Abstract: Human-machine interaction is a hot topic nowadays in the communities of computer vision and robotics. In this context, face recognition algorithms (used as primary cue for a person’s identity assessment) work well under controlled conditions but degrade significantly when tested in real-world environments. This is mostly due to the difficulty of simultaneously handling variations in illumination, pose, and occlusions. In this paper, we propose a novel approach for robust pose-invariant face recognition for human-robot interaction based on the real-time fitting of a 3D deformable model to input images taken from video sequences. More concrete, our approach generates a rectified face image irrespective with the actual head-pose orientation. Experimental results performed on Honda video database, using several manifold learning techniques, show a distinct advantage of the proposed method over the standard 2D appearance-based snapshot approach.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2012). Appearance-based Face Recognition Using A Supervised Manifold Learning Framework. In IEEE Workshop on the Applications of Computer Vision (pp. 465–470). IEEE Xplore.
Abstract: Many natural image sets, depicting objects whose appearance is changing due to motion, pose or light variations, can be considered samples of a low-dimension nonlinear manifold embedded in the high-dimensional observation space (the space of all possible images). The main contribution of our work is represented by a Supervised Laplacian Eigemaps (S-LE) algorithm, which exploits the class label information for mapping the original data in the embedded space. Our proposed approach benefits from two important properties: i) it is discriminative, and ii) it adaptively selects the neighbors of a sample without using any predefined neighborhood size. Experiments were conducted on four face databases and the results demonstrate that the proposed algorithm significantly outperforms many linear and non-linear embedding techniques. Although we've focused on the face recognition problem, the proposed approach could also be extended to other category of objects characterized by large variance in their appearance.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2012). A Supervised Non-linear Dimensionality Reduction Approach for Manifold Learning. PR - Pattern Recognition, 45(6), 2432–2444.
Abstract: IF= 2.61
IF=2.61 (2010)
In this paper we introduce a novel supervised manifold learning technique called Supervised Laplacian Eigenmaps (S-LE), which makes use of class label information to guide the procedure of non-linear dimensionality reduction by adopting the large margin concept. The graph Laplacian is split into two components: within-class graph and between-class graph to better characterize the discriminant property of the data. Our approach has two important characteristics: (i) it adaptively estimates the local neighborhood surrounding each sample based on data density and similarity and (ii) the objective function simultaneously maximizes the local margin between heterogeneous samples and pushes the homogeneous samples closer to each other.
Our approach has been tested on several challenging face databases and it has been conveniently compared with other linear and non-linear techniques, demonstrating its superiority. Although we have concentrated in this paper on the face recognition problem, the proposed approach could also be applied to other category of objects characterized by large variations in their appearance (such as hand or body pose, for instance.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2012). Out-of-Sample Embedding by Sparse Representation. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 336–344). Springer Berlin Heidelberg.
Abstract: A critical aspect of non-linear dimensionality reduction techniques is represented by the construction of the adjacency graph. The difficulty resides in finding the optimal parameters, a process which, in general, is heuristically driven. Recently, sparse representation has been proposed as a non-parametric solution to overcome this problem. In this paper, we demonstrate that this approach not only serves for the graph construction, but also represents an efficient and accurate alternative for out-of-sample embedding. Considering for a case study the Laplacian Eigenmaps, we applied our method to the face recognition problem. Experimental results conducted on some challenging datasets confirmed the robustness of our approach and its superiority when compared to existing techniques.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2011). A Discriminative Non-Linear Manifold Learning Technique for Face Recognition. In Informatics Engineering and Information Science (Vol. 254, pp. 339–353). Springer Berlin Heidelberg.
Abstract: In this paper we propose a novel non-linear discriminative analysis technique for manifold learning. The proposed approach is a discriminant version of Laplacian Eigenmaps which takes into account the class label information in order to guide the procedure of non-linear dimensionality reduction. By following the large margin concept, the graph Laplacian is split in two components: within-class graph and between-class graph to better characterize the discriminant property of the data.
Our approach has been tested on several challenging face databases and it has been conveniently compared with other linear and non-linear techniques. The experimental results confirm that our method outperforms, in general, the existing ones. Although we have concentrated in this paper on the face recognition problem, the proposed approach could also be applied to other category of objects characterized by large variance in their appearance.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2010). Dynamic Facial Expression Recognition Using Laplacian Eigenmaps-Based Manifold Learning. In IEEE International Conference on Robotics and Automation (156–161).
Abstract: In this paper, we propose an integrated framework for tracking, modelling and recognition of facial expressions. The main contributions are: (i) a view- and texture independent scheme that exploits facial action parameters estimated by an appearance-based 3D face tracker; (ii) the complexity of the non-linear facial expression space is modelled through a manifold, whose structure is learned using Laplacian Eigenmaps. The projected facial expressions are afterwards recognized based on Nearest Neighbor classifier; (iii) with the proposed approach, we developed an application for an AIBO robot, in which it mirrors the perceived facial expression.
|
|