2017 |
|
Jose Garcia-Rodriguez, Isabelle Guyon, Sergio Escalera, Alexandra Psarrou, Andrew Lewis, & Miguel Cazorla. (2017). Editorial: Special Issue on Computational Intelligence for Vision and Robotics. Neural Computing and Applications - Neural Computing and Applications, 28(5), 853–854.
|
|
|
Mohammad Ali Bagheri, Qigang Gao, Sergio Escalera, Huamin Ren, Thomas B. Moeslund, & Elham Etemad. (2017). Locality Regularized Group Sparse Coding for Action Recognition. CVIU - Computer Vision and Image Understanding, 158, 106–114.
Abstract: Bag of visual words (BoVW) models are widely utilized in image/ video representation and recognition. The cornerstone of these models is the encoding stage, in which local features are decomposed over a codebook in order to obtain a representation of features. In this paper, we propose a new encoding algorithm by jointly encoding the set of local descriptors of each sample and considering the locality structure of descriptors. The proposed method takes advantages of locality coding such as its stability and robustness to noise in descriptors, as well as the strengths of the group coding strategy by taking into account the potential relation among descriptors of a sample. To efficiently implement our proposed method, we consider the Alternating Direction Method of Multipliers (ADMM) framework, which results in quadratic complexity in the problem size. The method is employed for a challenging classification problem: action recognition by depth cameras. Experimental results demonstrate the outperformance of our methodology compared to the state-of-the-art on the considered datasets.
Keywords: Bag of words; Feature encoding; Locality constrained coding; Group sparse coding; Alternating direction method of multipliers; Action recognition
|
|
|
Xavier Perez Sala, Fernando De la Torre, Laura Igual, Sergio Escalera, & Cecilio Angulo. (2017). Subspace Procrustes Analysis. IJCV - International Journal of Computer Vision, 121(3), 327–343.
Abstract: Procrustes Analysis (PA) has been a popular technique to align and build 2-D statistical models of shapes. Given a set of 2-D shapes PA is applied to remove rigid transformations. Then, a non-rigid 2-D model is computed by modeling (e.g., PCA) the residual. Although PA has been widely used, it has several limitations for modeling 2-D shapes: occluded landmarks and missing data can result in local minima solutions, and there is no guarantee that the 2-D shapes provide a uniform sampling of the 3-D space of rotations for the object. To address previous issues, this paper proposes Subspace PA (SPA). Given several
instances of a 3-D object, SPA computes the mean and a 2-D subspace that can simultaneously model all rigid and non-rigid deformations of the 3-D object. We propose a discrete (DSPA) and continuous (CSPA) formulation for SPA, assuming that 3-D samples of an object are provided. DSPA extends the traditional PA, and produces unbiased 2-D models by uniformly sampling different views of the 3-D object. CSPA provides a continuous approach to uniformly sample the space of 3-D rotations, being more efficient in space and time. Experiments using SPA to learn 2-D models of bodies from motion capture data illustrate the benefits of our approach.
|
|
2016 |
|
Anastasios Doulamis, Nikolaos Doulamis, Marco Bertini, Jordi Gonzalez, & Thomas B. Moeslund. (2016). Introduction to the Special Issue on the Analysis and Retrieval of Events/Actions and Workflows in Video Streams. MTAP - Multimedia Tools and Applications, 75(22), 14985–14990.
|
|
|
Antonio Hernandez, Sergio Escalera, & Stan Sclaroff. (2016). Poselet-basedContextual Rescoring for Human Pose Estimation via Pictorial Structures. IJCV - International Journal of Computer Vision, 118(1), 49–64.
Abstract: In this paper we propose a contextual rescoring method for predicting the position of body parts in a human pose estimation framework. A set of poselets is incorporated in the model, and their detections are used to extract spatial and score-related features relative to other body part hypotheses. A method is proposed for the automatic discovery of a compact subset of poselets that covers the different poses in a set of validation images while maximizing precision. A rescoring mechanism is defined as a set-based boosting classifier that computes a new score for each body joint detection, given its relationship to detections of other body joints and mid-level parts in the image. This new score is incorporated in the pictorial structure model as an additional unary potential, following the recent work of Pishchulin et al. Experiments on two benchmarks show comparable results to Pishchulin et al. while reducing the size of the mid-level representation by an order of magnitude, reducing the execution time by 68 % accordingly.
Keywords: Contextual rescoring; Poselets; Human pose estimation
|
|