|
Josep Llados, Enric Marti, & Juan J.Villanueva. (2001). Symbol recognition by error-tolerant subgraph matching between region adjacency graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10), 1137–1143.
Abstract: The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content.
|
|
|
Ariel Amato, Mikhail Mozerov, Andrew Bagdanov, & Jordi Gonzalez. (2011). Accurate Moving Cast Shadow Suppression Based on Local Color Constancy detection. TIP - IEEE Transactions on Image Processing, 20(10), 2954–2966.
Abstract: This paper describes a novel framework for detection and suppression of properly shadowed regions for most possible scenarios occurring in real video sequences. Our approach requires no prior knowledge about the scene, nor is it restricted to specific scene structures. Furthermore, the technique can detect both achromatic and chromatic shadows even in the presence of camouflage that occurs when foreground regions are very similar in color to shadowed regions. The method exploits local color constancy properties due to reflectance suppression over shadowed regions. To detect shadowed regions in a scene, the values of the background image are divided by values of the current frame in the RGB color space. We show how this luminance ratio can be used to identify segments with low gradient constancy, which in turn distinguish shadows from foreground. Experimental results on a collection of publicly available datasets illustrate the superior performance of our method compared with the most sophisticated, state-of-the-art shadow detection algorithms. These results show that our approach is robust and accurate over a broad range of shadow types and challenging video conditions.
|
|
|
Bhaskar Chakraborty, Jordi Gonzalez, & Xavier Roca. (2013). Large scale continuous visual event recognition using max-margin Hough transformation framework. CVIU - Computer Vision and Image Understanding, 117(10), 1356–1368.
Abstract: In this paper we propose a novel method for continuous visual event recognition (CVER) on a large scale video dataset using max-margin Hough transformation framework. Due to high scalability, diverse real environmental state and wide scene variability direct application of action recognition/detection methods such as spatio-temporal interest point (STIP)-local feature based technique, on the whole dataset is practically infeasible. To address this problem, we apply a motion region extraction technique which is based on motion segmentation and region clustering to identify possible candidate “event of interest” as a preprocessing step. On these candidate regions a STIP detector is applied and local motion features are computed. For activity representation we use generalized Hough transform framework where each feature point casts a weighted vote for possible activity class centre. A max-margin frame work is applied to learn the feature codebook weight. For activity detection, peaks in the Hough voting space are taken into account and initial event hypothesis is generated using the spatio-temporal information of the participating STIPs. For event recognition a verification Support Vector Machine is used. An extensive evaluation on benchmark large scale video surveillance dataset (VIRAT) and as well on a small scale benchmark dataset (MSR) shows that the proposed method is applicable on a wide range of continuous visual event recognition applications having extremely challenging conditions.
|
|
|
Ignasi Rius, Jordi Gonzalez, Javier Varona, & Xavier Roca. (2009). Action-specific motion prior for efficient bayesian 3D human body tracking. PR - Pattern Recognition, 42(11), 2907–2921.
Abstract: In this paper, we aim to reconstruct the 3D motion parameters of a human body
model from the known 2D positions of a reduced set of joints in the image plane.
Towards this end, an action-specific motion model is trained from a database of real
motion-captured performances. The learnt motion model is used within a particle
filtering framework as a priori knowledge on human motion. First, our dynamic
model guides the particles according to similar situations previously learnt. Then, the solution space is constrained so only feasible human postures are accepted as valid solutions at each time step. As a result, we are able to track the 3D configuration of the full human body from several cycles of walking motion sequences using only the 2D positions of a very reduced set of joints from lateral or frontal viewpoints.
|
|
|
Oscar Lopes, Miguel Reyes, Sergio Escalera, & Jordi Gonzalez. (2014). Spherical Blurred Shape Model for 3-D Object and Pose Recognition: Quantitative Analysis and HCI Applications in Smart Environments. TSMCB - IEEE Transactions on Systems, Man and Cybernetics (Part B), 44(12), 2379–2390.
Abstract: The use of depth maps is of increasing interest after the advent of cheap multisensor devices based on structured light, such as Kinect. In this context, there is a strong need of powerful 3-D shape descriptors able to generate rich object representations. Although several 3-D descriptors have been already proposed in the literature, the research of discriminative and computationally efficient descriptors is still an open issue. In this paper, we propose a novel point cloud descriptor called spherical blurred shape model (SBSM) that successfully encodes the structure density and local variabilities of an object based on shape voxel distances and a neighborhood propagation strategy. The proposed SBSM is proven to be rotation and scale invariant, robust to noise and occlusions, highly discriminative for multiple categories of complex objects like the human hand, and computationally efficient since the SBSM complexity is linear to the number of object voxels. Experimental evaluation in public depth multiclass object data, 3-D facial expressions data, and a novel hand poses data sets show significant performance improvements in relation to state-of-the-art approaches. Moreover, the effectiveness of the proposal is also proved for object spotting in 3-D scenes and for real-time automatic hand pose recognition in human computer interaction scenarios.
|
|