Wenjuan Gong, Jordi Gonzalez, Joao Manuel R. S. Taveres, & Xavier Roca. (2012). A New Image Dataset on Human Interactions. In 7th Conference on Articulated Motion and Deformable Objects (Vol. 7378, pp. 204–209). Springer Berlin Heidelberg.
Abstract: This article describes a new collection of still image dataset which are dedicated to interactions between people. Human action recognition from still images have been a hot topic recently, but most of them are actions performed by a single person, like running, walking, riding bikes, phoning and so on and there is no interactions between people in one image. The dataset collected in this paper are concentrating on human interaction between two people aiming to explore this new topic in the research area of action recognition from still images.
|
Sergio Escalera. (2012). Human Behavior Analysis From Depth Maps. In F.J. Perales, R.B. Fisher, & T.B. Moeslund (Eds.), 7th Conference on Articulated Motion and Deformable Objects (Vol. 7378, pp. 282–292). Springer Heidelberg.
Abstract: Pose Recovery (PR) and Human Behavior Analysis (HBA) have been a main focus of interest from the beginnings of Computer Vision and Machine Learning. PR and HBA were originally addressed by the analysis of still images and image sequences. More recent strategies consisted of Motion Capture technology (MOCAP), based on the synchronization of multiple cameras in controlled environments; and the analysis of depth maps from Time-of-Flight (ToF) technology, based on range image recording from distance sensor measurements. Recently, with the appearance of the multi-modal RGBD information provided by the low cost Kinect \textsfTM sensor (from RGB and Depth, respectively), classical methods for PR and HBA have been redefined, and new strategies have been proposed. In this paper, the recent contributions and future trends of multi-modal RGBD data analysis for PR and HBA are reviewed and discussed.
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2012). Efficient pairwise classification using Local Cross Off strategy. In 25th Canadian Conference on Artificial Intelligence (Vol. 7310, pp. 25–36). LNCS.
Abstract: The pairwise classification approach tends to perform better than other well-known approaches when dealing with multiclass classification problems. In the pairwise approach, however, the nuisance votes of many irrelevant classifiers may result in a wrong prediction class. To overcome this problem, a novel method, Local Crossing Off (LCO), is presented and evaluated in this paper. The proposed LCO system takes advantage of nearest neighbor classification algorithm because of its simplicity and speed, as well as the strength of other two powerful binary classifiers to discriminate between two classes. This paper provides a set of experimental results on 20 datasets using two base learners: Neural Networks and Support Vector Machines. The results show that the proposed technique not only achieves better classification accuracy, but also is computationally more efficient for tackling classification problems which have a relatively large number of target classes.
|
Ekaterina Zaytseva, Santiago Segui, & Jordi Vitria. (2012). Sketchable Histograms of Oriented Gradients for Object Detection. In 17th Iberomerican Conference on Pattern Recognition (Vol. 7441, pp. 374–381). Springer Berlin Heidelberg.
Abstract: In this paper we investigate a new representation approach for visual object recognition. The new representation, called sketchable-HoG, extends the classical histogram of oriented gradients (HoG) feature by adding two different aspects: the stability of the majority orientation and the continuity of gradient orientations. In this way, the sketchable-HoG locally characterizes the complexity of an object model and introduces global structure information while still keeping simplicity, compactness and robustness. We evaluated the proposed image descriptor on publicly Catltech 101 dataset. The obtained results outperforms classical HoG descriptor as well as other reported descriptors in the literature.
|
Volkmar Frinken, Alicia Fornes, Josep Llados, & Jean-Marc Ogier. (2012). Bidirectional Language Model for Handwriting Recognition. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 611–619). LNCS. Springer Berlin Heidelberg.
Abstract: In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
|
Laura Igual, Joan Carles Soliva, Roger Gimeno, Sergio Escalera, Oscar Vilarroya, & Petia Radeva. (2012). Automatic Internal Segmentation of Caudate Nucleus for Diagnosis of Attention Deficit Hyperactivity Disorder. In 9th International Conference on Image Analysis and Recognition (Vol. 7325, pp. 222–229). LNCS.
Abstract: Poster
Studies on volumetric brain Magnetic Resonance Imaging (MRI) showed neuroanatomical abnormalities in pediatric Attention-Deficit/Hyperactivity Disorder (ADHD). In particular, the diminished right caudate volume is one of the most replicated findings among ADHD samples in morphometric MRI studies. In this paper, we propose a fully-automatic method for internal caudate nucleus segmentation based on machine learning. Moreover, the ratio between right caudate body volume and the bilateral caudate body volume is applied in a ADHD diagnostic test. We separately validate the automatic internal segmentation of caudate in head and body structures and the diagnostic test using real data from ADHD and control subjects. As a result, we show accurate internal caudate segmentation and similar performance among the proposed automatic diagnostic test and the manual annotation.
|
Fahad Shahbaz Khan, Joost Van de Weijer, Sadiq Ali, & Michael Felsberg. (2013). Evaluating the impact of color on texture recognition. In 15th International Conference on Computer Analysis of Images and Patterns (Vol. 8047, pp. 154–162). Springer Berlin Heidelberg.
Abstract: State-of-the-art texture descriptors typically operate on grey scale images while ignoring color information. A common way to obtain a joint color-texture representation is to combine the two visual cues at the pixel level. However, such an approach provides sub-optimal results for texture categorisation task.
In this paper we investigate how to optimally exploit color information for texture recognition. We evaluate a variety of color descriptors, popular in image classification, for texture categorisation. In addition we analyze different fusion approaches to combine color and texture cues. Experiments are conducted on the challenging scenes and 10 class texture datasets. Our experiments clearly suggest that in all cases color names provide the best performance. Late fusion is the best strategy to combine color and texture. By selecting the best color descriptor with optimal fusion strategy provides a gain of 5% to 8% compared to texture alone on scenes and texture datasets.
Keywords: Color; Texture; image representation
|
Francesco Ciompi, Simone Balocco, Carles Caus, Josepa Mauri, & Petia Radeva. (2013). Stent shape estimation through a comprehensive interpretation of intravascular ultrasound images. In 16th International Conference on Medical Image Computing and Computer Assisted Intervention (Vol. 8150, pp. 345–352). LNCS. Springer Berlin Heidelberg.
Abstract: We present a method for automatic struts detection and stent shape estimation in cross-sectional intravascular ultrasound images. A stent shape is first estimated through a comprehensive interpretation of the vessel morphology, performed using a supervised context-aware multi-class classification scheme. Then, the successive strut identification exploits both local appearance and the defined stent shape. The method is tested on 589 images obtained from 80 patients, achieving a F-measure of 74.1% and an averaged distance between manual and automatic struts of 0.10 mm.
|
D.Sanchez, J.C.Ortega, & Miguel Angel Bautista. (2013). Human Body Segmentation with Multi-limb Error-Correcting Output Codes Detection and Graph Cuts Optimization. In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 50–58). LNCS. Springer Berlin Heidelberg.
Abstract: Human body segmentation is a hard task because of the high variability in appearance produced by changes in the point of view, lighting conditions, and number of articulations of the human body. In this paper, we propose a two-stage approach for the segmentation of the human body. In a first step, a set of human limbs are described, normalized to be rotation invariant, and trained using cascade of classifiers to be split in a tree structure way. Once the tree structure is trained, it is included in a ternary Error-Correcting Output Codes (ECOC) framework. This first classification step is applied in a windowing way on a new test image, defining a body-like probability map, which is used as an initialization of a GMM color modelling and binary Graph Cuts optimization procedure. The proposed methodology is tested in a novel limb-labelled data set. Results show performance improvements of the novel approach in comparison to classical cascade of classifiers and human detector-based Graph Cuts segmentation approaches.
Keywords: Human Body Segmentation; Error-Correcting Output Codes; Cascade of Classifiers; Graph Cuts
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2013). Logo recognition Based on the Dempster-Shafer Fusion of Multiple Classifiers. In 26th Canadian Conference on Artificial Intelligence (Vol. 7884, pp. 1–12). Springer Berlin Heidelberg.
Abstract: Best paper award
The performance of different feature extraction and shape description methods in trademark image recognition systems have been studied by several researchers. However, the potential improvement in classification through feature fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of three classifiers, each trained on different feature sets. Three promising shape description techniques, including Zernike moments, generic Fourier descriptors, and shape signature are used to extract informative features from logo images, and each set of features is fed into an individual classifier. In order to reduce recognition error, a powerful combination strategy based on the Dempster-Shafer theory is utilized to fuse the three classifiers trained on different sources of information. This combination strategy can effectively make use of diversity of base learners generated with different set of features. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing significant performance improvements of the proposed methodology.
Keywords: Logo recognition; ensemble classification; Dempster-Shafer fusion; Zernike moments; generic Fourier descriptor; shape signature
|
Naveen Onkarappa, & Angel Sappa. (2013). Laplacian Derivative based Regularization for Optical Flow Estimation in Driving Scenario. In 15th International Conference on Computer Analysis of Images and Patterns (Vol. 8048, pp. 483–490). LNCS. Springer Berlin Heidelberg.
Abstract: Existing state of the art optical flow approaches, which are evaluated on standard datasets such as Middlebury, not necessarily have a similar performance when evaluated on driving scenarios. This drop on performance is due to several challenges arising on real scenarios during driving. Towards this direction, in this paper, we propose a modification to the regularization term in a variational optical flow formulation, that notably improves the results, specially in driving scenarios. The proposed modification consists on using the Laplacian derivatives of flow components in the regularization term instead of gradients of flow components. We show the improvements in results on a standard real image sequences dataset (KITTI).
Keywords: Optical flow; regularization; Driver Assistance Systems; Performance Evaluation
|
Miguel Angel Bautista, Antonio Hernandez, Victor Ponce, Xavier Perez Sala, Xavier Baro, Oriol Pujol, et al. (2012). Probability-based Dynamic TimeWarping for Gesture Recognition on RGB-D data. In 21st International Conference on Pattern Recognition International Workshop on Depth Image Analysis (Vol. 7854, pp. 126–135). Springer Berlin Heidelberg.
Abstract: Dynamic Time Warping (DTW) is commonly used in gesture recognition tasks in order to tackle the temporal length variability of gestures. In the DTW framework, a set of gesture patterns are compared one by one to a maybe infinite test sequence, and a query gesture category is recognized if a warping cost below a certain threshold is found within the test sequence. Nevertheless, either taking one single sample per gesture category or a set of isolated samples may not encode the variability of such gesture category. In this paper, a probability-based DTW for gesture recognition is proposed. Different samples of the same gesture pattern obtained from RGB-Depth data are used to build a Gaussian-based probabilistic model of the gesture. Finally, the cost of DTW has been adapted accordingly to the new model. The proposed approach is tested in a challenging scenario, showing better performance of the probability-based DTW in comparison to state-of-the-art approaches for gesture recognition on RGB-D data.
|
Miguel Reyes, Albert Clapes, Luis Felipe Mejia, Jose Ramirez, Juan R Revilla, & Sergio Escalera. (2012). Posture Analysis and Range of Movement Estimation using Depth Maps. In 21st International Conference on Pattern Recognition International Workshop on Depth Image Analysis (Vol. 7854, pp. 97–105). Springer Berlin Heidelberg.
Abstract: World Health Organization estimates that 80% of the world population is affected of back pain during his life. Current practices to analyze back problems are expensive, subjective, and invasive. In this work, we propose a novel tool for posture and range of movement estimation based on the analysis of 3D information from depth maps. Given a set of keypoints defined by the user, RGB and depth data are aligned, depth surface is reconstructed, keypoints are matching using a novel point-to-point fitting procedure, and accurate measurements about posture, spinal curvature, and range of movement are computed. The system shows high precision and reliable measurements, being useful for posture reeducation purposes to prevent musculoskeletal disorders, such as back pain, as well as tracking the posture evolution of patients in rehabilitation treatments.
|
Klaus Broelemann, Anjan Dutta, Xiaoyi Jiang, & Josep Llados. (2012). Hierarchical graph representation for symbol spotting in graphical document images. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 529–538). LNCS. Springer Berlin Heidelberg.
Abstract: Symbol spotting can be defined as locating given query symbol in a large collection of graphical documents. In this paper we present a hierarchical graph representation for symbols. This representation allows graph matching methods to deal with low-level vectorization errors and, thus, to perform a robust symbol spotting. To show the potential of this approach, we conduct an experiment with the SESYD dataset.
|
David Masip, Alexander Todorov, & Jordi Vitria. (2012). The Role of Facial Regions in Evaluating Social Dime. In Rita Cucchiara V. M. Andrea Fusiello (Ed.), 12th European Conference on Computer Vision – Workshops and Demonstrations (Vol. 7584, pp. 210–219). LNCS. Springer Berlin Heidelberg.
Abstract: Facial trait judgments are an important information cue for people. Recent works in the Psychology field have stated the basis of face evaluation, defining a set of traits that we evaluate from faces (e.g. dominance, trustworthiness, aggressiveness, attractiveness, threatening or intelligence among others). We rapidly infer information from others faces, usually after a short period of time (< 1000ms) we perceive a certain degree of dominance or trustworthiness of another person from the face. Although these perceptions are not necessarily accurate, they influence many important social outcomes (such as the results of the elections or the court decisions). This topic has also attracted the attention of Computer Vision scientists, and recently a computational model to automatically predict trait evaluations from faces has been proposed. These systems try to mimic the human perception by means of applying machine learning classifiers to a set of labeled data. In this paper we perform an experimental study on the specific facial features that trigger the social inferences. Using previous results from the literature, we propose to use simple similarity maps to evaluate which regions of the face influence the most the trait inferences. The correlation analysis is performed using only appearance, and the results from the experiments suggest that each trait is correlated with specific facial characteristics.
Keywords: Workshops and Demonstrations
|