|
Marco Pedersoli, Jordi Gonzalez, & Juan J. Villanueva. (2009). High-Speed Human Detection Using a Multiresolution Cascade of Histograms of Oriented Gradients. In 4th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 5524). LNCS. Springer Berlin Heidelberg.
Abstract: This paper presents a new method for human detection based on a multiresolution cascade of Histograms of Oriented Gradients (HOG) that can highly reduce the computational cost of the detection search without affecting accuracy. The method consists of a cascade of sliding window detectors. Each detector is a Support Vector Machine (SVM) composed by features at different resolution, from coarse for the first level to fine for the last one.
Considering that the spatial stride of the sliding window search is affected by the HOG features size, unlike previous methods based on Adaboost cascades, we can adopt a spatial stride inversely proportional to the features resolution. This produces that the speed-up of the cascade is not only due to the low number of features that need to be computed in the first levels, but also to the lower number of detection windows that needs to be evaluated.
Experimental results shows that our method permits a detection rate comparable with the state of the art, but at the same time a gain in the speed of the detection search of 10-20 times depending on the cascade configuration.
|
|
|
Bhaskar Chakraborty, Andrew Bagdanov, & Jordi Gonzalez. (2009). Towards Real-Time Human Action Recognition. In 4th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 5524). LNCS. Springer Berlin Heidelberg.
Abstract: This work presents a novel approach to human detection based action-recognition in real-time. To realize this goal our method first detects humans in different poses using a correlation-based approach. Recognition of actions is done afterward based on the change of the angular values subtended by various body parts. Real-time human detection and action recognition are very challenging, and most state-of-the-art approaches employ complex feature extraction and classification techniques, which ultimately becomes a handicap for real-time recognition. Our correlation-based method, on the other hand, is computationally efficient and uses very simple gradient-based features. For action recognition angular features of body parts are extracted using a skeleton technique. Results for action recognition are comparable with the present state-of-the-art.
|
|
|
Murad Al Haj, Andrew Bagdanov, Jordi Gonzalez, & Xavier Roca. (2009). Robust and Efficient Multipose Face Detection Using Skin Color Segmentation. In 4th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 5524). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we describe an efficient technique for detecting faces in arbitrary images and video sequences. The approach is based on segmentation of images or video frames into skin-colored blobs using a pixel-based heuristic. Scale and translation invariant features are then computed from these segmented blobs which are used to perform statistical discrimination between face and non-face classes. We train and evaluate our method on a standard, publicly available database of face images and analyze its performance over a range of statistical pattern classifiers. The generalization of our approach is illustrated by testing on an independent sequence of frames containing many faces and non-faces. These experiments indicate that our proposed approach obtains false positive rates comparable to more complex, state-of-the-art techniques, and that it generalizes better to new data. Furthermore, the use of skin blobs and invariant features requires fewer training samples since significantly fewer non-face candidate regions must be considered when compared to AdaBoost-based approaches.
|
|
|
Miquel Ferrer, Ernest Valveny, F. Serratosa, I. Bardaji, & Horst Bunke. (2009). Graph-based k-means clustering: A comparison of the set versus the generalized median graph. In 13th International Conference on Computer Analysis of Images and Patterns (Vol. 5702, 342–350). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we propose the application of the generalized median graph in a graph-based k-means clustering algorithm. In the graph-based k-means algorithm, the centers of the clusters have been traditionally represented using the set median graph. We propose an approximate method for the generalized median graph computation that allows to use it to represent the centers of the clusters. Experiments on three databases show that using the generalized median graph as the clusters representative yields better results than the set median graph.
|
|
|
Alicia Fornes, Josep Llados, Gemma Sanchez, & Horst Bunke. (2009). Symbol-independent writer identification in old handwritten music scores. In In proceedings of 8th IAPR International Workshop on Graphics Recognition (186–197). Springer Berlin Heidelberg.
|
|
|
Francesco Ciompi, Oriol Pujol, E Fernandez-Nofrerias, J. Mauri, & Petia Radeva. (2009). ECOC Random Fields for Lumen Segmentation in Radial Artery IVUS Sequences. In 12th International Conference on Medical Image and Computer Assisted Intervention (Vol. 5762). LNCS. Springer Berlin Heidelberg.
Abstract: The measure of lumen volume on radial arteries can be used to evaluate the vessel response to different vasodilators. In this paper, we present a framework for automatic lumen segmentation in longitudinal cut images of radial artery from Intravascular ultrasound sequences. The segmentation is tackled as a classification problem where the contextual information is exploited by means of Conditional Random Fields (CRFs). A multi-class classification framework is proposed, and inference is achieved by combining binary CRFs according to the Error-Correcting-Output-Code technique. The results are validated against manually segmented sequences. Finally, the method is compared with other state-of-the-art classifiers.
|
|
|
David Aldavert, Ricardo Toledo, Arnau Ramisa, & Ramon Lopez de Mantaras. (2009). Efficient Object Pixel-Level Categorization using Bag of Features: Advances in Visual Computing. In 5th International Symposium on Visual Computing (Vol. 5875, 44–55). Springer Berlin Heidelberg.
Abstract: In this paper we present a pixel-level object categorization method suitable to be applied under real-time constraints. Since pixels are categorized using a bag of features scheme, the major bottleneck of such an approach would be the feature pooling in local histograms of visual words. Therefore, we propose to bypass this time-consuming step and directly obtain the score from a linear Support Vector Machine classifier. This is achieved by creating an integral image of the components of the SVM which can readily obtain the classification score for any image sub-window with only 10 additions and 2 products, regardless of its size. Besides, we evaluated the performance of two efficient feature quantization methods: the Hierarchical K-Means and the Extremely Randomized Forest. All experiments have been done in the Graz02 database, showing comparable, or even better results to related work with a lower computational cost.
|
|
|
David Aldavert, Ricardo Toledo, Arnau Ramisa, & Ramon Lopez de Mantaras. (2009). Visual Registration Method For A Low Cost Robot: Computer Vision Systems. In 7th International Conference on Computer Vision Systems (Vol. 5815, 204–214). LNCS. Springer Berlin Heidelberg.
Abstract: An autonomous mobile robot must face the correspondence or data association problem in order to carry out tasks like place recognition or unknown environment mapping. In order to put into correspondence two maps, most methods estimate the transformation relating the maps from matches established between low level feature extracted from sensor data. However, finding explicit matches between features is a challenging and computationally expensive task. In this paper, we propose a new method to align obstacle maps without searching explicit matches between features. The maps are obtained from a stereo pair. Then, we use a vocabulary tree approach to identify putative corresponding maps followed by the Newton minimization algorithm to find the transformation that relates both maps. The proposed method is evaluated in a typical office environment showing good performance.
|
|
|
Oscar Camara, Estanislao Oubel, Gemma Piella, Simone Balocco, Mathieu De Craene, & Alejandro F. Frangi. (2009). Multi-sequence Registration of Cine, Tagged and Delay-Enhancement MRI with Shift Correction and Steerable Pyramid-Based Detagging. In 5th International Conference on Functional Imaging and Modeling of the Heart (Vol. 5528, 330–338). LNCS. Springer Berlin Heidelberg.
Abstract: In this work, we present a registration framework for cardiac cine MRI (cMRI), tagged (tMRI) and delay-enhancement MRI (deMRI), where the two main issues to find an accurate alignment between these images have been taking into account: the presence of tags in tMRI and respiration artifacts in all sequences. A steerable pyramid image decomposition has been used for detagging purposes since it is suitable to extract high-order oriented structures by directional adaptive filtering. Shift correction of cMRI is achieved by firstly maximizing the similarity between the Long Axis and Short Axis cMRI. Subsequently, these shift-corrected images are used as target images in a rigid registration procedure with their corresponding tMRI/deMRI in order to correct their shift. The proposed registration framework has been evaluated by 840 registration tests, considerably improving the alignment of the MR images (mean RMS error of 2.04mm vs. 5.44mm).
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2009). Natural Facial Expression Recognition Using Dynamic and Static Schemes. In 5th International Symposium on Visual Computing (Vol. 5875, 730–739). LNCS. Springer Berlin Heidelberg.
Abstract: Affective computing is at the core of a new paradigm in HCI and AI represented by human-centered computing. Within this paradigm, it is expected that machines will be enabled with perceiving capabilities, making them aware about users’ affective state. The current paper addresses the problem of facial expression recognition from monocular videos sequences. We propose a dynamic facial expression recognition scheme, which is proven to be very efficient. Furthermore, it is conveniently compared with several static-based systems adopting different magnitude of facial expression. We provide evaluations of performance using Linear Discriminant Analysis (LDA), Non parametric Discriminant Analysis (NDA), and Support Vector Machines (SVM). We also provide performance evaluations using arbitrary test video sequences.
|
|
|
Oriol Pujol, Eloi Puertas, & Carlo Gatta. (2009). Multi-scale Stacked Sequential Learning. In 8th International Workshop of Multiple Classifier Systems (Vol. 5519, 262–271). Springer Berlin Heidelberg.
Abstract: One of the most widely used assumptions in supervised learning is that data is independent and identically distributed. This assumption does not hold true in many real cases. Sequential learning is the discipline of machine learning that deals with dependent data such that neighboring examples exhibit some kind of relationship. In the literature, there are different approaches that try to capture and exploit this correlation, by means of different methodologies. In this paper we focus on meta-learning strategies and, in particular, the stacked sequential learning approach. The main contribution of this work is two-fold: first, we generalize the stacked sequential learning. This generalization reflects the key role of neighboring interactions modeling. Second, we propose an effective and efficient way of capturing and exploiting sequential correlations that takes into account long-range interactions by means of a multi-scale pyramidal decomposition of the predicted labels. Additionally, this new method subsumes the standard stacked sequential learning approach. We tested the proposed method on two different classification tasks: text lines classification in a FAQ data set and image classification. Results on these tasks clearly show that our approach outperforms the standard stacked sequential learning. Moreover, we show that the proposed method allows to control the trade-off between the detail and the desired range of the interactions.
|
|
|
Debora Gil, Aura Hernandez-Sabate, Mireia Burnat, Steven Jansen, & Jordi Martinez-Vilalta. (2009). Structure-Preserving Smoothing of Biomedical Images. In 13th International Conference on Computer Analysis of Images and Patterns (Vol. 5702, pp. 427–434). LNCS. Springer Berlin Heidelberg.
Abstract: Smoothing of biomedical images should preserve gray-level transitions between adjacent tissues, while restoring contours consistent with anatomical structures. Anisotropic diffusion operators are based on image appearance discontinuities (either local or contextual) and might fail at weak inter-tissue transitions. Meanwhile, the output of block-wise and morphological operations is prone to present a block structure due to the shape and size of the considered pixel neighborhood. In this contribution, we use differential geometry concepts to define a diffusion operator that restricts to image consistent level-sets. In this manner, the final state is a non-uniform intensity image presenting homogeneous inter-tissue transitions along anatomical structures, while smoothing intra-structure texture. Experiments on different types of medical images (magnetic resonance, computerized tomography) illustrate its benefit on a further process (such as segmentation) of images.
Keywords: non-linear smoothing; differential geometry; anatomical structures segmentation; cardiac magnetic resonance; computerized tomography.
|
|
|
L.Tarazon, D. Perez, N. Serrano, V. Alabau, Oriol Ramos Terrades, A. Sanchis, et al. (2009). Confidence Measures for Error Correction in Interactive Transcription of Handwritten Text. In 15th International Conference on Image Analysis and Processing (Vol. 5716, pp. 567–574). LNCS. Springer Berlin Heidelberg.
Abstract: An effective approach to transcribe old text documents is to follow an interactive-predictive paradigm in which both, the system is guided by the human supervisor, and the supervisor is assisted by the system to complete the transcription task as efficiently as possible. In this paper, we focus on a particular system prototype called GIDOC, which can be seen as a first attempt to provide user-friendly, integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. More specifically, we focus on the handwriting recognition part of GIDOC, for which we propose the use of confidence measures to guide the human supervisor in locating possible system errors and deciding how to proceed. Empirical results are reported on two datasets showing that a word error rate not larger than a 10% can be achieved by only checking the 32% of words that are recognised with less confidence.
|
|
|
Petia Radeva, Jordi Vitria, Fernando Vilariño, Panagiota Spyridonos, Fernando Azpiroz, Juan Malagelada, et al. (2009). Cascade analysis for intestinal contraction detection. US Patent Office.
Abstract: A method and system cascade analysisi for intestinal contraction detection is provided by extracting from image frames captured in-vivo. The method and system also relate to the detection of turbid liquids in intestinal tracts, to automatic detection of video image frames taken in the gastrointestinal tract including a field of view obstructed by turbid media, and more particulary, to extraction of image data obstructed by turbid media.
|
|