|
Ricardo Toledo and 6 others. 2000. Eigensnakes for vessel segmentation in angiography. 15 th International Conference on Pattern Recognition.340–343.
|
|
|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Michael Felsberg and J.Laaksonen. 2015. Deep semantic pyramids for human attributes and action recognition. Image Analysis, Proceedings of 19th Scandinavian Conference , SCIA 2015. Springer International Publishing, 341–353.
Abstract: Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features.
We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature.
Keywords: Action recognition; Human attributes; Semantic pyramids
|
|
|
Patricia Marquez, Debora Gil, Aura Hernandez-Sabate and Daniel Kondermann. 2013. When Is A Confidence Measure Good Enough? 9th International Conference on Computer Vision Systems. Springer Link, 344–353. (LNCS.)
Abstract: Confidence estimation has recently become a hot topic in image processing and computer vision.Yet, several definitions exist of the term “confidence” which are sometimes used interchangeably. This is a position paper, in which we aim to give an overview on existing definitions,
thereby clarifying the meaning of the used terms to facilitate further research in this field. Based on these clarifications, we develop a theory to compare confidence measures with respect to their quality.
Keywords: Optical flow, confidence measure, performance evaluation
|
|
|
Fadi Dornaika and Angel Sappa. 2007. Improving Appearance-Based 3D Face Tracking Using Sparse Stereo Data. In J. Braz, A.R., H. Araujo and J. Jorge,, ed. Advances in Computer Graphics and Computer Vision,. Springer Verlag, 354–366.
|
|
|
Gioacchino Vino and Angel Sappa. 2013. Revisiting Harris Corner Detector Algorithm: a Gradual Thresholding Approach. 10th International Conference on Image Analysis and Recognition. Springer Berlin Heidelberg, 354–363. (LNCS.)
Abstract: This paper presents an adaptive thresholding approach intended to increase the number of detected corners, while reducing the amount of those ones corresponding to noisy data. The proposed approach works by using the classical Harris corner detector algorithm and overcome the difficulty in finding a general threshold that work well for all the images in a given data set by proposing a novel adaptive thresholding scheme. Initially, two thresholds are used to discern between strong corners and flat regions. Then, a region based criteria is used to discriminate between weak corners and noisy points in the midway interval. Experimental results show that the proposed approach has a better capability to reject false corners and, at the same time, to detect weak ones. Comparisons with the state of the art are provided showing the validity of the proposed approach.
|
|
|
Alejandro Gonzalez Alzate, Gabriel Villalonga, Jiaolong Xu, David Vazquez, Jaume Amores and Antonio Lopez. 2015. Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection. IEEE Intelligent Vehicles Symposium IV2015.356–361.
Abstract: Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multimodality and strong multi-view classifier) affect performance both individually and when integrated together. In the multimodality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy.
Keywords: Pedestrian Detection
|
|
|
Muhammad Anwer Rao, David Vazquez and Antonio Lopez. 2011. Opponent Colors for Human Detection. In J. Vitria, J.M. Sanches and M. Hernandez, eds. 5th Iberian Conference on Pattern Recognition and Image Analysis. Berlin Heidelberg, Springer, 363–370. (LNCS.)
Abstract: Human detection is a key component in fields such as advanced driving assistance and video surveillance. However, even detecting non-occluded standing humans remains a challenge of intensive research. Finding good features to build human models for further detection is probably one of the most important issues to face. Currently, shape, texture and motion features have deserve extensive attention in the literature. However, color-based features, which are important in other domains (e.g., image categorization), have received much less attention. In fact, the use of RGB color space has become a kind of choice by default. The focus has been put in developing first and second order features on top of RGB space (e.g., HOG and co-occurrence matrices, resp.). In this paper we evaluate the opponent colors (OPP) space as a biologically inspired alternative for human detection. In particular, by feeding OPP space in the baseline framework of Dalal et al. for human detection (based on RGB, HOG and linear SVM), we will obtain better detection performance than by using RGB space. This is a relevant result since, up to the best of our knowledge, OPP space has not been previously used for human detection. This suggests that in the future it could be worth to compute co-occurrence matrices, self-similarity features, etc., also on top of OPP space, i.e., as we have done with HOG in this paper.
Keywords: Pedestrian Detection; Color; Part Based Models
|
|
|
Ferran Diego, G.D. Evangelidis and Joan Serrat. 2012. Night-time outdoor surveillance by mobile cameras. 1st International Conference on Pattern Recognition Applications and Methods.365–371.
Abstract: This paper addresses the problem of video surveillance by mobile cameras. We present a method that allows online change detection in night-time outdoor surveillance. Because of the camera movement, background frames are not available and must be “localized” in former sequences and registered with the current frames. To this end, we propose a Frame Localization And Registration (FLAR) approach that solves the problem efficiently. Frames of former sequences define a database which is queried by current frames in turn. To quickly retrieve nearest neighbors, database is indexed through a visual dictionary method based on the SURF descriptor. Furthermore, the frame localization is benefited by a temporal filter that exploits the temporal coherence of videos. Next, the recently proposed ECC alignment scheme is used to spatially register the synchronized frames. Finally, change detection methods apply to aligned frames in order to mark suspicious areas. Experiments with real night sequences recorded by in-vehicle cameras demonstrate the performance of the proposed method and verify its efficiency and effectiveness against other methods.
|
|
|
Angel Sappa, Rosa Herrero, Fadi Dornaika, David Geronimo and Antonio Lopez. 2007. Road Approximation in Euclidean and v-Disparity Space: A Comparative Study. EUROCAST2007, Workshop on Cybercars and Intelligent Vehicles.368–369.
Abstract: This paper presents a comparative study between two road approximation techniques—planar surfaces—from stereo vision data. The first approach is carried out in the v-disparity space and is based on a voting scheme, the Hough transform. The second one consists in computing the best fitting plane for the whole 3D road data points, directly in the Euclidean space, by using least squares fitting. The comparative study is initially performed over a set of different synthetic surfaces
(e.g., plane, quadratic surface, cubic surface) digitized by a virtual stereo head; then real data obtained with a commercial stereo head are used. The comparative study is intended to be used as a criterion for fining the best technique according to the road geometry. Additionally, it highlights common problems driven from a wrong assumption about the scene’s prior knowledge.
|
|
|
Katerine Diaz, Francesc J. Ferri and W. Diaz. 2013. Fast Approximated Discriminative Common Vectors using rank-one SVD updates. 20th International Conference On Neural Information Processing. Springer Berlin Heidelberg, 368–375. (LNCS.)
Abstract: An efficient incremental approach to the discriminative common vector (DCV) method for dimensionality reduction and classification is presented. The proposal consists of a rank-one update along with an adaptive restriction on the rank of the null space which leads to an approximate but convenient solution. The algorithm can be implemented very efficiently in terms of matrix operations and space complexity, which enables its use in large-scale dynamic application domains. Deep comparative experimentation using publicly available high dimensional image datasets has been carried out in order to properly assess the proposed algorithm against several recent incremental formulations.
K. Diaz-Chito, F.J. Ferri, W. Diaz
|
|