|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Michael Felsberg and J.Laaksonen. 2015. Deep semantic pyramids for human attributes and action recognition. Image Analysis, Proceedings of 19th Scandinavian Conference , SCIA 2015. Springer International Publishing, 341–353.
Abstract: Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features.
We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature.
Keywords: Action recognition; Human attributes; Semantic pyramids
|
|
|
Patricia Marquez and 6 others. 2014. Factors Affecting Optical Flow Performance in Tagging Magnetic Resonance Imaging. 17th International Conference on Medical Image Computing and Computer Assisted Intervention. Springer International Publishing, 231–238. (LNCS.)
Abstract: Changes in cardiac deformation patterns are correlated with cardiac pathologies. Deformation can be extracted from tagging Magnetic Resonance Imaging (tMRI) using Optical Flow (OF) techniques. For applications of OF in a clinical setting it is important to assess to what extent the performance of a particular OF method is stable across dierent clinical acquisition artifacts. This paper presents a statistical validation framework, based on ANOVA, to assess the motion and appearance factors that have the largest in uence on OF accuracy drop.
In order to validate this framework, we created a database of simulated tMRI data including the most common artifacts of MRI and test three dierent OF methods, including HARP.
Keywords: Optical flow; Performance Evaluation; Synthetic Database; ANOVA; Tagging Magnetic Resonance Imaging
|
|
|
Petia Radeva and Joan Serrat. 1993. Rubber Snake: Implementation on Signed Distance Potential. Vision Conference.187–194.
|
|
|
Fadi Dornaika and Angel Sappa. 2007. Improving Appearance-Based 3D Face Tracking Using Sparse Stereo Data. In J. Braz, A.R., H. Araujo and J. Jorge,, ed. Advances in Computer Graphics and Computer Vision,. Springer Verlag, 354–366.
|
|
|
Diego Cheda, Daniel Ponsa and Antonio Lopez. 2012. Monocular Depth-based Background Estimation. 7th International Conference on Computer Vision Theory and Applications.323–328.
Abstract: In this paper, we address the problem of reconstructing the background of a scene from a video sequence with occluding objects. The images are taken by hand-held cameras. Our method composes the background by selecting the appropriate pixels from previously aligned input images. To do that, we minimize a cost function that penalizes the deviations from the following assumptions: background represents objects whose distance to the camera is maximal, and background objects are stationary. Distance information is roughly obtained by a supervised learning approach that allows us to distinguish between close and distant image regions. Moving foreground objects are filtered out by using stationariness and motion boundary constancy measurements. The cost function is minimized by a graph cuts method. We demonstrate the applicability of our approach to recover an occlusion-free background in a set of sequences.
|
|
|
Patricia Marquez, Debora Gil, R.Mester and Aura Hernandez-Sabate. 2014. Local Analysis of Confidence Measures for Optical Flow Quality Evaluation. 9th International Conference on Computer Vision Theory and Applications.450–457.
Abstract: Optical Flow (OF) techniques facing the complexity of real sequences have been developed in the last years. Even using the most appropriate technique for our specific problem, at some points the output flow might fail to achieve the minimum error required for the system. Confidence measures computed from either input data or OF output should discard those points where OF is not accurate enough for its further use. It follows that evaluating the capabilities of a confidence measure for bounding OF error is as important as the definition
itself. In this paper we analyze different confidence measures and point out their advantages and limitations for their use in real world settings. We also explore the agreement with current tools for their evaluation of confidence measures performance.
Keywords: Optical Flow; Confidence Measure; Performance Evaluation.
|
|
|
P. Ricaurte, C. Chilan, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla and Angel Sappa. 2014. Performance Evaluation of Feature Point Descriptors in the Infrared Domain. 9th International Conference on Computer Vision Theory and Applications.545–550.
Abstract: This paper presents a comparative evaluation of classical feature point descriptors when they are used in the long-wave infrared spectral band. Robustness to changes in rotation, scaling, blur, and additive noise are evaluated using a state of the art framework. Statistical results using an outdoor image data set are presented together with a discussion about the differences with respect to the results obtained when images from the visible spectrum are considered.
Keywords: Infrared Imaging; Feature Point Descriptors
|
|
|
Naveen Onkarappa, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla and Angel Sappa. 2014. Cross-spectral Stereo Correspondence using Dense Flow Fields. 9th International Conference on Computer Vision Theory and Applications.613–617.
Abstract: This manuscript addresses the cross-spectral stereo correspondence problem. It proposes the usage of a dense flow field based representation instead of the original cross-spectral images, which have a low correlation. In this way, working in the flow field space, classical cost functions can be used as similarity measures. Preliminary experimental results on urban environments have been obtained showing the validity of the proposed approach.
Keywords: Cross-spectral Stereo Correspondence; Dense Optical Flow; Infrared and Visible Spectrum
|
|
|
Ariel Amato, Felipe Lumbreras and Angel Sappa. 2014. A General-purpose Crowdsourcing Platform for Mobile Devices. 9th International Conference on Computer Vision Theory and Applications.211–215.
Abstract: This paper presents details of a general purpose micro-task on-demand platform based on the crowdsourcing philosophy. This platform was specifically developed for mobile devices in order to exploit the strengths of such devices; namely: i) massivity, ii) ubiquity and iii) embedded sensors. The combined use of mobile platforms and the crowdsourcing model allows to tackle from the simplest to the most complex tasks. Users experience is the highlighted feature of this platform (this fact is extended to both task-proposer and tasksolver). Proper tools according with a specific task are provided to a task-solver in order to perform his/her job in a simpler, faster and appealing way. Moreover, a task can be easily submitted by just selecting predefined templates, which cover a wide range of possible applications. Examples of its usage in computer vision and computer games are provided illustrating the potentiality of the platform.
Keywords: Crowdsourcing Platform; Mobile Crowdsourcing
|
|
|
Chris Bahnsen, David Vazquez, Antonio Lopez and Thomas B. Moeslund. 2019. Learning to Remove Rain in Traffic Surveillance by Using Synthetic Data. 14th International Conference on Computer Vision Theory and Applications.123–130.
Abstract: Rainfall is a problem in automated traffic surveillance. Rain streaks occlude the road users and degrade the overall visibility which in turn decrease object detection performance. One way of alleviating this is by artificially removing the rain from the images. This requires knowledge of corresponding rainy and rain-free images. Such images are often produced by overlaying synthetic rain on top of rain-free images. However, this method fails to incorporate the fact that rain fall in the entire three-dimensional volume of the scene. To overcome this, we introduce training data from the SYNTHIA virtual world that models rain streaks in the entirety of a scene. We train a conditional Generative Adversarial Network for rain removal and apply it on traffic surveillance images from SYNTHIA and the AAU RainSnow datasets. To measure the applicability of the rain-removed images in a traffic surveillance context, we run the YOLOv2 object detection algorithm on the original and rain-removed frames. The results on SYNTHIA show an 8% increase in detection accuracy compared to the original rain image. Interestingly, we find that high PSNR or SSIM scores do not imply good object detection performance.
Keywords: Rain Removal; Traffic Surveillance; Image Denoising
|
|