|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Michael Felsberg and J.Laaksonen. 2015. Deep semantic pyramids for human attributes and action recognition. Image Analysis, Proceedings of 19th Scandinavian Conference , SCIA 2015. Springer International Publishing, 341–353.
Abstract: Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features.
We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature.
Keywords: Action recognition; Human attributes; Semantic pyramids
|
|
|
Felipe Lumbreras and 7 others. 2001. Visual Inspection of Safety Belts. International Conference on Quality Control by Artificial Vision.526–531.
|
|
|
Petia Radeva and Joan Serrat. 1993. Rubber Snake: Implementation on Signed Distance Potential. Vision Conference.187–194.
|
|
|
X. Orriols, Ricardo Toledo, X. Binefa, Petia Radeva, Jordi Vitria and Juan J. Villanueva. 2000. Probabilistic Saliency Approach for Elongated Structure Detection using Deformable Models. 15 th International Conference on Pattern Recognition.1006–1009.
|
|
|
David Lloret, Joan Serrat, Antonio Lopez, A. Soler and Juan J. Villanueva. 2000. Retinal image registration using creases as anatomical landmarks. 15 th International Conference on Pattern Recognition.207–2010.
Abstract: Retinal images are routinely used in ophthalmology to study the optical nerve head and the retina. To assess objectively the evolution of an illness, images taken at different times must be registered. Most methods so far have been designed specifically for a single image modality, like temporal series or stereo pairs of angiographies, fluorescein angiographies or scanning laser ophthalmoscope (SLO) images, which makes them prone to fail when conditions vary. In contrast, the method we propose has shown to be accurate and reliable on all the former modalities. It has been adapted from the 3D registration of CT and MR image to 2D. Relevant features (also known as landmarks) are extracted by means of a robust creaseness operator, and resulting images are iteratively transformed until a maximum in their correlation is achieved. Our method has succeeded in more than 100 pairs tried so far, in all cases including also the scaling as a parameter to be optimized
|
|
|
Ricardo Toledo and 6 others. 2000. Eigensnakes for vessel segmentation in angiography. 15 th International Conference on Pattern Recognition.340–343.
|
|
|
A. Pujol, Felipe Lumbreras, Javier Varona and Juan J. Villanueva. 2000. Locating people in indoor scenes for real applications. 15 th International Conference on Pattern Recognition.632–635.
|
|
|
Cristina Cañero, Petia Radeva, Ricardo Toledo, Juan J. Villanueva and J. Mauri. 2000. 3D Curve Reconstruction by Biplane Snakes. 15 th International Conference on Pattern Recognition.563–566.
|
|
|
M.J. Yzuel, J. Pladellorens, Joan Serrat and A. Dupuy. 1993. Application restauration and edge detection techniques in the calculation of left ventricular volumes. Optics in Medicine, Biology and Environmental Research : Selected contributions to the first International Conference on Optics within Life Sciences (OWLS I). Elsevier, 374–375.
|
|
|
Joan Serrat, J. Argemi and Juan J. Villanueva. 1991. Automatization of TW2 method using a knowledge-based image analysis system. VIth International Congress of Auxology..
|
|