toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Fahad Shahbaz Khan; Jiaolong Xu; Muhammad Anwer Rao; Joost Van de Weijer; Andrew Bagdanov; Antonio Lopez edit  doi
openurl 
  Title Recognizing Actions through Action-specific Person Detection Type Journal Article
  Year 2015 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume (down) 24 Issue 11 Pages 4422-4432  
  Keywords  
  Abstract Action recognition in still images is a challenging problem in computer vision. To facilitate comparative evaluation independently of person detection, the standard evaluation protocol for action recognition uses an oracle person detector to obtain perfect bounding box information at both training and test time. The assumption is that, in practice, a general person detector will provide candidate bounding boxes for action recognition. In this paper, we argue that this paradigm is suboptimal and that action class labels should already be considered during the detection stage. Motivated by the observation that body pose is strongly conditioned on action class, we show that: 1) the existing state-of-the-art generic person detectors are not adequate for proposing candidate bounding boxes for action classification; 2) due to limited training examples, the direct training of action-specific person detectors is also inadequate; and 3) using only a small number of labeled action examples, the transfer learning is able to adapt an existing detector to propose higher quality bounding boxes for subsequent action classification. To the best of our knowledge, we are the first to investigate transfer learning for the task of action-specific person detection in still images. We perform extensive experiments on two benchmark data sets: 1) Stanford-40 and 2) PASCAL VOC 2012. For the action detection task (i.e., both person localization and classification of the action performed), our approach outperforms methods based on general person detection by 5.7% mean average precision (MAP) on Stanford-40 and 2.1% MAP on PASCAL VOC 2012. Our approach also significantly outperforms the state of the art with a MAP of 45.4% on Stanford-40 and 31.4% on PASCAL VOC 2012. We also evaluate our action detection approach for the task of action classification (i.e., recognizing actions without localizing them). For this task, our approach, without using any ground-truth person localization at test tim- , outperforms on both data sets state-of-the-art methods, which do use person locations.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; LAMP; 600.076; 600.079 Approved no  
  Call Number Admin @ si @ KXR2015 Serial 2668  
Permanent link to this record
 

 
Author Lluis Garrido; M.Guerrieri; Laura Igual edit  doi
openurl 
  Title Image Segmentation with Cage Active Contours Type Journal Article
  Year 2015 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume (down) 24 Issue 12 Pages 5557 - 5566  
  Keywords Level sets; Mean value coordinates; Parametrized active contours; level sets; mean value coordinates  
  Abstract In this paper, we present a framework for image segmentation based on parametrized active contours. The evolving contour is parametrized according to a reduced set of control points that form a closed polygon and have a clear visual interpretation. The parametrization, called mean value coordinates, stems from the techniques used in computer graphics to animate virtual models. Our framework allows to easily formulate region-based energies to segment an image. In particular, we present three different local region-based energy terms: 1) the mean model; 2) the Gaussian model; 3) and the histogram model. We show the behavior of our method on synthetic and real images and compare the performance with state-of-the-art level set methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes MILAB Approved no  
  Call Number Admin @ si @ GGI2015 Serial 2673  
Permanent link to this record
 

 
Author Mikhail Mozerov; Joost Van de Weijer edit  doi
openurl 
  Title Global Color Sparseness and a Local Statistics Prior for Fast Bilateral Filtering Type Journal Article
  Year 2015 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume (down) 24 Issue 12 Pages 5842-5853  
  Keywords  
  Abstract The property of smoothing while preserving edges makes the bilateral filter a very popular image processing tool. However, its non-linear nature results in a computationally costly operation. Various works propose fast approximations to the bilateral filter. However, the majority does not generalize to vector input as is the case with color images. We propose a fast approximation to the bilateral filter for color images. The filter is based on two ideas. First, the number of colors, which occur in a single natural image, is limited. We exploit this color sparseness to rewrite the initial non-linear bilateral filter as a number of linear filter operations. Second, we impose a statistical prior to the image values that are locally present within the filter window. We show that this statistical prior leads to a closed-form solution of the bilateral filter. Finally, we combine both ideas into a single fast and accurate bilateral filter for color images. Experimental results show that our bilateral filter based on the local prior yields an extremely fast bilateral filter approximation, but with limited accuracy, which has potential application in real-time video filtering. Our bilateral filter, which combines color sparseness and local statistics, yields a fast and accurate bilateral filter approximation and obtains the state-of-the-art results.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.079;ISE Approved no  
  Call Number Admin @ si @ MoW2015b Serial 2689  
Permanent link to this record
 

 
Author I. Sorodoc; S. Pezzelle; A. Herbelot; Mariella Dimiccoli; R. Bernardi edit  url
doi  openurl
  Title Learning quantification from images: A structured neural architecture Type Journal Article
  Year 2018 Publication Natural Language Engineering Abbreviated Journal NLE  
  Volume (down) 24 Issue 3 Pages 363-392  
  Keywords  
  Abstract Major advances have recently been made in merging language and vision representations. Most tasks considered so far have confined themselves to the processing of objects and lexicalised relations amongst objects (content words). We know, however, that humans (even pre-school children) can abstract over raw multimodal data to perform certain types of higher level reasoning, expressed in natural language by function words. A case in point is given by their ability to learn quantifiers, i.e. expressions like few, some and all. From formal semantics and cognitive linguistics, we know that quantifiers are relations over sets which, as a simplification, we can see as proportions. For instance, in most fish are red, most encodes the proportion of fish which are red fish. In this paper, we study how well current neural network strategies model such relations. We propose a task where, given an image and a query expressed by an object–property pair, the system must return a quantifier expressing which proportions of the queried object have the queried property. Our contributions are twofold. First, we show that the best performance on this task involves coupling state-of-the-art attention mechanisms with a network architecture mirroring the logical structure assigned to quantifiers by classic linguistic formalisation. Second, we introduce a new balanced dataset of image scenarios associated with quantification queries, which we hope will foster further research in this area.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ SPH2018 Serial 3021  
Permanent link to this record
 

 
Author Estefania Talavera; Maria Leyva-Vallina; Md. Mostafa Kamal Sarker; Domenec Puig; Nicolai Petkov; Petia Radeva edit   pdf
url  openurl
  Title Hierarchical approach to classify food scenes in egocentric photo-streams Type Journal Article
  Year 2020 Publication IEEE Journal of Biomedical and Health Informatics Abbreviated Journal J-BHI  
  Volume (down) 24 Issue 3 Pages 866 - 877  
  Keywords  
  Abstract Recent studies have shown that the environment where people eat can affect their nutritional behaviour. In this work, we provide automatic tools for a personalised analysis of a person's health habits by the examination of daily recorded egocentric photo-streams. Specifically, we propose a new automatic approach for the classification of food-related environments, that is able to classify up to 15 such scenes. In this way, people can monitor the context around their food intake in order to get an objective insight into their daily eating routine. We propose a model that classifies food-related scenes organized in a semantic hierarchy. Additionally, we present and make available a new egocentric dataset composed of more than 33000 images recorded by a wearable camera, over which our proposed model has been tested. Our approach obtains an accuracy and F-score of 56\% and 65\%, respectively, clearly outperforming the baseline methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ TLM2020 Serial 3380  
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal edit   pdf
url  doi
openurl 
  Title Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts Type Journal Article
  Year 2021 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume (down) 24 Issue Pages 269–281  
  Keywords  
  Abstract Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ BRL2021b Serial 3574  
Permanent link to this record
 

 
Author Minesh Mathew; Lluis Gomez; Dimosthenis Karatzas; C.V. Jawahar edit   pdf
url  openurl
  Title Asking questions on handwritten document collections Type Journal Article
  Year 2021 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume (down) 24 Issue Pages 235-249  
  Keywords  
  Abstract This work addresses the problem of Question Answering (QA) on handwritten document collections. Unlike typical QA and Visual Question Answering (VQA) formulations where the answer is a short text, we aim to locate a document snippet where the answer lies. The proposed approach works without recognizing the text in the documents. We argue that the recognition-free approach is suitable for handwritten documents and historical collections where robust text recognition is often difficult. At the same time, for human users, document image snippets containing answers act as a valid alternative to textual answers. The proposed approach uses an off-the-shelf deep embedding network which can project both textual words and word images into a common sub-space. This embedding bridges the textual and visual domains and helps us retrieve document snippets that potentially answer a question. We evaluate results of the proposed approach on two new datasets: (i) HW-SQuAD: a synthetic, handwritten document image counterpart of SQuAD1.0 dataset and (ii) BenthamQA: a smaller set of QA pairs defined on documents from the popular Bentham manuscripts collection. We also present a thorough analysis of the proposed recognition-free approach compared to a recognition-based approach which uses text recognized from the images using an OCR. Datasets presented in this work are available to download at docvqa.org.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MGK2021 Serial 3621  
Permanent link to this record
 

 
Author Aura Hernandez-Sabate; Jose Elias Yauri; Pau Folch; Daniel Alvarez; Debora Gil edit  url
openurl 
  Title EEG Dataset Collection for Mental Workload Predictions in Flight-Deck Environment Type Journal Article
  Year 2024 Publication Sensors Abbreviated Journal SENS  
  Volume (down) 24 Issue 4 Pages 1174  
  Keywords  
  Abstract High mental workload reduces human performance and the ability to correctly carry out complex tasks. In particular, aircraft pilots enduring high mental workloads are at high risk of failure, even with catastrophic outcomes. Despite progress, there is still a lack of knowledge about the interrelationship between mental workload and brain functionality, and there is still limited data on flight-deck scenarios. Although recent emerging deep-learning (DL) methods using physiological data have presented new ways to find new physiological markers to detect and assess cognitive states, they demand large amounts of properly annotated datasets to achieve good performance. We present a new dataset of electroencephalogram (EEG) recordings specifically collected for the recognition of different levels of mental workload. The data were recorded from three experiments, where participants were induced to different levels of workload through tasks of increasing cognition demand. The first involved playing the N-back test, which combines memory recall with arithmetical skills. The second was playing Heat-the-Chair, a serious game specifically designed to emphasize and monitor subjects under controlled concurrent tasks. The third was flying in an Airbus320 simulator and solving several critical situations. The design of the dataset has been validated on three different levels: (1) correlation of the theoretical difficulty of each scenario to the self-perceived difficulty and performance of subjects; (2) significant difference in EEG temporal patterns across the theoretical difficulties and (3) usefulness for the training and evaluation of AI models.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM Approved no  
  Call Number Admin @ si @ HYF2024 Serial 4019  
Permanent link to this record
 

 
Author Gemma Sanchez; Josep Llados; K. Tombre edit  doi
openurl 
  Title A mean string algorithm to compute the average among a set of 2D shapes Type Journal Article
  Year 2002 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume (down) 23 Issue 1-3 Pages 203–214  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; IF: 0.409 Approved no  
  Call Number DAG @ dag @ SLT2002 Serial 275  
Permanent link to this record
 

 
Author Carles Fernandez; Pau Baiget; Xavier Roca; Jordi Gonzalez edit  openurl
  Title Interpretation of Complex Situations in a Semantic-based Surveillance Framework Type Journal
  Year 2008 Publication Signal Processing: Image Communication, Special Issue on Semantic Analysis for Interactive Multimedia Services Abbreviated Journal  
  Volume (down) 23 Issue 7 Pages 554-569  
  Keywords Cognitive vision system; Situation analysis; Applied ontologies  
  Abstract The integration of cognitive capabilities in computer vision systems requires both to enable high semantic expressiveness and to deal with high computational costs as large amounts of data are involved in the analysis. This contribution describes a cognitive vision system conceived to automatically provide high-level interpretations of complex real-time situations in outdoor and indoor scenarios, and to eventually maintain communication with casual end users in multiple languages. The main contributions are: (i) the design of an integrative multilevel architecture for cognitive surveillance purposes; (ii) the proposal of a coherent taxonomy of knowledge to guide the process of interpretation, which leads to the conception of a situation-based ontology; (iii) the use of situational analysis for content detection and a progressive interpretation of semantically rich scenes, by managing incomplete or uncertain knowledge, and (iv) the use of such an ontological background to enable multilingual capabilities and advanced end-user interfaces. Experimental results are provided to show the feasibility of the proposed approach.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number ISE @ ise @ FBR2008 Serial 954  
Permanent link to this record
 

 
Author Josep Llados; Enric Marti; Juan J.Villanueva edit  openurl
  Title Symbol recognition by error-tolerant subgraph matching between region adjacency graphs Type Journal Article
  Year 2001 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal  
  Volume (down) 23 Issue 10 Pages 1137-1143  
  Keywords  
  Abstract The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG;IAM;ISE; Approved no  
  Call Number IAM @ iam @ LMV2001 Serial 1581  
Permanent link to this record
 

 
Author Oriol Pujol; Debora Gil; Petia Radeva edit   pdf
doi  openurl
  Title Fundamentals of Stop and Go active models Type Journal Article
  Year 2005 Publication Image and Vision Computing Abbreviated Journal  
  Volume (down) 23 Issue 8 Pages 681-691  
  Keywords Deformable models; Geodesic snakes; Region-based segmentation  
  Abstract An efficient snake formulation should conform to the idea of picking the smoothest curve among all the shapes approximating an object of interest. In current geodesic snakes, the regularizing curvature also affects the convergence stage, hindering the latter at concave regions. In the present work, we make use of characteristic functions to define a novel geodesic formulation that decouples regularity and convergence. This term decoupling endows the snake with higher adaptability to non-convex shapes. Convergence is ensured by splitting the definition of the external force into an attractive vector field and a repulsive one. In our paper, we propose to use likelihood maps as approximation of characteristic functions of object appearance. The better efficiency and accuracy of our decoupled scheme are illustrated in the particular case of feature space-based segmentation.  
  Address  
  Corporate Author Thesis  
  Publisher Butterworth-Heinemann Place of Publication Newton, MA, USA Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0262-8856 ISBN Medium  
  Area Expedition Conference  
  Notes IAM;MILAB;HuPBA Approved no  
  Call Number IAM @ iam @ PGR2005 Serial 1629  
Permanent link to this record
 

 
Author Xavier Carrillo; E Fernandez-Nofrerias; Francesco Ciompi; Oriol Rodriguez-Leor; Petia Radeva; Neus Salvatella; Oriol Pujol; J. Mauri; A. Bayes edit  openurl
  Title Changes in Radial Artery Volume Assessed Using Intravascular Ultrasound: A Comparison of Two Vasodilator Regimens in Transradial Coronary Intervention Type Journal Article
  Year 2011 Publication Journal of Invasive Cardiology Abbreviated Journal JOIC  
  Volume (down) 23 Issue 10 Pages 401-404  
  Keywords radial; vasodilator treatment; percutaneous coronary intervention; IVUS; volumetric IVUS analysis  
  Abstract OBJECTIVES:
This study used intravascular ultrasound (IVUS) to evaluate radial artery volume changes after intraarterial administration of nitroglycerin and/or verapamil.
BACKGROUND:
Radial artery spasm, which is associated with radial artery size, is the main limitation of the transradial approach in percutaneous coronary interventions (PCI).
METHODS:
This prospective, randomized study compared the effect of two intra-arterial vasodilator regimens on radial artery volume: 0.2 mg of nitroglycerin plus 2.5 mg of verapamil (Group 1; n = 15) versus 2.5 mg of verapamil alone (Group 2; n = 15). Radial artery lumen volume was assessed using IVUS at two time points: at baseline (5 minutes after sheath insertion) and post-vasodilator (1 minute after drug administration). The luminal volume of the radial artery was computed using ECOC Random Fields (ECOC-RF), a technique used for automatic segmentation of luminal borders in longitudinal cut images from IVUS sequences.
RESULTS:
There was a significant increase in arterial lumen volume in both groups, with an increase from 451 ± 177 mm³ to 508 ± 192 mm³ (p = 0.001) in Group 1 and from 456 ± 188 mm³ to 509 ± 170 mm³ (p = 0.001) in Group 2. There were no significant differences between the groups in terms of absolute volume increase (58 mm³ versus 53 mm³, respectively; p = 0.65) or in relative volume increase (14% versus 20%, respectively; p = 0.69).
CONCLUSIONS:
Administration of nitroglycerin plus verapamil or verapamil alone to the radial artery resulted in similar increases in arterial lumen volume according to ECOC-RF IVUS measurements.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB;HuPBA Approved no  
  Call Number Admin @ si @ CFC2011 Serial 1797  
Permanent link to this record
 

 
Author Shida Beigpour; Christian Riess; Joost Van de Weijer; Elli Angelopoulou edit   pdf
doi  openurl
  Title Multi-Illuminant Estimation with Conditional Random Fields Type Journal Article
  Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume (down) 23 Issue 1 Pages 83-95  
  Keywords color constancy; CRF; multi-illuminant  
  Abstract Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes CIC; LAMP; 600.074; 600.079 Approved no  
  Call Number Admin @ si @ BRW2014 Serial 2451  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Michael Felsberg; Carlo Gatta edit   pdf
doi  openurl
  Title Semantic Pyramids for Gender and Action Recognition Type Journal Article
  Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume (down) 23 Issue 8 Pages 3633-3645  
  Keywords  
  Abstract Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes CIC; LAMP; 601.160; 600.074; 600.079;MILAB Approved no  
  Call Number Admin @ si @ KWR2014 Serial 2507  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: