Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	916–930 of 3413 records found matching your query (RSS):

Search & Display Options

Select All Deselect All

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–80]

List View

Citations

Details

	Records
	Author	Fahad Shahbaz Khan; Jiaolong Xu; Muhammad Anwer Rao; Joost Van de Weijer; Andrew Bagdanov; Antonio Lopez
	Title	Recognizing Actions through Action-specific Person Detection			Type	Journal Article
	Year	2015	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	24	Issue	11	Pages	4422-4432
	Keywords
	Abstract	Action recognition in still images is a challenging problem in computer vision. To facilitate comparative evaluation independently of person detection, the standard evaluation protocol for action recognition uses an oracle person detector to obtain perfect bounding box information at both training and test time. The assumption is that, in practice, a general person detector will provide candidate bounding boxes for action recognition. In this paper, we argue that this paradigm is suboptimal and that action class labels should already be considered during the detection stage. Motivated by the observation that body pose is strongly conditioned on action class, we show that: 1) the existing state-of-the-art generic person detectors are not adequate for proposing candidate bounding boxes for action classification; 2) due to limited training examples, the direct training of action-specific person detectors is also inadequate; and 3) using only a small number of labeled action examples, the transfer learning is able to adapt an existing detector to propose higher quality bounding boxes for subsequent action classification. To the best of our knowledge, we are the first to investigate transfer learning for the task of action-specific person detection in still images. We perform extensive experiments on two benchmark data sets: 1) Stanford-40 and 2) PASCAL VOC 2012. For the action detection task (i.e., both person localization and classification of the action performed), our approach outperforms methods based on general person detection by 5.7% mean average precision (MAP) on Stanford-40 and 2.1% MAP on PASCAL VOC 2012. Our approach also significantly outperforms the state of the art with a MAP of 45.4% on Stanford-40 and 31.4% on PASCAL VOC 2012. We also evaluate our action detection approach for the task of action classification (i.e., recognizing actions without localizing them). For this task, our approach, without using any ground-truth person localization at test tim- , outperforms on both data sets state-of-the-art methods, which do use person locations.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1057-7149	ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; LAMP; 600.076; 600.079			Approved	no
	Call Number	Admin @ si @ KXR2015			Serial	2668
Permanent link to this record



	Author	Lluis Garrido; M.Guerrieri; Laura Igual
	Title	Image Segmentation with Cage Active Contours			Type	Journal Article
	Year	2015	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	24	Issue	12	Pages	5557 - 5566
	Keywords	Level sets; Mean value coordinates; Parametrized active contours; level sets; mean value coordinates
	Abstract	In this paper, we present a framework for image segmentation based on parametrized active contours. The evolving contour is parametrized according to a reduced set of control points that form a closed polygon and have a clear visual interpretation. The parametrization, called mean value coordinates, stems from the techniques used in computer graphics to animate virtual models. Our framework allows to easily formulate region-based energies to segment an image. In particular, we present three different local region-based energy terms: 1) the mean model; 2) the Gaussian model; 3) and the histogram model. We show the behavior of our method on synthetic and real images and compare the performance with state-of-the-art level set methods.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1057-7149	ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ GGI2015			Serial	2673
Permanent link to this record



	Author	Mikhail Mozerov; Joost Van de Weijer
	Title	Global Color Sparseness and a Local Statistics Prior for Fast Bilateral Filtering			Type	Journal Article
	Year	2015	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	24	Issue	12	Pages	5842-5853
	Keywords
	Abstract	The property of smoothing while preserving edges makes the bilateral filter a very popular image processing tool. However, its non-linear nature results in a computationally costly operation. Various works propose fast approximations to the bilateral filter. However, the majority does not generalize to vector input as is the case with color images. We propose a fast approximation to the bilateral filter for color images. The filter is based on two ideas. First, the number of colors, which occur in a single natural image, is limited. We exploit this color sparseness to rewrite the initial non-linear bilateral filter as a number of linear filter operations. Second, we impose a statistical prior to the image values that are locally present within the filter window. We show that this statistical prior leads to a closed-form solution of the bilateral filter. Finally, we combine both ideas into a single fast and accurate bilateral filter for color images. Experimental results show that our bilateral filter based on the local prior yields an extremely fast bilateral filter approximation, but with limited accuracy, which has potential application in real-time video filtering. Our bilateral filter, which combines color sparseness and local statistics, yields a fast and accurate bilateral filter approximation and obtains the state-of-the-art results.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1057-7149	ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.079;ISE			Approved	no
	Call Number	Admin @ si @ MoW2015b			Serial	2689
Permanent link to this record



	Author	I. Sorodoc; S. Pezzelle; A. Herbelot; Mariella Dimiccoli; R. Bernardi
	Title	Learning quantification from images: A structured neural architecture			Type	Journal Article
	Year	2018	Publication	Natural Language Engineering	Abbreviated Journal	NLE
	Volume	24	Issue	3	Pages	363-392
	Keywords
	Abstract	Major advances have recently been made in merging language and vision representations. Most tasks considered so far have confined themselves to the processing of objects and lexicalised relations amongst objects (content words). We know, however, that humans (even pre-school children) can abstract over raw multimodal data to perform certain types of higher level reasoning, expressed in natural language by function words. A case in point is given by their ability to learn quantifiers, i.e. expressions like few, some and all. From formal semantics and cognitive linguistics, we know that quantifiers are relations over sets which, as a simplification, we can see as proportions. For instance, in most fish are red, most encodes the proportion of fish which are red fish. In this paper, we study how well current neural network strategies model such relations. We propose a task where, given an image and a query expressed by an object–property pair, the system must return a quantifier expressing which proportions of the queried object have the queried property. Our contributions are twofold. First, we show that the best performance on this task involves coupling state-of-the-art attention mechanisms with a network architecture mirroring the logical structure assigned to quantifiers by classic linguistic formalisation. Second, we introduce a new balanced dataset of image scenarios associated with quantification queries, which we hope will foster further research in this area.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ SPH2018			Serial	3021
Permanent link to this record



	Author	Estefania Talavera; Maria Leyva-Vallina; Md. Mostafa Kamal Sarker; Domenec Puig; Nicolai Petkov; Petia Radeva
	Title	Hierarchical approach to classify food scenes in egocentric photo-streams			Type	Journal Article
	Year	2020	Publication	IEEE Journal of Biomedical and Health Informatics	Abbreviated Journal	J-BHI
	Volume	24	Issue	3	Pages	866 - 877
	Keywords
	Abstract	Recent studies have shown that the environment where people eat can affect their nutritional behaviour. In this work, we provide automatic tools for a personalised analysis of a person's health habits by the examination of daily recorded egocentric photo-streams. Specifically, we propose a new automatic approach for the classification of food-related environments, that is able to classify up to 15 such scenes. In this way, people can monitor the context around their food intake in order to get an objective insight into their daily eating routine. We propose a model that classifies food-related scenes organized in a semantic hierarchy. Additionally, we present and make available a new egocentric dataset composed of more than 33000 images recorded by a wearable camera, over which our proposed model has been tested. Our approach obtains an accuracy and F-score of 56\% and 65\%, respectively, clearly outperforming the baseline methods.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB; no proj			Approved	no
	Call Number	Admin @ si @ TLM2020			Serial	3380
Permanent link to this record



	Author	Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal
	Title	Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts			Type	Journal Article
	Year	2021	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	24	Issue		Pages	269–281
	Keywords
	Abstract	Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121; 600.140; 110.312			Approved	no
	Call Number	Admin @ si @ BRL2021b			Serial	3574
Permanent link to this record



	Author	Minesh Mathew; Lluis Gomez; Dimosthenis Karatzas; C.V. Jawahar
	Title	Asking questions on handwritten document collections			Type	Journal Article
	Year	2021	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	24	Issue		Pages	235-249
	Keywords
	Abstract	This work addresses the problem of Question Answering (QA) on handwritten document collections. Unlike typical QA and Visual Question Answering (VQA) formulations where the answer is a short text, we aim to locate a document snippet where the answer lies. The proposed approach works without recognizing the text in the documents. We argue that the recognition-free approach is suitable for handwritten documents and historical collections where robust text recognition is often difficult. At the same time, for human users, document image snippets containing answers act as a valid alternative to textual answers. The proposed approach uses an off-the-shelf deep embedding network which can project both textual words and word images into a common sub-space. This embedding bridges the textual and visual domains and helps us retrieve document snippets that potentially answer a question. We evaluate results of the proposed approach on two new datasets: (i) HW-SQuAD: a synthetic, handwritten document image counterpart of SQuAD1.0 dataset and (ii) BenthamQA: a smaller set of QA pairs defined on documents from the popular Bentham manuscripts collection. We also present a thorough analysis of the proposed recognition-free approach compared to a recognition-based approach which uses text recognized from the images using an OCR. Datasets presented in this work are available to download at docvqa.org.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ MGK2021			Serial	3621
Permanent link to this record



	Author	Aura Hernandez-Sabate; Jose Elias Yauri; Pau Folch; Daniel Alvarez; Debora Gil
	Title	EEG Dataset Collection for Mental Workload Predictions in Flight-Deck Environment			Type	Journal Article
	Year	2024	Publication	Sensors	Abbreviated Journal	SENS
	Volume	24	Issue	4	Pages	1174
	Keywords
	Abstract	High mental workload reduces human performance and the ability to correctly carry out complex tasks. In particular, aircraft pilots enduring high mental workloads are at high risk of failure, even with catastrophic outcomes. Despite progress, there is still a lack of knowledge about the interrelationship between mental workload and brain functionality, and there is still limited data on flight-deck scenarios. Although recent emerging deep-learning (DL) methods using physiological data have presented new ways to find new physiological markers to detect and assess cognitive states, they demand large amounts of properly annotated datasets to achieve good performance. We present a new dataset of electroencephalogram (EEG) recordings specifically collected for the recognition of different levels of mental workload. The data were recorded from three experiments, where participants were induced to different levels of workload through tasks of increasing cognition demand. The first involved playing the N-back test, which combines memory recall with arithmetical skills. The second was playing Heat-the-Chair, a serious game specifically designed to emphasize and monitor subjects under controlled concurrent tasks. The third was flying in an Airbus320 simulator and solving several critical situations. The design of the dataset has been validated on three different levels: (1) correlation of the theoretical difficulty of each scenario to the self-perceived difficulty and performance of subjects; (2) significant difference in EEG temporal patterns across the theoretical difficulties and (3) usefulness for the training and evaluation of AI models.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM			Approved	no
	Call Number	Admin @ si @ HYF2024			Serial	4019
Permanent link to this record



	Author	Gemma Sanchez; Josep Llados; K. Tombre
	Title	A mean string algorithm to compute the average among a set of 2D shapes			Type	Journal Article
	Year	2002	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	23	Issue	1-3	Pages	203–214
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; IF: 0.409			Approved	no
	Call Number	DAG @ dag @ SLT2002			Serial	275
Permanent link to this record



	Author	Carles Fernandez; Pau Baiget; Xavier Roca; Jordi Gonzalez
	Title	Interpretation of Complex Situations in a Semantic-based Surveillance Framework			Type	Journal
	Year	2008	Publication	Signal Processing: Image Communication, Special Issue on Semantic Analysis for Interactive Multimedia Services	Abbreviated Journal
	Volume	23	Issue	7	Pages	554-569
	Keywords	Cognitive vision system; Situation analysis; Applied ontologies
	Abstract	The integration of cognitive capabilities in computer vision systems requires both to enable high semantic expressiveness and to deal with high computational costs as large amounts of data are involved in the analysis. This contribution describes a cognitive vision system conceived to automatically provide high-level interpretations of complex real-time situations in outdoor and indoor scenarios, and to eventually maintain communication with casual end users in multiple languages. The main contributions are: (i) the design of an integrative multilevel architecture for cognitive surveillance purposes; (ii) the proposal of a coherent taxonomy of knowledge to guide the process of interpretation, which leads to the conception of a situation-based ontology; (iii) the use of situational analysis for content detection and a progressive interpretation of semantically rich scenes, by managing incomplete or uncertain knowledge, and (iv) the use of such an ontological background to enable multilingual capabilities and advanced end-user interfaces. Experimental results are provided to show the feasibility of the proposed approach.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE			Approved	no
	Call Number	ISE @ ise @ FBR2008			Serial	954
Permanent link to this record



	Author	Josep Llados; Enric Marti; Juan J.Villanueva
	Title	Symbol recognition by error-tolerant subgraph matching between region adjacency graphs			Type	Journal Article
	Year	2001	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal
	Volume	23	Issue	10	Pages	1137-1143
	Keywords
	Abstract	The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG;IAM;ISE;			Approved	no
	Call Number	IAM @ iam @ LMV2001			Serial	1581
Permanent link to this record



	Author	Oriol Pujol; Debora Gil; Petia Radeva
	Title	Fundamentals of Stop and Go active models			Type	Journal Article
	Year	2005	Publication	Image and Vision Computing	Abbreviated Journal
	Volume	23	Issue	8	Pages	681-691
	Keywords	Deformable models; Geodesic snakes; Region-based segmentation
	Abstract	An efficient snake formulation should conform to the idea of picking the smoothest curve among all the shapes approximating an object of interest. In current geodesic snakes, the regularizing curvature also affects the convergence stage, hindering the latter at concave regions. In the present work, we make use of characteristic functions to define a novel geodesic formulation that decouples regularity and convergence. This term decoupling endows the snake with higher adaptability to non-convex shapes. Convergence is ensured by splitting the definition of the external force into an attractive vector field and a repulsive one. In our paper, we propose to use likelihood maps as approximation of characteristic functions of object appearance. The better efficiency and accuracy of our decoupled scheme are illustrated in the particular case of feature space-based segmentation.
	Address
	Corporate Author				Thesis
	Publisher	Butterworth-Heinemann	Place of Publication	Newton, MA, USA	Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0262-8856	ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM;MILAB;HuPBA			Approved	no
	Call Number	IAM @ iam @ PGR2005			Serial	1629
Permanent link to this record



	Author	Xavier Carrillo; E Fernandez-Nofrerias; Francesco Ciompi; Oriol Rodriguez-Leor; Petia Radeva; Neus Salvatella; Oriol Pujol; J. Mauri; A. Bayes
	Title	Changes in Radial Artery Volume Assessed Using Intravascular Ultrasound: A Comparison of Two Vasodilator Regimens in Transradial Coronary Intervention			Type	Journal Article
	Year	2011	Publication	Journal of Invasive Cardiology	Abbreviated Journal	JOIC
	Volume	23	Issue	10	Pages	401-404
	Keywords	radial; vasodilator treatment; percutaneous coronary intervention; IVUS; volumetric IVUS analysis
	Abstract	OBJECTIVES: This study used intravascular ultrasound (IVUS) to evaluate radial artery volume changes after intraarterial administration of nitroglycerin and/or verapamil. BACKGROUND: Radial artery spasm, which is associated with radial artery size, is the main limitation of the transradial approach in percutaneous coronary interventions (PCI). METHODS: This prospective, randomized study compared the effect of two intra-arterial vasodilator regimens on radial artery volume: 0.2 mg of nitroglycerin plus 2.5 mg of verapamil (Group 1; n = 15) versus 2.5 mg of verapamil alone (Group 2; n = 15). Radial artery lumen volume was assessed using IVUS at two time points: at baseline (5 minutes after sheath insertion) and post-vasodilator (1 minute after drug administration). The luminal volume of the radial artery was computed using ECOC Random Fields (ECOC-RF), a technique used for automatic segmentation of luminal borders in longitudinal cut images from IVUS sequences. RESULTS: There was a significant increase in arterial lumen volume in both groups, with an increase from 451 ± 177 mm³ to 508 ± 192 mm³ (p = 0.001) in Group 1 and from 456 ± 188 mm³ to 509 ± 170 mm³ (p = 0.001) in Group 2. There were no significant differences between the groups in terms of absolute volume increase (58 mm³ versus 53 mm³, respectively; p = 0.65) or in relative volume increase (14% versus 20%, respectively; p = 0.69). CONCLUSIONS: Administration of nitroglycerin plus verapamil or verapamil alone to the radial artery resulted in similar increases in arterial lumen volume according to ECOC-RF IVUS measurements.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB;HuPBA			Approved	no
	Call Number	Admin @ si @ CFC2011			Serial	1797
Permanent link to this record



	Author	Shida Beigpour; Christian Riess; Joost Van de Weijer; Elli Angelopoulou
	Title	Multi-Illuminant Estimation with Conditional Random Fields			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	23	Issue	1	Pages	83-95
	Keywords	color constancy; CRF; multi-illuminant
	Abstract	Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1057-7149	ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC; LAMP; 600.074; 600.079			Approved	no
	Call Number	Admin @ si @ BRW2014			Serial	2451
Permanent link to this record



	Author	Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Michael Felsberg; Carlo Gatta
	Title	Semantic Pyramids for Gender and Action Recognition			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	23	Issue	8	Pages	3633-3645
	Keywords
	Abstract	Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1057-7149	ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC; LAMP; 601.160; 600.074; 600.079;MILAB			Approved	no
	Call Number	Admin @ si @ KWR2014			Serial	2507
Permanent link to this record