Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	1906–1920 of 3413 records found matching your query (RSS):

Search & Display Options

Select All Deselect All

[111–120] << 121 122 123 124 125 126 127 128 129 130 >> [131–140]

List View

Citations

Details

	Records
	Author	Jiaolong Xu; David Vazquez; Antonio Lopez; Javier Marin; Daniel Ponsa
	Title	Learning a Part-based Pedestrian Detector in Virtual World			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Intelligent Transportation Systems	Abbreviated Journal	TITS
	Volume	15	Issue	5	Pages	2121-2131
	Keywords	Domain Adaptation; Pedestrian Detection; Virtual Worlds
	Abstract	Detecting pedestrians with on-board vision systems is of paramount interest for assisting drivers to prevent vehicle-to-pedestrian accidents. The core of a pedestrian detector is its classification module, which aims at deciding if a given image window contains a pedestrian. Given the difficulty of this task, many classifiers have been proposed during the last fifteen years. Among them, the so-called (deformable) part-based classifiers including multi-view modeling are usually top ranked in accuracy. Training such classifiers is not trivial since a proper aspect clustering and spatial part alignment of the pedestrian training samples are crucial for obtaining an accurate classifier. In this paper, first we perform automatic aspect clustering and part alignment by using virtual-world pedestrians, i.e., human annotations are not required. Second, we use a mixture-of-parts approach that allows part sharing among different aspects. Third, these proposals are integrated in a learning framework which also allows to incorporate real-world training data to perform domain adaptation between virtual- and real-world cameras. Overall, the obtained results on four popular on-board datasets show that our proposal clearly outperforms the state-of-the-art deformable part-based detector known as latent SVM.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1931-0587	ISBN	978-1-4673-2754-1	Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.076			Approved	no
	Call Number	ADAS @ adas @ XVL2014			Serial	2433
Permanent link to this record



	Author	Jiaolong Xu; Sebastian Ramos;David Vazquez; Antonio Lopez
	Title	Cost-sensitive Structured SVM for Multi-category Domain Adaptation			Type	Conference Article
	Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	3886 - 3891
	Keywords	Domain Adaptation; Pedestrian Detection
	Abstract	Domain adaptation addresses the problem of accuracy drop that a classifier may suffer when the training data (source domain) and the testing data (target domain) are drawn from different distributions. In this work, we focus on domain adaptation for structured SVM (SSVM). We propose a cost-sensitive domain adaptation method for SSVM, namely COSS-SSVM. In particular, during the re-training of an adapted classifier based on target and source data, the idea that we explore consists in introducing a non-zero cost even for correctly classified source domain samples. Eventually, we aim to learn a more targetoriented classifier by not rewarding (zero loss) properly classified source-domain training samples. We assess the effectiveness of COSS-SSVM on multi-category object recognition.
	Address	Stockholm; Sweden; August 2014
	Corporate Author				Thesis
	Publisher	IEEE	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1051-4651	ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	ADAS; 600.057; 600.054; 601.217; 600.076			Approved	no
	Call Number	ADAS @ adas @ XRV2014a			Serial	2434
Permanent link to this record



	Author	Onur Ferhat; Fernando Vilariño; F. Javier Sanchez
	Title	A cheap portable eye-tracker solution for common setups.			Type	Journal Article
	Year	2014	Publication	Journal of Eye Movement Research	Abbreviated Journal	JEMR
	Volume	7	Issue	3	Pages	1-10
	Keywords
	Abstract	We analyze the feasibility of a cheap eye-tracker where the hardware consists of a single webcam and a Raspberry Pi device. Our aim is to discover the limits of such a system and to see whether it provides an acceptable performance. We base our work on the open source Opengazer (Zielinski, 2013) and we propose several improvements to create a robust, real-time system which can work on a computer with 30Hz sampling rate. After assessing the accuracy of our eye-tracker in elaborated experiments involving 12 subjects under 4 different system setups, we install it on a Raspberry Pi to create a portable stand-alone eye-tracker which achieves 1.42° horizontal accuracy with 3Hz refresh rate for a building cost of 70 Euros.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	;SIAI			Approved	no
	Call Number	Admin @ si @ FVS2014			Serial	2435
Permanent link to this record



	Author	J.S. Cope; P.Remagnino; S.Mannan; Katerine Diaz; Francesc J. Ferri; P.Wilkin
	Title	Reverse Engineering Expert Visual Observations: From Fixations To The Learning Of Spatial Filters With A Neural-Gas Algorithm			Type	Journal Article
	Year	2013	Publication	Expert Systems with Applications	Abbreviated Journal	EXWA
	Volume	40	Issue	17	Pages	6707-6712
	Keywords	Neural gas; Expert vision; Eye-tracking; Fixations
	Abstract	Human beings can become experts in performing specific vision tasks, for example, doctors analysing medical images, or botanists studying leaves. With sufficient knowledge and experience, people can become very efficient at such tasks. When attempting to perform these tasks with a machine vision system, it would be highly beneficial to be able to replicate the process which the expert undergoes. Advances in eye-tracking technology can provide data to allow us to discover the manner in which an expert studies an image. This paper presents a first step towards utilizing these data for computer vision purposes. A growing-neural-gas algorithm is used to learn a set of Gabor filters which give high responses to image regions which a human expert fixated on. These filters can then be used to identify regions in other images which are likely to be useful for a given vision task. The algorithm is evaluated by learning filters for locating specific areas of plant leaves.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0957-4174	ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ CRM2013			Serial	2438
Permanent link to this record



	Author	Katerine Diaz; Francesc J. Ferri; W. Diaz
	Title	Fast Approximated Discriminative Common Vectors using rank-one SVD updates			Type	Conference Article
	Year	2013	Publication	20th International Conference On Neural Information Processing	Abbreviated Journal
	Volume	8228	Issue	III	Pages	368-375
	Keywords
	Abstract	An efficient incremental approach to the discriminative common vector (DCV) method for dimensionality reduction and classification is presented. The proposal consists of a rank-one update along with an adaptive restriction on the rank of the null space which leads to an approximate but convenient solution. The algorithm can be implemented very efficiently in terms of matrix operations and space complexity, which enables its use in large-scale dynamic application domains. Deep comparative experimentation using publicly available high dimensional image datasets has been carried out in order to properly assess the proposed algorithm against several recent incremental formulations. K. Diaz-Chito, F.J. Ferri, W. Diaz
	Address	Daegu; Korea; November 2013
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-642-42050-4	Medium
	Area		Expedition		Conference	ICONIP
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ DFD2013			Serial	2439
Permanent link to this record



	Author	Katerine Diaz; Francesc J. Ferri
	Title	Extensiones del método de vectores comunes discriminantes Aplicadas a la clasificación de imágenes			Type	Book Whole
	Year	2013	Publication	Extensiones del método de vectores comunes discriminantes Aplicadas a la clasificación de imágenes	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Los métodos basados en subespacios son una herramienta muy utilizada en aplicaciones de visión por computador. Aquí se presentan y validan algunos algoritmos que hemos propuesto en este campo de investigación. El primer algoritmo está relacionado con una extensión del método de vectores comunes discriminantes con kernel, que reinterpreta el espacio nulo de la matriz de dispersión intra-clase del conjunto de entrenamiento para obtener las características discriminantes. Dentro de los métodos basados en subespacios existen diferentes tipos de entrenamiento. Uno de los más populares, pero no por ello uno de los más eficientes, es el aprendizaje por lotes. En este tipo de aprendizaje, todas las muestras del conjunto de entrenamiento tienen que estar disponibles desde el inicio. De este modo, cuando nuevas muestras se ponen a disposición del algoritmo, el sistema tiene que ser reentrenado de nuevo desde cero. Una alternativa a este tipo de entrenamiento es el aprendizaje incremental. Aquí se proponen diferentes algoritmos incrementales del método de vectores comunes discriminantes.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-639-55339-0	Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ DiF2013			Serial	2440
Permanent link to this record



	Author	Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera
	Title	Combining Local and Global Learners in the Pairwise Multiclass Classification			Type	Journal Article
	Year	2015	Publication	Pattern Analysis and Applications	Abbreviated Journal	PAA
	Volume	18	Issue	4	Pages	845-860
	Keywords	Multiclass classification; Pairwise approach; One-versus-one
	Abstract	Pairwise classification is a well-known class binarization technique that converts a multiclass problem into a number of two-class problems, one problem for each pair of classes. However, in the pairwise technique, nuisance votes of many irrelevant classifiers may result in a wrong class prediction. To overcome this problem, a simple, but efficient method is proposed and evaluated in this paper. The proposed method is based on excluding some classes and focusing on the most probable classes in the neighborhood space, named Local Crossing Off (LCO). This procedure is performed by employing a modified version of standard K-nearest neighbor and large margin nearest neighbor algorithms. The LCO method takes advantage of nearest neighbor classification algorithm because of its local learning behavior as well as the global behavior of powerful binary classifiers to discriminate between two classes. Combining these two properties in the proposed LCO technique will avoid the weaknesses of each method and will increase the efficiency of the whole classification system. On several benchmark datasets of varying size and difficulty, we found that the LCO approach leads to significant improvements using different base learners. The experimental results show that the proposed technique not only achieves better classification accuracy in comparison to other standard approaches, but also is computationally more efficient for tackling classification problems which have a relatively large number of target classes.
	Address
	Corporate Author				Thesis
	Publisher	Springer London	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-7541	ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ BGE2014			Serial	2441
Permanent link to this record



	Author	Oscar Lopes; Miguel Reyes; Sergio Escalera; Jordi Gonzalez
	Title	Spherical Blurred Shape Model for 3-D Object and Pose Recognition: Quantitative Analysis and HCI Applications in Smart Environments			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Systems, Man and Cybernetics (Part B)	Abbreviated Journal	TSMCB
	Volume	44	Issue	12	Pages	2379-2390
	Keywords
	Abstract	The use of depth maps is of increasing interest after the advent of cheap multisensor devices based on structured light, such as Kinect. In this context, there is a strong need of powerful 3-D shape descriptors able to generate rich object representations. Although several 3-D descriptors have been already proposed in the literature, the research of discriminative and computationally efficient descriptors is still an open issue. In this paper, we propose a novel point cloud descriptor called spherical blurred shape model (SBSM) that successfully encodes the structure density and local variabilities of an object based on shape voxel distances and a neighborhood propagation strategy. The proposed SBSM is proven to be rotation and scale invariant, robust to noise and occlusions, highly discriminative for multiple categories of complex objects like the human hand, and computationally efficient since the SBSM complexity is linear to the number of object voxels. Experimental evaluation in public depth multiclass object data, 3-D facial expressions data, and a novel hand poses data sets show significant performance improvements in relation to state-of-the-art approaches. Moreover, the effectiveness of the proposal is also proved for object spotting in 3-D scenes and for real-time automatic hand pose recognition in human computer interaction scenarios.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	2168-2267	ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; ISE; 600.078;MILAB			Approved	no
	Call Number	Admin @ si @ LRE2014			Serial	2442
Permanent link to this record



	Author	Xavier Perez Sala; Sergio Escalera; Cecilio Angulo; Jordi Gonzalez
	Title	A survey on model based approaches for 2D and 3D visual human pose recovery			Type	Journal Article
	Year	2014	Publication	Sensors	Abbreviated Journal	SENS
	Volume	14	Issue	3	Pages	4189-4210
	Keywords	human pose recovery; human body modelling; behavior analysis; computer vision
	Abstract	Human Pose Recovery has been studied in the field of Computer Vision for the last 40 years. Several approaches have been reported, and significant improvements have been obtained in both data representation and model design. However, the problem of Human Pose Recovery in uncontrolled environments is far from being solved. In this paper, we define a general taxonomy to group model based approaches for Human Pose Recovery, which is composed of five main modules: appearance, viewpoint, spatial relations, temporal consistence, and behavior. Subsequently, a methodological comparison is performed following the proposed taxonomy, evaluating current SoA approaches in the aforementioned five group categories. As a result of this comparison, we discuss the main advantages and drawbacks of the reviewed literature.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; ISE; 600.046; 600.063; 600.078;MILAB			Approved	no
	Call Number	Admin @ si @ PEA2014			Serial	2443
Permanent link to this record



	Author	Frederic Sampedro; Anna Domenech; Sergio Escalera
	Title	Obtaining quantitative global tumoral state indicators based on whole-body PET/CT scans: A breast cancer case study			Type	Journal Article
	Year	2014	Publication	Nuclear Medicine Communications	Abbreviated Journal	NMC
	Volume	35	Issue	4	Pages	362-371
	Keywords
	Abstract	Objectives: In this work we address the need for the computation of quantitative global tumoral state indicators from oncological whole-body PET/computed tomography scans. The combination of such indicators with other oncological information such as tumor markers or biopsy results would prove useful in oncological decision-making scenarios. Materials and methods: From an ordering of 100 breast cancer patients on the basis of oncological state through visual analysis by a consensus of nuclear medicine specialists, a set of numerical indicators computed from image analysis of the PET/computed tomography scan is presented, which attempts to summarize a patient’s oncological state in a quantitative manner taking into consideration the total tumor volume, aggressiveness, and spread. Results: Results obtained by comparative analysis of the proposed indicators with respect to the experts’ evaluation show up to 87% Pearson’s correlation coefficient when providing expert-guided PET metabolic tumor volume segmentation and 64% correlation when using completely automatic image analysis techniques. Conclusion: Global quantitative tumor information obtained by whole-body PET/CT image analysis can prove useful in clinical nuclear medicine settings and oncological decision-making scenarios. The completely automatic computation of such indicators would improve its impact as time efficiency and specialist independence would be achieved.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA;MILAB			Approved	no
	Call Number	SDE2014a			Serial	2444
Permanent link to this record



	Author	Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera
	Title	Generic Subclass Ensemble: A Novel Approach to Ensemble Classification			Type	Conference Article
	Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	1254 - 1259
	Keywords
	Abstract	Multiple classifier systems, also known as classifier ensembles, have received great attention in recent years because of their improved classification accuracy in different applications. In this paper, we propose a new general approach to ensemble classification, named generic subclass ensemble, in which each base classifier is trained with data belonging to a subset of classes, and thus discriminates among a subset of target categories. The ensemble classifiers are then fused using a combination rule. The proposed approach differs from existing methods that manipulate the target attribute, since in our approach individual classification problems are not restricted to two-class problems. We perform a series of experiments to evaluate the efficiency of the generic subclass approach on a set of benchmark datasets. Experimental results with multilayer perceptrons show that the proposed approach presents a viable alternative to the most commonly used ensemble classification approaches.
	Address	Stockholm; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1051-4651	ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ BGE2014b			Serial	2445
Permanent link to this record



	Author	Mohammad Ali Bagheri; Gang Hu; Qigang Gao; Sergio Escalera
	Title	A Framework of Multi-Classifier Fusion for Human Action Recognition			Type	Conference Article
	Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	1260 - 1265
	Keywords
	Abstract	The performance of different action-recognition methods using skeleton joint locations have been recently studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of five action learning techniques, each performing the recognition task from a different perspective. The underlying rationale of the fusion approach is that different learners employ varying structures of input descriptors/features to be trained. These varying structures cannot be attached and used by a single learner. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a poorly performing learner. This leads to having a more robust and general-applicable framework. Also, we propose two simple, yet effective, action description techniques. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers' output, showing advanced performance of the proposed methodology.
	Address	Stockholm; Sweden; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1051-4651	ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ BHG2014			Serial	2446
Permanent link to this record



	Author	Naveen Onkarappa
	Title	Optical Flow in Driver Assistance Systems			Type	Book Whole
	Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Motion perception is one of the most important attributes of the human brain. Visual motion perception consists in inferring speed and direction of elements in a scene based on visual inputs. Analogously, computer vision is assisted by motion cues in the scene. Motion detection in computer vision is useful in solving problems such as segmentation, depth from motion, structure from motion, compression, navigation and many others. These problems are common in several applications, for instance, video surveillance, robot navigation and advanced driver assistance systems (ADAS). One of the most widely used techniques for motion detection is the optical flow estimation. The work in this thesis attempts to make optical flow suitable for the requirements and conditions of driving scenarios. In this context, a novel space-variant representation called reverse log-polar representation is proposed that is shown to be better than the traditional log-polar space-variant representation for ADAS. The space-variant representations reduce the amount of data to be processed. Another major contribution in this research is related to the analysis of the influence of specific characteristics from driving scenarios on the optical flow accuracy. Characteristics such as vehicle speed and road texture are considered in the aforementioned analysis. From this study, it is inferred that the regularization weight has to be adapted according to the required error measure and for different speeds and road textures. It is also shown that polar represented optical flow suits driving scenarios where predominant motion is translation. Due to the requirements of such a study and by the lack of needed datasets a new synthetic dataset is presented; it contains: i) sequences of different speeds and road textures in an urban scenario; ii) sequences with complex motion of an on-board camera; and iii) sequences with additional moving vehicles in the scene. The ground-truth optical flow is generated by the ray-tracing technique. Further, few applications of optical flow in ADAS are shown. Firstly, a robust RANSAC based technique to estimate horizon line is proposed. Then, an egomotion estimation is presented to compare the proposed space-variant representation with the classical one. As a final contribution, a modification in the regularization term is proposed that notably improves the results in the ADAS applications. This adaptation is evaluated using a state of the art optical flow technique. The experiments on a public dataset (KITTI) validate the advantages of using the proposed modification.
	Address	Bellaterra
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Angel Sappa
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-940902-1-9	Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Nav2013			Serial	2447
Permanent link to this record



	Author	Jorge Bernal; Fernando Vilariño; F. Javier Sanchez; M. Arnold; Anarta Ghosh; Gerard Lacey
	Title	Experts vs Novices: Applying Eye-tracking Methodologies in Colonoscopy Video Screening for Polyp Search			Type	Conference Article
	Year	2014	Publication	2014 Symposium on Eye Tracking Research and Applications	Abbreviated Journal
	Volume		Issue		Pages	223-226
	Keywords
	Abstract	We present in this paper a novel study aiming at identifying the differences in visual search patterns between physicians of diverse levels of expertise during the screening of colonoscopy videos. Physicians were clustered into two groups -experts and novices- according to the number of procedures performed, and fixations were captured by an eye-tracker device during the task of polyp search in different video sequences. These fixations were integrated into heat maps, one for each cluster. The obtained maps were validated over a ground truth consisting of a mask of the polyp, and the comparison between experts and novices was performed by using metrics such as reaction time, dwelling time and energy concentration ratio. Experimental results show a statistically significant difference between experts and novices, and the obtained maps show to be a useful tool for the characterisation of the behaviour of each group.
	Address	USA; March 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-4503-2751-0	Medium
	Area		Expedition		Conference	ETRA
	Notes	MV; 600.047; 600.060;SIAI			Approved	no
	Call Number	Admin @ si @ BVS2014			Serial	2448
Permanent link to this record



	Author	Fahad Shahbaz Khan; Joost Van de Weijer; Andrew Bagdanov; Michael Felsberg
	Title	Scale Coding Bag-of-Words for Action Recognition			Type	Conference Article
	Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	1514-1519
	Keywords
	Abstract	Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image. Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant strategy is sub-optimal since it ignores the multi-scale information available with each bounding box of a person. This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music, riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.
	Address	Stockholm; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	CIC; LAMP; 601.240; 600.074; 600.079			Approved	no
	Call Number	Admin @ si @ KWB2014			Serial	2450
Permanent link to this record