Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	1741–1755 of 3413 records found matching your query (RSS):

Search & Display Options

Select All Deselect All

[101–110] << 111 112 113 114 115 116 117 118 119 120 >> [121–130]

List View

Citations

Details

	Records
	Author	Antonio Hernandez
	Title	From pixels to gestures: learning visual representations for human analysis in color and depth data sequences			Type	Book Whole
	Year	2015	Publication	PhD Thesis, Universitat de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The visual analysis of humans from images is an important topic of interest due to its relevance to many computer vision applications like pedestrian detection, monitoring and surveillance, human-computer interaction, e-health or content-based image retrieval, among others. In this dissertation we are interested in learning different visual representations of the human body that are helpful for the visual analysis of humans in images and video sequences. To that end, we analyze both RGB and depth image modalities and address the problem from three different research lines, at different levels of abstraction; from pixels to gestures: human segmentation, human pose estimation and gesture recognition. First, we show how binary segmentation (object vs. background) of the human body in image sequences is helpful to remove all the background clutter present in the scene. The presented method, based on Graph cuts optimization, enforces spatio-temporal consistency of the produced segmentation masks among consecutive frames. Secondly, we present a framework for multi-label segmentation for obtaining much more detailed segmentation masks: instead of just obtaining a binary representation separating the human body from the background, finer segmentation masks can be obtained separating the different body parts. At a higher level of abstraction, we aim for a simpler yet descriptive representation of the human body. Human pose estimation methods usually rely on skeletal models of the human body, formed by segments (or rectangles) that represent the body limbs, appropriately connected following the kinematic constraints of the human body. In practice, such skeletal models must fulfill some constraints in order to allow for efficient inference, while actually limiting the expressiveness of the model. In order to cope with this, we introduce a top-down approach for predicting the position of the body parts in the model, using a mid-level part representation based on Poselets. Finally, we propose a framework for gesture recognition based on the bag of visual words framework. We leverage the benefits of RGB and depth image modalities by combining modality-specific visual vocabularies in a late fusion fashion. A new rotation-variant depth descriptor is presented, yielding better results than other state-of-the-art descriptors. Moreover, spatio-temporal pyramids are used to encode rough spatial and temporal structure. In addition, we present a probabilistic reformulation of Dynamic Time Warping for gesture segmentation in video sequences. A Gaussian-based probabilistic model of a gesture is learnt, implicitly encoding possible deformations in both spatial and time domains.
	Address	January 2015
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Sergio Escalera;Stan Sclaroff
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-940902-0-2	Medium
	Area		Expedition		Conference
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ Her2015			Serial	2576
Permanent link to this record



	Author	Hongxing Gao
	Title	Focused Structural Document Image Retrieval in Digital Mailroom Applications			Type	Book Whole
	Year	2015	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this work, we develop a generic framework that is able to handle the document retrieval problem in various scenarios such as searching for full page matches or retrieving the counterparts for specific document areas, focusing on their structural similarity or letting their visual resemblance to play a dominant role. Based on the spatial indexing technique, we propose to search for matches of local key-region pairs carrying both structural and visual information from the collection while a scheme allowing to adjust the relative contribution of structural and visual similarity is presented. Based on the fact that the structure of documents is tightly linked with the distance among their elements, we firstly introduce an efficient detector named Distance Transform based Maximally Stable Extremal Regions (DTMSER). We illustrate that this detector is able to efficiently extract the structure of a document image as a dendrogram (hierarchical tree) of multi-scale key-regions that roughly correspond to letters, words and paragraphs. We demonstrate that, without benefiting from the structure information, the key-regions extracted by the DTMSER algorithm achieve better results comparing with state-of-the-art methods while much less amount of key-regions are employed. We subsequently propose a pair-wise Bag of Words (BoW) framework to efficiently embed the explicit structure extracted by the DTMSER algorithm. We represent each document as a list of key-region pairs that correspond to the edges in the dendrogram where inclusion relationship is encoded. By employing those structural key-region pairs as the pooling elements for generating the histogram of features, the proposed method is able to encode the explicit inclusion relations into a BoW representation. The experimental results illustrate that the pair-wise BoW, powered by the embedded structural information, achieves remarkable improvement over the conventional BoW and spatial pyramidal BoW methods. To handle various retrieval scenarios in one framework, we propose to directly query a series of key-region pairs, carrying both structure and visual information, from the collection. We introduce the spatial indexing techniques to the document retrieval community to speed up the structural relationship computation for key-region pairs. We firstly test the proposed framework in a full page retrieval scenario where structurally similar matches are expected. In this case, the pair-wise querying method achieves notable improvement over the BoW and spatial pyramidal BoW frameworks. Furthermore, we illustrate that the proposed method is also able to handle focused retrieval situations where the queries are defined as a specific interesting partial areas of the images. We examine our method on two types of focused queries: structure-focused and exact queries. The experimental results show that, the proposed generic framework obtains nearly perfect precision on both types of focused queries while it is the first framework able to tackle structure-focused queries, setting a new state of the art in the field. Besides, we introduce a line verification method to check the spatial consistency among the matched key-region pairs. We propose a computationally efficient version of line verification through a two step implementation. We first compute tentative localizations of the query and subsequently employ them to divide the matched key-region pairs into several groups, then line verification is performed within each group while more precise bounding boxes are computed. We demonstrate that, comparing with the standard approach (based on RANSAC), the line verification proposed generally achieves much higher recall with slight loss on precision on specific queries.
	Address	January 2015
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Josep Llados;Dimosthenis Karatzas;Marçal Rusiñol
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-943427-0-7	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ Gao2015			Serial	2577
Permanent link to this record



	Author	Antonio Esteban Lansaque
	Title	3D reconstruction and recognition using structured ligth			Type	Report
	Year	2014	Publication	CVC Technical Report	Abbreviated Journal
	Volume	179	Issue		Pages
	Keywords
	Abstract	This work covers the problem of 3D reconstruction, recognition and 6DOF pose estimation. The goal of this project is to reconstruct a 3D scene and to align an object model of the industrial pieces onto the reconstructed scene. The reconstruction algorithm is based on stereo techniques and the recognition algorithm is based on SHOT descriptors computed on a set of uniform keypoints. Correspondences are used to estimate a first 6DOF transformation that maps the model onto the scene and then ICP algorithm is used to refine the transformation. In order to check the effectiveness of the proposed algorithm, several experiments were performed. These experiments were conducted on a lab environment in order to get results under the same conditions in all of them. Although obtained results are not real time results, the proposed algorithm ends up with high rates of object recognition.
	Address	UAB; September 2014
	Corporate Author				Thesis	Master's thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM; 600.075			Approved	no
	Call Number	Admin @ si @ Est2014			Serial	2578
Permanent link to this record



	Author	Ricard Balague
	Title	Exploring the combination of color cues for intrinsic image decomposition			Type	Report
	Year	2014	Publication	CVC Technical Report	Abbreviated Journal
	Volume	178	Issue		Pages
	Keywords
	Abstract	Intrinsic image decomposition is a challenging problem that consists in separating an image into its physical characteristics: reflectance and shading. This problem can be solved in different ways, but most methods have combined information from several visual cues. In this work we describe an extension of an existing method proposed by Serra et al. which considers two color descriptors and combines them by means of a Markov Random Field. We analyze in depth the weak points of the method and we explore more possibilities to use in both descriptors. The proposed extension depends on the combination of the cues considered to overcome some of the limitations of the original method. Our approach is tested on the MIT dataset and Beigpour et al. dataset, which contain images of real objects acquired under controlled conditions and synthetic images respectively, with their corresponding ground truth.
	Address	UAB; September 2014
	Corporate Author				Thesis	Master's thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC; 600.074			Approved	no
	Call Number	Admin @ si @ Bal2014			Serial	2579
Permanent link to this record



	Author	Sebastian Ramos
	Title	Vision-based Detection of Road Hazards for Autonomous Driving			Type	Report
	Year	2014	Publication	CVC Technical Report	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	UAB; September 2014
	Corporate Author				Thesis	Master's thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.076			Approved	no
	Call Number	Admin @ si @ Ram2014			Serial	2580
Permanent link to this record



	Author	Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Michael Felsberg; J.Laaksonen
	Title	Compact color texture description for texture classification			Type	Journal Article
	Year	2015	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	51	Issue		Pages	16-22
	Keywords
	Abstract	Describing textures is a challenging problem in computer vision and pattern recognition. The classification problem involves assigning a category label to the texture class it belongs to. Several factors such as variations in scale, illumination and viewpoint make the problem of texture description extremely challenging. A variety of histogram based texture representations exists in literature. However, combining multiple texture descriptors and assessing their complementarity is still an open research problem. In this paper, we first show that combining multiple local texture descriptors significantly improves the recognition performance compared to using a single best method alone. This gain in performance is achieved at the cost of high-dimensional final image representation. To counter this problem, we propose to use an information-theoretic compression technique to obtain a compact texture description without any significant loss in accuracy. In addition, we perform a comprehensive evaluation of pure color descriptors, popular in object recognition, for the problem of texture classification. Experiments are performed on four challenging texture datasets namely, KTH-TIPS-2a, KTH-TIPS-2b, FMD and Texture-10. The experiments clearly demonstrate that our proposed compact multi-texture approach outperforms the single best texture method alone. In all cases, discriminative color names outperforms other color features for texture classification. Finally, we show that combining discriminative color names with compact texture representation outperforms state-of-the-art methods by 7:8%, 4:3% and 5:0% on KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets respectively.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.068; 600.079;ADAS			Approved	no
	Call Number	Admin @ si @ KRW2015a			Serial	2587
Permanent link to this record



	Author	Sergio Escalera; Jordi Gonzalez; Xavier Baro; Pablo Pardo; Junior Fabian; Marc Oliu; Hugo Jair Escalante; Ivan Huerta; Isabelle Guyon
	Title	ChaLearn Looking at People 2015 new competitions: Age Estimation and Cultural Event Recognition			Type	Conference Article
	Year	2015	Publication	IEEE International Joint Conference on Neural Networks IJCNN2015	Abbreviated Journal
	Volume		Issue		Pages	1-8
	Keywords
	Abstract	Following previous series on Looking at People (LAP) challenges [1], [2], [3], in 2015 ChaLearn runs two new competitions within the field of Looking at People: age and cultural event recognition in still images. We propose thefirst crowdsourcing application to collect and label data about apparent age of people instead of the real age. In terms of cultural event recognition, tens of categories have to be recognized. This involves scene understanding and human analysis. This paper summarizes both challenges and data, providing some initial baselines. The results of the first round of the competition were presented at ChaLearn LAP 2015 IJCNN special session on computer vision and robotics http://www.dtic.ua.es/∼jgarcia/IJCNN2015. Details of the ChaLearn LAP competitions can be found at http://gesture.chalearn.org/.
	Address	Killarney; Ireland; July 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IJCNN
	Notes	HuPBA; ISE; 600.063; 600.078;MV			Approved	no
	Call Number	Admin @ si @ EGB2015			Serial	2591
Permanent link to this record



	Author	Frederic Sampedro; Anna Domenech; Sergio Escalera; Ignasi Carrio
	Title	Deriving global quantitative tumor response parameters from 18F-FDG PET-CT scans in patients with non-Hodgkins lymphoma			Type	Journal Article
	Year	2015	Publication	Nuclear Medicine Communications	Abbreviated Journal	NMC
	Volume	36	Issue	4	Pages	328-333
	Keywords
	Abstract	OBJECTIVES: The aim of the study was to address the need for quantifying the global cancer time evolution magnitude from a pair of time-consecutive positron emission tomography-computed tomography (PET-CT) scans. In particular, we focus on the computation of indicators using image-processing techniques that seek to model non-Hodgkin's lymphoma (NHL) progression or response severity. MATERIALS AND METHODS: A total of 89 pairs of time-consecutive PET-CT scans from NHL patients were stored in a nuclear medicine station for subsequent analysis. These were classified by a consensus of nuclear medicine physicians into progressions, partial responses, mixed responses, complete responses, and relapses. The cases of each group were ordered by magnitude following visual analysis. Thereafter, a set of quantitative indicators designed to model the cancer evolution magnitude within each group were computed using semiautomatic and automatic image-processing techniques. Performance evaluation of the proposed indicators was measured by a correlation analysis with the expert-based visual analysis. RESULTS: The set of proposed indicators achieved Pearson's correlation results in each group with respect to the expert-based visual analysis: 80.2% in progressions, 77.1% in partial response, 68.3% in mixed response, 88.5% in complete response, and 100% in relapse. In the progression and mixed response groups, the proposed indicators outperformed the common indicators used in clinical practice [changes in metabolic tumor volume, mean, maximum, peak standardized uptake value (SUV mean, SUV max, SUV peak), and total lesion glycolysis] by more than 40%. CONCLUSION: Computing global indicators of NHL response using PET-CT imaging techniques offers a strong correlation with the associated expert-based visual analysis, motivating the future incorporation of such quantitative and highly observer-independent indicators in oncological decision making or treatment response evaluation scenarios.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ SDE2015			Serial	2605
Permanent link to this record



	Author	Wenjuan Gong; W.Zhang; Jordi Gonzalez; Y.Ren; Z.Li
	Title	Enhanced Asymmetric Bilinear Model for Face Recognition			Type	Journal Article
	Year	2015	Publication	International Journal of Distributed Sensor Networks	Abbreviated Journal	IJDSN
	Volume		Issue		Pages	Article ID 218514
	Keywords
	Abstract	Bilinear models have been successfully applied to separate two factors, for example, pose variances and different identities in face recognition problems. Asymmetric model is a type of bilinear model which models a system in the most concise way. But seldom there are works exploring the applications of asymmetric bilinear model on face recognition problem with illumination changes. In this work, we propose enhanced asymmetric model for illumination-robust face recognition. Instead of initializing the factor probabilities randomly, we initialize them with nearest neighbor method and optimize them for the test data. Above that, we update the factor model to be identified. We validate the proposed method on a designed data sample and extended Yale B dataset. The experiment results show that the enhanced asymmetric models give promising results and good recognition accuracies.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE; 600.063; 600.078			Approved	no
	Call Number	Admin @ si @ GZG2015			Serial	2592
Permanent link to this record



	Author	Adriana Romero; Petia Radeva; Carlo Gatta
	Title	Meta-parameter free unsupervised sparse feature learning			Type	Journal Article
	Year	2015	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	37	Issue	8	Pages	1716-1722
	Keywords
	Abstract	We propose a meta-parameter free, off-the-shelf, simple and fast unsupervised feature learning algorithm, which exploits a new way of optimizing for sparsity. Experiments on CIFAR-10, STL- 10 and UCMerced show that the method achieves the state-of-theart performance, providing discriminative features that generalize well.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB; 600.068; 600.079; 601.160			Approved	no
	Call Number	Admin @ si @ RRG2014b			Serial	2594
Permanent link to this record



	Author	Manuel Graña; Bogdan Raducanu
	Title	Special Issue on Bioinspired and knowledge based techniques and applications			Type	Journal Article
	Year	2015	Publication	Neurocomputing	Abbreviated Journal	NEUCOM
	Volume		Issue		Pages	1-3
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP;			Approved	no
	Call Number	Admin @ si @ GrR2015			Serial	2598
Permanent link to this record



	Author	Bogdan Raducanu; Alireza Bosaghzadeh; Fadi Dornaika
	Title	Facial Expression Recognition based on Multi-view Observations with Application to Social Robotics			Type	Conference Article
	Year	2014	Publication	1st Workshop on Computer Vision for Affective Computing	Abbreviated Journal
	Volume		Issue		Pages	1-8
	Keywords
	Abstract	Human-robot interaction is a hot topic nowadays in the social robotics community. One crucial aspect is represented by the affective communication which comes encoded through the facial expressions. In this paper, we propose a novel approach for facial expression recognition, which exploits an efficient and adaptive graph-based label propagation (semi-supervised mode) in a multi-observation framework. The facial features are extracted using an appearance-based 3D face tracker, view- and texture independent. Our method has been extensively tested on the CMU dataset, and has been conveniently compared with other methods for graph construction. With the proposed approach, we developed an application for an AIBO robot, in which it mirrors the recognized facial expression.
	Address	Singapore; November 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ACCV
	Notes	LAMP;			Approved	no
	Call Number	Admin @ si @ RBD2014			Serial	2599
Permanent link to this record



	Author	C. Alejandro Parraga
	Title	Perceptual Psychophysics			Type	Book Chapter
	Year	2015	Publication	Biologically-Inspired Computer Vision: Fundamentals and Applications	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor	G.Cristobal; M.Keil; L.Perrinet
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-527-41264-8	Medium
	Area		Expedition		Conference
	Notes	CIC; 600.074			Approved	no
	Call Number	Admin @ si @ Par2015			Serial	2600
Permanent link to this record



	Author	Firat Ismailoglu; Ida G. Sprinkhuizen-Kuyper; Evgueni Smirnov; Sergio Escalera; Ralf Peeters
	Title	Fractional Programming Weighted Decoding for Error-Correcting Output Codes			Type	Conference Article
	Year	2015	Publication	Multiple Classifier Systems, Proceedings of 12th International Workshop , MCS 2015	Abbreviated Journal
	Volume		Issue		Pages	38-50
	Keywords
	Abstract	In order to increase the classification performance obtained using Error-Correcting Output Codes designs (ECOC), introducing weights in the decoding phase of the ECOC has attracted a lot of interest. In this work, we present a method for ECOC designs that focuses on increasing hypothesis margin on the data samples given a base classifier. While achieving this, we implicitly reward the base classifiers with high performance, whereas punish those with low performance. The resulting objective function is of the fractional programming type and we deal with this problem through the Dinkelbach’s Algorithm. The conducted tests over well known UCI datasets show that the presented method is superior to the unweighted decoding and that it outperforms the results of the state-of-the-art weighted decoding methods in most of the performed experiments.
	Address	Gunzburg; Germany; June 2015
	Corporate Author				Thesis
	Publisher	Springer International Publishing	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-319-20247-1	Medium
	Area		Expedition		Conference	MCS
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ ISS2015			Serial	2601
Permanent link to this record



	Author	Hugo Jair Escalante; Jose Martinez; Sergio Escalera; Victor Ponce; Xavier Baro
	Title	Improving Bag of Visual Words Representations with Genetic Programming			Type	Conference Article
	Year	2015	Publication	IEEE International Joint Conference on Neural Networks IJCNN2015	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The bag of visual words is a well established representation in diverse computer vision problems. Taking inspiration from the fields of text mining and retrieval, this representation has proved to be very effective in a large number of domains. In most cases, a standard term-frequency weighting scheme is considered for representing images and videos in computer vision. This is somewhat surprising, as there are many alternative ways of generating bag of words representations within the text processing community. This paper explores the use of alternative weighting schemes for landmark tasks in computer vision: image categorization and gesture recognition. We study the suitability of using well-known supervised and unsupervised weighting schemes for such tasks. More importantly, we devise a genetic program that learns new ways of representing images and videos under the bag of visual words representation. The proposed method learns to combine term-weighting primitives trying to maximize the classification performance. Experimental results are reported in standard image and video data sets showing the effectiveness of the proposed evolutionary algorithm.
	Address	Killarney; Ireland; July 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IJCNN
	Notes	HuPBA;MV			Approved	no
	Call Number	Admin @ si @ EME2015			Serial	2603
Permanent link to this record