Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Fahad Shahbaz Khan
Title	Coloring bag-of-words based image representations			Type	Book Whole
Year	2011	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Put succinctly, the bag-of-words based image representation is the most successful approach for object and scene recognition. Within the bag-of-words framework the optimal fusion of multiple cues, such as shape, texture and color, still remains an active research domain. There exist two main approaches to combine color and shape information within the bag-of-words framework. The first approach called, early fusion, fuses color and shape at the feature level as a result of which a joint colorshape vocabulary is produced. The second approach, called late fusion, concatenates histogram representation of both color and shape, obtained independently. In the first part of this thesis, we analyze the theoretical implications of both early and late feature fusion. We demonstrate that both these approaches are suboptimal for a subset of object categories. Consequently, we propose a novel method for recognizing object categories when using multiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category specific color attention. Color is used to compute bottom-up and top-down attention maps. Subsequently, the color attention maps are used to modulate the weights of the shape features. Shape features are given more weight in regions with higher attention and vice versa. The approach is tested on several benchmark object recognition data sets and the results clearly demonstrate the effectiveness of our proposed method. In the second part of the thesis, we investigate the problem of obtaining compact spatial pyramid representations for object and scene recognition. Spatial pyramids have been successfully applied to incorporate spatial information into bag-of-words based image representation. However, a major drawback of spatial pyramids is that it leads to high dimensional image representations. We present a novel framework for obtaining compact pyramid representation. The approach reduces the size of a high dimensional pyramid representation upto an order of magnitude without any significant reduction in accuracy. Moreover, we also investigate the optimal combination of multiple features such as color and shape within the context of our compact pyramid representation. Finally, we describe a novel technique to build discriminative visual words from multiple cues learned independently from training images. To this end, we use an information theoretic vocabulary compression technique to find discriminative combinations of visual cues and the resulting visual vocabulary is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. The approach is tested on standard object recognition data sets. The results obtained clearly demonstrate the effectiveness of our approach.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher		Place of Publication		Editor	Joost Van de Weijer;Maria Vanrell
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	Admin @ si @ Kha2011			Serial	1838
Permanent link to this record



Author	Jürgen Brauer; Wenjuan Gong; Jordi Gonzalez; Michael Arens
Title	On the Effect of Temporal Information on Monocular 3D Human Pose Estimation			Type	Conference Article
Year	2011	Publication	2nd IEEE International Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams	Abbreviated Journal
Volume		Issue		Pages	906 - 913
Keywords
Abstract	We address the task of estimating 3D human poses from monocular camera sequences. Many works make use of multiple consecutive frames for the estimation of a 3D pose in a frame. Although such an approach should ease the pose estimation task substantially since multiple consecutive frames allow to solve for 2D projection ambiguities in principle, it has not yet been investigated systematically how much we can improve the 3D pose estimates when using multiple consecutive frames opposed to single frame information. In this paper we analyze the difference in quality of 3D pose estimates based on different numbers of consecutive frames from which 2D pose estimates are available. We validate the use of temporal information on two major different approaches for human pose estimation – modeling and learning approaches. The results of our experiments show that both learning and modeling approaches benefit from using multiple frames opposed to single frame input but that the benefit is small when the 2D pose estimates show a high quality in terms of precision.
Address	Barcelona
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4673-0062-9	Medium
Area		Expedition		Conference	ARTEMIS
Notes	ISE			Approved	no
Call Number	Admin @ si @BGG 2011			Serial	1860
Permanent link to this record



Author	Carles Sanchez
Title	Tracheal ring detection in bronchoscopy			Type	Report
Year	2011	Publication	CVC Technical Report	Abbreviated Journal
Volume	168	Issue		Pages
Keywords	Bronchoscopy, tracheal ring, segmentation
Abstract	Endoscopy is the process in which a camera is introduced inside a human. Given that endoscopy provides realistic images (in contrast to other modalities) and allows non-invase minimal intervention procedures (which can aid in diagnosis and surgical interventions), its use has spreaded during last decades. In this project we will focus on bronchoscopic procedures, during which the camera is introduced through the trachea in order to have a diagnostic of the patient. The diagnostic interventions are focused on: degree of stenosis (reduction in tracheal area), prosthesis or early diagnosis of tumors. In the first case, assessment of the luminal area and the calculation of the diameters of the tracheal rings are required. A main limitation is that all the process is done by hand, which means that the doctor takes all the measurements and decisions just by looking at the screen. As far as we know there is no computational framework for helping the doctors in the diagnosis. This project will consist of analysing bronchoscopic videos in order to extract useful information for the diagnostic of the degree of stenosis. In particular we will focus on segmentation of the tracheal rings. As a result of this project several strategies (for detecting tracheal rings) had been implemented in order to compare their performance.
Address
Corporate Author				Thesis	Master's thesis
Publisher		Place of Publication		Editor	Debora Gil, F.Javier Sanchez
Language	english	Summary Language	english	Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM;MV			Approved	no
Call Number	IAM @ iam @ San2011			Serial	1841
Permanent link to this record



Author	G.D. Evangelidis; Ferran Diego; Joan Serrat; Antonio Lopez
Title	Slice Matching for Accurate Spatio-Temporal Alignment			Type	Conference Article
Year	2011	Publication	In ICCV Workshop on Visual Surveillance	Abbreviated Journal
Volume		Issue		Pages
Keywords	video alignment
Abstract	Video synchronization and alignment is a rather recent topic in computer vision. It usually deals with the problem of aligning sequences recorded simultaneously by static, jointly- or independently-moving cameras. In this paper, we investigate the more difficult problem of matching videos captured at different times from independently-moving cameras, whose trajectories are approximately coincident or parallel. To this end, we propose a novel method that pixel-wise aligns videos and allows thus to automatically highlight their differences. This primarily aims at visual surveillance but the method can be adopted as is by other related video applications, like object transfer (augmented reality) or high dynamic range video. We build upon a slice matching scheme to first synchronize the sequences, while we develop a spatio-temporal alignment scheme to spatially register corresponding frames and refine the temporal mapping. We investigate the performance of the proposed method on videos recorded from vehicles driven along different types of roads and compare with related previous works.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VS
Notes	ADAS			Approved	no
Call Number	Admin @ si @ EDS2011; ADAS @ adas @ eds2011a			Serial	1861
Permanent link to this record



Author	Gemma Roig; Xavier Boix; F. de la Torre; Joan Serrat; C. Vilella
Title	Hierarchical CRF with product label spaces for parts-based Models			Type	Conference Article
Year	2011	Publication	IEEE Conference on Automatic Face and Gesture Recognition	Abbreviated Journal
Volume		Issue		Pages	657-664
Keywords	Shape; Computational modeling; Principal component analysis; Random variables; Color; Upper bound; Facial features
Abstract	Non-rigid object detection is a challenging an open research problem in computer vision. It is a critical part in many applications such as image search, surveillance, human-computer interaction or image auto-annotation. Most successful approaches to non-rigid object detection make use of part-based models. In particular, Conditional Random Fields (CRF) have been successfully embedded into a discriminative parts-based model framework due to its effectiveness for learning and inference (usually based on a tree structure). However, CRF-based approaches do not incorporate global constraints and only model pairwise interactions. This is especially important when modeling object classes that may have complex parts interactions (e.g. facial features or body articulations), because neglecting them yields an oversimplified model with suboptimal performance. To overcome this limitation, this paper proposes a novel hierarchical CRF (HCRF). The main contribution is to build a hierarchy of part combinations by extending the label set to a hierarchy of product label spaces. In order to keep the inference computation tractable, we propose an effective method to reduce the new label set. We test our method on two applications: facial feature detection on the Multi-PIE database and human pose estimation on the Buffy dataset.
Address	Santa Barbara, CA, USA, 2011
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	FG
Notes	ADAS			Approved	no
Call Number	Admin @ si @ RBT2011			Serial	1862
Permanent link to this record



Author	Fahad Shahbaz Khan; Joost Van de Weijer; Andrew Bagdanov; Maria Vanrell
Title	Portmanteau Vocabularies for Multi-Cue Image Representation			Type	Conference Article
Year	2011	Publication	25th Annual Conference on Neural Information Processing Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NIPS
Notes	CIC			Approved	no
Call Number	Admin @ si @ KWB2011			Serial	1865
Permanent link to this record



Author	Naila Murray; Sandra Skaff; Luca Marchesotti; Florent Perronnin
Title	Towards Automatic Concept Transfer			Type	Conference Article
Year	2011	Publication	Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Non-Photorealistic Animation and Rendering	Abbreviated Journal
Volume		Issue		Pages	167.176
Keywords	chromatic modeling, color concepts, color transfer, concept transfer
Abstract	This paper introduces a novel approach to automatic concept transfer; examples of concepts are “romantic”, “earthy”, and “luscious”. The approach modifies the color content of an input image given only a concept specified by a user in natural language, thereby requiring minimal user input. This approach is particularly useful for users who are aware of the message they wish to convey in the transferred image while being unsure of the color combination needed to achieve the corresponding transfer. The user may adjust the intensity level of the concept transfer to his/her liking with a single parameter. The proposed approach uses a convex clustering algorithm, with a novel pruning mechanism, to automatically set the complexity of models of chromatic content. It also uses the Earth-Mover's Distance to compute a mapping between the models of the input image and the target chromatic concept. Results show that our approach yields transferred images which effectively represent concepts, as confirmed by a user study.
Address
Corporate Author				Thesis
Publisher	ACM Press	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4503-0907-3	Medium
Area		Expedition		Conference	NPAR
Notes	CIC			Approved	no
Call Number	Admin @ si @ MSM2011			Serial	1866
Permanent link to this record



Author	Jordi Roca; C. Alejandro Parraga; Maria Vanrell
Title	Categorical Focal Colours are Structurally Invariant Under Illuminant Changes			Type	Conference Article
Year	2011	Publication	European Conference on Visual Perception	Abbreviated Journal
Volume		Issue		Pages	196
Keywords
Abstract	The visual system perceives the colour of surfaces approximately constant under changes of illumination. In this work, we investigate how stable is the perception of categorical \“focal\” colours and their interrelations with varying illuminants and simple chromatic backgrounds. It has been proposed that best examples of colour categories across languages cluster in small regions of the colour space and are restricted to a set of 11 basic terms (Kay and Regier, 2003 Proceedings of the National Academy of Sciences of the USA 100 9085\–9089). Following this, we developed a psychophysical paradigm that exploits the ability of subjects to reliably reproduce the most representative examples of each category, adjusting multiple test patches embedded in a coloured Mondrian. The experiment was run on a CRT monitor (inside a dark room) under various simulated illuminants. We modelled the recorded data for each subject and adapted state as a 3D interconnected structure (graph) in Lab space. The graph nodes were the subject\’s focal colours at each adaptation state. The model allowed us to get a better distance measure between focal structures under different illuminants. We found that perceptual focal structures tend to be preserved better than the structures of the physical \“ideal\” colours under illuminant changes.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title	Perception 40	Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECVP
Notes	CIC			Approved	no
Call Number	Admin @ si @ RPV2011			Serial	1867
Permanent link to this record



Author	Miguel Angel Bautista; Oriol Pujol; Xavier Baro; Sergio Escalera
Title	Introducing the Separability Matrix for Error Correcting Output Codes Coding			Type	Conference Article
Year	2011	Publication	10th International Conference on Multiple Classifier Systems	Abbreviated Journal
Volume	6713	Issue		Pages	227-236
Keywords
Abstract	Error Correcting Output Codes (ECOC) have demonstrate to be a powerful tool for treating multi-class problems. Nevertheless, predefined ECOC designs may not benefit from Error-correcting principles for particular multi-class data. In this paper, we introduce the Separability matrix as a tool to study and enhance designs for ECOC coding. In addition, a novel problem-dependent coding design based on the Separability matrix is tested over a wide set of challenging multi-class problems, obtaining very satisfactory results.
Address	Napoles, Italy
Corporate Author				Thesis
Publisher	Springer-Verlag Berlin, Heidelberg	Place of Publication		Editor	Carlo Sansone; Josef Kittler; Fabio Roli
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-21556-8	Medium
Area		Expedition		Conference	MCS
Notes	MILAB; OR;HuPBA;MV			Approved	no
Call Number	Admin @ si @ BPB2011b			Serial	1887
Permanent link to this record



Author	Ruth Aylett; Ginevra Castellano; Bogdan Raducanu; Ana Paiva; Marc Hanheide
Title	Long-term socially perceptive and interactive robot companions: challenges and future perspectives			Type	Conference Article
Year	2011	Publication	13th International Conference on Multimodal Interaction	Abbreviated Journal
Volume		Issue		Pages	323-326
Keywords	human-robot interaction, multimodal interaction, social robotics
Abstract	This paper gives a brief overview of the challenges for multi-model perception and generation applied to robot companions located in human social environments. It reviews the current position in both perception and generation and the immediate technical challenges and goes on to consider the extra issues raised by embodiment and social context. Finally, it briefly discusses the impact of systems that must function continually over months rather than just for a few hours.
Address	Alicante
Corporate Author				Thesis
Publisher	ACM	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4503-0641-6	Medium
Area		Expedition		Conference	ICMI
Notes	OR;MV			Approved	no
Call Number	Admin @ si @ ACR2011			Serial	1888
Permanent link to this record



Author	Antonio Hernandez; Carlos Primo; Sergio Escalera
Title	Automatic user interaction correction via Multi-label Graph cuts			Type	Conference Article
Year	2011	Publication	In ICCV 2011 1st IEEE International Workshop on Human Interaction in Computer Vision HICV	Abbreviated Journal
Volume		Issue		Pages	1276-1281
Keywords
Abstract	Most applications in image segmentation requires from user interaction in order to achieve accurate results. However, user wants to achieve the desired segmentation accuracy reducing effort of manual labelling. In this work, we extend standard multi-label α-expansion Graph Cut algorithm so that it analyzes the interaction of the user in order to modify the object model and improve final segmentation of objects. The approach is inspired in the fact that fast user interactions may introduce some pixel errors confusing object and background. Our results with different degrees of user interaction and input errors show high performance of the proposed approach on a multi-label human limb segmentation problem compared with classical α-expansion algorithm.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4673-0062-9	Medium
Area		Expedition		Conference	HICV
Notes	MILAB; HuPBA			Approved	no
Call Number	Admin @ si @ HPE2011			Serial	1892
Permanent link to this record



Author	Miguel Reyes; Gabriel Dominguez; Sergio Escalera
Title	Feature Weighting in Dynamic Time Warping for Gesture Recognition in Depth Data			Type	Conference Article
Year	2011	Publication	1st IEEE Workshop on Consumer Depth Cameras for Computer Vision	Abbreviated Journal
Volume		Issue		Pages	1182-1188
Keywords
Abstract	We present a gesture recognition approach for depth video data based on a novel Feature Weighting approach within the Dynamic Time Warping framework. Depth features from human joints are compared through video sequences using Dynamic Time Warping, and weights are assigned to features based on inter-intra class gesture variability. Feature Weighting in Dynamic Time Warping is then applied for recognizing begin-end of gestures in data sequences. The obtained results recognizing several gestures in depth data show high performance compared with classical Dynamic Time Warping approach.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4673-0062-9	Medium
Area		Expedition		Conference	CDC4CV
Notes	HuPBA;MILAB			Approved	no
Call Number	Admin @ si @ RDE2011			Serial	1893
Permanent link to this record



Author	Michal Drozdzal; Santiago Segui; Petia Radeva; Jordi Vitria; Laura Igual
Title	System and Method for Displaying Motility Events in an in Vivo Image Stream			Type	Patent
Year	2011	Publication	US 61/592,786	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Given Imaging
Corporate Author	US Patent Office			Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; OR;MV			Approved	no
Call Number	Admin @ si @ DSR2011			Serial	1897
Permanent link to this record



Author	Alejandro Gonzalez Alzate
Title	Evaluation of spatiotemporal descriptors for pedestrian detection in video sequences			Type	Report
Year	2011	Publication	CVC Technical Report	Abbreviated Journal
Volume	166	Issue		Pages
Keywords
Abstract
Address	Bellaterra (Spain)
Corporate Author	Computer Vision Center			Thesis	Master's thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ Gon2011			Serial	1932
Permanent link to this record



Author	Yainuvis Socarras
Title	Image segmentation for improving pedestrian detection			Type	Report
Year	2011	Publication	CVC Technical Report	Abbreviated Journal
Volume	167	Issue		Pages
Keywords
Abstract
Address	Bellaterra (Spain)
Corporate Author	Computer Vision Center			Thesis	Master's thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS;			Approved	no
Call Number	Admin @ si @ Soc2011			Serial	1933
Permanent link to this record