Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–11] |
Records | |||||
---|---|---|---|---|---|
Author | Fahad Shahbaz Khan | ||||
Title | Coloring bag-of-words based image representations | Type | Book Whole | ||
Year | 2011 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Put succinctly, the bag-of-words based image representation is the most successful approach for object and scene recognition. Within the bag-of-words framework the optimal fusion of multiple cues, such as shape, texture and color, still remains an active research domain. There exist two main approaches to combine color and shape information within the bag-of-words framework. The first approach called, early fusion, fuses color and shape at the feature level as a result of which a joint colorshape vocabulary is produced. The second approach, called late fusion, concatenates histogram representation of both color and shape, obtained independently. In the first part of this thesis, we analyze the theoretical implications of both early and late feature fusion. We demonstrate that both these approaches are suboptimal for a subset of object categories. Consequently, we propose a novel method for recognizing object categories when using multiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category specific color attention. Color is used to compute bottom-up and top-down attention maps. Subsequently, the color attention maps are used to modulate the weights of the shape features. Shape features are given more weight in regions with higher attention and vice versa. The approach is tested on several benchmark object recognition data sets and the results clearly demonstrate the effectiveness of our proposed method. In the second part of the thesis, we investigate the problem of obtaining compact spatial pyramid representations for object and scene recognition. Spatial pyramids have been successfully applied to incorporate spatial information into bag-of-words based image representation. However, a major drawback of spatial pyramids is that it leads to high dimensional image representations. We present a novel framework for obtaining compact pyramid representation. The approach reduces the size of a high dimensional pyramid representation upto an order of magnitude without any significant reduction in accuracy. Moreover, we also investigate the optimal combination of multiple features such as color and shape within the context of our compact pyramid representation. Finally, we describe a novel technique to build discriminative visual words from multiple cues learned independently from training images. To this end, we use an information theoretic vocabulary compression technique to find discriminative combinations of visual cues and the resulting visual vocabulary is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. The approach is tested on standard object recognition data sets. The results obtained clearly demonstrate the effectiveness of our approach. | ||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Place of Publication | Editor | Joost Van de Weijer;Maria Vanrell | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ Kha2011 | Serial | 1838 | ||
Permanent link to this record | |||||
Author | Jürgen Brauer; Wenjuan Gong; Jordi Gonzalez; Michael Arens | ||||
Title | On the Effect of Temporal Information on Monocular 3D Human Pose Estimation | Type | Conference Article | ||
Year | 2011 | Publication | 2nd IEEE International Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams | Abbreviated Journal | |
Volume | Issue | Pages | 906 - 913 | ||
Keywords | |||||
Abstract | We address the task of estimating 3D human poses from monocular camera sequences. Many works make use of multiple consecutive frames for the estimation of a 3D pose in a frame. Although such an approach should ease the pose estimation task substantially since multiple consecutive frames allow to solve for 2D projection ambiguities in principle, it has not yet been investigated systematically how much we can improve the 3D pose estimates when using multiple consecutive frames opposed to single frame information. In this paper we analyze the difference in quality of 3D pose estimates based on different numbers of consecutive frames from which 2D pose estimates are available. We validate the use of temporal information on two major different approaches for human pose estimation – modeling and learning approaches. The results of our experiments show that both learning and modeling approaches benefit from using multiple frames opposed to single frame input but that the benefit is small when the 2D pose estimates show a high quality in terms of precision. | ||||
Address | Barcelona | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4673-0062-9 | Medium | ||
Area | Expedition | Conference | ARTEMIS | ||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @BGG 2011 | Serial | 1860 | ||
Permanent link to this record | |||||
Author | Carles Sanchez | ||||
Title | Tracheal ring detection in bronchoscopy | Type | Report | ||
Year | 2011 | Publication | CVC Technical Report | Abbreviated Journal | |
Volume | 168 | Issue | Pages | ||
Keywords | Bronchoscopy, tracheal ring, segmentation | ||||
Abstract | Endoscopy is the process in which a camera is introduced inside a human.
Given that endoscopy provides realistic images (in contrast to other modalities) and allows non-invase minimal intervention procedures (which can aid in diagnosis and surgical interventions), its use has spreaded during last decades. In this project we will focus on bronchoscopic procedures, during which the camera is introduced through the trachea in order to have a diagnostic of the patient. The diagnostic interventions are focused on: degree of stenosis (reduction in tracheal area), prosthesis or early diagnosis of tumors. In the first case, assessment of the luminal area and the calculation of the diameters of the tracheal rings are required. A main limitation is that all the process is done by hand, which means that the doctor takes all the measurements and decisions just by looking at the screen. As far as we know there is no computational framework for helping the doctors in the diagnosis. This project will consist of analysing bronchoscopic videos in order to extract useful information for the diagnostic of the degree of stenosis. In particular we will focus on segmentation of the tracheal rings. As a result of this project several strategies (for detecting tracheal rings) had been implemented in order to compare their performance. |
||||
Address | |||||
Corporate Author | Thesis | Master's thesis | |||
Publisher | Place of Publication | Editor | Debora Gil, F.Javier Sanchez | ||
Language | english | Summary Language | english | Original Title | |
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM;MV | Approved | no | ||
Call Number | IAM @ iam @ San2011 | Serial | 1841 | ||
Permanent link to this record | |||||
Author | G.D. Evangelidis; Ferran Diego; Joan Serrat; Antonio Lopez | ||||
Title | Slice Matching for Accurate Spatio-Temporal Alignment | Type | Conference Article | ||
Year | 2011 | Publication | In ICCV Workshop on Visual Surveillance | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | video alignment | ||||
Abstract | Video synchronization and alignment is a rather recent topic in computer vision. It usually deals with the problem of aligning sequences recorded simultaneously by static, jointly- or independently-moving cameras. In this paper, we investigate the more difficult problem of matching videos captured at different times from independently-moving cameras, whose trajectories are approximately coincident or parallel. To this end, we propose a novel method that pixel-wise aligns videos and allows thus to automatically highlight their differences. This primarily aims at visual surveillance but the method can be adopted as is by other related video applications, like object transfer (augmented reality) or high dynamic range video. We build upon a slice matching scheme to first synchronize the sequences, while we develop a spatio-temporal alignment scheme to spatially register corresponding frames and refine the temporal mapping. We investigate the performance of the proposed method on videos recorded from vehicles driven along different types of roads and compare with related previous works. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VS | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ EDS2011; ADAS @ adas @ eds2011a | Serial | 1861 | ||
Permanent link to this record | |||||
Author | Gemma Roig; Xavier Boix; F. de la Torre; Joan Serrat; C. Vilella | ||||
Title | Hierarchical CRF with product label spaces for parts-based Models | Type | Conference Article | ||
Year | 2011 | Publication | IEEE Conference on Automatic Face and Gesture Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 657-664 | ||
Keywords | Shape; Computational modeling; Principal component analysis; Random variables; Color; Upper bound; Facial features | ||||
Abstract | Non-rigid object detection is a challenging an open research problem in computer vision. It is a critical part in many applications such as image search, surveillance, human-computer interaction or image auto-annotation. Most successful approaches to non-rigid object detection make use of part-based models. In particular, Conditional Random Fields (CRF) have been successfully embedded into a discriminative parts-based model framework due to its effectiveness for learning and inference (usually based on a tree structure). However, CRF-based approaches do not incorporate global constraints and only model pairwise interactions. This is especially important when modeling object classes that may have complex parts interactions (e.g. facial features or body articulations), because neglecting them yields an oversimplified model with suboptimal performance. To overcome this limitation, this paper proposes a novel hierarchical CRF (HCRF). The main contribution is to build a hierarchy of part combinations by extending the label set to a hierarchy of product label spaces. In order to keep the inference computation tractable, we propose an effective method to reduce the new label set. We test our method on two applications: facial feature detection on the Multi-PIE database and human pose estimation on the Buffy dataset. | ||||
Address | Santa Barbara, CA, USA, 2011 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FG | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ RBT2011 | Serial | 1862 | ||
Permanent link to this record | |||||
Author | Fahad Shahbaz Khan; Joost Van de Weijer; Andrew Bagdanov; Maria Vanrell | ||||
Title | Portmanteau Vocabularies for Multi-Cue Image Representation | Type | Conference Article | ||
Year | 2011 | Publication | 25th Annual Conference on Neural Information Processing Systems | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | NIPS | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ KWB2011 | Serial | 1865 | ||
Permanent link to this record | |||||
Author | Naila Murray; Sandra Skaff; Luca Marchesotti; Florent Perronnin | ||||
Title | Towards Automatic Concept Transfer | Type | Conference Article | ||
Year | 2011 | Publication | Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Non-Photorealistic Animation and Rendering | Abbreviated Journal | |
Volume | Issue | Pages | 167.176 | ||
Keywords | chromatic modeling, color concepts, color transfer, concept transfer | ||||
Abstract | This paper introduces a novel approach to automatic concept transfer; examples of concepts are “romantic”, “earthy”, and “luscious”. The approach modifies the color content of an input image given only a concept specified by a user in natural language, thereby requiring minimal user input. This approach is particularly useful for users who are aware of the message they wish to convey in the transferred image while being unsure of the color combination needed to achieve the corresponding transfer. The user may adjust the intensity level of the concept transfer to his/her liking with a single parameter. The proposed approach uses a convex clustering algorithm, with a novel pruning mechanism, to automatically set the complexity of models of chromatic content. It also uses the Earth-Mover's Distance to compute a mapping between the models of the input image and the target chromatic concept. Results show that our approach yields transferred images which effectively represent concepts, as confirmed by a user study. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | ACM Press | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-0907-3 | Medium | ||
Area | Expedition | Conference | NPAR | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ MSM2011 | Serial | 1866 | ||
Permanent link to this record | |||||
Author | Jordi Roca; C. Alejandro Parraga; Maria Vanrell | ||||
Title | Categorical Focal Colours are Structurally Invariant Under Illuminant Changes | Type | Conference Article | ||
Year | 2011 | Publication | European Conference on Visual Perception | Abbreviated Journal | |
Volume | Issue | Pages | 196 | ||
Keywords | |||||
Abstract | The visual system perceives the colour of surfaces approximately constant under changes of illumination. In this work, we investigate how stable is the perception of categorical \“focal\” colours and their interrelations with varying illuminants and simple chromatic backgrounds. It has been proposed that best examples of colour categories across languages cluster in small regions of the colour space and are restricted to a set of 11 basic terms (Kay and Regier, 2003 Proceedings of the National Academy of Sciences of the USA 100 9085\–9089). Following this, we developed a psychophysical paradigm that exploits the ability of subjects to reliably reproduce the most representative examples of each category, adjusting multiple test patches embedded in a coloured Mondrian. The experiment was run on a CRT monitor (inside a dark room) under various simulated illuminants. We modelled the recorded data for each subject and adapted state as a 3D interconnected structure (graph) in Lab space. The graph nodes were the subject\’s focal colours at each adaptation state. The model allowed us to get a better distance measure between focal structures under different illuminants. We found that perceptual focal structures tend to be preserved better than the structures of the physical \“ideal\” colours under illuminant changes. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Perception 40 | Abbreviated Series Title | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECVP | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ RPV2011 | Serial | 1867 | ||
Permanent link to this record | |||||
Author | Miguel Angel Bautista; Oriol Pujol; Xavier Baro; Sergio Escalera | ||||
Title | Introducing the Separability Matrix for Error Correcting Output Codes Coding | Type | Conference Article | ||
Year | 2011 | Publication | 10th International Conference on Multiple Classifier Systems | Abbreviated Journal | |
Volume | 6713 | Issue | Pages | 227-236 | |
Keywords | |||||
Abstract | Error Correcting Output Codes (ECOC) have demonstrate to be a powerful tool for treating multi-class problems. Nevertheless, predefined ECOC designs may not benefit from Error-correcting principles for particular multi-class data. In this paper, we introduce the Separability matrix as a tool to study and enhance designs for ECOC coding. In addition, a novel problem-dependent coding design based on the Separability matrix is tested over a wide set of challenging multi-class problems, obtaining very satisfactory results. | ||||
Address | Napoles, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Springer-Verlag Berlin, Heidelberg | Place of Publication | Editor | Carlo Sansone; Josef Kittler; Fabio Roli | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-21556-8 | Medium | |
Area | Expedition | Conference | MCS | ||
Notes | MILAB; OR;HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ BPB2011b | Serial | 1887 | ||
Permanent link to this record | |||||
Author | Ruth Aylett; Ginevra Castellano; Bogdan Raducanu; Ana Paiva; Marc Hanheide | ||||
Title | Long-term socially perceptive and interactive robot companions: challenges and future perspectives | Type | Conference Article | ||
Year | 2011 | Publication | 13th International Conference on Multimodal Interaction | Abbreviated Journal | |
Volume | Issue | Pages | 323-326 | ||
Keywords | human-robot interaction, multimodal interaction, social robotics | ||||
Abstract | This paper gives a brief overview of the challenges for multi-model perception and generation applied to robot companions located in human social environments. It reviews the current position in both perception and generation and the immediate technical challenges and goes on to consider the extra issues raised by embodiment and social context. Finally, it briefly discusses the impact of systems that must function continually over months rather than just for a few hours. | ||||
Address | Alicante | ||||
Corporate Author | Thesis | ||||
Publisher | ACM | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-0641-6 | Medium | ||
Area | Expedition | Conference | ICMI | ||
Notes | OR;MV | Approved | no | ||
Call Number | Admin @ si @ ACR2011 | Serial | 1888 | ||
Permanent link to this record | |||||
Author | Antonio Hernandez; Carlos Primo; Sergio Escalera | ||||
Title | Automatic user interaction correction via Multi-label Graph cuts | Type | Conference Article | ||
Year | 2011 | Publication | In ICCV 2011 1st IEEE International Workshop on Human Interaction in Computer Vision HICV | Abbreviated Journal | |
Volume | Issue | Pages | 1276-1281 | ||
Keywords | |||||
Abstract | Most applications in image segmentation requires from user interaction in order to achieve accurate results. However, user wants to achieve the desired segmentation accuracy reducing effort of manual labelling. In this work, we extend standard multi-label α-expansion Graph Cut algorithm so that it analyzes the interaction of the user in order to modify the object model and improve final segmentation of objects. The approach is inspired in the fact that fast user interactions may introduce some pixel errors confusing object and background. Our results with different degrees of user interaction and input errors show high performance of the proposed approach on a multi-label human limb segmentation problem compared with classical α-expansion algorithm. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4673-0062-9 | Medium | ||
Area | Expedition | Conference | HICV | ||
Notes | MILAB; HuPBA | Approved | no | ||
Call Number | Admin @ si @ HPE2011 | Serial | 1892 | ||
Permanent link to this record | |||||
Author | Miguel Reyes; Gabriel Dominguez; Sergio Escalera | ||||
Title | Feature Weighting in Dynamic Time Warping for Gesture Recognition in Depth Data | Type | Conference Article | ||
Year | 2011 | Publication | 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 1182-1188 | ||
Keywords | |||||
Abstract | We present a gesture recognition approach for depth video data based on a novel Feature Weighting approach within the Dynamic Time Warping framework. Depth features from human joints are compared through video sequences using Dynamic Time Warping, and weights are assigned to features based on inter-intra class gesture variability. Feature Weighting in Dynamic Time Warping is then applied for recognizing begin-end of gestures in data sequences. The obtained results recognizing several gestures in depth data show high performance compared with classical Dynamic Time Warping approach. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4673-0062-9 | Medium | ||
Area | Expedition | Conference | CDC4CV | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ RDE2011 | Serial | 1893 | ||
Permanent link to this record | |||||
Author | Michal Drozdzal; Santiago Segui; Petia Radeva; Jordi Vitria; Laura Igual | ||||
Title | System and Method for Displaying Motility Events in an in Vivo Image Stream | Type | Patent | ||
Year | 2011 | Publication | US 61/592,786 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Given Imaging | ||||
Corporate Author | US Patent Office | Thesis | |||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; OR;MV | Approved | no | ||
Call Number | Admin @ si @ DSR2011 | Serial | 1897 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate | ||||
Title | Evaluation of spatiotemporal descriptors for pedestrian detection in video sequences | Type | Report | ||
Year | 2011 | Publication | CVC Technical Report | Abbreviated Journal | |
Volume | 166 | Issue | Pages | ||
Keywords | |||||
Abstract | |||||
Address | Bellaterra (Spain) | ||||
Corporate Author | Computer Vision Center | Thesis | Master's thesis | ||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ Gon2011 | Serial | 1932 | ||
Permanent link to this record | |||||
Author | Yainuvis Socarras | ||||
Title | Image segmentation for improving pedestrian detection | Type | Report | ||
Year | 2011 | Publication | CVC Technical Report | Abbreviated Journal | |
Volume | 167 | Issue | Pages | ||
Keywords | |||||
Abstract | |||||
Address | Bellaterra (Spain) | ||||
Corporate Author | Computer Vision Center | Thesis | Master's thesis | ||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; | Approved | no | ||
Call Number | Admin @ si @ Soc2011 | Serial | 1933 | ||
Permanent link to this record |