Home | [1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30] |
Records | |||||
---|---|---|---|---|---|
Author | Jose Seabra; Francesco Ciompi; Oriol Pujol; J. Mauri; Petia Radeva; Joao Sanchez | ||||
Title | Rayleigh Mixture Model for Plaque Characterization in Intravascular Ultrasound | Type | Journal Article | ||
Year | 2011 | Publication | IEEE Transactions on Biomedical Engineering | Abbreviated Journal | TBME |
Volume | 58 | Issue | 5 | Pages | 1314-1324 |
Keywords | |||||
Abstract | Vulnerable plaques are the major cause of carotid and coronary vascular problems, such as heart attack or stroke. A correct modeling of plaque echomorphology and composition can help the identification of such lesions. The Rayleigh distribution is widely used to describe (nearly) homogeneous areas in ultrasound images. Since plaques may contain tissues with heterogeneous regions, more complex distributions depending on multiple parameters are usually needed, such as Rice, K or Nakagami distributions. In such cases, the problem formulation becomes more complex, and the optimization procedure to estimate the plaque echomorphology is more difficult. Here, we propose to model the tissue echomorphology by means of a mixture of Rayleigh distributions, known as the Rayleigh mixture model (RMM). The problem formulation is still simple, but its ability to describe complex textural patterns is very powerful. In this paper, we present a method for the automatic estimation of the RMM mixture parameters by means of the expectation maximization algorithm, which aims at characterizing tissue echomorphology in ultrasound (US). The performance of the proposed model is evaluated with a database of in vitro intravascular US cases. We show that the mixture coefficients and Rayleigh parameters explicitly derived from the mixture model are able to accurately describe different plaque types and to significantly improve the characterization performance of an already existing methodology. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB;HuPBA | Approved | no | ||
Call Number | Admin @ si @ SCP2011 | Serial | 1712 | ||
Permanent link to this record | |||||
Author | Maria Salamo; Sergio Escalera | ||||
Title | Increasing Retrieval Quality in Conversational Recommenders | Type | Journal Article | ||
Year | 2011 | Publication | IEEE Transactions on Knowledge and Data Engineering | Abbreviated Journal | TKDE |
Volume | 99 | Issue | Pages | 1-1 | |
Keywords | |||||
Abstract | IF JCR CCIA 2.286 2009 24/103
JCR Impact Factor 2010: 1.851 A major task of research in conversational recommender systems is personalization. Critiquing is a common and powerful form of feedback, where a user can express her feature preferences by applying a series of directional critiques over the recommendations instead of providing specific preference values. Incremental Critiquing is a conversational recommender system that uses critiquing as a feedback to efficiently personalize products. The expectation is that in each cycle the system retrieves the products that best satisfy the user’s soft product preferences from a minimal information input. In this paper, we present a novel technique that increases retrieval quality based on a combination of compatibility and similarity scores. Under the hypothesis that a user learns Turing the recommendation process, we propose two novel exponential reinforcement learning approaches for compatibility that take into account both the instant at which the user makes a critique and the number of satisfied critiques. Moreover, we consider that the impact of features on the similarity differs according to the preferences manifested by the user. We propose a global weighting approach that uses a common weight for nearest cases in order to focus on groups of relevant products. We show that our methodology significantly improves recommendation efficiency in four data sets of different sizes in terms of session length in comparison with state-of-the-art approaches. Moreover, our recommender shows higher robustness against noisy user data when compared to classical approaches |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1041-4347 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | MILAB; HuPBA | Approved | no | ||
Call Number | Admin @ si @ SaE2011 | Serial | 1713 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; David Masip; Eloi Puertas; Petia Radeva; Oriol Pujol | ||||
Title | Online Error-Correcting Output Codes | Type | Journal Article | ||
Year | 2011 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 32 | Issue | 3 | Pages | 458-467 |
Keywords | |||||
Abstract | IF JCR CCIA 1.303 2009 54/103
This article proposes a general extension of the error correcting output codes framework to the online learning scenario. As a result, the final classifier handles the addition of new classes independently of the base classifier used. In particular, this extension supports the use of both online example incremental and batch classifiers as base learners. The extension of the traditional problem independent codings one-versus-all and one-versus-one is introduced. Furthermore, two new codings are proposed, unbalanced online ECOC and a problem dependent online ECOC. This last online coding technique takes advantage of the problem data for minimizing the number of dichotomizers used in the ECOC framework while preserving a high accuracy. These techniques are validated on an online setting of 11 data sets from UCI database and applied to two real machine vision applications: traffic sign recognition and face recognition. As a result, the online ECOC techniques proposed provide a feasible and robust way for handling new classes using any base classifier. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | North Holland | Editor | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0167-8655 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | MILAB;OR;HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ EMP2011 | Serial | 1714 | ||
Permanent link to this record | |||||
Author | Ariel Amato; Mikhail Mozerov; Andrew Bagdanov; Jordi Gonzalez | ||||
Title | Accurate Moving Cast Shadow Suppression Based on Local Color Constancy detection | Type | Journal Article | ||
Year | 2011 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 20 | Issue | 10 | Pages | 2954 - 2966 |
Keywords | |||||
Abstract | This paper describes a novel framework for detection and suppression of properly shadowed regions for most possible scenarios occurring in real video sequences. Our approach requires no prior knowledge about the scene, nor is it restricted to specific scene structures. Furthermore, the technique can detect both achromatic and chromatic shadows even in the presence of camouflage that occurs when foreground regions are very similar in color to shadowed regions. The method exploits local color constancy properties due to reflectance suppression over shadowed regions. To detect shadowed regions in a scene, the values of the background image are divided by values of the current frame in the RGB color space. We show how this luminance ratio can be used to identify segments with low gradient constancy, which in turn distinguish shadows from foreground. Experimental results on a collection of publicly available datasets illustrate the superior performance of our method compared with the most sophisticated, state-of-the-art shadow detection algorithms. These results show that our approach is robust and accurate over a broad range of shadow types and challenging video conditions. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1057-7149 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ AMB2011 | Serial | 1716 | ||
Permanent link to this record | |||||
Author | Arjan Gijsenij; Theo Gevers; Joost Van de Weijer | ||||
Title | Computational Color Constancy: Survey and Experiments | Type | Journal Article | ||
Year | 2011 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 20 | Issue | 9 | Pages | 2475-2489 |
Keywords | computational color constancy;computer vision application;gamut-based method;learning-based method;static method;colour vision;computer vision;image colour analysis;learning (artificial intelligence);lighting | ||||
Abstract | Computational color constancy is a fundamental prerequisite for many computer vision applications. This paper presents a survey of many recent developments and state-of-the- art methods. Several criteria are proposed that are used to assess the approaches. A taxonomy of existing algorithms is proposed and methods are separated in three groups: static methods, gamut-based methods and learning-based methods. Further, the experimental setup is discussed including an overview of publicly available data sets. Finally, various freely available methods, of which some are considered to be state-of-the-art, are evaluated on two data sets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1057-7149 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ISE;CIC | Approved | no | ||
Call Number | Admin @ si @ GGW2011 | Serial | 1717 | ||
Permanent link to this record | |||||
Author | Xavier Boix; Josep M. Gonfaus; Joost Van de Weijer; Andrew Bagdanov; Joan Serrat; Jordi Gonzalez | ||||
Title | Harmony Potentials: Fusing Global and Local Scale for Semantic Image Segmentation | Type | Journal Article | ||
Year | 2012 | Publication | International Journal of Computer Vision | Abbreviated Journal | IJCV |
Volume | 96 | Issue | 1 | Pages | 83-102 |
Keywords | |||||
Abstract | The Hierarchical Conditional Random Field(HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales.
At higher scales in the image, this representation yields an oversimplied model since multiple classes can be reasonably expected to appear within large regions. This simplied model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combi- nation of labels, penalizing only unlikely combinations of classes. We also propose an eective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0920-5691 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ISE;CIC;ADAS | Approved | no | ||
Call Number | Admin @ si @ BGW2012 | Serial | 1718 | ||
Permanent link to this record | |||||
Author | Olivier Penacchio; C. Alejandro Parraga | ||||
Title | What is the best criterion for an efficient design of retinal photoreceptor mosaics? | Type | Journal Article | ||
Year | 2011 | Publication | Perception | Abbreviated Journal | PER |
Volume | 40 | Issue | Pages | 197 | |
Keywords | |||||
Abstract | The proportions of L, M and S photoreceptors in the primate retina are arguably determined by evolutionary pressure and the statistics of the visual environment. Two information theory-based approaches have been recently proposed for explaining the asymmetrical spatial densities of photoreceptors in humans. In the first approach Garrigan et al (2010 PLoS ONE 6 e1000677), a model for computing the information transmitted by cone arrays which considers the differential blurring produced by the long-wavelength accommodation of the eye’s lens is proposed. Their results explain the sparsity of S-cones but the optimum depends weakly on the L:M cone ratio. In the second approach (Penacchio et al, 2010 Perception 39 ECVP Supplement, 101), we show that human cone arrays make the visual representation scale-invariant, allowing the total entropy of the signal to be preserved while decreasing individual neurons’ entropy in further retinotopic representations. This criterion provides a thorough description of the distribution of L:M cone ratios and does not depend on differential blurring of the signal by the lens. Here, we investigate the similarities and differences of both approaches when applied to the same database. Our results support a 2-criteria optimization in the space of cone ratios whose components are arguably important and mostly unrelated.
[This work was partially funded by projects TIN2010-21771-C02-1 and Consolider-Ingenio 2010-CSD2007-00018 from the Spanish MICINN. CAP was funded by grant RYC-2007-00484] |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ PeP2011a | Serial | 1719 | ||
Permanent link to this record | |||||
Author | C. Alejandro Parraga; Olivier Penacchio; Maria Vanrell | ||||
Title | Retinal Filtering Matches Natural Image Statistics at Low Luminance Levels | Type | Journal Article | ||
Year | 2011 | Publication | Perception | Abbreviated Journal | PER |
Volume | 40 | Issue | Pages | 96 | |
Keywords | |||||
Abstract | The assumption that the retina’s main objective is to provide a minimum entropy representation to higher visual areas (ie efficient coding principle) allows to predict retinal filtering in space–time and colour (Atick, 1992 Network 3 213–251). This is achieved by considering the power spectra of natural images (which is proportional to 1/f2) and the suppression of retinal and image noise. However, most studies consider images within a limited range of lighting conditions (eg near noon) whereas the visual system’s spatial filtering depends on light intensity and the spatiochromatic properties of natural scenes depend of the time of the day. Here, we explore whether the dependence of visual spatial filtering on luminance match the changes in power spectrum of natural scenes at different times of the day. Using human cone-activation based naturalistic stimuli (from the Barcelona Calibrated Images Database), we show that for a range of luminance levels, the shape of the retinal CSF reflects the slope of the power spectrum at low spatial frequencies. Accordingly, the retina implements the filtering which best decorrelates the input signal at every luminance level. This result is in line with the body of work that places efficient coding as a guiding neural principle. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ PPV2011 | Serial | 1720 | ||
Permanent link to this record | |||||
Author | Olivier Penacchio | ||||
Title | Mixed Hodge Structures and Equivariant Sheaves on the Projective Plane | Type | Journal Article | ||
Year | 2011 | Publication | Mathematische Nachrichten | Abbreviated Journal | MN |
Volume | 284 | Issue | 4 | Pages | 526-542 |
Keywords | Mixed Hodge structures, equivariant sheaves, MSC (2010) Primary: 14C30, Secondary: 14F05, 14M25 | ||||
Abstract | We describe an equivalence of categories between the category of mixed Hodge structures and a category of equivariant vector bundles on a toric model of the complex projective plane which verify some semistability condition. We then apply this correspondence to define an invariant which generalizes the notion of R-split mixed Hodge structure and give calculations for the first group of cohomology of possibly non smooth or non-complete curves of genus 0 and 1. Finally, we describe some extension groups of mixed Hodge structures in terms of equivariant extensions of coherent sheaves. © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | WILEY-VCH Verlag | Place of Publication | Editor | R. Mennicken | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1522-2616 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ Pen2011 | Serial | 1721 | ||
Permanent link to this record | |||||
Author | Carles Fernandez; Pau Baiget; Xavier Roca; Jordi Gonzalez | ||||
Title | Determining the Best Suited Semantic Events for Cognitive Surveillance | Type | Journal Article | ||
Year | 2011 | Publication | Expert Systems with Applications | Abbreviated Journal | EXSY |
Volume | 38 | Issue | 4 | Pages | 4068–4079 |
Keywords | Cognitive surveillance; Event modeling; Content-based video retrieval; Ontologies; Advanced user interfaces | ||||
Abstract | State-of-the-art systems on cognitive surveillance identify and describe complex events in selected domains, thus providing end-users with tools to easily access the contents of massive video footage. Nevertheless, as the complexity of events increases in semantics and the types of indoor/outdoor scenarios diversify, it becomes difficult to assess which events describe better the scene, and how to model them at a pixel level to fulfill natural language requests. We present an ontology-based methodology that guides the identification, step-by-step modeling, and generalization of the most relevant events to a specific domain. Our approach considers three steps: (1) end-users provide textual evidence from surveilled video sequences; (2) transcriptions are analyzed top-down to build the knowledge bases for event description; and (3) the obtained models are used to generalize event detection to different image sequences from the surveillance domain. This framework produces user-oriented knowledge that improves on existing advanced interfaces for video indexing and retrieval, by determining the best suited events for video understanding according to end-users. We have conducted experiments with outdoor and indoor scenes showing thefts, chases, and vandalism, demonstrating the feasibility and generalization of this proposal. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ FBR2011a | Serial | 1722 | ||
Permanent link to this record | |||||
Author | Carles Fernandez; Pau Baiget; Xavier Roca; Jordi Gonzalez | ||||
Title | Augmenting Video Surveillance Footage with Virtual Agents for Incremental Event Evaluation | Type | Journal Article | ||
Year | 2011 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 32 | Issue | 6 | Pages | 878–889 |
Keywords | |||||
Abstract | The fields of segmentation, tracking and behavior analysis demand for challenging video resources to test, in a scalable manner, complex scenarios like crowded environments or scenes with high semantics. Nevertheless, existing public databases cannot scale the presence of appearing agents, which would be useful to study long-term occlusions and crowds. Moreover, creating these resources is expensive and often too particularized to specific needs. We propose an augmented reality framework to increase the complexity of image sequences in terms of occlusions and crowds, in a scalable and controllable manner. Existing datasets can be increased with augmented sequences containing virtual agents. Such sequences are automatically annotated, thus facilitating evaluation in terms of segmentation, tracking, and behavior recognition. In order to easily specify the desired contents, we propose a natural language interface to convert input sentences into virtual agent behaviors. Experimental tests and validation in indoor, street, and soccer environments are provided to show the feasibility of the proposed approach in terms of robustness, scalability, and semantics. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ FBR2011b | Serial | 1723 | ||
Permanent link to this record | |||||
Author | Arjan Gijsenij; Theo Gevers | ||||
Title | Color Constancy Using Natural Image Statistics and Scene Semantics | Type | Journal Article | ||
Year | 2011 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 33 | Issue | 4 | Pages | 687-698 |
Keywords | |||||
Abstract | Existing color constancy methods are all based on specific assumptions such as the spatial and spectral characteristics of images. As a consequence, no algorithm can be considered as universal. However, with the large variety of available methods, the question is how to select the method that performs best for a specific image. To achieve selection and combining of color constancy algorithms, in this paper natural image statistics are used to identify the most important characteristics of color images. Then, based on these image characteristics, the proper color constancy algorithm (or best combination of algorithms) is selected for a specific image. To capture the image characteristics, the Weibull parameterization (e.g., grain size and contrast) is used. It is shown that the Weibull parameterization is related to the image attributes to which the used color constancy methods are sensitive. An MoG-classifier is used to learn the correlation and weighting between the Weibull-parameters and the image attributes (number of edges, amount of texture, and SNR). The output of the classifier is the selection of the best performing color constancy method for a certain image. Experimental results show a large improvement over state-of-the-art single algorithms. On a data set consisting of more than 11,000 images, an increase in color constancy performance up to 20 percent (median angular error) can be obtained compared to the best-performing single algorithm. Further, it is shown that for certain scene categories, one specific color constancy algorithm can be used instead of the classifier considering several algorithms. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ GiG2011 | Serial | 1724 | ||
Permanent link to this record | |||||
Author | Albert Ali Salah; Theo Gevers; Nicu Sebe; Alessandro Vinciarelli | ||||
Title | Computer Vision for Ambient Intelligence | Type | Journal Article | ||
Year | 2011 | Publication | Journal of Ambient Intelligence and Smart Environments | Abbreviated Journal | JAISE |
Volume | 3 | Issue | 3 | Pages | 187-191 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ SGS2011a | Serial | 1725 | ||
Permanent link to this record | |||||
Author | Arnau Ramisa; Alex Goldhoorn; David Aldavert; Ricardo Toledo; Ramon Lopez de Mantaras | ||||
Title | Combining Invariant Features and the ALV Homing Method for Autonomous Robot Navigation Based on Panoramas | Type | Journal Article | ||
Year | 2011 | Publication | Journal of Intelligent and Robotic Systems | Abbreviated Journal | JIRC |
Volume | 64 | Issue | 3-4 | Pages | 625-649 |
Keywords | |||||
Abstract | Biologically inspired homing methods, such as the Average Landmark Vector, are an interesting solution for local navigation due to its simplicity. However, usually they require a modification of the environment by placing artificial landmarks in order to work reliably. In this paper we combine the Average Landmark Vector with invariant feature points automatically detected in panoramic images to overcome this limitation. The proposed approach has been evaluated first in simulation and, as promising results are found, also in two data sets of panoramas from real world environments. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Netherlands | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0921-0296 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | RV;ADAS | Approved | no | ||
Call Number | Admin @ si @ RGA2011 | Serial | 1728 | ||
Permanent link to this record | |||||
Author | Koen E.A. van de Sande; Theo Gevers; Cees G.M. Snoek | ||||
Title | Empowering Visual Categorization with the GPU | Type | Journal Article | ||
Year | 2011 | Publication | IEEE Transactions on Multimedia | Abbreviated Journal | TMM |
Volume | 13 | Issue | 1 | Pages | 60-70 |
Keywords | |||||
Abstract | Visual categorization is important to manage large collections of digital images and video, where textual meta-data is often incomplete or simply unavailable. The bag-of-words model has become the most powerful method for visual categorization of images and video. Despite its high accuracy, a severe drawback of this model is its high computational cost. As the trend to increase computational power in newer CPU and GPU architectures is to increase their level of parallelism, exploiting this parallelism becomes an important direction to handle the computational cost of the bag-of-words approach. When optimizing a system based on the bag-of-words approach, the goal is to minimize the time it takes to process batches of images. Additionally, we also consider power usage as an evaluation metric. In this paper, we analyze the bag-of-words model for visual categorization in terms of computational cost and identify two major bottlenecks: the quantization step and the classification step. We address these two bottlenecks by proposing two efficient algorithms for quantization and classification by exploiting the GPU hardware and the CUDA parallel programming model. The algorithms are designed to (1) keep categorization accuracy intact, (2) decompose the problem and (3) give the same numerical results. In the experiments on large scale datasets it is shown that, by using a parallel implementation on the Geforce GTX260 GPU, classifying unseen images is 4.8 times faster than a quad-core CPU version on the Core i7 920, while giving the exact same numerical results. In addition, we show how the algorithms can be generalized to other applications, such as text retrieval and video retrieval. Moreover, when the obtained speedup is used to process extra video frames in a video retrieval benchmark, the accuracy of visual categorization is improved by 29%. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ SGS2011b | Serial | 1729 | ||
Permanent link to this record |