|   | 
Details
   web
Records Links
Author Susana Alvarez; Maria Vanrell edit   pdf
url  doi
openurl 
Title Texton theory revisited: a bag-of-words approach to combine textons Type Journal Article
Year 2012 Publication Pattern Recognition Abbreviated Journal PR  
Volume 45 Issue 12 Pages 4312-4325  
Keywords  
Abstract The aim of this paper is to revisit an old theory of texture perception and
update its computational implementation by extending it to colour. With this in mind we try to capture the optimality of perceptual systems. This is achieved in the proposed approach by sharing well-known early stages of the visual processes and extracting low-dimensional features that perfectly encode adequate properties for a large variety of textures without needing further learning stages. We propose several descriptors in a bag-of-words framework that are derived from different quantisation models on to the feature spaces. Our perceptual features are directly given by the shape and colour attributes of image blobs, which are the textons. In this way we avoid learning visual words and directly build the vocabularies on these lowdimensionaltexton spaces. Main differences between proposed descriptors rely on how co-occurrence of blob attributes is represented in the vocabularies. Our approach overcomes current state-of-art in colour texture description which is proved in several experiments on large texture datasets.
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 0031-3203 ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number Admin @ si @ AlV2012a Serial 2130  
Permanent link to this record
 

 
Author Javier Vazquez; Robert Benavente; Maria Vanrell edit   pdf
url  openurl
Title Naming constraints constancy Type Conference Article
Year 2012 Publication 2nd Joint AVA / BMVA Meeting on Biological and Machine Vision Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract Different studies have shown that languages from industrialized cultures
share a set of 11 basic colour terms: red, green, blue, yellow, pink, purple, brown, orange, black, white, and grey (Berlin & Kay, 1969, Basic Color Terms, University of California Press)( Kay & Regier, 2003, PNAS, 100, 9085-9089). Some of these studies have also reported the best representatives or focal values of each colour (Boynton and Olson, 1990, Vision Res. 30,1311–1317), (Sturges and Whitfield, 1995, CRA, 20:6, 364–376). Some further studies have provided us with fuzzy datasets for color naming by asking human observers to rate colours in terms of membership values (Benavente -et al-, 2006, CRA. 31:1, 48–56,). Recently, a computational model based on these human ratings has been developed (Benavente -et al-, 2008, JOSA-A, 25:10, 2582-2593). This computational model follows a fuzzy approach to assign a colour name to a particular RGB value. For example, a pixel with a value (255,0,0) will be named 'red' with membership 1, while a cyan pixel with a RGB value of (0, 200, 200) will be considered to be 0.5 green and 0.5 blue. In this work, we show how this colour naming paradigm can be applied to different computer vision tasks. In particular, we report results in colour constancy (Vazquez-Corral -et al-, 2012, IEEE TIP, in press) showing that the classical constraints on either illumination or surface reflectance can be substituted by
the statistical properties encoded in the colour names. [Supported by projects TIN2010-21771-C02-1, CSD2007-00018].
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference AV A  
Notes CIC Approved no  
Call Number Admin @ si @ VBV2012 Serial 2131  
Permanent link to this record
 

 
Author Xavier Otazu; Olivier Penacchio; Laura Dempere-Marco edit   pdf
url  openurl
Title An investigation into plausible neural mechanisms related to the the CIWaM computational model for brightness induction Type Conference Article
Year 2012 Publication 2nd Joint AVA / BMVA Meeting on Biological and Machine Vision Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. From a purely computational perspective, we built a low-level computational model (CIWaM) of early sensory processing based on multi-resolution wavelets with the aim of replicating brightness and colour (Otazu et al., 2010, Journal of Vision, 10(12):5) induction effects. Furthermore, we successfully used the CIWaM architecture to define a computational saliency model (Murray et al, 2011, CVPR, 433-440; Vanrell et al, submitted to AVA/BMVA'12). From a biological perspective, neurophysiological evidence suggests that perceived brightness information may be explicitly represented in V1. In this work we investigate possible neural mechanisms that offer a plausible explanation for such effects. To this end, we consider the model by Z.Li (Li, 1999, Network:Comput. Neural Syst., 10, 187-212) which is based on biological data and focuses on the part of V1 responsible for contextual influences, namely, layer 2-3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has proven to account for phenomena such as visual saliency, which share with brightness induction the relevant effect of contextual influences (the ones modelled by CIWaM). In the proposed model, the input to the network is derived from a complete multiscale and multiorientation wavelet decomposition taken from the computational model (CIWaM).
This model successfully accounts for well known pyschophysical effects (among them: the White's and modied White's effects, the Todorovic, Chevreul, achromatic ring patterns, and grating induction effects) for static contexts and also for brigthness induction in dynamic contexts defined by modulating the luminance of surrounding areas. From a methodological point of view, we conclude that the results obtained by the computational model (CIWaM) are compatible with the ones obtained by the neurodynamical model proposed here.
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference AV A  
Notes CIC Approved no  
Call Number Admin @ si @ OPD2012a Serial 2132  
Permanent link to this record
 

 
Author David Geronimo; Joan Serrat; Antonio Lopez; Ramon Baldrich edit   pdf
doi  openurl
Title Traffic sign recognition for computer vision project-based learning Type Journal Article
Year 2013 Publication IEEE Transactions on Education Abbreviated Journal T-EDUC  
Volume 56 Issue 3 Pages 364-371  
Keywords traffic signs  
Abstract This paper presents a graduate course project on computer vision. The aim of the project is to detect and recognize traffic signs in video sequences recorded by an on-board vehicle camera. This is a demanding problem, given that traffic sign recognition is one of the most challenging problems for driving assistance systems. Equally, it is motivating for the students given that it is a real-life problem. Furthermore, it gives them the opportunity to appreciate the difficulty of real-world vision problems and to assess the extent to which this problem can be solved by modern computer vision and pattern classification techniques taught in the classroom. The learning objectives of the course are introduced, as are the constraints imposed on its design, such as the diversity of students' background and the amount of time they and their instructors dedicate to the course. The paper also describes the course contents, schedule, and how the project-based learning approach is applied. The outcomes of the course are discussed, including both the students' marks and their personal feedback.  
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 0018-9359 ISBN Medium  
Area Expedition Conference  
Notes ADAS; CIC Approved no  
Call Number Admin @ si @ GSL2013; ADAS @ adas @ Serial 2160  
Permanent link to this record
 

 
Author Jaime Moreno; Xavier Otazu edit  doi
isbn  openurl
Title Image compression algorithm based on Hilbert scanning of embedded quadTrees: an introduction of the Hi-SET coder Type Conference Article
Year 2011 Publication IEEE International Conference on Multimedia and Expo Abbreviated Journal  
Volume Issue Pages 1-6  
Keywords  
Abstract In this work we present an effective and computationally simple algorithm for image compression based on Hilbert Scanning of Embedded quadTrees (Hi-SET). It allows to represent an image as an embedded bitstream along a fractal function. Embedding is an important feature of modern image compression algorithms, in this way Salomon in [1, pg. 614] cite that another feature and perhaps a unique one is the fact of achieving the best quality for the number of bits input by the decoder at any point during the decoding. Hi-SET possesses also this latter feature. Furthermore, the coder is based on a quadtree partition strategy, that applied to image transformation structures such as discrete cosine or wavelet transform allows to obtain an energy clustering both in frequency and space. The coding algorithm is composed of three general steps, using just a list of significant pixels. The implementation of the proposed coder is developed for gray-scale and color image compression. Hi-SET compressed images are, on average, 6.20dB better than the ones obtained by other compression techniques based on the Hilbert scanning. Moreover, Hi-SET improves the image quality in 1.39dB and 1.00dB in gray-scale and color compression, respectively, when compared with JPEG2000 coder.  
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1945-7871 ISBN 978-1-61284-348-3 Medium  
Area Expedition Conference ICME  
Notes CIC Approved no  
Call Number Admin @ si @ MoO2011a Serial 2176  
Permanent link to this record
 

 
Author Jaime Moreno; Xavier Otazu edit  openurl
Title Image coder based on Hilbert scanning of embedded quadTrees Type Conference Article
Year 2011 Publication Data Compression Conference Abbreviated Journal  
Volume Issue Pages 470-470  
Keywords  
Abstract In this work we present an effective and computationally simple algorithm for image compression based on Hilbert Scanning of Embedded quadTrees (Hi-SET). It allows to represent an image as an embedded bitstream along a fractal function. Embedding is an important feature of modern image compression algorithms, in this way Salomon in [1, pg. 614] cite that another feature and perhaps a unique one is the fact of achieving the best quality for the number of bits input by the decoder at any point during the decoding. Hi-SET possesses also this latter feature. Furthermore, the coder is based on a quadtree partition strategy, that applied to image transformation structures such as discrete cosine or wavelet transform allows to obtain an energy clustering both in frequency and space. The coding algorithm is composed of three general steps, using just a list of significant pixels.  
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference DCC  
Notes CIC Approved no  
Call Number Admin @ si @ MoO2011b Serial 2177  
Permanent link to this record
 

 
Author Xavier Otazu; Olivier Penacchio; Laura Dempere-Marco edit   pdf
doi  openurl
Title Brightness induction by contextual influences in V1: a neurodynamical account Type Abstract
Year 2012 Publication Journal of Vision Abbreviated Journal VSS  
Volume 12 Issue 9 Pages  
Keywords  
Abstract Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas and reveals fundamental properties of neural organization in the visual system. Several phenomenological models have been proposed that successfully account for psychophysical data (Pessoa et al. 1995, Blakeslee and McCourt 2004, Barkan et al. 2008, Otazu et al. 2008).
Neurophysiological evidence suggests that brightness information is explicitly represented in V1 and neuronal response modulations have been observed followingluminance changes outside their receptive fields (Rossi and Paradiso, 1999).
In this work we investigate possible neural mechanisms that offer a plausible explanation for such effects. To this end, we consider the model by Z.Li (1999) which is based on biological data and focuses on the part of V1 responsible for contextual influences, namely, layer 2–3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has proven to account for phenomena such as contour detection and preattentive segmentation, which share with brightness induction the relevant effect of contextual influences. In our model, the input to the network is derived from a complete multiscale and multiorientation wavelet decomposition which makes it possible to recover an image reflecting the perceived intensity. The proposed model successfully accounts for well known pyschophysical effects (among them: the White's and modified White's effects, the Todorović, Chevreul, achromatic ring patterns, and grating induction effects). Our work suggests that intra-cortical interactions in the primary visual cortex could partially explain perceptual brightness induction effects and reveals how a common general architecture may account for several different fundamental processes emerging early in the visual pathway.
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number Admin @ si @ OPD2012b Serial 2178  
Permanent link to this record
 

 
Author Xavier Otazu edit   pdf
url  openurl
Title Perceptual tone-mapping operator based on multiresolution contrast decomposition Type Abstract
Year 2012 Publication Perception Abbreviated Journal PER  
Volume 41 Issue Pages 86  
Keywords  
Abstract Tone-mapping operators (TMO) are used to display high dynamic range(HDR) images in low dynamic range (LDR) displays. Many computational and biologically inspired approaches have been used in the literature, being many of them based on multiresolution decompositions. In this work, a simple two stage model for TMO is presented. The first stage is a novel multiresolution contrast decomposition, which is inspired in a pyramidal contrast decomposition (Peli, 1990 Journal of the Optical Society of America7(10), 2032-2040).
This novel multiresolution decomposition represents the Michelson contrast of the image at different spatial scales. This multiresolution contrast representation, applied on the intensity channel of an opponent colour decomposition, is processed by a non-linear saturating model of V1 neurons (Albrecht et al, 2002 Journal ofNeurophysiology 88(2) 888-913). This saturation model depends on the visual frequency, and it has been modified in order to include information from the extended Contrast Sensitivity Function (e-CSF) (Otazu et al, 2010 Journal ofVision10(12) 5).
A set of HDR images in Radiance RGBE format (from CIS HDR Photographic Survey and Greg Ward database) have been used to test the model, obtaining a set of LDR images. The resulting LDR images do not show the usual halo or color modification artifacts.
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 0301-0066 ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number Admin @ si @ Ota2012 Serial 2179  
Permanent link to this record
 

 
Author Olivier Penacchio; Laura Dempere-Marco; Xavier Otazu edit   pdf
openurl 
Title Switching off brightness induction through induction-reversed images Type Abstract
Year 2012 Publication Perception Abbreviated Journal PER  
Volume 41 Issue Pages 208  
Keywords  
Abstract Brightness induction is the modulation of the perceived intensity of an
area by the luminance of surrounding areas. Although V1 is traditionally regarded as
an area mostly responsive to retinal information, neurophysiological evidence
suggests that it may explicitly represent brightness information. In this work, we
investigate possible neural mechanisms underlying brightness induction. To this end,
we consider the model by Z Li (1999 Computation and Neural Systems10187-212)
which is constrained by neurophysiological data and focuses on the part of V1
responsible for contextual influences. This model, which has proven to account for
phenomena such as contour detection and preattentive segmentation, shares with
brightness induction the relevant effect of contextual influences. Importantly, the
input to our network model derives from a complete multiscale and multiorientation
wavelet decomposition, which makes it possible to recover an image reflecting the
perceived luminance and successfully accounts for well known psychophysical
effects for both static and dynamic contexts. By further considering inverse problem
techniques we define induction-reversed images: given a target image, we build an
image whose perceived luminance matches the actual luminance of the original
stimulus, thus effectively canceling out brightness induction effects. We suggest that
induction-reversed images may help remove undesired perceptual effects and can
find potential applications in fields such as radiological image interpretation
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number Admin @ si @ PDO2012a Serial 2180  
Permanent link to this record
 

 
Author Olivier Penacchio; Laura Dempere-Marco; Xavier Otazu edit   pdf
openurl 
Title A Neurodynamical Model Of Brightness Induction In V1 Following Static And Dynamic Contextual Influences Type Abstract
Year 2012 Publication 8th Federation of European Neurosciences Abbreviated Journal  
Volume 6 Issue Pages 63-64  
Keywords  
Abstract Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. Although striate cortex is traditionally regarded as an area mostly responsive to ensory (i.e. retinal) information,
neurophysiological evidence suggests that perceived brightness information mightbe explicitly represented in V1.
Such evidence has been observed both in anesthetised cats where neuronal response modulations have been found to follow luminance changes outside the receptive felds and in human fMRI measurements. In this work, possible neural mechanisms that ofer a plausible explanation for such phenomenon are investigated. To this end, we consider the model proposed by Z.Li (Li, Network:Comput. Neural Syst., 10 (1999)) which is based on neurophysiological evidence and focuses on the part of V1 responsible for contextual infuences, i.e. layer 2-3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has reproduced other phenomena such as contour detection and preattentive segmentation, which share with brightness induction the relevant efect of contextual infuences. We have extended the original model such that the input to the network is obtained from a complete multiscale and multiorientation wavelet decomposition, thereby allowing the recovery of an image refecting the perceived intensity. The proposed model successfully accounts for well known psychophysical efects for static contexts (among them: the White's and modifed White's efects, the Todorovic, Chevreul, achromatic ring patterns, and grating induction efects) and also for brigthness induction in dynamic contexts defned by modulating the luminance of surrounding areas (e.g. the brightness of a static central area is perceived to vary in antiphase to the sinusoidal luminance changes of its surroundings). This work thus suggests that intra-cortical interactions in V1 could partially explain perceptual brightness induction efects and reveals how a common general architecture may account for several different fundamental processes emerging early in the visual processing pathway.
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference FENS  
Notes CIC Approved no  
Call Number Admin @ si @ PDO2012b Serial 2181  
Permanent link to this record
 

 
Author Jordi Roca; C. Alejandro Parraga; Maria Vanrell edit   pdf
url  openurl
Title Predicting categorical colour perception in successive colour constancy Type Abstract
Year 2012 Publication Perception Abbreviated Journal PER  
Volume 41 Issue Pages 138  
Keywords  
Abstract Colour constancy is a perceptual mechanism that seeks to keep the colour of objects relatively stable under an illumination shift. Experiments haveshown that its effects depend on the number of colours present in the scene. We
studied categorical colour changes under different adaptation states, in particular, whether the colour categories seen under a chromatically neutral illuminant are the same after a shift in the chromaticity of the illumination. To do this, we developed the chromatic setting paradigm (2011 Journal of Vision11 349), which is as an extension of achromatic setting to colour categories. The paradigm exploits the ability of subjects to reliably reproduce the most representative examples of each category, adjusting multiple test patches embedded in a coloured Mondrian. Our experiments were run on a CRT monitor (inside a dark room) under various simulated illuminants and restricting the number of colours of the Mondrian background to three, thus weakening the adaptation effect. Our results show a change in the colour categories present before (under neutral illumination) and after adaptation (under coloured illuminants) with a tendency for adapted colours to be less saturated than before adaptation. This behaviour was predicted by a simple
affine matrix model, adjusted to the chromatic setting results.
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 0301-0066 ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number Admin @ si @ RPV2012 Serial 2188  
Permanent link to this record
 

 
Author Jordi Roca; Maria Vanrell; C. Alejandro Parraga edit  url
isbn  openurl
Title What is constant in colour constancy? Type Conference Article
Year 2012 Publication 6th European Conference on Colour in Graphics, Imaging and Vision Abbreviated Journal  
Volume Issue Pages 337-343  
Keywords  
Abstract Color constancy refers to the ability of the human visual system to stabilize
the color appearance of surfaces under an illuminant change. In this work we studied how the interrelations among nine colors are perceived under illuminant changes, particularly whether they remain stable across 10 different conditions (5 illuminants and 2 backgrounds). To do so we have used a paradigm that measures several colors under an immersive state of adaptation. From our measures we defined a perceptual structure descriptor that is up to 87% stable over all conditions, suggesting that color category features could be used to predict color constancy. This is in agreement with previous results on the stability of border categories [1,2] and with computational color constancy
algorithms [3] for estimating the scene illuminant.
 
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN 9781622767014 Medium  
Area Expedition Conference CGIV  
Notes CIC Approved no  
Call Number RVP2012 Serial 2189  
Permanent link to this record
 

 
Author Adria Ruiz; Joost Van de Weijer; Xavier Binefa edit   pdf
url  openurl
Title Regularized Multi-Concept MIL for weakly-supervised facial behavior categorization Type Conference Article
Year 2014 Publication 25th British Machine Vision Conference Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract We address the problem of estimating high-level semantic labels for videos of recorded people by means of analysing their facial expressions. This problem, to which we refer as facial behavior categorization, is a weakly-supervised learning problem where we do not have access to frame-by-frame facial gesture annotations but only weak-labels at the video level are available. Therefore, the goal is to learn a set of discriminative expressions and how they determine the video weak-labels. Facial behavior categorization can be posed as a Multi-Instance-Learning (MIL) problem and we propose a novel MIL method called Regularized Multi-Concept MIL to solve it. In contrast to previous approaches applied in facial behavior analysis, RMC-MIL follows a Multi-Concept assumption which allows different facial expressions (concepts) to contribute differently to the video-label. Moreover, to handle with the high-dimensional nature of facial-descriptors, RMC-MIL uses a discriminative approach to model the concepts and structured sparsity regularization to discard non-informative features. RMC-MIL is posed as a convex-constrained optimization problem where all the parameters are jointly learned using the Projected-Quasi-Newton method. In our experiments, we use two public data-sets to show the advantages of the Regularized Multi-Concept approach and its improvement compared to existing MIL methods. RMC-MIL outperforms state-of-the-art results in the UNBC data-set for pain detection.  
Address Nottingham; UK; September 2014  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference BMVC  
Notes LAMP; CIC; 600.074; 600.079 Approved no  
Call Number Admin @ si @ RWB2014 Serial 2508  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Andrew Bagdanov; Michael Felsberg edit   pdf
doi  openurl
Title Scale Coding Bag-of-Words for Action Recognition Type Conference Article
Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal  
Volume Issue Pages 1514-1519  
Keywords  
Abstract Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image.
Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant
strategy is sub-optimal since it ignores the multi-scale information
available with each bounding box of a person.
This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music,
riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.
 
Address Stockholm; August 2014  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference ICPR  
Notes CIC; LAMP; 601.240; 600.074; 600.079 Approved no  
Call Number Admin @ si @ KWB2014 Serial 2450  
Permanent link to this record
 

 
Author Shida Beigpour; Christian Riess; Joost Van de Weijer; Elli Angelopoulou edit   pdf
doi  openurl
Title Multi-Illuminant Estimation with Conditional Random Fields Type Journal Article
Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
Volume 23 Issue 1 Pages 83-95  
Keywords color constancy; CRF; multi-illuminant  
Abstract Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.  
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1057-7149 ISBN Medium  
Area Expedition Conference  
Notes CIC; LAMP; 600.074; 600.079 Approved no  
Call Number Admin @ si @ BRW2014 Serial 2451  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Michael Felsberg; Carlo Gatta edit   pdf
doi  openurl
Title Semantic Pyramids for Gender and Action Recognition Type Journal Article
Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
Volume 23 Issue 8 Pages 3633-3645  
Keywords  
Abstract Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.  
Address  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1057-7149 ISBN Medium  
Area Expedition Conference  
Notes CIC; LAMP; 601.160; 600.074; 600.079;MILAB Approved no  
Call Number Admin @ si @ KWR2014 Serial 2507  
Permanent link to this record
 

 
Author Marc Serra; Olivier Penacchio; Robert Benavente; Maria Vanrell; Dimitris Samaras edit   pdf
doi  openurl
Title The Photometry of Intrinsic Images Type Conference Article
Year 2014 Publication 27th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
Volume Issue Pages 1494-1501  
Keywords  
Abstract Intrinsic characterization of scenes is often the best way to overcome the illumination variability artifacts that complicate most computer vision problems, from 3D reconstruction to object or material recognition. This paper examines the deficiency of existing intrinsic image models to accurately account for the effects of illuminant color and sensor characteristics in the estimation of intrinsic images and presents a generic framework which incorporates insights from color constancy research to the intrinsic image decomposition problem. The proposed mathematical formulation includes information about the color of the illuminant and the effects of the camera sensors, both of which modify the observed color of the reflectance of the objects in the scene during the acquisition process. By modeling these effects, we get a “truly intrinsic” reflectance image, which we call absolute reflectance, which is invariant to changes of illuminant or camera sensors. This model allows us to represent a wide range of intrinsic image decompositions depending on the specific assumptions on the geometric properties of the scene configuration and the spectral properties of the light source and the acquisition system, thus unifying previous models in a single general framework. We demonstrate that even partial information about sensors improves significantly the estimated reflectance images, thus making our method applicable for a wide range of sensors. We validate our general intrinsic image framework experimentally with both synthetic data and natural images.  
Address Columbus; Ohio; USA; June 2014  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference CVPR  
Notes CIC; 600.052; 600.051; 600.074 Approved no  
Call Number Admin @ si @ SPB2014 Serial 2506  
Permanent link to this record
 

 
Author M. Danelljan; Fahad Shahbaz Khan; Michael Felsberg; Joost Van de Weijer edit   pdf
doi  openurl
Title Adaptive color attributes for real-time visual tracking Type Conference Article
Year 2014 Publication 27th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
Volume Issue Pages 1090 - 1097  
Keywords  
Abstract Visual tracking is a challenging problem in computer vision. Most state-of-the-art visual trackers either rely on luminance information or use simple color representations for image description. Contrary to visual tracking, for object
recognition and detection, sophisticated color features when combined with luminance have shown to provide excellent performance. Due to the complexity of the tracking problem, the desired color feature should be computationally
efficient, and possess a certain amount of photometric invariance while maintaining high discriminative power.
This paper investigates the contribution of color in a tracking-by-detection framework. Our results suggest that color attributes provides superior performance for visual tracking. We further propose an adaptive low-dimensional
variant of color attributes. Both quantitative and attributebased evaluations are performed on 41 challenging benchmark color sequences. The proposed approach improves the baseline intensity-based tracker by 24% in median distance precision. Furthermore, we show that our approach outperforms
state-of-the-art tracking methods while running at more than 100 frames per second.
 
Address Nottingham; UK; September 2014  
Corporate Author Thesis (down)  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference CVPR  
Notes CIC; LAMP; 600.074; 600.079 Approved no  
Call Number Admin @ si @ DKF2014 Serial 2509  
Permanent link to this record