|   | 
Details
   web
Records Links
Author Anna Salvatella; Maria Vanrell; Ramon Baldrich edit  openurl
Title Subtexture Components for Texture Description Type Miscellaneous
Year 2003 Publication Lecture Notes in Computer Science, vol 2652, pp 884–892 Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract  
Address (down) Springer-Verlag  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number CAT @ cat @ SVR2003 Serial 421  
Permanent link to this record
 

 
Author Hassan Ahmed Sial edit  isbn
openurl 
Title Estimating Light Effects from a Single Image: Deep Architectures and Ground-Truth Generation Type Book Whole
Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract In this thesis, we explore how to estimate the effects of the light interacting with the scene objects from a single image. To achieve this goal, we focus on recovering intrinsic components like reflectance, shading, or light properties such as color and position using deep architectures. The success of these approaches relies on training on large and diversified image datasets. Therefore, we present several contributions on this such as: (a) a data-augmentation technique; (b) a ground-truth for an existing multi-illuminant dataset; (c) a family of synthetic datasets, SID for Surreal Intrinsic Datasets, with diversified backgrounds and coherent light conditions; and (d) a practical pipeline to create hybrid ground-truths to overcome the complexity of acquiring realistic light conditions in a massive way. In parallel with the creation of datasets, we trained different flexible encoder-decoder deep architectures incorporating physical constraints from the image formation models.

In the last part of the thesis, we apply all the previous experience to two different problems. Firstly, we create a large hybrid Doc3DShade dataset with real shading and synthetic reflectance under complex illumination conditions, that is used to train a two-stage architecture that improves the character recognition task in complex lighting conditions of unwrapped documents. Secondly, we tackle the problem of single image scene relighting by extending both, the SID dataset to present stronger shading and shadows effects, and the deep architectures to use intrinsic components to estimate new relit images.
 
Address (down) September 2021  
Corporate Author Thesis Ph.D. thesis  
Publisher IMPRIMA Place of Publication Editor Maria Vanrell;Ramon Baldrich  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN 978-84-122714-8-5 Medium  
Area Expedition Conference  
Notes CIC; Approved no  
Call Number Admin @ si @ Sia2021 Serial 3607  
Permanent link to this record
 

 
Author Marc Serra edit  isbn
openurl 
Title Modeling, estimation and evaluation of intrinsic images considering color information Type Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract Image values are the result of a combination of visual information coming from multiple sources. Recovering information from the multiple factors thatproduced an image seems a hard and ill-posed problem. However, it is important to observe that humans develop the ability to interpret images and recognize and isolate specific physical properties of the scene.

Images describing a single physical characteristic of an scene are called intrinsic images. These images would benefit most computer vision tasks which are often affected by the multiple complex effects that are usually found in natural images (e.g. cast shadows, specularities, interreflections...).

In this thesis we analyze the problem of intrinsic image estimation from different perspectives, including the theoretical formulation of the problem, the visual cues that can be used to estimate the intrinsic components and the evaluation mechanisms of the problem.
 
Address (down) September 2015  
Corporate Author Thesis Ph.D. thesis  
Publisher Ediciones Graficas Rey Place of Publication Editor Robert Benavente;Olivier Penacchio  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN 978-84-943427-4-5 Medium  
Area Expedition Conference  
Notes CIC; 600.074 Approved no  
Call Number Admin @ si @ Ser2015 Serial 2688  
Permanent link to this record
 

 
Author Justine Giroux; Mohammad Reza Karimi Dastjerdi; Yannick Hold-Geoffroy; Javier Vazquez; Jean François Lalonde edit   pdf
url  openurl
Title Towards a Perceptual Evaluation Framework for Lighting Estimation Type Conference Article
Year 2024 Publication Arxiv Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract rogress in lighting estimation is tracked by computing existing image quality assessment (IQA) metrics on images from standard datasets. While this may appear to be a reasonable approach, we demonstrate that doing so does not correlate to human preference when the estimated lighting is used to relight a virtual scene into a real photograph. To study this, we design a controlled psychophysical experiment where human observers must choose their preference amongst rendered scenes lit using a set of lighting estimation algorithms selected from the recent literature, and use it to analyse how these algorithms perform according to human perception. Then, we demonstrate that none of the most popular IQA metrics from the literature, taken individually, correctly represent human perception. Finally, we show that by learning a combination of existing IQA metrics, we can more accurately represent human preference. This provides a new perceptual framework to help evaluate future lighting estimation algorithms.  
Address (down) Seattle; USA; June 2024  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference CVPR  
Notes MACO; CIC Approved no  
Call Number Admin @ si @ GDH2024 Serial 3999  
Permanent link to this record
 

 
Author Sandra Jimenez; Xavier Otazu; Valero Laparra; Jesus Malo edit   pdf
doi  openurl
Title Chromatic induction and contrast masking: similar models, different goals? Type Conference Article
Year 2013 Publication Human Vision and Electronic Imaging XVIII Abbreviated Journal  
Volume 8651 Issue Pages  
Keywords  
Abstract Normalization of signals coming from linear sensors is an ubiquitous mechanism of neural adaptation.1 Local interaction between sensors tuned to a particular feature at certain spatial position and neighbor sensors explains a wide range of psychophysical facts including (1) masking of spatial patterns, (2) non-linearities of motion sensors, (3) adaptation of color perception, (4) brightness and chromatic induction, and (5) image quality assessment. Although the above models have formal and qualitative similarities, it does not necessarily mean that the mechanisms involved are pursuing the same statistical goal. For instance, in the case of chromatic mechanisms (disregarding spatial information), different parameters in the normalization give rise to optimal discrimination or adaptation, and different non-linearities may give rise to error minimization or component independence. In the case of spatial sensors (disregarding color information), a number of studies have pointed out the benefits of masking in statistical independence terms. However, such statistical analysis has not been performed for spatio-chromatic induction models where chromatic perception depends on spatial configuration. In this work we investigate whether successful spatio-chromatic induction models,6 increase component independence similarly as previously reported for masking models. Mutual information analysis suggests that seeking an efficient chromatic representation may explain the prevalence of induction effects in spatially simple images. © (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.  
Address (down) San Francisco CA; USA; February 2013  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference HVEI  
Notes CIC Approved no  
Call Number Admin @ si @ JOL2013 Serial 2240  
Permanent link to this record
 

 
Author Josep M. Gonfaus; Xavier Boix; Joost Van de Weijer; Andrew Bagdanov; Joan Serrat; Jordi Gonzalez edit  url
doi  isbn
openurl 
Title Harmony Potentials for Joint Classification and Segmentation Type Conference Article
Year 2010 Publication 23rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
Volume Issue Pages 3280–3287  
Keywords  
Abstract Hierarchical conditional random fields have been successfully applied to object segmentation. One reason is their ability to incorporate contextual information at different scales. However, these models do not allow multiple labels to be assigned to a single node. At higher scales in the image, this yields an oversimplified model, since multiple classes can be reasonable expected to appear within one region. This simplified model especially limits the impact that observations at larger scales may have on the CRF model. Neglecting the information at larger scales is undesirable since class-label estimates based on these scales are more reliable than at smaller, noisier scales. To address this problem, we propose a new potential, called harmony potential, which can encode any possible combination of class labels. We propose an effective sampling strategy that renders tractable the underlying optimization problem. Results show that our approach obtains state-of-the-art results on two challenging datasets: Pascal VOC 2009 and MSRC-21.  
Address (down) San Francisco CA, USA  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1063-6919 ISBN 978-1-4244-6984-0 Medium  
Area Expedition Conference CVPR  
Notes ADAS;CIC;ISE Approved no  
Call Number ADAS @ adas @ GBW2010 Serial 1296  
Permanent link to this record
 

 
Author Ivet Rafegas; Maria Vanrell edit   pdf
openurl 
Title Color spaces emerging from deep convolutional networks Type Conference Article
Year 2016 Publication 24th Color and Imaging Conference Abbreviated Journal  
Volume Issue Pages 225-230  
Keywords  
Abstract Award for the best interactive session
Defining color spaces that provide a good encoding of spatio-chromatic properties of color surfaces is an open problem in color science [8, 22]. Related to this, in computer vision the fusion of color with local image features has been studied and evaluated [16]. In human vision research, the cells which are selective to specific color hues along the visual pathway are also a focus of attention [7, 14]. In line with these research aims, in this paper we study how color is encoded in a deep Convolutional Neural Network (CNN) that has been trained on more than one million natural images for object recognition. These convolutional nets achieve impressive performance in computer vision, and rival the representations in human brain. In this paper we explore how color is represented in a CNN architecture that can give some intuition about efficient spatio-chromatic representations. In convolutional layers the activation of a neuron is related to a spatial filter, that combines spatio-chromatic representations. We use an inverted version of it to explore the properties. Using a series of unsupervised methods we classify different type of neurons depending on the color axes they define and we propose an index of color-selectivity of a neuron. We estimate the main color axes that emerge from this trained net and we prove that colorselectivity of neurons decreases from early to deeper layers.
 
Address (down) San Diego; USA; November 2016  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference CIC  
Notes CIC Approved no  
Call Number Admin @ si @ RaV2016a Serial 2894  
Permanent link to this record
 

 
Author Maria Vanrell; Jordi Vitria edit  openurl
Title Mathematical Morphology, Granulometries and Texture Perception. Type Miscellaneous
Year 1993 Publication SPIE International Symposium on Optical Instrumentation and Applied Science (Conference on image Algebra and Morphological image Processing IV). Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract  
Address (down) San Diego; CA; USA  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes OR;CIC;MV Approved no  
Call Number BCNPCL @ bcnpcl @ VaV1993 Serial 178  
Permanent link to this record
 

 
Author Bojana Gajic; Ramon Baldrich edit  doi
openurl 
Title Cross-domain fashion image retrieval Type Conference Article
Year 2018 Publication CVPR 2018 Workshop on Women in Computer Vision (WiCV 2018, 4th Edition) Abbreviated Journal  
Volume Issue Pages 19500-19502  
Keywords  
Abstract Cross domain image retrieval is a challenging task that implies matching images from one domain to their pairs from another domain. In this paper we focus on fashion image retrieval, which involves matching an image of a fashion item taken by users, to the images of the same item taken in controlled condition, usually by professional photographer. When facing this problem, we have different products
in train and test time, and we use triplet loss to train the network. We stress the importance of proper training of simple architecture, as well as adapting general models to the specific task.
 
Address (down) Salt Lake City, USA; 22 June 2018  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference CVPRW  
Notes CIC; 600.087 Approved no  
Call Number Admin @ si @ Serial 3709  
Permanent link to this record
 

 
Author Xavier Otazu; Maria Vanrell edit  openurl
Title Several lightness induction effects with a computational multiresolution wavelet framework Type Journal
Year 2006 Publication 29th European Conference on Visual Perception (ECVP’06), Perception Suppl s, 32: 56–56 Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract  
Address (down) Saint-Petersburg (Russia)  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number CAT @ cat @ OtV2006 Serial 659  
Permanent link to this record
 

 
Author Danna Xue; Luis Herranz; Javier Vazquez; Yanning Zhang edit  url
doi  openurl
Title Burst Perception-Distortion Tradeoff: Analysis and Evaluation Type Conference Article
Year 2023 Publication IEEE International Conference on Acoustics, Speech and Signal Processing Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract Burst image restoration attempts to effectively utilize the complementary cues appearing in sequential images to produce a high-quality image. Most current methods use all the available images to obtain the reconstructed image. However, using more images for burst restoration is not always the best option regarding reconstruction quality and efficiency, as the images acquired by handheld imaging devices suffer from degradation and misalignment caused by the camera noise and shake. In this paper, we extend the perception-distortion tradeoff theory by introducing multiple-frame information. We propose the area of the unattainable region as a new metric for perception-distortion tradeoff evaluation and comparison. Based on this metric, we analyse the performance of burst restoration from the perspective of the perception-distortion tradeoff under both aligned bursts and misaligned bursts situations. Our analysis reveals the importance of inter-frame alignment for burst restoration and shows that the optimal burst length for the restoration model depends both on the degree of degradation and misalignment.  
Address (down) Rodhes Islands; Greece; June 2023  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference ICASSP  
Notes CIC; MACO Approved no  
Call Number Admin @ si @ XHV2023 Serial 3909  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Andrew Bagdanov; Maria Vanrell; Antonio Lopez edit   pdf
url  doi
isbn  openurl
Title Color Attributes for Object Detection Type Conference Article
Year 2012 Publication 25th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
Volume Issue Pages 3306-3313  
Keywords pedestrian detection  
Abstract State-of-the-art object detectors typically use shape information as a low level feature representation to capture the local structure of an object. This paper shows that early fusion of shape and color, as is popular in image classification,
leads to a significant drop in performance for object detection. Moreover, such approaches also yields suboptimal results for object categories with varying importance of color and shape.
In this paper we propose the use of color attributes as an explicit color representation for object detection. Color attributes are compact, computationally efficient, and when combined with traditional shape features provide state-ofthe-
art results for object detection. Our method is tested on the PASCAL VOC 2007 and 2009 datasets and results clearly show that our method improves over state-of-the-art techniques despite its simplicity. We also introduce a new dataset consisting of cartoon character images in which color plays a pivotal role. On this dataset, our approach yields a significant gain of 14% in mean AP over conventional state-of-the-art methods.
 
Address (down) Providence; Rhode Island; USA;  
Corporate Author Thesis  
Publisher IEEE Xplore Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1063-6919 ISBN 978-1-4673-1226-4 Medium  
Area Expedition Conference CVPR  
Notes ADAS; CIC; Approved no  
Call Number Admin @ si @ KRW2012 Serial 1935  
Permanent link to this record
 

 
Author Marc Serra; Olivier Penacchio; Robert Benavente; Maria Vanrell edit   pdf
url  doi
isbn  openurl
Title Names and Shades of Color for Intrinsic Image Estimation Type Conference Article
Year 2012 Publication 25th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
Volume Issue Pages 278-285  
Keywords  
Abstract In the last years, intrinsic image decomposition has gained attention. Most of the state-of-the-art methods are based on the assumption that reflectance changes come along with strong image edges. Recently, user intervention in the recovery problem has proved to be a remarkable source of improvement. In this paper, we propose a novel approach that aims to overcome the shortcomings of pure edge-based methods by introducing strong surface descriptors, such as the color-name descriptor which introduces high-level considerations resembling top-down intervention. We also use a second surface descriptor, termed color-shade, which allows us to include physical considerations derived from the image formation model capturing gradual color surface variations. Both color cues are combined by means of a Markov Random Field. The method is quantitatively tested on the MIT ground truth dataset using different error metrics, achieving state-of-the-art performance.  
Address (down) Providence, Rhode Island  
Corporate Author Thesis  
Publisher IEEE Xplore Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1063-6919 ISBN 978-1-4673-1226-4 Medium  
Area Expedition Conference CVPR  
Notes CIC Approved no  
Call Number Admin @ si @ SPB2012 Serial 2026  
Permanent link to this record
 

 
Author Naila Murray; Luca Marchesotti; Florent Perronnin edit   pdf
url  doi
isbn  openurl
Title AVA: A Large-Scale Database for Aesthetic Visual Analysis Type Conference Article
Year 2012 Publication 25th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
Volume Issue Pages 2408-2415  
Keywords  
Abstract With the ever-expanding volume of visual content available, the ability to organize and navigate such content by aesthetic preference is becoming increasingly important. While still in its nascent stage, research into computational models of aesthetic preference already shows great potential. However, to advance research, realistic, diverse and challenging databases are needed. To this end, we introduce a new large-scale database for conducting Aesthetic Visual Analysis: AVA. It contains over 250,000 images along with a rich variety of meta-data including a large number of aesthetic scores for each image, semantic labels for over 60 categories as well as labels related to photographic style. We show the advantages of AVA with respect to existing databases in terms of scale, diversity, and heterogeneity of annotations. We then describe several key insights into aesthetic preference afforded by AVA. Finally, we demonstrate, through three applications, how the large scale of AVA can be leveraged to improve performance on existing preference tasks  
Address (down) Providence, Rhode Islan  
Corporate Author Thesis  
Publisher IEEE Xplore Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1063-6919 ISBN 978-1-4673-1226-4 Medium  
Area Expedition Conference CVPR  
Notes CIC Approved no  
Call Number Admin @ si @ MMP2012a Serial 2025  
Permanent link to this record
 

 
Author Bojana Gajic; Eduard Vazquez; Ramon Baldrich edit  url
openurl 
Title Evaluation of Deep Image Descriptors for Texture Retrieval Type Conference Article
Year 2017 Publication Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017) Abbreviated Journal  
Volume Issue Pages 251-257  
Keywords Texture Representation; Texture Retrieval; Convolutional Neural Networks; Psychophysical Evaluation  
Abstract The increasing complexity learnt in the layers of a Convolutional Neural Network has proven to be of great help for the task of classification. The topic has received great attention in recently published literature.
Nonetheless, just a handful of works study low-level representations, commonly associated with lower layers. In this paper, we explore recent findings which conclude, counterintuitively, the last layer of the VGG convolutional network is the best to describe a low-level property such as texture. To shed some light on this issue, we are proposing a psychophysical experiment to evaluate the adequacy of different layers of the VGG network for texture retrieval. Results obtained suggest that, whereas the last convolutional layer is a good choice for a specific task of classification, it might not be the best choice as a texture descriptor, showing a very poor performance on texture retrieval. Intermediate layers show the best performance, showing a good combination of basic filters, as in the primary visual cortex, and also a degree of higher level information to describe more complex textures.
 
Address (down) Porto, Portugal; 27 February – 1 March 2017  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference VISIGRAPP  
Notes CIC; 600.087 Approved no  
Call Number Admin @ si @ Serial 3710  
Permanent link to this record
 

 
Author Rahat Khan; Joost Van de Weijer; Fahad Shahbaz Khan; Damien Muselet; christophe Ducottet; Cecile Barat edit   pdf
doi  openurl
Title Discriminative Color Descriptors Type Conference Article
Year 2013 Publication IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
Volume Issue Pages 2866 - 2873  
Keywords  
Abstract Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.  
Address (down) Portland; Oregon; June 2013  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1063-6919 ISBN Medium  
Area Expedition Conference CVPR  
Notes CIC; 600.048 Approved no  
Call Number Admin @ si @ KWK2013a Serial 2262  
Permanent link to this record
 

 
Author Agnes Borras; Francesc Tous; Josep Llados; Maria Vanrell edit   pdf
openurl 
Title High-Level Clothes Description Based on Colour-Texture and Structural Features Type Conference Article
Year 2003 Publication 1rst. Iberian Conference on Pattern Recognition and Image Analysis IbPRIA 2003 Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract  
Address (down) Palma de Mallorca  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes DAG;CIC Approved no  
Call Number CAT @ cat @ BTL2003b Serial 369  
Permanent link to this record
 

 
Author Ivet Rafegas edit  isbn
openurl 
Title Color in Visual Recognition: from flat to deep representations and some biological parallelisms Type Book Whole
Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract Visual recognition is one of the main problems in computer vision that attempts to solve image understanding by deciding what objects are in images. This problem can be computationally solved by using relevant sets of visual features, such as edges, corners, color or more complex object parts. This thesis contributes to how color features have to be represented for recognition tasks.

Image features can be extracted following two different approaches. A first approach is defining handcrafted descriptors of images which is then followed by a learning scheme to classify the content (named flat schemes in Kruger et al. (2013). In this approach, perceptual considerations are habitually used to define efficient color features. Here we propose a new flat color descriptor based on the extension of color channels to boost the representation of spatio-chromatic contrast that surpasses state-of-the-art approaches. However, flat schemes present a lack of generality far away from the capabilities of biological systems. A second approach proposes evolving these flat schemes into a hierarchical process, like in the visual cortex. This includes an automatic process to learn optimal features. These deep schemes, and more specifically Convolutional Neural Networks (CNNs), have shown an impressive performance to solve various vision problems. However, there is a lack of understanding about the internal representation obtained, as a result of automatic learning. In this thesis we propose a new methodology to explore the internal representation of trained CNNs by defining the Neuron Feature as a visualization of the intrinsic features encoded in each individual neuron. Additionally, and inspired by physiological techniques, we propose to compute different neuron selectivity indexes (e.g., color, class, orientation or symmetry, amongst others) to label and classify the full CNN neuron population to understand learned representations.

Finally, using the proposed methodology, we show an in-depth study on how color is represented on a specific CNN, trained for object recognition, that competes with primate representational abilities (Cadieu et al (2014)). We found several parallelisms with biological visual systems: (a) a significant number of color selectivity neurons throughout all the layers; (b) an opponent and low frequency representation of color oriented edges and a higher sampling of frequency selectivity in brightness than in color in 1st layer like in V1; (c) a higher sampling of color hue in the second layer aligned to observed hue maps in V2; (d) a strong color and shape entanglement in all layers from basic features in shallower layers (V1 and V2) to object and background shapes in deeper layers (V4 and IT); and (e) a strong correlation between neuron color selectivities and color dataset bias.
 
Address (down) November 2017  
Corporate Author Thesis Ph.D. thesis  
Publisher Ediciones Graficas Rey Place of Publication Editor Maria Vanrell  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN 978-84-945373-7-0 Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number Admin @ si @ Raf2017 Serial 3100  
Permanent link to this record