Publicacions CVC -- Query Results

[1–10] << 11 12 13 >>

Details

Records
Author	Bojana Gajic; Ramon Baldrich
Title	Cross-domain fashion image retrieval			Type	Conference Article
Year	2018	Publication	CVPR 2018 Workshop on Women in Computer Vision (WiCV 2018, 4th Edition)	Abbreviated Journal
Volume		Issue		Pages	19500-19502
Keywords
Abstract	Cross domain image retrieval is a challenging task that implies matching images from one domain to their pairs from another domain. In this paper we focus on fashion image retrieval, which involves matching an image of a fashion item taken by users, to the images of the same item taken in controlled condition, usually by professional photographer. When facing this problem, we have different products in train and test time, and we use triplet loss to train the network. We stress the importance of proper training of simple architecture, as well as adapting general models to the specific task.
Address	Salt Lake City, USA; 22 June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	CIC; 600.087			Approved	no
Call Number	Admin @ si @			Serial	3709
Permanent link to this record



Author	Bojana Gajic; Eduard Vazquez; Ramon Baldrich
Title	Evaluation of Deep Image Descriptors for Texture Retrieval			Type	Conference Article
Year	2017	Publication	Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017)	Abbreviated Journal
Volume		Issue		Pages	251-257
Keywords	Texture Representation; Texture Retrieval; Convolutional Neural Networks; Psychophysical Evaluation
Abstract	The increasing complexity learnt in the layers of a Convolutional Neural Network has proven to be of great help for the task of classification. The topic has received great attention in recently published literature. Nonetheless, just a handful of works study low-level representations, commonly associated with lower layers. In this paper, we explore recent findings which conclude, counterintuitively, the last layer of the VGG convolutional network is the best to describe a low-level property such as texture. To shed some light on this issue, we are proposing a psychophysical experiment to evaluate the adequacy of different layers of the VGG network for texture retrieval. Results obtained suggest that, whereas the last convolutional layer is a good choice for a specific task of classification, it might not be the best choice as a texture descriptor, showing a very poor performance on texture retrieval. Intermediate layers show the best performance, showing a good combination of basic filters, as in the primary visual cortex, and also a degree of higher level information to describe more complex textures.
Address	Porto, Portugal; 27 February – 1 March 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VISIGRAPP
Notes	CIC; 600.087			Approved	no
Call Number	Admin @ si @			Serial	3710
Permanent link to this record



Author	Marcos V Conde; Javier Vazquez; Michael S Brown; Radu TImofte
Title	NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement			Type	Conference Article
Year	2024	Publication	38th AAAI Conference on Artificial Intelligence	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	3D lookup tables (3D LUTs) are a key component for image enhancement. Modern image signal processors (ISPs) have dedicated support for these as part of the camera rendering pipeline. Cameras typically provide multiple options for picture styles, where each style is usually obtained by applying a unique handcrafted 3D LUT. Current approaches for learning and applying 3D LUTs are notably fast, yet not so memory-efficient, as storing multiple 3D LUTs is required. For this reason and other implementation limitations, their use on mobile devices is less popular. In this work, we propose a Neural Implicit LUT (NILUT), an implicitly defined continuous 3D color transformation parameterized by a neural network. We show that NILUTs are capable of accurately emulating real 3D LUTs. Moreover, a NILUT can be extended to incorporate multiple styles into a single network with the ability to blend styles implicitly. Our novel approach is memory-efficient, controllable and can complement previous methods, including learned ISPs.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	AAAI
Notes	CIC; MACO			Approved	no
Call Number	Admin @ si @ CVB2024			Serial	3872
Permanent link to this record



Author	Danna Xue; Javier Vazquez; Luis Herranz; Yang Zhang; Michael S Brown
Title	Integrating High-Level Features for Consistent Palette-based Multi-image Recoloring			Type	Journal Article
Year	2023	Publication	Computer Graphics Forum	Abbreviated Journal	CGF
Volume		Issue		Pages
Keywords
Abstract	Achieving visually consistent colors across multiple images is important when images are used in photo albums, websites, and brochures. Unfortunately, only a handful of methods address multi-image color consistency compared to one-to-one color transfer techniques. Furthermore, existing methods do not incorporate high-level features that can assist graphic designers in their work. To address these limitations, we introduce a framework that builds upon a previous palette-based color consistency method and incorporates three high-level features: white balance, saliency, and color naming. We show how these features overcome the limitations of the prior multi-consistency workflow and showcase the user-friendly nature of our framework.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC; MACO			Approved	no
Call Number	Admin @ si @ XVH2023			Serial	3883
Permanent link to this record



Author	Jaykishan Patel; Alban Flachot; Javier Vazquez; David H. Brainard; Thomas S. A. Wallis; Marcus A. Brubaker; Richard F. Murray
Title	A deep convolutional neural network trained to infer surface reflectance is deceived by mid-level lightness illusions			Type	Journal Article
Year	2023	Publication	Journal of Vision	Abbreviated Journal	JV
Volume	23	Issue	9	Pages	4817-4817
Keywords
Abstract	A long-standing view is that lightness illusions are by-products of strategies employed by the visual system to stabilize its perceptual representation of surface reflectance against changes in illumination. Computationally, one such strategy is to infer reflectance from the retinal image, and to base the lightness percept on this inference. CNNs trained to infer reflectance from images have proven successful at solving this problem under limited conditions. To evaluate whether these CNNs provide suitable starting points for computational models of human lightness perception, we tested a state-of-the-art CNN on several lightness illusions, and compared its behaviour to prior measurements of human performance. We trained a CNN (Yu & Smith, 2019) to infer reflectance from luminance images. The network had a 30-layer hourglass architecture with skip connections. We trained the network via supervised learning on 100K images, rendered in Blender, each showing randomly placed geometric objects (surfaces, cubes, tori, etc.), with random Lambertian reflectance patterns (solid, Voronoi, or low-pass noise), under randomized point+ambient lighting. The renderer also provided the ground-truth reflectance images required for training. After training, we applied the network to several visual illusions. These included the argyle, Koffka-Adelson, snake, White’s, checkerboard assimilation, and simultaneous contrast illusions, along with their controls where appropriate. The CNN correctly predicted larger illusions in the argyle, Koffka-Adelson, and snake images than in their controls. It also correctly predicted an assimilation effect in White's illusion. It did not, however, account for the checkerboard assimilation or simultaneous contrast effects. These results are consistent with the view that at least some lightness phenomena are by-products of a rational approach to inferring stable representations of physical properties from intrinsically ambiguous retinal images. Furthermore, they suggest that CNN models may be a promising starting point for new models of human lightness perception.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MACO; CIC			Approved	no
Call Number	Admin @ si @ PFV2023			Serial	3890
Permanent link to this record



Author	Marcos V Conde; Florin Vasluianu; Javier Vazquez; Radu Timofte
Title	Perceptual image enhancement for smartphone real-time applications			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	1848-1858
Keywords
Abstract	Recent advances in camera designs and imaging pipelines allow us to capture high-quality images using smartphones. However, due to the small size and lens limitations of the smartphone cameras, we commonly find artifacts or degradation in the processed images. The most common unpleasant effects are noise artifacts, diffraction artifacts, blur, and HDR overexposure. Deep learning methods for image restoration can successfully remove these artifacts. However, most approaches are not suitable for real-time applications on mobile devices due to their heavy computation and memory requirements. In this paper, we propose LPIENet, a lightweight network for perceptual image enhancement, with the focus on deploying it on smartphones. Our experiments show that, with much fewer parameters and operations, our model can deal with the mentioned artifacts and achieve competitive performance compared with state-of-the-art methods on standard benchmarks. Moreover, to prove the efficiency and reliability of our approach, we deployed the model directly on commercial smartphones and evaluated its performance. Our model can process 2K resolution images under 1 second in mid-level commercial smartphones.
Address	Waikoloa; Hawai; USA; January 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	MACO; CIC			Approved	no
Call Number	Admin @ si @ CVV2023			Serial	3900
Permanent link to this record



Author	Danna Xue; Luis Herranz; Javier Vazquez; Yanning Zhang
Title	Burst Perception-Distortion Tradeoff: Analysis and Evaluation			Type	Conference Article
Year	2023	Publication	IEEE International Conference on Acoustics, Speech and Signal Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Burst image restoration attempts to effectively utilize the complementary cues appearing in sequential images to produce a high-quality image. Most current methods use all the available images to obtain the reconstructed image. However, using more images for burst restoration is not always the best option regarding reconstruction quality and efficiency, as the images acquired by handheld imaging devices suffer from degradation and misalignment caused by the camera noise and shake. In this paper, we extend the perception-distortion tradeoff theory by introducing multiple-frame information. We propose the area of the unattainable region as a new metric for perception-distortion tradeoff evaluation and comparison. Based on this metric, we analyse the performance of burst restoration from the perspective of the perception-distortion tradeoff under both aligned bursts and misaligned bursts situations. Our analysis reveals the importance of inter-frame alignment for burst restoration and shows that the optimal burst length for the restoration model depends both on the degree of degradation and misalignment.
Address	Rodhes Islands; Greece; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICASSP
Notes	CIC; MACO			Approved	no
Call Number	Admin @ si @ XHV2023			Serial	3909
Permanent link to this record



Author	Yawei Li; Yulun Zhang; Radu Timofte; Luc Van Gool; Zhijun Tu; Kunpeng Du; Hailing Wang; Hanting Chen; Wei Li; Xiaofei Wang; Jie Hu; Yunhe Wang; Xiangyu Kong; Jinlong Wu; Dafeng Zhang; Jianxing Zhang; Shuai Liu; Furui Bai; Chaoyu Feng; Hao Wang; Yuqian Zhang; Guangqi Shao; Xiaotao Wang; Lei Lei; Rongjian Xu; Zhilu Zhang; Yunjin Chen; Dongwei Ren; Wangmeng Zuo; Qi Wu; Mingyan Han; Shen Cheng; Haipeng Li; Ting Jiang; Chengzhi Jiang; Xinpeng Li; Jinting Luo; Wenjie Lin; Lei Yu; Haoqiang Fan; Shuaicheng Liu; Aditya Arora; Syed Waqas Zamir; Javier Vazquez; Konstantinos G. Derpanis; Michael S. Brown; Hao Li; Zhihao Zhao; Jinshan Pan; Jiangxin Dong; Jinhui Tang; Bo Yang; Jingxiang Chen; Chenghua Li; Xi Zhang; Zhao Zhang; Jiahuan Ren; Zhicheng Ji; Kang Miao; Suiyi Zhao; Huan Zheng; YanYan Wei; Kangliang Liu; Xiangcheng Du; Sijie Liu; Yingbin Zheng; Xingjiao Wu; Cheng Jin; Rajeev Irny; Sriharsha Koundinya; Vighnesh Kamath; Gaurav Khandelwal; Sunder Ali Khowaja; Jiseok Yoon; Ik Hyun Lee; Shijie Chen; Chengqiang Zhao; Huabin Yang; Zhongjian Zhang; Junjia Huang; Yanru Zhang
Title	NTIRE 2023 challenge on image denoising: Methods and results			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
Volume		Issue		Pages	1904-1920
Keywords
Abstract	This paper reviews the NTIRE 2023 challenge on image denoising (σ = 50) with a focus on the proposed solutions and results. The aim is to obtain a network design capable to produce high-quality results with the best performance measured by PSNR for image denoising. Independent additive white Gaussian noise (AWGN) is assumed and the noise level is 50. The challenge had 225 registered participants, and 16 teams made valid submissions. They gauge the state-of-the-art for image denoising.
Address	Vancouver; Canada; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	MACO; CIC			Approved	no
Call Number	Admin @ si @ LZT2023			Serial	3910
Permanent link to this record



Author	Justine Giroux; Mohammad Reza Karimi Dastjerdi; Yannick Hold-Geoffroy; Javier Vazquez; Jean François Lalonde
Title	Towards a Perceptual Evaluation Framework for Lighting Estimation			Type	Conference Article
Year	2024	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	rogress in lighting estimation is tracked by computing existing image quality assessment (IQA) metrics on images from standard datasets. While this may appear to be a reasonable approach, we demonstrate that doing so does not correlate to human preference when the estimated lighting is used to relight a virtual scene into a real photograph. To study this, we design a controlled psychophysical experiment where human observers must choose their preference amongst rendered scenes lit using a set of lighting estimation algorithms selected from the recent literature, and use it to analyse how these algorithms perform according to human perception. Then, we demonstrate that none of the most popular IQA metrics from the literature, taken individually, correctly represent human perception. Finally, we show that by learning a combination of existing IQA metrics, we can more accurately represent human preference. This provides a new perceptual framework to help evaluate future lighting estimation algorithms.
Address	Seattle; USA; June 2024
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	MACO; CIC			Approved	no
Call Number	Admin @ si @ GDH2024			Serial	3999
Permanent link to this record



Author	Trevor Canham; Javier Vazquez; D Long; Richard F. Murray; Michael S Brown
Title	Noise Prism: A Novel Multispectral Visualization Technique			Type	Journal Article
Year	2021	Publication	31st Color and Imaging Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	A novel technique for visualizing multispectral images is proposed. Inspired by how prisms work, our method spreads spectral information over a chromatic noise pattern. This is accomplished by populating the pattern with pixels representing each measurement band at a count proportional to its measured intensity. The method is advantageous because it allows for lightweight encoding and visualization of spectral information while maintaining the color appearance of the stimulus. A four alternative forced choice (4AFC) experiment was conducted to validate the method’s information-carrying capacity in displaying metameric stimuli of varying colors and spectral basis functions. The scores ranged from 100% to 20% (less than chance given the 4AFC task), with many conditions falling somewhere in between at statistically significant intervals. Using this data, color and texture difference metrics can be evaluated and optimized to predict the legibility of the visualization technique.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CIC
Notes	MACO; CIC			Approved	no
Call Number	Admin @ si @ CVL2021			Serial	4000
Permanent link to this record



Author	Susana Alvarez; Anna Salvatella; Maria Vanrell; Xavier Otazu
Title	3D Texton Spaces for color-texture retrieval			Type	Conference Article
Year	2010	Publication	7th International Conference on Image Analysis and Recognition	Abbreviated Journal
Volume	6111	Issue		Pages	354–363
Keywords
Abstract	Color and texture are visual cues of different nature, their integration in an useful visual descriptor is not an easy problem. One way to combine both features is to compute spatial texture descriptors independently on each color channel. Another way is to do the integration at the descriptor level. In this case the problem of normalizing both cues arises. In this paper we solve the latest problem by fusing color and texture through distances in texton spaces. Textons are the attributes of image blobs and they are responsible for texture discrimination as defined in Julesz’s Texton theory. We describe them in two low-dimensional and uniform spaces, namely, shape and color. The dissimilarity between color texture images is computed by combining the distances in these two spaces. Following this approach, we propose our TCD descriptor which outperforms current state of art methods in the two different approaches mentioned above, early combination with LBP and late combination with MPEG-7. This is done on an image retrieval experiment over a highly diverse texture dataset from Corel.
Address
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	A.C. Campilho and M.S. Kamel
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-13771-6	Medium
Area		Expedition		Conference	ICIAR
Notes	CIC			Approved	no
Call Number	CAT @ cat @ ASV2010a			Serial	1325
Permanent link to this record



Author	Joost Van de Weijer; Fahad Shahbaz Khan; Marc Masana
Title	Interactive Visual and Semantic Image Retrieval			Type	Book Chapter
Year	2013	Publication	Multimodal Interaction in Image and Video Applications	Abbreviated Journal
Volume	48	Issue		Pages	31-35
Keywords
Abstract	One direct consequence of recent advances in digital visual data generation and the direct availability of this information through the World-Wide Web, is a urgent demand for efficient image retrieval systems. The objective of image retrieval is to allow users to efficiently browse through this abundance of images. Due to the non-expert nature of the majority of the internet users, such systems should be user friendly, and therefore avoid complex user interfaces. In this chapter we investigate how high-level information provided by recently developed object recognition techniques can improve interactive image retrieval. Wel apply a bagof- word based image representation method to automatically classify images in a number of categories. These additional labels are then applied to improve the image retrieval system. Next to these high-level semantic labels, we also apply a low-level image description to describe the composition and color scheme of the scene. Both descriptions are incorporated in a user feedback image retrieval setting. The main objective is to show that automatic labeling of images with semantic labels can improve image retrieval results.
Address
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	Angel Sappa; Jordi Vitria
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1868-4394	ISBN	978-3-642-35931-6	Medium
Area		Expedition		Conference
Notes	CIC; 605.203; 600.048			Approved	no
Call Number	Admin @ si @ WKC2013			Serial	2284
Permanent link to this record



Author	C. Alejandro Parraga
Title	Color Vision, Computational Methods for			Type	Book Chapter
Year	2014	Publication	Encyclopedia of Computational Neuroscience	Abbreviated Journal
Volume		Issue		Pages	1-11
Keywords	Color computational vision; Computational neuroscience of color
Abstract	The study of color vision has been aided by a whole battery of computational methods that attempt to describe the mechanisms that lead to our perception of colors in terms of the information-processing properties of the visual system. Their scope is highly interdisciplinary, linking apparently dissimilar disciplines such as mathematics, physics, computer science, neuroscience, cognitive science, and psychology. Since the sensation of color is a feature of our brains, computational approaches usually include biological features of neural systems in their descriptions, from retinal light-receptor interaction to subcortical color opponency, cortical signal decoding, and color categorization. They produce hypotheses that are usually tested by behavioral or psychophysical experiments.
Address
Corporate Author				Thesis
Publisher	Springer-Verlag Berlin Heidelberg	Place of Publication		Editor	Dieter Jaeger; Ranu Jung
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4614-7320-6	Medium
Area		Expedition		Conference
Notes	CIC; 600.074			Approved	no
Call Number	Admin @ si @ Par2014			Serial	2512
Permanent link to this record



Author	C. Alejandro Parraga
Title	Perceptual Psychophysics			Type	Book Chapter
Year	2015	Publication	Biologically-Inspired Computer Vision: Fundamentals and Applications	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor	G.Cristobal; M.Keil; L.Perrinet
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-527-41264-8	Medium
Area		Expedition		Conference
Notes	CIC; 600.074			Approved	no
Call Number	Admin @ si @ Par2015			Serial	2600
Permanent link to this record



Author	Partha Pratim Roy; Eduard Vazquez; Josep Llados; Ramon Baldrich; Umapada Pal
Title	A System to Retrieve Text/Symbols from Color Maps using Connected Component and Skeleton Analysis			Type	Conference Article
Year	2007	Publication	Seventh IAPR International Workshop on Graphics Recognition	Abbreviated Journal
Volume		Issue		Pages	79–78
Keywords
Abstract
Address	Curitiba (Brasil)
Corporate Author				Thesis
Publisher		Place of Publication		Editor	J. Llados, W. Liu, J.M. Ogier
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	GREC
Notes	CAT; DAG;CIC			Approved	no
Call Number	CAT @ cat @ RVL2007			Serial	836
Permanent link to this record



Author	Shida Beigpour
Title	Illumination and object reflectance modeling			Type	Book Whole
Year	2013	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	More realistic and accurate models of the scene illumination and object reflectance can greatly improve the quality of many computer vision and computer graphics tasks. Using such model, a more profound knowledge about the interaction of light with object surfaces can be established which proves crucial to a variety of computer vision applications. In the current work, we investigate the various existing approaches to illumination and reflectance modeling and form an analysis on their shortcomings in capturing the complexity of real-world scenes. Based on this analysis we propose improvements to different aspects of reflectance and illumination estimation in order to more realistically model the real-world scenes in the presence of complex lighting phenomena (i.e, multiple illuminants, interreflections and shadows). Moreover, we captured our own multi-illuminant dataset which consists of complex scenes and illumination conditions both outdoor and in laboratory conditions. In addition we investigate the use of synthetic data to facilitate the construction of datasets and improve the process of obtaining ground-truth information.
Address	Barcelona
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Joost Van de Weijer;Ernest Valveny
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	Admin @ si @ Bei2013			Serial	2267
Permanent link to this record



Author	Fahad Shahbaz Khan
Title	Coloring bag-of-words based image representations			Type	Book Whole
Year	2011	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Put succinctly, the bag-of-words based image representation is the most successful approach for object and scene recognition. Within the bag-of-words framework the optimal fusion of multiple cues, such as shape, texture and color, still remains an active research domain. There exist two main approaches to combine color and shape information within the bag-of-words framework. The first approach called, early fusion, fuses color and shape at the feature level as a result of which a joint colorshape vocabulary is produced. The second approach, called late fusion, concatenates histogram representation of both color and shape, obtained independently. In the first part of this thesis, we analyze the theoretical implications of both early and late feature fusion. We demonstrate that both these approaches are suboptimal for a subset of object categories. Consequently, we propose a novel method for recognizing object categories when using multiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category specific color attention. Color is used to compute bottom-up and top-down attention maps. Subsequently, the color attention maps are used to modulate the weights of the shape features. Shape features are given more weight in regions with higher attention and vice versa. The approach is tested on several benchmark object recognition data sets and the results clearly demonstrate the effectiveness of our proposed method. In the second part of the thesis, we investigate the problem of obtaining compact spatial pyramid representations for object and scene recognition. Spatial pyramids have been successfully applied to incorporate spatial information into bag-of-words based image representation. However, a major drawback of spatial pyramids is that it leads to high dimensional image representations. We present a novel framework for obtaining compact pyramid representation. The approach reduces the size of a high dimensional pyramid representation upto an order of magnitude without any significant reduction in accuracy. Moreover, we also investigate the optimal combination of multiple features such as color and shape within the context of our compact pyramid representation. Finally, we describe a novel technique to build discriminative visual words from multiple cues learned independently from training images. To this end, we use an information theoretic vocabulary compression technique to find discriminative combinations of visual cues and the resulting visual vocabulary is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. The approach is tested on standard object recognition data sets. The results obtained clearly demonstrate the effectiveness of our approach.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher		Place of Publication		Editor	Joost Van de Weijer;Maria Vanrell
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	Admin @ si @ Kha2011			Serial	1838
Permanent link to this record



Author	Robert Benavente
Title	A Parametric Model for Computational Colour Naming			Type	Book Whole
Year	2007	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords	PhD Thesis
Abstract
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Maria Vanrell
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	CAT @ cat @ Ben2007			Serial	1108
Permanent link to this record