Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	2386–2400 of 3413 records found matching your query (RSS):

Search & Display Options

Select All Deselect All

[141–150] << 151 152 153 154 155 156 157 158 159 160 >> [161–170]

List View

Citations

Details

	Records
	Author	Javad Zolfaghari Bengar
	Title	Reducing Label Effort with Deep Active Learning			Type	Book Whole
	Year	2021	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Deep convolutional neural networks (CNNs) have achieved superior performance in many visual recognition applications, such as image classification, detection and segmentation. Training deep CNNs requires huge amounts of labeled data, which is expensive and labor intensive to collect. Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. In this thesis we study several aspects of active learning including video object detection for autonomous driving systems, image classification on balanced and imbalanced datasets and the incorporation of self-supervised learning in active learning. We briefly describe our approach in each of these areas to reduce the labeling effort. In chapter two we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our criterion is based on the estimated number of errors in terms of false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two outdoor datasets. In the next chapter we address the well-known problem of over confidence in the neural networks. As an alternative to network confidence, we propose a new informativeness-based active learning method that captures the learning dynamics of neural network with a metric called label-dispersion. This metric is low when the network consistently assigns the same label to the sample during the course of training and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results. In chapter four, we tackle the problem of sampling bias in active learning methods on imbalanced datasets. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called longtail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we propose a general optimization framework that explicitly takes class-balancing into account. Results on three datasets show that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we show that also on balanced datasets our method generally results in a performance gain. Another paradigm to reduce the annotation effort is self-training that learns from a large amount of unlabeled data in an unsupervised way and fine-tunes on few labeled samples. Recent advancements in self-training have achieved very impressive results rivaling supervised learning on some datasets. In the last chapter we focus on whether active learning and self supervised learning can benefit from each other. We study object recognition datasets with several labeling budgets for the evaluations. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort, that for a low labeling budget, active learning offers no benefit to self-training, and finally that the combination of active learning and self-training is fruitful when the labeling budget is high.
	Address	December 2021
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	IMPRIMA	Place of Publication		Editor	Joost Van de Weijer;Bogdan Raducanu
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-122714-9-2	Medium
	Area		Expedition		Conference
	Notes	LAMP;			Approved	no
	Call Number	Admin @ si @ Zol2021			Serial	3609
Permanent link to this record



	Author	AN Ruchai; VI Kober; KA Dorofeev; VN Karnaukhov; Mikhail Mozerov
	Title	Classification of breast abnormalities using a deep convolutional neural network and transfer learning			Type	Journal Article
	Year	2021	Publication	Journal of Communications Technology and Electronics	Abbreviated Journal
	Volume	66	Issue	6	Pages	778–783
	Keywords
	Abstract	A new algorithm for classification of breast pathologies in digital mammography using a convolutional neural network and transfer learning is proposed. The following pretrained neural networks were chosen: MobileNetV2, InceptionResNetV2, Xception, and ResNetV2. All mammographic images were pre-processed to improve classification reliability. Transfer training was carried out using additional data augmentation and fine-tuning. The performance of the proposed algorithm for classification of breast pathologies in terms of accuracy on real data is discussed and compared with that of state-of-the-art algorithms on the available MIAS database.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP;			Approved	no
	Call Number	Admin @ si @ RKD2022			Serial	3680
Permanent link to this record



	Author	Yaxing Wang; L. Zhang; Joost Van de Weijer
	Title	Ensembles of generative adversarial networks			Type	Conference Article
	Year	2016	Publication	30th Annual Conference on Neural Information Processing Systems Worshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Ensembles are a popular way to improve results of discriminative CNNs. The combination of several networks trained starting from different initializations improves results significantly. In this paper we investigate the usage of ensembles of GANs. The specific nature of GANs opens up several new ways to construct ensembles. The first one is based on the fact that in the minimax game which is played to optimize the GAN objective the generator network keeps on changing even after the network can be considered optimal. As such ensembles of GANs can be constructed based on the same network initialization but just taking models which have different amount of iterations. These so-called self ensembles are much faster to train than traditional ensembles. The second method, called cascade GANs, redirects part of the training data which is badly modeled by the first GAN to another GAN. In experiments on the CIFAR10 dataset we show that ensembles of GANs obtain model probability distributions which better model the data distribution. In addition, we show that these improved results can be obtained at little additional computational cost.
	Address	Barcelona; Spain; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	NIPSW
	Notes	LAMP; 600.068			Approved	no
	Call Number	Admin @ si @ WZW2016			Serial	2905
Permanent link to this record



	Author	Adria Ruiz; Joost Van de Weijer; Xavier Binefa
	Title	From emotions to action units with hidden and semi-hidden-task learning			Type	Conference Article
	Year	2015	Publication	16th IEEE International Conference on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	3703-3711
	Keywords
	Abstract	Limited annotated training data is a challenging problem in Action Unit recognition. In this paper, we investigate how the use of large databases labelled according to the 6 universal facial expressions can increase the generalization ability of Action Unit classifiers. For this purpose, we propose a novel learning framework: Hidden-Task Learning. HTL aims to learn a set of Hidden-Tasks (Action Units)for which samples are not available but, in contrast, training data is easier to obtain from a set of related VisibleTasks (Facial Expressions). To that end, HTL is able to exploit prior knowledge about the relation between Hidden and Visible-Tasks. In our case, we base this prior knowledge on empirical psychological studies providing statistical correlations between Action Units and universal facial expressions. Additionally, we extend HTL to Semi-Hidden Task Learning (SHTL) assuming that Action Unit training samples are also provided. Performing exhaustive experiments over four different datasets, we show that HTL and SHTL improve the generalization ability of AU classifiers by training them with additional facial expression data. Additionally, we show that SHTL achieves competitive performance compared with state-of-the-art Transductive Learning approaches which face the problem of limited training data by using unlabelled test samples during training.
	Address	Santiago de Chile; Chile; December 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCV
	Notes	LAMP; 600.068; 600.079			Approved	no
	Call Number	Admin @ si @ RWB2015			Serial	2671
Permanent link to this record



	Author	Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Andrew Bagdanov; Michael Felsberg; Jorma
	Title	Scale coding bag of deep features for human attribute and action recognition			Type	Journal Article
	Year	2018	Publication	Machine Vision and Applications	Abbreviated Journal	MVAP
	Volume	29	Issue	1	Pages	55-71
	Keywords	Action recognition; Attribute recognition; Bag of deep features
	Abstract	Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a bag of deep features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state of the art.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.068; 600.079; 600.106; 600.120			Approved	no
	Call Number	Admin @ si @ KWR2018			Serial	3107
Permanent link to this record



	Author	Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Michael Felsberg; J.Laaksonen
	Title	Compact color texture description for texture classification			Type	Journal Article
	Year	2015	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	51	Issue		Pages	16-22
	Keywords
	Abstract	Describing textures is a challenging problem in computer vision and pattern recognition. The classification problem involves assigning a category label to the texture class it belongs to. Several factors such as variations in scale, illumination and viewpoint make the problem of texture description extremely challenging. A variety of histogram based texture representations exists in literature. However, combining multiple texture descriptors and assessing their complementarity is still an open research problem. In this paper, we first show that combining multiple local texture descriptors significantly improves the recognition performance compared to using a single best method alone. This gain in performance is achieved at the cost of high-dimensional final image representation. To counter this problem, we propose to use an information-theoretic compression technique to obtain a compact texture description without any significant loss in accuracy. In addition, we perform a comprehensive evaluation of pure color descriptors, popular in object recognition, for the problem of texture classification. Experiments are performed on four challenging texture datasets namely, KTH-TIPS-2a, KTH-TIPS-2b, FMD and Texture-10. The experiments clearly demonstrate that our proposed compact multi-texture approach outperforms the single best texture method alone. In all cases, discriminative color names outperforms other color features for texture classification. Finally, we show that combining discriminative color names with compact texture representation outperforms state-of-the-art methods by 7:8%, 4:3% and 5:0% on KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets respectively.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.068; 600.079;ADAS			Approved	no
	Call Number	Admin @ si @ KRW2015a			Serial	2587
Permanent link to this record



	Author	Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Michael Felsberg; J.Laaksonen
	Title	Deep semantic pyramids for human attributes and action recognition			Type	Conference Article
	Year	2015	Publication	Image Analysis, Proceedings of 19th Scandinavian Conference , SCIA 2015	Abbreviated Journal
	Volume	9127	Issue		Pages	341-353
	Keywords	Action recognition; Human attributes; Semantic pyramids
	Abstract	Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features. We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature.
	Address	Denmark; Copenhagen; June 2015
	Corporate Author				Thesis
	Publisher	Springer International Publishing	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-319-19664-0	Medium
	Area		Expedition		Conference	SCIA
	Notes	LAMP; 600.068; 600.079;ADAS			Approved	no
	Call Number	Admin @ si @ KRW2015b			Serial	2672
Permanent link to this record



	Author	Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen
	Title	Combining Holistic and Part-based Deep Representations for Computational Painting Categorization			Type	Conference Article
	Year	2016	Publication	6th International Conference on Multimedia Retrieval	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Automatic analysis of visual art, such as paintings, is a challenging inter-disciplinary research problem. Conventional approaches only rely on global scene characteristics by encoding holistic information for computational painting categorization.We argue that such approaches are sub-optimal and that discriminative common visual structures provide complementary information for painting classification. We present an approach that encodes both the global scene layout and discriminative latent common structures for computational painting categorization. The region of interests are automatically extracted, without any manual part labeling, by training class-specific deformable part-based models. Both holistic and region-of-interests are then described using multi-scale dense convolutional features. These features are pooled separately using Fisher vector encoding and concatenated afterwards in a single image representation. Experiments are performed on a challenging dataset with 91 different painters and 13 diverse painting styles. Our approach outperforms the standard method, which only employs the global scene characteristics. Furthermore, our method achieves state-of-the-art results outperforming a recent multi-scale deep features based approach [11] by 6.4% and 3.8% respectively on artist and style classification.
	Address	New York; USA; June 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICMR
	Notes	LAMP; 600.068; 600.079;ADAS			Approved	no
	Call Number	Admin @ si @ RKW2016			Serial	2763
Permanent link to this record



	Author	Marc Masana; Joost Van de Weijer; Andrew Bagdanov
	Title	On-the-fly Network pruning for object detection			Type	Conference Article
	Year	2016	Publication	International conference on learning representations	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Object detection with deep neural networks is often performed by passing a few thousand candidate bounding boxes through a deep neural network for each image. These bounding boxes are highly correlated since they originate from the same image. In this paper we investigate how to exploit feature occurrence at the image scale to prune the neural network which is subsequently applied to all bounding boxes. We show that removing units which have near-zero activation in the image allows us to significantly reduce the number of parameters in the network. Results on the PASCAL 2007 Object Detection Challenge demonstrate that up to 40% of units in some fully-connected layers can be entirely eliminated with little change in the detection result.
	Address	Puerto Rico; May 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICLR
	Notes	LAMP; 600.068; 600.106; 600.079			Approved	no
	Call Number	Admin @ si @MWB2016			Serial	2758
Permanent link to this record



	Author	Laura Lopez-Fuentes; Andrew Bagdanov; Joost Van de Weijer; Harald Skinnemoen
	Title	Bandwidth Limited Object Recognition in High Resolution Imagery			Type	Conference Article
	Year	2017	Publication	IEEE Winter conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This paper proposes a novel method to optimize bandwidth usage for object detection in critical communication scenarios. We develop two operating models of active information seeking. The first model identifies promising regions in low resolution imagery and progressively requests higher resolution regions on which to perform recognition of higher semantic quality. The second model identifies promising regions in low resolution imagery while simultaneously predicting the approximate location of the object of higher semantic quality. From this general framework, we develop a car recognition system via identification of its license plate and evaluate the performance of both models on a car dataset that we introduce. Results are compared with traditional JPEG compression and demonstrate that our system saves up to one order of magnitude of bandwidth while sacrificing little in terms of recognition performance.
	Address	Santa Rosa; CA; USA; March 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	LAMP; 600.068; 600.109; 600.084; 600.106; 600.079; 600.120			Approved	no
	Call Number	Admin @ si @ LBW2017			Serial	2973
Permanent link to this record



	Author	Laura Lopez-Fuentes; Joost Van de Weijer; Manuel Gonzalez-Hidalgo; Harald Skinnemoen; Andrew Bagdanov
	Title	Review on computer vision techniques in emergency situations			Type	Journal Article
	Year	2018	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
	Volume	77	Issue	13	Pages	17069–17107
	Keywords	Emergency management; Computer vision; Decision makers; Situational awareness; Critical situation
	Abstract	In emergency situations, actions that save lives and limit the impact of hazards are crucial. In order to act, situational awareness is needed to decide what to do. Geolocalized photos and video of the situations as they evolve can be crucial in better understanding them and making decisions faster. Cameras are almost everywhere these days, either in terms of smartphones, installed CCTV cameras, UAVs or others. However, this poses challenges in big data and information overflow. Moreover, most of the time there are no disasters at any given location, so humans aiming to detect sudden situations may not be as alert as needed at any point in time. Consequently, computer vision tools can be an excellent decision support. The number of emergencies where computer vision tools has been considered or used is very wide, and there is a great overlap across related emergency research. Researchers tend to focus on state-of-the-art systems that cover the same emergency as they are studying, obviating important research in other fields. In order to unveil this overlap, the survey is divided along four main axes: the types of emergencies that have been studied in computer vision, the objective that the algorithms can address, the type of hardware needed and the algorithms used. Therefore, this review provides a broad overview of the progress of computer vision covering all sorts of emergencies.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.068; 600.120			Approved	no
	Call Number	Admin @ si @ LWG2018			Serial	3041
Permanent link to this record



	Author	Claudio Baecchi; Francesco Turchini; Lorenzo Seidenari; Andrew Bagdanov; Alberto del Bimbo
	Title	Fisher vectors over random density forest for object recognition			Type	Conference Article
	Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	4328-4333
	Keywords
	Abstract
	Address	Stockholm; Sweden; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	LAMP; 600.079			Approved	no
	Call Number	Admin @ si @ BTS2014			Serial	2518
Permanent link to this record



	Author	Federico Bartoli; Giuseppe Lisanti; Svebor Karaman; Andrew Bagdanov; Alberto del Bimbo
	Title	Unsupervised scene adaptation for faster multi- scale pedestrian detection			Type	Conference Article
	Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	3534 - 3539
	Keywords
	Abstract
	Address	Stockholm; Sweden; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	LAMP; 600.079			Approved	no
	Call Number	Admin @ si @ BLK2014			Serial	2519
Permanent link to this record



	Author	Svebor Karaman; Giuseppe Lisanti; Andrew Bagdanov; Alberto del Bimbo
	Title	From re-identification to identity inference: Labeling consistency by local similarity constraints			Type	Book Chapter
	Year	2014	Publication	Person Re-Identification	Abbreviated Journal
	Volume	2	Issue		Pages	287-307
	Keywords	re-identification; Identity inference; Conditional random fields; Video surveillance
	Abstract	In this chapter, we introduce the problem of identity inference as a generalization of person re-identification. It is most appropriate to distinguish identity inference from re-identification in situations where a large number of observations must be identified without knowing a priori that groups of test images represent the same individual. The standard single- and multishot person re-identification common in the literature are special cases of our formulation. We present an approach to solving identity inference by modeling it as a labeling problem in a Conditional Random Field (CRF). The CRF model ensures that the final labeling gives similar labels to detections that are similar in feature space. Experimental results are given on the ETHZ, i-LIDS and CAVIAR datasets. Our approach yields state-of-the-art performance for multishot re-identification, and our results on the more general identity inference problem demonstrate that we are able to infer the identity of very many examples even with very few labeled images in the gallery.
	Address
	Corporate Author				Thesis
	Publisher	Springer London	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	2191-6586	ISBN	978-1-4471-6295-7	Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.079			Approved	no
	Call Number	Admin @ si @KLB2014b			Serial	2521
Permanent link to this record



	Author	Lorenzo Seidenari; Giuseppe Serra; Andrew Bagdanov; Alberto del Bimbo
	Title	Local pyramidal descriptors for image recognition			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	36	Issue	5	Pages	1033 - 1040
	Keywords	Object categorization; local features; kernel methods
	Abstract	In this paper we present a novel method to improve the flexibility of descriptor matching for image recognition by using local multiresolution pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one’s bets on which levels will most relevant for matching during learning and recognition. We introduce the Pyramid SIFT (P-SIFT) descriptor and show that its use in four state-of-the-art image recognition pipelines improves accuracy and yields state-of-the-art results. Our technique is applicable independently of spatial pyramid matching and we show that spatial pyramids can be combined with local pyramids to obtain further improvement.We achieve state-of-the-art results on Caltech-101 (80.1%) and Caltech-256 (52.6%) when compared to other approaches based on SIFT features over intensity images. Our technique is efficient and is extremely easy to integrate into image recognition pipelines.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.079			Approved	no
	Call Number	Admin @ si @ SSB2014			Serial	2524
Permanent link to this record