Home | [51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–80] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Patricia Marquez; Debora Gil; Aura Hernandez-Sabate | ||||
Title | Evaluation of the Capabilities of Confidence Measures for Assessing Optical Flow Quality | Type | Conference Article | ||
Year | 2013 | Publication | ICCV Workshop on Computer Vision in Vehicle Technology: From Earth to Mars | Abbreviated Journal | |
Volume | Issue | Pages | 624-631 | ||
Keywords | |||||
Abstract | Assessing Optical Flow (OF) quality is essential for its further use in reliable decision support systems. The absence of ground truth in such situations leads to the computation of OF Confidence Measures (CM) obtained from either input or output data. A fair comparison across the capabilities of the different CM for bounding OF error is required in order to choose the best OF-CM pair for discarding points where OF computation is not reliable. This paper presents a statistical probabilistic framework for assessing the quality of a given CM. Our quality measure is given in terms of the percentage of pixels whose OF error bound can not be determined by CM values. We also provide statistical tools for the computation of CM values that ensures a given accuracy of the flow field. | ||||
Address | Sydney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVTT:E2M | ||
Notes | IAM; ADAS; 600.044; 600.057; 601.145 | Approved | no | ||
Call Number | Admin @ si @ MGH2013b | Serial | 2351 | ||
Permanent link to this record | |||||
Author | Fadi Dornaika; Bogdan Raducanu | ||||
Title | Single Snapshot 3D Head Pose Initialization for Tracking in Human Robot Interaction Scenario | Type | Conference Article | ||
Year | 2010 | Publication | 1st International Workshop on Computer Vision for Human-Robot Interaction | Abbreviated Journal | |
Volume | Issue | Pages | 32–39 | ||
Keywords | 1st International Workshop on Computer Vision for Human-Robot Interaction, in conjunction with IEEE CVPR 2010 | ||||
Abstract | This paper presents an automatic 3D head pose initialization scheme for a real-time face tracker with application to human-robot interaction. It has two main contributions. First, we propose an automatic 3D head pose and person specific face shape estimation, based on a 3D deformable model. The proposed approach serves to initialize our realtime 3D face tracker. What makes this contribution very attractive is that the initialization step can cope with faces
under arbitrary pose, so it is not limited only to near-frontal views. Second, the previous framework is used to develop an application in which the orientation of an AIBO’s camera can be controlled through the imitation of user’s head pose. In our scenario, this application is used to build panoramic images from overlapping snapshots. Experiments on real videos confirm the robustness and usefulness of the proposed methods. |
||||
Address | San Francisco; CA; USA; June 2010 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2160-7508 | ISBN | 978-1-4244-7029-7 | Medium | |
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | OR;MV | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ DoR2010a | Serial | 1309 | ||
Permanent link to this record | |||||
Author | Andreas Møgelmose; Chris Bahnsen; Thomas B. Moeslund; Albert Clapes; Sergio Escalera | ||||
Title | Tri-modal Person Re-identification with RGB, Depth and Thermal Features | Type | Conference Article | ||
Year | 2013 | Publication | 9th IEEE Workshop on Perception beyond the visible Spectrum, Computer Vision and Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 301-307 | ||
Keywords | |||||
Abstract | Person re-identification is about recognizing people who have passed by a sensor earlier. Previous work is mainly based on RGB data, but in this work we for the first time present a system where we combine RGB, depth, and thermal data for re-identification purposes. First, from each of the three modalities, we obtain some particular features: from RGB data, we model color information from different regions of the body, from depth data, we compute different soft body biometrics, and from thermal data, we extract local structural information. Then, the three information types are combined in a joined classifier. The tri-modal system is evaluated on a new RGB-D-T dataset, showing successful results in re-identification scenarios. | ||||
Address | Portland; oregon; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-0-7695-4990-3 | Medium | ||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | HUPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ MBM2013 | Serial | 2253 | ||
Permanent link to this record | |||||
Author | Fadi Dornaika; Bogdan Raducanu | ||||
Title | Out-of-Sample Embedding for Manifold Learning Applied to Face Recognition | Type | Conference Article | ||
Year | 2013 | Publication | IEEE International Workshop on Analysis and Modeling of Faces and Gestures | Abbreviated Journal | |
Volume | Issue | Pages | 862-868 | ||
Keywords | |||||
Abstract | Manifold learning techniques are affected by two critical aspects: (i) the design of the adjacency graphs, and (ii) the embedding of new test data---the out-of-sample problem. For the first aspect, the proposed schemes were heuristically driven. For the second aspect, the difficulty resides in finding an accurate mapping that transfers unseen data samples into an existing manifold. Past works addressing these two aspects were heavily parametric in the sense that the optimal performance is only reached for a suitable parameter choice that should be known in advance. In this paper, we demonstrate that sparse coding theory not only serves for automatic graph reconstruction as shown in recent works, but also represents an accurate alternative for out-of-sample embedding. Considering for a case study the Laplacian Eigenmaps, we applied our method to the face recognition problem. To evaluate the effectiveness of the proposed out-of-sample embedding, experiments are conducted using the k-nearest neighbor (KNN) and Kernel Support Vector Machines (KSVM) classifiers on four public face databases. The experimental results show that the proposed model is able to achieve high categorization effectiveness as well as high consistency with non-linear embeddings/manifolds obtained in batch modes. | ||||
Address | Portland; USA; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | OR; 600.046;MV | Approved | no | ||
Call Number | Admin @ si @ DoR2013 | Serial | 2236 | ||
Permanent link to this record | |||||
Author | David Vazquez; Jiaolong Xu; Sebastian Ramos; Antonio Lopez; Daniel Ponsa | ||||
Title | Weakly Supervised Automatic Annotation of Pedestrian Bounding Boxes | Type | Conference Article | ||
Year | 2013 | Publication | CVPR Workshop on Ground Truth – What is a good dataset? | Abbreviated Journal | |
Volume | Issue | Pages | 706 - 711 | ||
Keywords | Pedestrian Detection; Domain Adaptation | ||||
Abstract | Among the components of a pedestrian detector, its trained pedestrian classifier is crucial for achieving the desired performance. The initial task of the training process consists in collecting samples of pedestrians and background, which involves tiresome manual annotation of pedestrian bounding boxes (BBs). Thus, recent works have assessed the use of automatically collected samples from photo-realistic virtual worlds. However, learning from virtual-world samples and testing in real-world images may suffer the dataset shift problem. Accordingly, in this paper we assess an strategy to collect samples from the real world and retrain with them, thus avoiding the dataset shift, but in such a way that no BBs of real-world pedestrians have to be provided. In particular, we train a pedestrian classifier based on virtual-world samples (no human annotation required). Then, using such a classifier we collect pedestrian samples from real-world images by detection. After, a human oracle rejects the false detections efficiently (weak annotation). Finally, a new classifier is trained with the accepted detections. We show that this classifier is competitive with respect to the counterpart trained with samples collected by manually annotating hundreds of pedestrian BBs. | ||||
Address | Portland; Oregon; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | English | Summary Language | English | Original Title | |
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | ADAS; 600.054; 600.057; 601.217 | Approved | no | ||
Call Number | ADAS @ adas @ VXR2013a | Serial | 2219 | ||
Permanent link to this record | |||||
Author | Jiaolong Xu; David Vazquez; Sebastian Ramos; Antonio Lopez; Daniel Ponsa | ||||
Title | Adapting a Pedestrian Detector by Boosting LDA Exemplar Classifiers | Type | Conference Article | ||
Year | 2013 | Publication | CVPR Workshop on Ground Truth – What is a good dataset? | Abbreviated Journal | |
Volume | Issue | Pages | 688 - 693 | ||
Keywords | Pedestrian Detection; Domain Adaptation | ||||
Abstract | Training vision-based pedestrian detectors using synthetic datasets (virtual world) is a useful technique to collect automatically the training examples with their pixel-wise ground truth. However, as it is often the case, these detectors must operate in real-world images, experiencing a significant drop of their performance. In fact, this effect also occurs among different real-world datasets, i.e. detectors' accuracy drops when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, in order to avoid this problem, it is required to adapt the detector trained with synthetic data to operate in the real-world scenario. In this paper, we propose a domain adaptation approach based on boosting LDA exemplar classifiers from both virtual and real worlds. We evaluate our proposal on multiple real-world pedestrian detection datasets. The results show that our method can efficiently adapt the exemplar classifiers from virtual to real world, avoiding drops in average precision over the 15%. | ||||
Address | Portland; oregon; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | English | Summary Language | English | Original Title | |
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | ADAS; 600.054; 600.057; 601.217 | Approved | yes | ||
Call Number | XVR2013; ADAS @ adas @ xvr2013a | Serial | 2220 | ||
Permanent link to this record | |||||
Author | Carlo Gatta; Adriana Romero; Joost Van de Weijer | ||||
Title | Unrolling loopy top-down semantic feedback in convolutional deep networks | Type | Conference Article | ||
Year | 2014 | Publication | Workshop on Deep Vision: Deep Learning for Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 498-505 | ||
Keywords | |||||
Abstract | In this paper, we propose a novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing. We also show how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods, and was not present in previous convolutional approaches. The proposed method is characterised by an efficient training and a sufficiently fast testing. We use the well known SIFTflow dataset to numerically show the advantages provided by our contributions, and to compare with state-of-the-art image parsing convolutional based approaches. | ||||
Address | Columbus; Ohio; June 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | LAMP; MILAB; 601.160; 600.079 | Approved | no | ||
Call Number | Admin @ si @ GRW2014 | Serial | 2490 | ||
Permanent link to this record | |||||
Author | Bogdan Raducanu; Alireza Bosaghzadeh; Fadi Dornaika | ||||
Title | Multi-observation Face Recognition in Videos based on Label Propagation | Type | Conference Article | ||
Year | 2015 | Publication | 6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015 | Abbreviated Journal | |
Volume | Issue | Pages | 10-17 | ||
Keywords | |||||
Abstract | In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we
propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods. |
||||
Address | Boston; USA; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | OR; 600.068; 600.072;MV | Approved | no | ||
Call Number | Admin @ si @ RBD2015 | Serial | 2627 | ||
Permanent link to this record | |||||
Author | Santiago Segui; Oriol Pujol; Jordi Vitria | ||||
Title | Learning to count with deep object features | Type | Conference Article | ||
Year | 2015 | Publication | Deep Vision: Deep Learning in Computer Vision, CVPR 2015 Workshop | Abbreviated Journal | |
Volume | Issue | Pages | 90-96 | ||
Keywords | |||||
Abstract | Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural
network in order to understand their underlying representation. To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training. We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene. |
||||
Address | Boston; USA; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | MILAB; HuPBA; OR;MV | Approved | no | ||
Call Number | Admin @ si @ SPV2015 | Serial | 2636 | ||
Permanent link to this record | |||||
Author | Xavier Baro; Jordi Gonzalez; Junior Fabian; Miguel Angel Bautista; Marc Oliu; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera | ||||
Title | ChaLearn Looking at People 2015 challenges: action spotting and cultural event recognition | Type | Conference Article | ||
Year | 2015 | Publication | 2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) | Abbreviated Journal | |
Volume | Issue | Pages | 1-9 | ||
Keywords | |||||
Abstract | Following previous series on Looking at People (LAP) challenges [6, 5, 4], ChaLearn ran two competitions to be presented at CVPR 2015: action/interaction spotting and cultural event recognition in RGB data. We ran a second round on human activity recognition on RGB data sequences. In terms of cultural event recognition, tens of categories have to be recognized. This involves scene understanding and human analysis. This paper summarizes the two performed challenges and obtained results. Details of the ChaLearn LAP competitions can be found at http://gesture.chalearn.org/. | ||||
Address | Boston; EEUU; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | HuPBA;MV | Approved | no | ||
Call Number | Serial | 2652 | |||
Permanent link to this record | |||||
Author | Andres Traumann; Sergio Escalera; Gholamreza Anbarjafari | ||||
Title | A New Retexturing Method for Virtual Fitting Room Using Kinect 2 Camera | Type | Conference Article | ||
Year | 2015 | Publication | 2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) | Abbreviated Journal | |
Volume | Issue | Pages | 75-79 | ||
Keywords | |||||
Abstract | |||||
Address | Boston; EEUU; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ TEA2015 | Serial | 2653 | ||
Permanent link to this record | |||||
Author | Ramin Irani; Kamal Nasrollahi; Chris Bahnsen; D.H. Lundtoft; Thomas B. Moeslund; Marc O. Simon; Ciprian Corneanu; Sergio Escalera; Tanja L. Pedersen; Maria-Louise Klitgaard; Laura Petrini | ||||
Title | Spatio-temporal Analysis of RGB-D-T Facial Images for Multimodal Pain Level Recognition | Type | Conference Article | ||
Year | 2015 | Publication | 2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) | Abbreviated Journal | |
Volume | Issue | Pages | 88-95 | ||
Keywords | |||||
Abstract | Pain is a vital sign of human health and its automatic detection can be of crucial importance in many different contexts, including medical scenarios. While most available computer vision techniques are based on RGB, in this paper, we investigate the effect of combining RGB, depth, and thermal
facial images for pain detection and pain intensity level recognition. For this purpose, we extract energies released by facial pixels using a spatiotemporal filter. Experiments on a group of 12 elderly people applying the multimodal approach show that the proposed method successfully detects pain and recognizes between three intensity levels in 82% of the analyzed frames improving more than 6% over RGB only analysis in similar conditions. |
||||
Address | Boston; EEUU; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ INB2015 | Serial | 2654 | ||
Permanent link to this record | |||||
Author | Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera; Albert Clapes; Kamal Nasrollahi; Michael Holte; Thomas B. Moeslund | ||||
Title | Keep it Accurate and Diverse: Enhancing Action Recognition Performance by Ensemble Learning | Type | Conference Article | ||
Year | 2015 | Publication | IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) | Abbreviated Journal | |
Volume | Issue | Pages | 22-29 | ||
Keywords | |||||
Abstract | The performance of different action recognition techniques has recently been studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of action learning techniques, each performing the recognition task from a different perspective.
The underlying idea is that instead of aiming a very sophisticated and powerful representation/learning technique, we can learn action categories using a set of relatively simple and diverse classifiers, each trained with different feature set. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a learner on an unseen action recognition scenario. This leads to having a more robust and general-applicable framework. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing enhanced performance of the proposed methodology. |
||||
Address | Boston; EEUU; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ BGE2015 | Serial | 2655 | ||
Permanent link to this record | |||||
Author | Jun Wan; Yibing Zhao; Shuai Zhou; Isabelle Guyon; Sergio Escalera | ||||
Title | ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition | Type | Conference Article | ||
Year | 2016 | Publication | 29th IEEE Conference on Computer Vision and Pattern Recognition Worshops | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD)and the Continuous Gesture Dataset (ConGD). Both datasets are derived from the ChaLearn Gesture Dataset
(CGD) that has a total of more than 50000 gestures for the “one-shot-learning” competition. To increase the potential of the old dataset, we designed new well curated datasets composed of 249 gesture labels, and including 47933 gestures manually labeled the begin and end frames in sequences.Using these datasets we will open two competitions on the CodaLab platform so that researchers can test and compare their methods for “user independent” gesture recognition. The first challenge is designed for gesture spotting and recognition in continuous sequences of gestures while the second one is designed for gesture classification from segmented data. The baseline method based on the bag of visual words model is also presented. |
||||
Address | Las Vegas; USA; July 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | HuPBA;MILAB; | Approved | no | ||
Call Number | Admin @ si @ WZZ2016 | Serial | 2771 | ||
Permanent link to this record | |||||
Author | Cristhian A. Aguilera-Carrasco; F. Aguilera; Angel Sappa; C. Aguilera; Ricardo Toledo | ||||
Title | Learning cross-spectral similarity measures with deep convolutional neural networks | Type | Conference Article | ||
Year | 2016 | Publication | 29th IEEE Conference on Computer Vision and Pattern Recognition Worshops | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The simultaneous use of images from different spectracan be helpful to improve the performance of many computer vision tasks. The core idea behind the usage of crossspectral approaches is to take advantage of the strengths of each spectral band providing a richer representation of a scene, which cannot be obtained with just images from one spectral band. In this work we tackle the cross-spectral image similarity problem by using Convolutional Neural Networks (CNNs). We explore three different CNN architectures to compare the similarity of cross-spectral image patches. Specifically, we train each network with images from the visible and the near-infrared spectrum, and then test the result with two public cross-spectral datasets. Experimental results show that CNN approaches outperform the current state-of-art on both cross-spectral datasets. Additionally, our experiments show that some CNN architectures are capable of generalizing between different crossspectral domains. | ||||
Address | Las vegas; USA; June 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference ![]() |
CVPRW | ||
Notes | ADAS; 600.086; 600.076 | Approved | no | ||
Call Number | Admin @ si @AAS2016 | Serial | 2809 | ||
Permanent link to this record |