|
Records |
Links |
|
Author |
Sounak Dey; Anjan Dutta; Suman Ghosh; Ernest Valveny; Josep Llados; Umapada Pal |
|
|
Title |
Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch |
Type |
Conference Article |
|
Year |
2018 |
Publication |
24th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
916 - 921 |
|
|
Keywords |
|
|
|
Abstract |
In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query. A cross-modal deep network architecture is formulated to jointly model the sketch and text input modalities as well as the the image output modality, learning a common embedding between text and images and between sketches and images. In addition, an attention model is used to selectively focus the attention on the different objects of the image, allowing for retrieval with multiple objects in the query. Experiments show that the proposed method performs the best in both single and multiple object image retrieval in standard datasets. |
|
|
Address |
Beijing; China; August 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 602.167; 602.168; 600.097; 600.084; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ DDG2018b |
Serial |
3152 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes |
|
|
Title |
Learning Graph Distances with Message Passing Neural Networks |
Type |
Conference Article |
|
Year |
2018 |
Publication |
24th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2239-2244 |
|
|
Keywords |
★Best Paper Award★ |
|
|
Abstract |
Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of error-tolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high
computational complexity, which makes it difficult to apply
these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with
(approximate) graph edit distance benchmarks. |
|
|
Address |
Beijing; China; August 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.097; 603.057; 601.302; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RFL2018 |
Serial |
3168 |
|
Permanent link to this record |
|
|
|
|
Author |
Gabriela Ramirez; Esau Villatoro; Bogdan Ionescu; Hugo Jair Escalante; Sergio Escalera; Martha Larson; Henning Muller; Isabelle Guyon |
|
|
Title |
Overview of the Multimedia Information Processing for Personality & Social Networks Analysis Contes |
Type |
Conference Article |
|
Year |
2018 |
Publication |
Multimedia Information Processing for Personality and Social Networks Analysis (MIPPSNA 2018) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Beijing; China; August 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPRW |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ RVI2018 |
Serial |
3211 |
|
Permanent link to this record |
|
|
|
|
Author |
Felipe Codevilla; Matthias Muller; Antonio Lopez; Vladlen Koltun; Alexey Dosovitskiy |
|
|
Title |
End-to-end Driving via Conditional Imitation Learning |
Type |
Conference Article |
|
Year |
2018 |
Publication |
IEEE International Conference on Robotics and Automation |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
4693 - 4700 |
|
|
Keywords |
|
|
|
Abstract |
Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at this https URL |
|
|
Address |
Brisbane; Australia; May 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICRA |
|
|
Notes |
ADAS; 600.116; 600.124; 600.118 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CML2018 |
Serial |
3108 |
|
Permanent link to this record |
|
|
|
|
Author |
Ozan Caglayan; Adrien Bardet; Fethi Bougares; Loic Barrault; Kai Wang; Marc Masana; Luis Herranz; Joost Van de Weijer |
|
|
Title |
LIUM-CVC Submissions for WMT18 Multimodal Translation Task |
Type |
Conference Article |
|
Year |
2018 |
Publication |
3rd Conference on Machine Translation |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previou multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions
ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR. |
|
|
Address |
Brussels; Belgium; October 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WMT |
|
|
Notes |
LAMP; 600.106; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CBB2018 |
Serial |
3240 |
|
Permanent link to this record |
|
|
|
|
Author |
Simone Balocco; Mauricio Gonzalez; Ricardo Ñancule; Petia Radeva; Gabriel Thomas |
|
|
Title |
Calcified Plaque Detection in IVUS Sequences: Preliminary Results Using Convolutional Nets |
Type |
Conference Article |
|
Year |
2018 |
Publication |
International Workshop on Artificial Intelligence and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
11047 |
Issue |
|
Pages |
34-42 |
|
|
Keywords |
Intravascular ultrasound images; Convolutional nets; Deep learning; Medical image analysis |
|
|
Abstract |
The manual inspection of intravascular ultrasound (IVUS) images to detect clinically relevant patterns is a difficult and laborious task performed routinely by physicians. In this paper, we present a framework based on convolutional nets for the quick selection of IVUS frames containing arterial calcification, a pattern whose detection plays a vital role in the diagnosis of atherosclerosis. Preliminary experiments on a dataset acquired from eighty patients show that convolutional architectures improve detections of a shallow classifier in terms of 𝐹1-measure, precision and recall. |
|
|
Address |
Cuba; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IWAIPR |
|
|
Notes |
MILAB; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ BGÑ2018 |
Serial |
3237 |
|
Permanent link to this record |
|
|
|
|
Author |
Mohamed Ilyes Lakhal; Hakan Cevikalp; Sergio Escalera |
|
|
Title |
CRN: End-to-end Convolutional Recurrent Network Structure Applied to Vehicle Classification |
Type |
Conference Article |
|
Year |
2018 |
Publication |
13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications |
Abbreviated Journal |
|
|
|
Volume |
5 |
Issue |
|
Pages |
137-144 |
|
|
Keywords |
Vehicle Classification; Deep Learning; End-to-end Learning |
|
|
Abstract |
Vehicle type classification is considered to be a central part of Intelligent Traffic Systems. In the recent years, deep learning methods have emerged in as being the state-of-the-art in many computer vision tasks. In this paper, we present a novel yet simple deep learning framework for the vehicle type classification problem. We propose an end-to-end trainable system, that combines convolution neural network for feature extraction and recurrent neural network as a classifier. The recurrent network structure is used to handle various types of feature inputs, and at the same time allows to produce a single or a set of class predictions. In order to assess the effectiveness of our solution, we have conducted a set of experiments in two public datasets, obtaining state of the art results. In addition, we also report results on the newly released MIO-TCD dataset. |
|
|
Address |
Funchal; Madeira; Portugal; January 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VISAPP |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ LCE2018a |
Serial |
3094 |
|
Permanent link to this record |
|
|
|
|
Author |
Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Farhan Akram; Syeda Furruka Banu; Adel Saleh; Vivek Kumar Singh; Forhad U. H. Chowdhury; Saddam Abdulwahab; Santiago Romani; Petia Radeva; Domenec Puig |
|
|
Title |
SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks. |
Type |
Conference Article |
|
Year |
2018 |
Publication |
21st International Conference on Medical Image Computing & Computer Assisted Intervention |
Abbreviated Journal |
|
|
|
Volume |
2 |
Issue |
|
Pages |
21-29 |
|
|
Keywords |
|
|
|
Abstract |
Skin lesion segmentation (SLS) in dermoscopic images is a crucial task for automated diagnosis of melanoma. In this paper, we present a robust deep learning SLS model, so-called SLSDeep, which is represented as an encoder-decoder network. The encoder network is constructed by dilated residual layers, in turn, a pyramid pooling network followed by three convolution layers is used for the decoder. Unlike the traditional methods employing a cross-entropy loss, we investigated a loss function by combining both Negative Log Likelihood (NLL) and End Point Error (EPE) to accurately segment the melanoma regions with sharp boundaries. The robustness of the proposed model was evaluated on two public databases: ISBI 2016 and 2017 for skin lesion analysis towards melanoma detection challenge. The proposed model outperforms the state-of-the-art methods in terms of segmentation accuracy. Moreover, it is capable to segment more than 100 images of size 384x384 per second on a recent GPU. |
|
|
Address |
Granada; Espanya; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MICCAI |
|
|
Notes |
MILAB; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ SRA2018 |
Serial |
3112 |
|
Permanent link to this record |
|
|
|
|
Author |
Esmitt Ramirez; Carles Sanchez; Agnes Borras; Marta Diez-Ferrer; Antoni Rosell; Debora Gil |
|
|
Title |
Image-Based Bronchial Anatomy Codification for Biopsy Guiding in Video Bronchoscopy |
Type |
Conference Article |
|
Year |
2018 |
Publication |
OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
11041 |
Issue |
|
Pages |
|
|
|
Keywords |
Biopsy guiding; Bronchoscopy; Lung biopsy; Intervention guiding; Airway codification |
|
|
Abstract |
Bronchoscopy examinations allow biopsy of pulmonary nodules with minimum risk for the patient. Even for experienced bronchoscopists, it is difficult to guide the bronchoscope to most distal lesions and obtain an accurate diagnosis. This paper presents an image-based codification of the bronchial anatomy for bronchoscopy biopsy guiding. The 3D anatomy of each patient is codified as a binary tree with nodes representing bronchial levels and edges labeled using their position on images projecting the 3D anatomy from a set of branching points. The paths from the root to leaves provide a codification of navigation routes with spatially consistent labels according to the anatomy observes in video bronchoscopy explorations. We evaluate our labeling approach as a guiding system in terms of the number of bronchial levels correctly codified, also in the number of labels-based instructions correctly supplied, using generalized mixed models and computer-generated data. Results obtained for three independent observers prove the consistency and reproducibility of our guiding system. We trust that our codification based on viewer’s projection might be used as a foundation for the navigation process in Virtual Bronchoscopy systems. |
|
|
Address |
Granada; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MICCAIW |
|
|
Notes |
IAM; 600.096; 600.075; 601.323; 600.145 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RSB2018b |
Serial |
3137 |
|
Permanent link to this record |
|
|
|
|
Author |
Jon Almazan; Bojana Gajic; Naila Murray; Diane Larlus |
|
|
Title |
Re-ID done right: towards good practices for person re-identification |
Type |
Miscellaneous |
|
Year |
2018 |
Publication |
Arxiv |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Training a deep architecture using a ranking loss has become standard for the person re-identification task. Increasingly, these deep architectures include additional components that leverage part detections, attribute predictions, pose estimators and other auxiliary information, in order to more effectively localize and align discriminative image regions. In this paper we adopt a different approach and carefully design each component of a simple deep architecture and, critically, the strategy for training it effectively for person re-identification. We extensively evaluate each design choice, leading to a list of good practices for person re-identification. By following these practices, our approach outperforms the state of the art, including more complex methods with auxiliary components, by large margins on four benchmark datasets. We also provide a qualitative analysis of our trained representation which indicates that, while compact, it is able to capture information from localized and discriminative regions, in a manner akin to an implicit attention mechanism. |
|
|
Address |
January 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3711 |
|
Permanent link to this record |
|
|
|
|
Author |
Mohammad N. S. Jahromi; Morten Bojesen Bonderup; Maryam Asadi-Aghbolaghi; Egils Avots; Kamal Nasrollahi; Sergio Escalera; Shohreh Kasaei; Thomas B. Moeslund; Gholamreza Anbarjafari |
|
|
Title |
Automatic Access Control Based on Face and Hand Biometrics in a Non-cooperative Context |
Type |
Conference Article |
|
Year |
2018 |
Publication |
IEEE Winter Applications of Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
28-36 |
|
|
Keywords |
IEEE Winter Applications of Computer Vision Workshops |
|
|
Abstract |
Automatic access control systems (ACS) based on the human biometrics or physical tokens are widely employed in public and private areas. Yet these systems, in their conventional forms, are restricted to active interaction from the users. In scenarios where users are not cooperating with the system, these systems are challenged. Failure in cooperation with the biometric systems might be intentional or because the users are incapable of handling the interaction procedure with the biometric system or simply forget to cooperate with it, due to for example, illness like dementia. This work introduces a challenging bimodal database, including face and hand information of the users when they approach a door to open it by its handle in a noncooperative context. We have defined two (an easy and a challenging) protocols on how to use the database. We have reported results on many baseline methods, including deep learning techniques as well as conventional methods on the database. The obtained results show the merit of the proposed database and the challenging nature of access control with non-cooperative users. |
|
|
Address |
Lake Tahoe; USA; March 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACVW |
|
|
Notes |
HUPBA; 602.133 |
Approved |
no |
|
|
Call Number |
Admin @ si @ JBA2018 |
Serial |
3121 |
|
Permanent link to this record |
|
|
|
|
Author |
Xavier Soria; Angel Sappa |
|
|
Title |
Improving Edge Detection in RGB Images by Adding NIR Channel |
Type |
Conference Article |
|
Year |
2018 |
Publication |
14th IEEE International Conference on Signal Image Technology & Internet Based System |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Edge detection; Contour detection; VGG; CNN; RGB-NIR; Near infrared images |
|
|
Abstract |
The edge detection is yet a critical problem in many computer vision and image processing tasks. The manuscript presents an Holistically-Nested Edge Detection based approach to study the inclusion of Near-Infrared in the Visible spectrum
images. To do so, a Single Sensor based dataset has been acquired in the range of 400nm to 1100nm wavelength spectral band. Prominent results have been obtained even when the ground truth (annotated edge-map) is based in the visible wavelength spectrum. |
|
|
Address |
Las Palmas de Gran Canaria; November 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SITIS |
|
|
Notes |
MSIAU; 600.122 |
Approved |
no |
|
|
Call Number |
Admin @ si @ SoS2018 |
Serial |
3192 |
|
Permanent link to this record |
|
|
|
|
Author |
Patricia Suarez; Angel Sappa; Boris X. Vintimilla |
|
|
Title |
Cross-spectral image dehaze through a dense stacked conditional GAN based approach |
Type |
Conference Article |
|
Year |
2018 |
Publication |
14th IEEE International Conference on Signal Image Technology & Internet Based System |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Infrared imaging; Dense; Stacked CGAN; Crossspectral; Convolutional networks |
|
|
Abstract |
This paper proposes a novel approach to remove haze from RGB images using a near infrared images based on a dense stacked conditional Generative Adversarial Network (CGAN). The architecture of the deep network implemented
receives, besides the images with haze, its corresponding image in the near infrared spectrum, which serve to accelerate the learning process of the details of the characteristics of the images. The model uses a triplet layer that allows the independence learning of each channel of the visible spectrum image to remove the haze on each color channel separately. A multiple loss function scheme is proposed, which ensures balanced learning between the colors
and the structure of the images. Experimental results have shown that the proposed method effectively removes the haze from the images. Additionally, the proposed approach is compared with a state of the art approach showing better results. |
|
|
Address |
Las Palmas de Gran Canaria; November 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-5386-9385-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SITIS |
|
|
Notes |
MSIAU; 600.086; 600.130; 600.122 |
Approved |
no |
|
|
Call Number |
Admin @ si @ SSV2018a |
Serial |
3193 |
|
Permanent link to this record |
|
|
|
|
Author |
Jorge Charco; Boris X. Vintimilla; Angel Sappa |
|
|
Title |
Deep learning based camera pose estimation in multi-view environment |
Type |
Conference Article |
|
Year |
2018 |
Publication |
14th IEEE International Conference on Signal Image Technology & Internet Based System |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Deep learning; Camera pose estimation; Multiview environment; Siamese architecture |
|
|
Abstract |
This paper proposes to use a deep learning network architecture for relative camera pose estimation on a multi-view environment. The proposed network is a variant architecture of AlexNet to use as regressor for prediction the relative translation and rotation as output. The proposed approach is trained from
scratch on a large data set that takes as input a pair of imagesfrom the same scene. This new architecture is compared with a previous approach using standard metrics, obtaining better results on the relative camera pose. |
|
|
Address |
Las Palmas de Gran Canaria; November 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SITIS |
|
|
Notes |
MSIAU; 600.086; 600.130; 600.122 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CVS2018 |
Serial |
3194 |
|
Permanent link to this record |
|
|
|
|
Author |
Boris N. Oreshkin; Pau Rodriguez; Alexandre Lacoste |
|
|
Title |
TADAM: Task dependent adaptive metric for improved few-shot learning |
Type |
Conference Article |
|
Year |
2018 |
Publication |
32nd Annual Conference on Neural Information Processing Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Few-shot learning has become essential for producing models that generalize from few examples. In this work, we identify that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms. Our analysis reveals that simple metric scaling completely changes the nature of few-shot algorithm parameter updates. Metric scaling provides improvements up to 14% in accuracy for certain metrics on the mini-Imagenet 5-way 5-shot classification task. We further propose a simple and effective way of conditioning a learner on the task sample set, resulting in learning a task-dependent metric space. Moreover, we propose and empirically test a practical end-to-end optimization procedure based on auxiliary task co-training to learn a task-dependent metric space. The resulting few-shot learning model based on the task-dependent scaled metric achieves state of the art on mini-Imagenet. We confirm these results on another few-shot dataset that we introduce in this paper based on CIFAR100. |
|
|
Address |
Montreal; Canada; December 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
NIPS |
|
|
Notes |
ISE; 600.098; 600.119 |
Approved |
no |
|
|
Call Number |
Admin @ si @ ORL2018 |
Serial |
3140 |
|
Permanent link to this record |