toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author M. Campos-Taberner; Adriana Romero; Carlo Gatta; Gustavo Camps-Valls edit  url
doi  openurl
  Title Shared feature representations of LiDAR and optical images: Trading sparsity for semantic discrimination Type Conference Article
  Year 2015 Publication IEEE International Geoscience and Remote Sensing Symposium IGARSS2015 Abbreviated Journal  
  Volume Issue Pages 4169 - 4172  
  Keywords  
  Abstract This paper studies the level of complementary information conveyed by extremely high resolution LiDAR and optical images. We pursue this goal following an indirect approach via unsupervised spatial-spectral feature extraction. We used a recently presented unsupervised convolutional neural network trained to enforce both population and lifetime spar-sity in the feature representation. We derived independent and joint feature representations, and analyzed the sparsity scores and the discriminative power. Interestingly, the obtained results revealed that the RGB+LiDAR representation is no longer sparse, and the derived basis functions merge color and elevation yielding a set of more expressive colored edge filters. The joint feature representation is also more discriminative when used for clustering and topological data visualization.  
  Address Milan; Italy; July 2015  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IGARSS  
  Notes (up) LAMP; 600.079;MILAB Approved no  
  Call Number Admin @ si @ CRG2015 Serial 2724  
Permanent link to this record
 

 
Author Laura Lopez-Fuentes; Joost Van de Weijer; Marc Bolaños; Harald Skinnemoen edit   pdf
openurl 
  Title Multi-modal Deep Learning Approach for Flood Detection Type Conference Article
  Year 2017 Publication MediaEval Benchmarking Initiative for Multimedia Evaluation Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this paper we propose a multi-modal deep learning approach to detect floods in social media posts. Social media posts normally contain some metadata and/or visual information, therefore in order to detect the floods we use this information. The model is based on a Convolutional Neural Network which extracts the visual features and a bidirectional Long Short-Term Memory network to extract the semantic features from the textual metadata. We validate the
method on images extracted from Flickr which contain both visual information and metadata and compare the results when using both, visual information only or metadata only. This work has been done in the context of the MediaEval Multimedia Satellite Task.
 
  Address Dublin; Ireland; September 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MediaEval  
  Notes (up) LAMP; 600.084; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ LWB2017a Serial 2974  
Permanent link to this record
 

 
Author Laura Lopez-Fuentes; Alessandro Farasin; Harald Skinnemoen; Paolo Garza edit   pdf
openurl 
  Title Deep Learning models for passability detection of flooded roads Type Conference Article
  Year 2018 Publication MediaEval 2018 Multimedia Benchmark Workshop Abbreviated Journal  
  Volume 2283 Issue Pages  
  Keywords  
  Abstract In this paper we study and compare several approaches to detect floods and evidence for passability of roads by conventional means in Twitter. We focus on tweets containing both visual information (a picture shared by the user) and metadata, a combination of text and related extra information intrinsic to the Twitter API. This work has been done in the context of the MediaEval 2018 Multimedia Satellite Task.  
  Address Sophia Antipolis; France; October 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MediaEval  
  Notes (up) LAMP; 600.084; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ LFS2018 Serial 3224  
Permanent link to this record
 

 
Author Laura Lopez-Fuentes; Claudio Rossi; Harald Skinnemoen edit   pdf
doi  openurl
  Title River segmentation for flood monitoring Type Conference Article
  Year 2017 Publication Data Science for Emergency Management at Big Data 2017 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Floods are major natural disasters which cause deaths and material damages every year. Monitoring these events is crucial in order to reduce both the affected people and the economic losses. In this work we train and test three different Deep Learning segmentation algorithms to estimate the water area from river images, and compare their performances. We discuss the implementation of a novel data chain aimed to monitor river water levels by automatically process data collected from surveillance cameras, and to give alerts in case of high increases of the water level or flooding. We also create and openly publish the first image dataset for river water segmentation.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) LAMP; 600.084; 600.120 Approved no  
  Call Number Admin @ si @ LRS2017 Serial 3078  
Permanent link to this record
 

 
Author Mikhail Mozerov; Joost Van de Weijer edit   pdf
doi  openurl
  Title One-view occlusion detection for stereo matching with a fully connected CRF model Type Journal Article
  Year 2019 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 28 Issue 6 Pages 2936-2947  
  Keywords Stereo matching; energy minimization; fully connected MRF model; geodesic distance filter  
  Abstract In this paper, we extend the standard belief propagation (BP) sequential technique proposed in the tree-reweighted sequential method [15] to the fully connected CRF models with the geodesic distance affinity. The proposed method has been applied to the stereo matching problem. Also a new approach to the BP marginal solution is proposed that we call one-view occlusion detection (OVOD). In contrast to the standard winner takes all (WTA) estimation, the proposed OVOD solution allows to find occluded regions in the disparity map and simultaneously improve the matching result. As a result we can perform only
one energy minimization process and avoid the cost calculation for the second view and the left-right check procedure. We show that the OVOD approach considerably improves results for cost augmentation and energy minimization techniques in comparison with the standard one-view affinity space implementation. We apply our method to the Middlebury data set and reach state-ofthe-art especially for median, average and mean squared error metrics.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) LAMP; 600.098; 600.109; 602.133; 600.120 Approved no  
  Call Number Admin @ si @ MoW2019 Serial 3221  
Permanent link to this record
 

 
Author Esteve Cervantes; Long Long Yu; Andrew Bagdanov; Marc Masana; Joost Van de Weijer edit   pdf
openurl 
  Title Hierarchical Part Detection with Deep Neural Networks Type Conference Article
  Year 2016 Publication 23rd IEEE International Conference on Image Processing Abbreviated Journal  
  Volume Issue Pages  
  Keywords Object Recognition; Part Detection; Convolutional Neural Networks  
  Abstract Part detection is an important aspect of object recognition. Most approaches apply object proposals to generate hundreds of possible part bounding box candidates which are then evaluated by part classifiers. Recently several methods have investigated directly regressing to a limited set of bounding boxes from deep neural network representation. However, for object parts such methods may be unfeasible due to their relatively small size with respect to the image. We propose a hierarchical method for object and part detection. In a single network we first detect the object and then regress to part location proposals based only on the feature representation inside the object. Experiments show that our hierarchical approach outperforms a network which directly regresses the part locations. We also show that our approach obtains part detection accuracy comparable or better than state-of-the-art on the CUB-200 bird and Fashionista clothing item datasets with only a fraction of the number of part proposals.  
  Address Phoenix; Arizona; USA; September 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes (up) LAMP; 600.106 Approved no  
  Call Number Admin @ si @ CLB2016 Serial 2762  
Permanent link to this record
 

 
Author Ozan Caglayan; Walid Aransa; Yaxing Wang; Marc Masana; Mercedes Garcıa-Martinez; Fethi Bougares; Loic Barrault; Joost Van de Weijer edit   pdf
openurl 
  Title Does Multimodality Help Human and Machine for Translation and Image Captioning? Type Conference Article
  Year 2016 Publication 1st conference on machine translation Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper presents the systems developed by LIUM and CVC for the WMT16 Multimodal Machine Translation challenge. We explored various comparative methods, namely phrase-based systems and attentional recurrent neural networks models trained using monomodal or multimodal data. We also performed a human evaluation in order to estimate theusefulness of multimodal data for human machine translation and image description generation. Our systems obtained the best results for both tasks according to the automatic evaluation metrics BLEU and METEOR.  
  Address Berlin; Germany; August 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WMT  
  Notes (up) LAMP; 600.106 ; 600.068 Approved no  
  Call Number Admin @ si @ CAW2016 Serial 2761  
Permanent link to this record
 

 
Author Xialei Liu; Joost Van de Weijer; Andrew Bagdanov edit   pdf
openurl 
  Title RankIQA: Learning from Rankings for No-reference Image Quality Assessment Type Conference Article
  Year 2017 Publication 17th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling. We then use fine-tuning to transfer the knowledge represented in the trained Siamese Network to a traditional CNN that estimates absolute image quality from single images. We demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. Experiments on the TID2013 benchmark show that we improve the state-of-the-art by over 5%. Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA.  
  Address Venice; Italy; October 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes (up) LAMP; 600.106; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ LWB2017b Serial 3036  
Permanent link to this record
 

 
Author Yaxing Wang; Abel Gonzalez-Garcia; Joost Van de Weijer; Luis Herranz edit   pdf
url  openurl
  Title SDIT: Scalable and Diverse Cross-domain Image Translation Type Conference Article
  Year 2019 Publication 27th ACM International Conference on Multimedia Abbreviated Journal  
  Volume Issue Pages 1267–1276  
  Keywords  
  Abstract Recently, image-to-image translation research has witnessed remarkable progress. Although current approaches successfully generate diverse outputs or perform scalable image transfer, these properties have not been combined into a single method. To address this limitation, we propose SDIT: Scalable and Diverse image-to-image translation. These properties are combined into a single generator. The diversity is determined by a latent variable which is randomly sampled from a normal distribution. The scalability is obtained by conditioning the network on the domain attributes. Additionally, we also exploit an attention mechanism that permits the generator to focus on the domain-specific attribute. We empirically demonstrate the performance of the proposed method on face mapping and other datasets beyond faces.  
  Address Nice; Francia; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ACM-MM  
  Notes (up) LAMP; 600.106; 600.109; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ WGW2019 Serial 3363  
Permanent link to this record
 

 
Author Chenshen Wu; Luis Herranz; Xialei Liu; Joost Van de Weijer; Bogdan Raducanu edit   pdf
openurl 
  Title Memory Replay GANs: Learning to Generate New Categories without Forgetting Type Conference Article
  Year 2018 Publication 32nd Annual Conference on Neural Information Processing Systems Abbreviated Journal  
  Volume Issue Pages 5966-5976  
  Keywords  
  Abstract Previous works on sequential learning address the problem of forgetting in discriminative models. In this paper we consider the case of generative models. In particular, we investigate generative adversarial networks (GANs) in the task of learning new categories in a sequential fashion. We first show that sequential fine tuning renders the network unable to properly generate images from previous categories (ie forgetting). Addressing this problem, we propose Memory Replay GANs (MeRGANs), a conditional GAN framework that integrates a memory replay generator. We study two methods to prevent forgetting by leveraging these replays, namely joint training with replay and replay alignment. Qualitative and quantitative experimental results in MNIST, SVHN and LSUN datasets show that our memory replay approach can generate competitive images while significantly mitigating the forgetting of previous categories.  
  Address Montreal; Canada; December 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference NIPS  
  Notes (up) LAMP; 600.106; 600.109; 602.200; 600.120 Approved no  
  Call Number Admin @ si @ WHL2018 Serial 3249  
Permanent link to this record
 

 
Author Ozan Caglayan; Walid Aransa; Adrien Bardet; Mercedes Garcia-Martinez; Fethi Bougares; Loic Barrault; Marc Masana; Luis Herranz; Joost Van de Weijer edit   pdf
openurl 
  Title LIUM-CVC Submissions for WMT17 Multimodal Translation Task Type Conference Article
  Year 2017 Publication 2nd Conference on Machine Translation Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper describes the monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation. We mainly explored two multimodal architectures where either global visual features or convolutional feature maps are integrated in order to benefit from visual context. Our final systems ranked first for both En-De and En-Fr language pairs according to the automatic evaluation metrics METEOR and BLEU.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WMT  
  Notes (up) LAMP; 600.106; 600.120 Approved no  
  Call Number Admin @ si @ CAB2017 Serial 3035  
Permanent link to this record
 

 
Author Ozan Caglayan; Adrien Bardet; Fethi Bougares; Loic Barrault; Kai Wang; Marc Masana; Luis Herranz; Joost Van de Weijer edit   pdf
openurl 
  Title LIUM-CVC Submissions for WMT18 Multimodal Translation Task Type Conference Article
  Year 2018 Publication 3rd Conference on Machine Translation Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previou multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions
ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR.
 
  Address Brussels; Belgium; October 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WMT  
  Notes (up) LAMP; 600.106; 600.120 Approved no  
  Call Number Admin @ si @ CBB2018 Serial 3240  
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen edit   pdf
doi  openurl
  Title Tex-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition Type Conference Article
  Year 2017 Publication 19th International Conference on Multimodal Interaction Abbreviated Journal  
  Volume Issue Pages  
  Keywords Convolutional Neural Networks; Texture Recognition; Local Binary Paterns  
  Abstract Recognizing materials and textures in realistic imaging conditions is a challenging computer vision problem. For many years, local features based orderless representations were a dominant approach for texture recognition. Recently deep local features, extracted from the intermediate layers of a Convolutional Neural Network (CNN), are used as filter banks. These dense local descriptors from a deep model, when encoded with Fisher Vectors, have shown to provide excellent results for texture recognition. The CNN models, employed in such approaches, take RGB patches as input and train on a large amount of labeled images. We show that CNN models, which we call TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard deep models trained on RGB patches. We further investigate two deep architectures, namely early and late fusion, to combine the texture and color information. Experiments on benchmark texture datasets clearly demonstrate that TEX-Nets provide complementary information to standard RGB deep network. Our approach provides a large gain of 4.8%, 3.5%, 2.6% and 4.1% respectively in accuracy on the DTD, KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets, compared to the standard RGB network of the same architecture. Further, our final combination leads to consistent improvements over the state-of-the-art on all four datasets.  
  Address Glasgow; Scothland; November 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ACM  
  Notes (up) LAMP; 600.109; 600.068; 600.120 Approved no  
  Call Number Admin @ si @ RKW2017 Serial 3038  
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen edit   pdf
openurl 
  Title Top-Down Deep Appearance Attention for Action Recognition Type Conference Article
  Year 2017 Publication 20th Scandinavian Conference on Image Analysis Abbreviated Journal  
  Volume 10269 Issue Pages 297-309  
  Keywords Action recognition; CNNs; Feature fusion  
  Abstract Recognizing human actions in videos is a challenging problem in computer vision. Recently, convolutional neural network based deep features have shown promising results for action recognition. In this paper, we investigate the problem of fusing deep appearance and motion cues for action recognition. We propose a video representation which combines deep appearance and motion based local convolutional features within the bag-of-deep-features framework. Firstly, dense deep appearance and motion based local convolutional features are extracted from spatial (RGB) and temporal (flow) networks, respectively. Both visual cues are processed in parallel by constructing separate visual vocabularies for appearance and motion. A category-specific appearance map is then learned to modulate the weights of the deep motion features. The proposed representation is discriminative and binds the deep local convolutional features to their spatial locations. Experiments are performed on two challenging datasets: JHMDB dataset with 21 action classes and ACT dataset with 43 categories. The results clearly demonstrate that our approach outperforms both standard approaches of early and late feature fusion. Further, our approach is only employing action labels and without exploiting body part information, but achieves competitive performance compared to the state-of-the-art deep features based approaches.  
  Address Tromso; June 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SCIA  
  Notes (up) LAMP; 600.109; 600.068; 600.120 Approved no  
  Call Number Admin @ si @ RKW2017b Serial 3039  
Permanent link to this record
 

 
Author Aitor Alvarez-Gila; Joost Van de Weijer; Estibaliz Garrote edit   pdf
openurl 
  Title Adversarial Networks for Spatial Context-Aware Spectral Image Reconstruction from RGB Type Conference Article
  Year 2017 Publication 1st International Workshop on Physics Based Vision meets Deep Learning Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Hyperspectral signal reconstruction aims at recovering the original spectral input that produced a certain trichromatic (RGB) response from a capturing device or observer.
Given the heavily underconstrained, non-linear nature of the problem, traditional techniques leverage different statistical properties of the spectral signal in order to build informative priors from real world object reflectances for constructing such RGB to spectral signal mapping. However,
most of them treat each sample independently, and thus do not benefit from the contextual information that the spatial dimensions can provide. We pose hyperspectral natural image reconstruction as an image to image mapping learning problem, and apply a conditional generative adversarial framework to help capture spatial semantics. This is the first time Convolutional Neural Networks -and, particularly, Generative Adversarial Networks- are used to solve this task. Quantitative evaluation shows a Root Mean Squared Error (RMSE) drop of 44:7% and a Relative RMSE drop of 47:0% on the ICVL natural hyperspectral image dataset.
 
  Address Venice; Italy; October 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV-PBDL  
  Notes (up) LAMP; 600.109; 600.106; 600.120 Approved no  
  Call Number Admin @ si @ AWG2017 Serial 2969  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: