toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Matthieu Molinier; Jorma Laaksonen edit   pdf
url  openurl
  Title Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification Type Journal Article
  Year 2018 Publication ISPRS Journal of Photogrammetry and Remote Sensing Abbreviated Journal ISPRS J  
  Volume 138 Issue Pages 74-85  
  Keywords Remote sensing; Deep learning; Scene classification; Local Binary Patterns; Texture analysis  
  Abstract Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) LAMP; 600.109; 600.106; 600.120 Approved no  
  Call Number Admin @ si @ RKW2018 Serial 3158  
Permanent link to this record
 

 
Author Xialei Liu; Joost Van de Weijer; Andrew Bagdanov edit   pdf
url  doi
openurl 
  Title Exploiting Unlabeled Data in CNNs by Self-Supervised Learning to Rank Type Journal Article
  Year 2019 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 41 Issue 8 Pages 1862-1878  
  Keywords Task analysis;Training;Image quality;Visualization;Uncertainty;Labeling;Neural networks;Learning from rankings;image quality assessment;crowd counting;active learning  
  Abstract For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50 percent.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) LAMP; 600.109; 600.106; 600.120 Approved no  
  Call Number LWB2019 Serial 3267  
Permanent link to this record
 

 
Author Yaxing Wang; Luis Herranz; Joost Van de Weijer edit   pdf
url  doi
openurl 
  Title Mix and match networks: multi-domain alignment for unpaired image-to-image translation Type Journal Article
  Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 128 Issue Pages 2849–2872  
  Keywords  
  Abstract This paper addresses the problem of inferring unseen cross-modal image-to-image translations between multiple modalities. We assume that only some of the pairwise translations have been seen (i.e. trained) and infer the remaining unseen translations (where training pairs are not available). We propose mix and match networks, an approach where multiple encoders and decoders are aligned in such a way that the desired translation can be obtained by simply cascading the source encoder and the target decoder, even when they have not interacted during the training stage (i.e. unseen). The main challenge lies in the alignment of the latent representations at the bottlenecks of encoder-decoder pairs. We propose an architecture with several tools to encourage alignment, including autoencoders and robust side information and latent consistency losses. We show the benefits of our approach in terms of effectiveness and scalability compared with other pairwise image-to-image translation approaches. We also propose zero-pair cross-modal image translation, a challenging setting where the objective is inferring semantic segmentation from depth (and vice-versa) without explicit segmentation-depth pairs, and only from two (disjoint) segmentation-RGB and depth-RGB training sets. We observe that a certain part of the shared information between unseen modalities might not be reachable, so we further propose a variant that leverages pseudo-pairs which allows us to exploit this shared information between the unseen modalities  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) LAMP; 600.109; 600.106; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ WHW2020 Serial 3424  
Permanent link to this record
 

 
Author Aymen Azaza; Joost Van de Weijer; Ali Douik; Marc Masana edit   pdf
url  openurl
  Title Context Proposals for Saliency Detection Type Journal Article
  Year 2018 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU  
  Volume 174 Issue Pages 1-11  
  Keywords  
  Abstract One of the fundamental properties of a salient object region is its contrast
with the immediate context. The problem is that numerous object regions
exist which potentially can all be salient. One way to prevent an exhaustive
search over all object regions is by using object proposal algorithms. These
return a limited set of regions which are most likely to contain an object. Several saliency estimation methods have used object proposals. However, they focus on the saliency of the proposal only, and the importance of its immediate context has not been evaluated.
In this paper, we aim to improve salient object detection. Therefore, we extend object proposal methods with context proposals, which allow to incorporate the immediate context in the saliency computation. We propose several saliency features which are computed from the context proposals. In the experiments, we evaluate five object proposal methods for the task of saliency segmentation, and find that Multiscale Combinatorial Grouping outperforms the others. Furthermore, experiments show that the proposed context features improve performance, and that our method matches results on the FT datasets and obtains competitive results on three other datasets (PASCAL-S, MSRA-B and ECSSD).
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) LAMP; 600.109; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ AWD2018 Serial 3241  
Permanent link to this record
 

 
Author Maria Elena Meza de Luna; Juan Ramon Terven Salinas; Bogdan Raducanu; Joaquin Salas edit   pdf
url  openurl
  Title A Social-Aware Assistant to support individuals with visual impairments during social interaction: A systematic requirements analysis Type Journal Article
  Year 2019 Publication International Journal of Human-Computer Studies Abbreviated Journal IJHC  
  Volume 122 Issue Pages 50-60  
  Keywords  
  Abstract Visual impairment affects the normal course of activities in everyday life including mobility, education, employment, and social interaction. Most of the existing technical solutions devoted to empowering the visually impaired people are in the areas of navigation (obstacle avoidance), access to printed information and object recognition. Less effort has been dedicated so far in developing solutions to support social interactions. In this paper, we introduce a Social-Aware Assistant (SAA) that provides visually impaired people with cues to enhance their face-to-face conversations. The system consists of a perceptive component (represented by smartglasses with an embedded video camera) and a feedback component (represented by a haptic belt). When the vision system detects a head nodding, the belt vibrates, thus suggesting the user to replicate (mirror) the gesture. In our experiments, sighted persons interacted with blind people wearing the SAA. We instructed the former to mirror the noddings according to the vibratory signal, while the latter interacted naturally. After the face-to-face conversation, the participants had an interview to express their experience regarding the use of this new technological assistant. With the data collected during the experiment, we have assessed quantitatively and qualitatively the device usefulness and user satisfaction.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) LAMP; 600.109; 600.120 Approved no  
  Call Number Admin @ si @ MTR2019 Serial 3142  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: