toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Mikhail Mozerov; Fei Yang; Joost Van de Weijer edit   pdf
doi  openurl
  Title Sparse Data Interpolation Using the Geodesic Distance Affinity Space Type Journal Article
  Year 2019 Publication IEEE Signal Processing Letters Abbreviated Journal SPL  
  Volume (up) 26 Issue 6 Pages 943 - 947  
  Keywords  
  Abstract In this letter, we adapt the geodesic distance-based recursive filter to the sparse data interpolation problem. The proposed technique is general and can be easily applied to any kind of sparse data. We demonstrate its superiority over other interpolation techniques in three experiments for qualitative and quantitative evaluation. In addition, we compare our method with the popular interpolation algorithm presented in the paper on EpicFlow optical flow, which is intuitively motivated by a similar geodesic distance principle. The comparison shows that our algorithm is more accurate and considerably faster than the EpicFlow interpolation technique.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ MYW2019 Serial 3261  
Permanent link to this record
 

 
Author Fei Yang; Luis Herranz; Joost Van de Weijer; Jose Antonio Iglesias; Antonio Lopez; Mikhail Mozerov edit   pdf
url  doi
openurl 
  Title Variable Rate Deep Image Compression with Modulated Autoencoder Type Journal Article
  Year 2020 Publication IEEE Signal Processing Letters Abbreviated Journal SPL  
  Volume (up) 27 Issue Pages 331-335  
  Keywords  
  Abstract Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods (DIC) are optimized for a single fixed rate-distortion (R-D) tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bitrates. To address these limitations, we formulate the problem of variable R-D optimization for DIC, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific R-D tradeoff via a modulation network. Jointly training this modulated autoencoder and the modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; 600.141; 600.120; 600.118 Approved no  
  Call Number Admin @ si @ YHW2020 Serial 3346  
Permanent link to this record
 

 
Author Mikhail Mozerov; Joost Van de Weijer edit   pdf
doi  openurl
  Title One-view occlusion detection for stereo matching with a fully connected CRF model Type Journal Article
  Year 2019 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume (up) 28 Issue 6 Pages 2936-2947  
  Keywords Stereo matching; energy minimization; fully connected MRF model; geodesic distance filter  
  Abstract In this paper, we extend the standard belief propagation (BP) sequential technique proposed in the tree-reweighted sequential method [15] to the fully connected CRF models with the geodesic distance affinity. The proposed method has been applied to the stereo matching problem. Also a new approach to the BP marginal solution is proposed that we call one-view occlusion detection (OVOD). In contrast to the standard winner takes all (WTA) estimation, the proposed OVOD solution allows to find occluded regions in the disparity map and simultaneously improve the matching result. As a result we can perform only
one energy minimization process and avoid the cost calculation for the second view and the left-right check procedure. We show that the OVOD approach considerably improves results for cost augmentation and energy minimization techniques in comparison with the standard one-view affinity space implementation. We apply our method to the Middlebury data set and reach state-ofthe-art especially for median, average and mean squared error metrics.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.098; 600.109; 602.133; 600.120 Approved no  
  Call Number Admin @ si @ MoW2019 Serial 3221  
Permanent link to this record
 

 
Author Lichao Zhang; Abel Gonzalez-Garcia; Joost Van de Weijer; Martin Danelljan; Fahad Shahbaz Khan edit   pdf
doi  openurl
  Title Synthetic Data Generation for End-to-End Thermal Infrared Tracking Type Journal Article
  Year 2019 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume (up) 28 Issue 4 Pages 1837 - 1850  
  Keywords  
  Abstract The usage of both off-the-shelf and end-to-end trained deep networks have significantly improved the performance of visual tracking on RGB videos. However, the lack of large labeled datasets hampers the usage of convolutional neural networks for tracking in thermal infrared (TIR) images. Therefore, most state-of-the-art methods on tracking for TIR data are still based on handcrafted features. To address this problem, we propose to use image-to-image translation models. These models allow us to translate the abundantly available labeled RGB data to synthetic TIR data. We explore both the usage of paired and unpaired image translation models for this purpose. These methods provide us with a large labeled dataset of synthetic TIR sequences, on which we can train end-to-end optimal features for tracking. To the best of our knowledge, we are the first to train end-to-end features for TIR tracking. We perform extensive experiments on the VOT-TIR2017 dataset. We show that a network trained on a large dataset of synthetic TIR data obtains better performance than one trained on the available real TIR data. Combining both data sources leads to further improvement. In addition, when we combine the network with motion features, we outperform the state of the art with a relative gain of over 10%, clearly showing the efficiency of using synthetic data to train end-to-end TIR trackers.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ YGW2019 Serial 3228  
Permanent link to this record
 

 
Author Xinhang Song; Shuqiang Jiang; Luis Herranz; Chengpeng Chen edit   pdf
url  doi
openurl 
  Title Learning Effective RGB-D Representations for Scene Recognition Type Journal Article
  Year 2019 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume (up) 28 Issue 2 Pages 980-993  
  Keywords  
  Abstract Deep convolutional networks can achieve impressive results on RGB scene recognition thanks to large data sets such as places. In contrast, RGB-D scene recognition is still underdeveloped in comparison, due to two limitations of RGB-D data we address in this paper. The first limitation is the lack of depth data for training deep learning models. Rather than fine tuning or transferring RGB-specific features, we address this limitation by proposing an architecture and a two-step training approach that directly learns effective depth-specific features using weak supervision via patches. The resulting RGB-D model also benefits from more complementary multimodal features. Another limitation is the short range of depth sensors (typically 0.5 m to 5.5 m), resulting in depth images not capturing distant objects in the scenes that RGB images can. We show that this limitation can be addressed by using RGB-D videos, where more comprehensive depth information is accumulated as the camera travels across the scenes. Focusing on this scenario, we introduce the ISIA RGB-D video data set to evaluate RGB-D scene recognition with videos. Our video recognition architecture combines convolutional and recurrent neural networks that are trained in three steps with increasingly complex data to learn effective features (i.e., patches, frames, and sequences). Our approach obtains the state-of-the-art performances on RGB-D image (NYUD2 and SUN RGB-D) and video (ISIA RGB-D) scene recognition.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ SJH2019 Serial 3247  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: