Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–20] |
Records | Links | |||||
---|---|---|---|---|---|---|
Author | Marc Masana; Joost Van de Weijer; Luis Herranz;Andrew Bagdanov; Jose Manuel Alvarez |
|
||||
Title | Domain-adaptive deep network compression | Type | Conference Article | |||
Year | 2017 | Publication | 17th IEEE International Conference on Computer Vision | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | ||||||
Abstract | Deep Neural Networks trained on large datasets can be easily transferred to new domains with far fewer labeled examples by a process called fine-tuning. This has the advantage that representations learned in the large source domain can be exploited on smaller target domains. However, networks designed to be optimal for the source task are often prohibitively large for the target task. In this work we address the compression of networks after domain transfer.
We focus on compression algorithms based on low-rank matrix decomposition. Existing methods base compression solely on learned network weights and ignore the statistics of network activations. We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing. We demonstrate that considering activation statistics when compressing weights leads to a rank-constrained regression problem with a closed-form solution. Because our method takes into account the target domain, it can more optimally remove the redundancy in the weights. Experiments show that our Domain Adaptive Low Rank (DALR) method significantly outperforms existing low-rank compression techniques. With our approach, the fc6 layer of VGG19 can be compressed more than 4x more than using truncated SVD alone – with only a minor or no loss in accuracy. When applied to domain-transferred networks it allows for compression down to only 5-20% of the original number of parameters with only a minor drop in performance. |
|||||
Address | Venice; Italy; October 2017 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | ICCV | |||
Notes | LAMP; 601.305; 600.106; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ | Serial | 3034 | |||
Permanent link to this record | ||||||
Author | Xialei Liu; Joost Van de Weijer; Andrew Bagdanov |
|
||||
Title | RankIQA: Learning from Rankings for No-reference Image Quality Assessment | Type | Conference Article | |||
Year | 2017 | Publication | 17th IEEE International Conference on Computer Vision | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | ||||||
Abstract | We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling. We then use fine-tuning to transfer the knowledge represented in the trained Siamese Network to a traditional CNN that estimates absolute image quality from single images. We demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. Experiments on the TID2013 benchmark show that we improve the state-of-the-art by over 5%. Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA. | |||||
Address | Venice; Italy; October 2017 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | ICCV | |||
Notes | LAMP; 600.106; 600.109; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ LWB2017b | Serial | 3036 | |||
Permanent link to this record | ||||||
Author | Rada Deeb; Damien Muselet; Mathieu Hebert; Alain Tremeau; Joost Van de Weijer |
|
||||
Title | 3D color charts for camera spectral sensitivity estimation | Type | Conference Article | |||
Year | 2017 | Publication | 28th British Machine Vision Conference | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | ||||||
Abstract | Estimating spectral data such as camera sensor responses or illuminant spectral power distribution from raw RGB camera outputs is crucial in many computer vision applications.
Usually, 2D color charts with various patches of known spectral reflectance are used as reference for such purpose. Deducing n-D spectral data (n»3) from 3D RGB inputs is an ill-posed problem that requires a high number of inputs. Unfortunately, most of the natural color surfaces have spectral reflectances that are well described by low-dimensional linear models, i.e. each spectral reflectance can be approximated by a weighted sum of the others. It has been shown that adding patches to color charts does not help in practice, because the information they add is redundant with the information provided by the first set of patches. In this paper, we propose to use spectral data of higher dimensionality by using 3D color charts that create inter-reflections between the surfaces. These inter-reflections produce multiplications between natural spectral curves and so provide non-linear spectral curves. We show that such data provide enough information for accurate spectral data estimation. |
|||||
Address | London; September 2017 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | BMVC | |||
Notes | LAMP; 600.109; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ DMH2017b | Serial | 3037 | |||
Permanent link to this record | ||||||
Author | Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen |
|
||||
Title | Tex-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition | Type | Conference Article | |||
Year | 2017 | Publication | 19th International Conference on Multimodal Interaction | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | Convolutional Neural Networks; Texture Recognition; Local Binary Paterns | |||||
Abstract | Recognizing materials and textures in realistic imaging conditions is a challenging computer vision problem. For many years, local features based orderless representations were a dominant approach for texture recognition. Recently deep local features, extracted from the intermediate layers of a Convolutional Neural Network (CNN), are used as filter banks. These dense local descriptors from a deep model, when encoded with Fisher Vectors, have shown to provide excellent results for texture recognition. The CNN models, employed in such approaches, take RGB patches as input and train on a large amount of labeled images. We show that CNN models, which we call TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard deep models trained on RGB patches. We further investigate two deep architectures, namely early and late fusion, to combine the texture and color information. Experiments on benchmark texture datasets clearly demonstrate that TEX-Nets provide complementary information to standard RGB deep network. Our approach provides a large gain of 4.8%, 3.5%, 2.6% and 4.1% respectively in accuracy on the DTD, KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets, compared to the standard RGB network of the same architecture. Further, our final combination leads to consistent improvements over the state-of-the-art on all four datasets. | |||||
Address | Glasgow; Scothland; November 2017 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | ACM | |||
Notes | LAMP; 600.109; 600.068; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ RKW2017 | Serial | 3038 | |||
Permanent link to this record | ||||||
Author | Ivet Rafegas |
|
||||
Title | Color in Visual Recognition: from flat to deep representations and some biological parallelisms | Type | Book Whole | |||
Year | 2017 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | ||||||
Abstract | Visual recognition is one of the main problems in computer vision that attempts to solve image understanding by deciding what objects are in images. This problem can be computationally solved by using relevant sets of visual features, such as edges, corners, color or more complex object parts. This thesis contributes to how color features have to be represented for recognition tasks.
Image features can be extracted following two different approaches. A first approach is defining handcrafted descriptors of images which is then followed by a learning scheme to classify the content (named flat schemes in Kruger et al. (2013). In this approach, perceptual considerations are habitually used to define efficient color features. Here we propose a new flat color descriptor based on the extension of color channels to boost the representation of spatio-chromatic contrast that surpasses state-of-the-art approaches. However, flat schemes present a lack of generality far away from the capabilities of biological systems. A second approach proposes evolving these flat schemes into a hierarchical process, like in the visual cortex. This includes an automatic process to learn optimal features. These deep schemes, and more specifically Convolutional Neural Networks (CNNs), have shown an impressive performance to solve various vision problems. However, there is a lack of understanding about the internal representation obtained, as a result of automatic learning. In this thesis we propose a new methodology to explore the internal representation of trained CNNs by defining the Neuron Feature as a visualization of the intrinsic features encoded in each individual neuron. Additionally, and inspired by physiological techniques, we propose to compute different neuron selectivity indexes (e.g., color, class, orientation or symmetry, amongst others) to label and classify the full CNN neuron population to understand learned representations. Finally, using the proposed methodology, we show an in-depth study on how color is represented on a specific CNN, trained for object recognition, that competes with primate representational abilities (Cadieu et al (2014)). We found several parallelisms with biological visual systems: (a) a significant number of color selectivity neurons throughout all the layers; (b) an opponent and low frequency representation of color oriented edges and a higher sampling of frequency selectivity in brightness than in color in 1st layer like in V1; (c) a higher sampling of color hue in the second layer aligned to observed hue maps in V2; (d) a strong color and shape entanglement in all layers from basic features in shallower layers (V1 and V2) to object and background shapes in deeper layers (V4 and IT); and (e) a strong correlation between neuron color selectivities and color dataset bias. |
|||||
Address | November 2017 | |||||
Corporate Author | Thesis | Ph.D. thesis | ||||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Maria Vanrell | ||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | 978-84-945373-7-0 | Medium | |||
Area | Expedition | Conference | ||||
Notes | CIC | Approved | no | |||
Call Number | Admin @ si @ Raf2017 | Serial | 3100 | |||
Permanent link to this record | ||||||
Author | Hassan Ahmed Sial; S. Sancho; Ramon Baldrich; Robert Benavente; Maria Vanrell |
|
||||
Title | Color-based data augmentation for Reflectance Estimation | Type | Conference Article | |||
Year | 2018 | Publication | 26th Color Imaging Conference | Abbreviated Journal | ||
Volume | Issue | Pages | 284-289 | |||
Keywords | ||||||
Abstract | Deep convolutional architectures have shown to be successful frameworks to solve generic computer vision problems. The estimation of intrinsic reflectance from single image is not a solved problem yet. Encoder-Decoder architectures are a perfect approach for pixel-wise reflectance estimation, although it usually suffers from the lack of large datasets. Lack of data can be partially solved with data augmentation, however usual techniques focus on geometric changes which does not help for reflectance estimation. In this paper we propose a color-based data augmentation technique that extends the training data by increasing the variability of chromaticity. Rotation on the red-green blue-yellow plane of an opponent space enable to increase the training set in a coherent and sound way that improves network generalization capability for reflectance estimation. We perform some experiments on the Sintel dataset showing that our color-based augmentation increase performance and overcomes one of the state-of-the-art methods. | |||||
Address | Vancouver; November 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | CIC | |||
Notes | CIC | Approved | no | |||
Call Number | Admin @ si @ SSB2018a | Serial | 3129 | |||
Permanent link to this record | ||||||
Author | Yaxing Wang; Joost Van de Weijer; Luis Herranz |
|
||||
Title | Mix and match networks: encoder-decoder alignment for zero-pair image translation | Type | Conference Article | |||
Year | 2018 | Publication | 31st IEEE Conference on Computer Vision and Pattern Recognition | Abbreviated Journal | ||
Volume | Issue | Pages | 5467 - 5476 | |||
Keywords | ||||||
Abstract | We address the problem of image translation between domains or modalities for which no direct paired data is available (i.e. zero-pair translation). We propose mix and match networks, based on multiple encoders and decoders aligned in such a way that other encoder-decoder pairs can be composed at test time to perform unseen image translation tasks between domains or modalities for which explicit paired samples were not seen during training. We study the impact of autoencoders, side information and losses in improving the alignment and transferability of trained pairwise translation models to unseen translations. We show our approach is scalable and can perform colorization and style transfer between unseen combinations of domains. We evaluate our system in a challenging cross-modal setting where semantic segmentation is estimated from depth images, without explicit access to any depth-semantic segmentation training pairs. Our model outperforms baselines based on pix2pix and CycleGAN models. | |||||
Address | Salt Lake City; USA; June 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | CVPR | |||
Notes | LAMP; 600.109; 600.106; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ WWH2018b | Serial | 3131 | |||
Permanent link to this record | ||||||
Author | Adrian Galdran; Aitor Alvarez-Gila; Alessandro Bria; Javier Vazquez; Marcelo Bertalmio |
|
||||
Title | On the Duality Between Retinex and Image Dehazing | Type | Conference Article | |||
Year | 2018 | Publication | 31st IEEE Conference on Computer Vision and Pattern Recognition | Abbreviated Journal | ||
Volume | Issue | Pages | 8212–8221 | |||
Keywords | Image color analysis; Task analysis; Atmospheric modeling; Computer vision; Computational modeling; Lighting | |||||
Abstract | Image dehazing deals with the removal of undesired loss of visibility in outdoor images due to the presence of fog. Retinex is a color vision model mimicking the ability of the Human Visual System to robustly discount varying illuminations when observing a scene under different spectral lighting conditions. Retinex has been widely explored in the computer vision literature for image enhancement and other related tasks. While these two problems are apparently unrelated, the goal of this work is to show that they can be connected by a simple linear relationship. Specifically, most Retinex-based algorithms have the characteristic feature of always increasing image brightness, which turns them into ideal candidates for effective image dehazing by directly applying Retinex to a hazy image whose intensities have been inverted. In this paper, we give theoretical proof that Retinex on inverted intensities is a solution to the image dehazing problem. Comprehensive qualitative and quantitative results indicate that several classical and modern implementations of Retinex can be transformed into competing image dehazing algorithms performing on pair with more complex fog removal methods, and can overcome some of the main challenges associated with this problem. | |||||
Address | Salt Lake City; USA; June 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | CVPR | |||
Notes | LAMP; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ GAB2018 | Serial | 3146 | |||
Permanent link to this record | ||||||
Author | Abel Gonzalez-Garcia; Joost Van de Weijer; Yoshua Bengio |
|
||||
Title | Image-to-image translation for cross-domain disentanglement | Type | Conference Article | |||
Year | 2018 | Publication | 32nd Annual Conference on Neural Information Processing Systems | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | ||||||
Abstract | ||||||
Address | Montreal; Canada; December 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | NIPS | |||
Notes | LAMP; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ GWB2018 | Serial | 3155 | |||
Permanent link to this record | ||||||
Author | Marc Masana; Idoia Ruiz; Joan Serrat; Joost Van de Weijer; Antonio Lopez |
|
||||
Title | Metric Learning for Novelty and Anomaly Detection | Type | Conference Article | |||
Year | 2018 | Publication | 29th British Machine Vision Conference | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | ||||||
Abstract | When neural networks process images which do not resemble the distribution seen during training, so called out-of-distribution images, they often make wrong predictions, and do so too confidently. The capability to detect out-of-distribution images is therefore crucial for many real-world applications. We divide out-of-distribution detection between novelty detection ---images of classes which are not in the training set but are related to those---, and anomaly detection ---images with classes which are unrelated to the training set. By related we mean they contain the same type of objects, like digits in MNIST and SVHN. Most existing work has focused on anomaly detection, and has addressed this problem considering networks trained with the cross-entropy loss. Differently from them, we propose to use metric learning which does not have the drawback of the softmax layer (inherent to cross-entropy methods), which forces the network to divide its prediction power over the learned classes. We perform extensive experiments and evaluate both novelty and anomaly detection, even in a relevant application such as traffic sign recognition, obtaining comparable or better results than previous works. | |||||
Address | Newcastle; uk; September 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | BMVC | |||
Notes | LAMP; ADAS; 601.305; 600.124; 600.106; 602.200; 600.120; 600.118;CIC | Approved | no | |||
Call Number | Admin @ si @ MRS2018 | Serial | 3156 | |||
Permanent link to this record | ||||||
Author | Marco Buzzelli; Joost Van de Weijer; Raimondo Schettini |
|
||||
Title | Learning Illuminant Estimation from Object Recognition | Type | Conference Article | |||
Year | 2018 | Publication | 25th International Conference on Image Processing | Abbreviated Journal | ||
Volume | Issue | Pages | 3234 - 3238 | |||
Keywords | Illuminant estimation; computational color constancy; semi-supervised learning; deep learning; convolutional neural networks | |||||
Abstract | In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep
learning architecture for illuminant estimation that is trained without ground truth illuminants. We evaluate our solution on standard datasets for color constancy, and compare it with state of the art methods. Our proposal is shown to outperform most deep learning methods in a cross-dataset evaluation setup, and to present competitive results in a comparison with parametric solutions. |
|||||
Address | Athens; Greece; October 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | ICIP | |||
Notes | LAMP; 600.109; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ BWS2018 | Serial | 3157 | |||
Permanent link to this record | ||||||
Author | Xialei Liu; Joost Van de Weijer; Andrew Bagdanov |
|
||||
Title | Leveraging Unlabeled Data for Crowd Counting by Learning to Rank | Type | Conference Article | |||
Year | 2018 | Publication | 31st IEEE Conference on Computer Vision and Pattern Recognition | Abbreviated Journal | ||
Volume | Issue | Pages | 7661 - 7669 | |||
Keywords | Task analysis; Training; Computer vision; Visualization; Estimation; Head; Context modeling | |||||
Abstract | We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework. To induce a ranking of
cropped images , we use the observation that any sub-image of a crowded scene image is guaranteed to contain the same number or fewer persons than the super-image. This allows us to address the problem of limited size of existing datasets for crowd counting. We collect two crowd scene datasets from Google using keyword searches and queryby-example image retrieval, respectively. We demonstrate how to efficiently learn from these unlabeled datasets by incorporating learning-to-rank in a multi-task network which simultaneously ranks images and estimates crowd density maps. Experiments on two of the most challenging crowd counting datasets show that our approach obtains state-ofthe-art results. |
|||||
Address | Salt Lake City; USA; June 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | CVPR | |||
Notes | LAMP; 600.109; 600.106; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ LWB2018 | Serial | 3159 | |||
Permanent link to this record | ||||||
Author | Xialei Liu; Marc Masana; Luis Herranz; Joost Van de Weijer; Antonio Lopez; Andrew Bagdanov |
|
||||
Title | Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting | Type | Conference Article | |||
Year | 2018 | Publication | 24th International Conference on Pattern Recognition | Abbreviated Journal | ||
Volume | Issue | Pages | 2262-2268 | |||
Keywords | ||||||
Abstract | In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of
a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolidation (which assumes a diagonal Fisher Information Matrix), leads to significantly better performance on lifelong learning of sequential tasks. Experimental results on the MNIST, CIFAR-100, CUB-200 and Stanford-40 datasets demonstrate that we significantly improve the results of standard elastic weight consolidation, and that we obtain competitive results when compared to the state-of-the-art in lifelong learning without forgetting. |
|||||
Address | ||||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | ICPR | |||
Notes | LAMP; ADAS; 601.305; 601.109; 600.124; 600.106; 602.200; 600.120; 600.118;CIC | Approved | no | |||
Call Number | Admin @ si @ LMH2018 | Serial | 3160 | |||
Permanent link to this record | ||||||
Author | Ozan Caglayan; Adrien Bardet; Fethi Bougares; Loic Barrault; Kai Wang; Marc Masana; Luis Herranz; Joost Van de Weijer |
|
||||
Title | LIUM-CVC Submissions for WMT18 Multimodal Translation Task | Type | Conference Article | |||
Year | 2018 | Publication | 3rd Conference on Machine Translation | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | ||||||
Abstract | This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previou multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions
ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR. |
|||||
Address | Brussels; Belgium; October 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | WMT | |||
Notes | LAMP; 600.106; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ CBB2018 | Serial | 3240 | |||
Permanent link to this record | ||||||
Author | Lu Yu; Yongmei Cheng; Joost Van de Weijer |
|
||||
Title | Weakly Supervised Domain-Specific Color Naming Based on Attention | Type | Conference Article | |||
Year | 2018 | Publication | 24th International Conference on Pattern Recognition | Abbreviated Journal | ||
Volume | Issue | Pages | 3019 - 3024 | |||
Keywords | ||||||
Abstract | The majority of existing color naming methods focuses on the eleven basic color terms of the English language. However, in many applications, different sets of color names are used for the accurate description of objects. Labeling data to learn these domain-specific color names is an expensive and laborious task. Therefore, in this article we aim to learn color names from weakly labeled data. For this purpose, we add an attention branch to the color naming network. The attention branch is used to modulate the pixel-wise color naming predictions of the network. In experiments, we illustrate that the attention branch correctly identifies the relevant regions. Furthermore, we show that our method obtains state-of-the-art results for pixel-wise and image-wise classification on the EBAY dataset and is able to learn color names for various domains. | |||||
Address | Beijing; August 2018 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | ICPR | |||
Notes | LAMP; 600.109; 602.200; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ YCW2018 | Serial | 3243 | |||
Permanent link to this record | ||||||
Author | Bojana Gajic; Ariel Amato; Ramon Baldrich; Carlo Gatta |
|
||||
Title | Bag of Negatives for Siamese Architectures | Type | Conference Article | |||
Year | 2019 | Publication | 30th British Machine Vision Conference | Abbreviated Journal | ||
Volume | Issue | Pages | ||||
Keywords | ||||||
Abstract | Training a Siamese architecture for re-identification with a large number of identities is a challenging task due to the difficulty of finding relevant negative samples efficiently. In this work we present Bag of Negatives (BoN), a method for accelerated and improved training of Siamese networks that scales well on datasets with a very large number of identities. BoN is an efficient and loss-independent method, able to select a bag of high quality negatives, based on a novel online hashing strategy. | |||||
Address | Cardiff; United Kingdom; September 2019 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | BMVC | |||
Notes | CIC; 600.140; 600.118;MILAB | Approved | no | |||
Call Number | Admin @ si @ GAB2019b | Serial | 3263 | |||
Permanent link to this record | ||||||
Author | Lichao Zhang; Abel Gonzalez-Garcia; Joost Van de Weijer; Martin Danelljan; Fahad Shahbaz Khan |
|
||||
Title | Learning the Model Update for Siamese Trackers | Type | Conference Article | |||
Year | 2019 | Publication | 18th IEEE International Conference on Computer Vision | Abbreviated Journal | ||
Volume | Issue | Pages | 4009-4018 | |||
Keywords | ||||||
Abstract | Siamese approaches address the visual tracking problem by extracting an appearance template from the current frame, which is used to localize the target in the next frame. In general, this template is linearly combined with the accumulated template from the previous frame, resulting in an exponential decay of information over time. While such an approach to updating has led to improved results, its simplicity limits the potential gain likely to be obtained by learning to update. Therefore, we propose to replace the handcrafted update function with a method which learns to update. We use a convolutional neural network, called UpdateNet, which given the initial template, the accumulated template and the template of the current frame aims to estimate the optimal template for the next frame. The UpdateNet is compact and can easily be integrated into existing Siamese trackers. We demonstrate the generality of the proposed approach by applying it to two Siamese trackers, SiamFC and DaSiamRPN. Extensive experiments on VOT2016, VOT2018, LaSOT, and TrackingNet datasets demonstrate that our UpdateNet effectively predicts the new target template, outperforming the standard linear update. On the large-scale TrackingNet dataset, our UpdateNet improves the results of DaSiamRPN with an absolute gain of 3.9% in terms of success score. | |||||
Address | Seul; Corea; October 2019 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | ICCV | |||
Notes | LAMP; 600.109; 600.141; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ ZGW2019 | Serial | 3295 | |||
Permanent link to this record | ||||||
Author | Lichao Zhang; Martin Danelljan; Abel Gonzalez-Garcia; Joost Van de Weijer; Fahad Shahbaz Khan |
|
||||
Title | Multi-Modal Fusion for End-to-End RGB-T Tracking | Type | Conference Article | |||
Year | 2019 | Publication | IEEE International Conference on Computer Vision Workshops | Abbreviated Journal | ||
Volume | Issue | Pages | 2252-2261 | |||
Keywords | ||||||
Abstract | We propose an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking. Our baseline tracker is DiMP (Discriminative Model Prediction), which employs a carefully designed target prediction network trained end-to-end using a discriminative loss. We analyze the effectiveness of modality fusion in each of the main components in DiMP, i.e. feature extractor, target estimation network, and classifier. We consider several fusion mechanisms acting at different levels of the framework, including pixel-level, feature-level and response-level. Our tracker is trained in an end-to-end manner, enabling the components to learn how to fuse the information from both modalities. As data to train our model, we generate a large-scale RGB-T dataset by considering an annotated RGB tracking dataset (GOT-10k) and synthesizing paired TIR images using an image-to-image translation approach. We perform extensive experiments on VOT-RGBT2019 dataset and RGBT210 dataset, evaluating each type of modality fusing on each model component. The results show that the proposed fusion mechanisms improve the performance of the single modality counterparts. We obtain our best results when fusing at the feature-level on both the IoU-Net and the model predictor, obtaining an EAO score of 0.391 on VOT-RGBT2019 dataset. With this fusion mechanism we achieve the state-of-the-art performance on RGBT210 dataset. | |||||
Address | Seul; Corea; October 2019 | |||||
Corporate Author | Thesis | |||||
Publisher | Place of Publication | Editor | ||||
Language | Summary Language | Original Title | ||||
Series Editor | Series Title | Abbreviated Series Title | ||||
Series Volume | Series Issue | Edition | ||||
ISSN | ISBN | Medium | ||||
Area | Expedition | Conference | ICCVW | |||
Notes | LAMP; 600.109; 600.141; 600.120;CIC | Approved | no | |||
Call Number | Admin @ si @ ZDG2019 | Serial | 3279 | |||
Permanent link to this record |