|   | 
Details
   web
Records
Author Lichao Zhang; Abel Gonzalez-Garcia; Joost Van de Weijer; Martin Danelljan; Fahad Shahbaz Khan
Title Learning the Model Update for Siamese Trackers Type Conference Article
Year (down) 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 4009-4018
Keywords
Abstract Siamese approaches address the visual tracking problem by extracting an appearance template from the current frame, which is used to localize the target in the next frame. In general, this template is linearly combined with the accumulated template from the previous frame, resulting in an exponential decay of information over time. While such an approach to updating has led to improved results, its simplicity limits the potential gain likely to be obtained by learning to update. Therefore, we propose to replace the handcrafted update function with a method which learns to update. We use a convolutional neural network, called UpdateNet, which given the initial template, the accumulated template and the template of the current frame aims to estimate the optimal template for the next frame. The UpdateNet is compact and can easily be integrated into existing Siamese trackers. We demonstrate the generality of the proposed approach by applying it to two Siamese trackers, SiamFC and DaSiamRPN. Extensive experiments on VOT2016, VOT2018, LaSOT, and TrackingNet datasets demonstrate that our UpdateNet effectively predicts the new target template, outperforming the standard linear update. On the large-scale TrackingNet dataset, our UpdateNet improves the results of DaSiamRPN with an absolute gain of 3.9% in terms of success score.
Address Seul; Corea; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes LAMP; 600.109; 600.141; 600.120 Approved no
Call Number Admin @ si @ ZGW2019 Serial 3295
Permanent link to this record
 

 
Author Ali Furkan Biten; Ruben Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas
Title Scene Text Visual Question Answering Type Conference Article
Year (down) 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 4291-4301
Keywords
Abstract Current visual question answering datasets do not consider the rich semantic information conveyed by text within an image. In this work, we present a new dataset, ST-VQA, that aims to highlight the importance of exploiting highlevel semantic information present in images as textual cues in the Visual Question Answering process. We use this dataset to define a series of tasks of increasing difficulty for which reading the scene text in the context provided by the visual information is necessary to reason and generate an appropriate answer. We propose a new evaluation metric for these tasks to account both for reasoning errors as well as shortcomings of the text recognition module. In addition we put forward a series of baseline methods, which provide further insight to the newly released dataset, and set the scene for further research.
Address Seul; Corea; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes DAG; 600.129; 600.135; 601.338; 600.121 Approved no
Call Number Admin @ si @ BTM2019b Serial 3285
Permanent link to this record
 

 
Author Axel Barroso-Laguna; Edgar Riba; Daniel Ponsa; Krystian Mikolajczyk
Title Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters Type Conference Article
Year (down) 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 5835-5843
Keywords
Abstract We introduce a novel approach for keypoint detection task that combines handcrafted and learned CNN filters within a shallow multi-scale architecture. Handcrafted filters provide anchor structures for learned filters, which localize, score and rank repeatable features. Scale-space representation is used within the network to extract keypoints at different levels. We design a loss function to detect robust features that exist across a range of scales and to maximize the repeatability score. Our Key.Net model is trained on data synthetically created from ImageNet and evaluated on HPatches benchmark. Results show that our approach outperforms state-of-the-art detectors in terms of repeatability, matching performance and complexity.
Address Seul; Corea; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes MSIAU; 600.122 Approved no
Call Number Admin @ si @ BRP2019 Serial 3290
Permanent link to this record
 

 
Author Hamed H. Aghdam; Abel Gonzalez-Garcia; Joost Van de Weijer; Antonio Lopez
Title Active Learning for Deep Detection Neural Networks Type Conference Article
Year (down) 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 3672-3680
Keywords
Abstract The cost of drawing object bounding boxes (ie labeling) for millions of images is prohibitively high. For instance, labeling pedestrians in a regular urban image could take 35 seconds on average. Active learning aims to reduce the cost of labeling by selecting only those images that are informative to improve the detection network accuracy. In this paper, we propose a method to perform active learning of object detectors based on convolutional neural networks. We propose a new image-level scoring process to rank unlabeled images for their automatic selection, which clearly outperforms classical scores. The proposed method can be applied to videos and sets of still images. In the former case, temporal selection rules can complement our scoring process. As a relevant use case, we extensively study the performance of our method on the task of pedestrian detection. Overall, the experiments show that the proposed method performs better than random selection.
Address Seul; Korea; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes ADAS; LAMP; 600.124; 600.109; 600.141; 600.120; 600.118 Approved no
Call Number Admin @ si @ AGW2019 Serial 3321
Permanent link to this record
 

 
Author Felipe Codevilla; Eder Santana; Antonio Lopez; Adrien Gaidon
Title Exploring the Limitations of Behavior Cloning for Autonomous Driving Type Conference Article
Year (down) 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 9328-9337
Keywords
Abstract Driving requires reacting to a wide variety of complex environment conditions and agent behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation learning can, in theory, leverage data from large fleets of human-driven cars. Behavior cloning in particular has been successfully used to learn simple visuomotor policies end-to-end, but scaling to the full spectrum of driving behaviors remains an unsolved problem. In this paper, we propose a new benchmark to experimentally investigate the scalability and limitations of behavior cloning. We show that behavior cloning leads to state-of-the-art results, executing complex lateral and longitudinal maneuvers, even in unseen environments, without being explicitly programmed to do so. However, we confirm some limitations of the behavior cloning approach: some well-known limitations (eg, dataset bias and overfitting), new generalization issues (eg, dynamic objects and the lack of a causal modeling), and training instabilities, all requiring further research before behavior cloning can graduate to real-world driving. The code, dataset, benchmark, and agent studied in this paper can be found at github.
Address Seul; Korea; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes ADAS; 600.124; 600.118 Approved no
Call Number Admin @ si @ CSL2019 Serial 3322
Permanent link to this record
 

 
Author David Berga; Xose R. Fernandez-Vidal; Xavier Otazu; Xose M. Pardo
Title SID4VAM: A Benchmark Dataset with Synthetic Images for Visual Attention Modeling Type Conference Article
Year (down) 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 8788-8797
Keywords
Abstract A benchmark of saliency models performance with a synthetic image dataset is provided. Model performance is evaluated through saliency metrics as well as the influence of model inspiration and consistency with human psychophysics. SID4VAM is composed of 230 synthetic images, with known salient regions. Images were generated with 15 distinct types of low-level features (e.g. orientation, brightness, color, size...) with a target-distractor popout type of synthetic patterns. We have used Free-Viewing and Visual Search task instructions and 7 feature contrasts for each feature category. Our study reveals that state-ofthe-art Deep Learning saliency models do not perform well with synthetic pattern images, instead, models with Spectral/Fourier inspiration outperform others in saliency metrics and are more consistent with human psychophysical experimentation. This study proposes a new way to evaluate saliency models in the forthcoming literature, accounting for synthetic images with uniquely low-level feature contexts, distinct from previous eye tracking image datasets.
Address Seul; Corea; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes NEUROBIT; 600.128 Approved no
Call Number Admin @ si @ BFO2019b Serial 3372
Permanent link to this record