toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links (down)
Author Axel Barroso-Laguna; Edgar Riba; Daniel Ponsa; Krystian Mikolajczyk edit   pdf
url  doi
openurl 
  Title Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters Type Conference Article
  Year 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 5835-5843  
  Keywords  
  Abstract We introduce a novel approach for keypoint detection task that combines handcrafted and learned CNN filters within a shallow multi-scale architecture. Handcrafted filters provide anchor structures for learned filters, which localize, score and rank repeatable features. Scale-space representation is used within the network to extract keypoints at different levels. We design a loss function to detect robust features that exist across a range of scales and to maximize the repeatability score. Our Key.Net model is trained on data synthetically created from ImageNet and evaluated on HPatches benchmark. Results show that our approach outperforms state-of-the-art detectors in terms of repeatability, matching performance and complexity.  
  Address Seul; Corea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes MSIAU; 600.122 Approved no  
  Call Number Admin @ si @ BRP2019 Serial 3290  
Permanent link to this record
 

 
Author Hamed H. Aghdam; Abel Gonzalez-Garcia; Joost Van de Weijer; Antonio Lopez edit   pdf
url  doi
openurl 
  Title Active Learning for Deep Detection Neural Networks Type Conference Article
  Year 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 3672-3680  
  Keywords  
  Abstract The cost of drawing object bounding boxes (ie labeling) for millions of images is prohibitively high. For instance, labeling pedestrians in a regular urban image could take 35 seconds on average. Active learning aims to reduce the cost of labeling by selecting only those images that are informative to improve the detection network accuracy. In this paper, we propose a method to perform active learning of object detectors based on convolutional neural networks. We propose a new image-level scoring process to rank unlabeled images for their automatic selection, which clearly outperforms classical scores. The proposed method can be applied to videos and sets of still images. In the former case, temporal selection rules can complement our scoring process. As a relevant use case, we extensively study the performance of our method on the task of pedestrian detection. Overall, the experiments show that the proposed method performs better than random selection.  
  Address Seul; Korea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes ADAS; LAMP; 600.124; 600.109; 600.141; 600.120; 600.118 Approved no  
  Call Number Admin @ si @ AGW2019 Serial 3321  
Permanent link to this record
 

 
Author Felipe Codevilla; Eder Santana; Antonio Lopez; Adrien Gaidon edit   pdf
url  doi
openurl 
  Title Exploring the Limitations of Behavior Cloning for Autonomous Driving Type Conference Article
  Year 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 9328-9337  
  Keywords  
  Abstract Driving requires reacting to a wide variety of complex environment conditions and agent behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation learning can, in theory, leverage data from large fleets of human-driven cars. Behavior cloning in particular has been successfully used to learn simple visuomotor policies end-to-end, but scaling to the full spectrum of driving behaviors remains an unsolved problem. In this paper, we propose a new benchmark to experimentally investigate the scalability and limitations of behavior cloning. We show that behavior cloning leads to state-of-the-art results, executing complex lateral and longitudinal maneuvers, even in unseen environments, without being explicitly programmed to do so. However, we confirm some limitations of the behavior cloning approach: some well-known limitations (eg, dataset bias and overfitting), new generalization issues (eg, dynamic objects and the lack of a causal modeling), and training instabilities, all requiring further research before behavior cloning can graduate to real-world driving. The code, dataset, benchmark, and agent studied in this paper can be found at github.  
  Address Seul; Korea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes ADAS; 600.124; 600.118 Approved no  
  Call Number Admin @ si @ CSL2019 Serial 3322  
Permanent link to this record
 

 
Author David Berga; Xose R. Fernandez-Vidal; Xavier Otazu; Xose M. Pardo edit   pdf
url  doi
openurl 
  Title SID4VAM: A Benchmark Dataset with Synthetic Images for Visual Attention Modeling Type Conference Article
  Year 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 8788-8797  
  Keywords  
  Abstract A benchmark of saliency models performance with a synthetic image dataset is provided. Model performance is evaluated through saliency metrics as well as the influence of model inspiration and consistency with human psychophysics. SID4VAM is composed of 230 synthetic images, with known salient regions. Images were generated with 15 distinct types of low-level features (e.g. orientation, brightness, color, size...) with a target-distractor popout type of synthetic patterns. We have used Free-Viewing and Visual Search task instructions and 7 feature contrasts for each feature category. Our study reveals that state-ofthe-art Deep Learning saliency models do not perform well with synthetic pattern images, instead, models with Spectral/Fourier inspiration outperform others in saliency metrics and are more consistent with human psychophysical experimentation. This study proposes a new way to evaluate saliency models in the forthcoming literature, accounting for synthetic images with uniquely low-level feature contexts, distinct from previous eye tracking image datasets.  
  Address Seul; Corea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes NEUROBIT; 600.128 Approved no  
  Call Number Admin @ si @ BFO2019b Serial 3372  
Permanent link to this record
 

 
Author Lichao Zhang; Abel Gonzalez-Garcia; Joost Van de Weijer; Martin Danelljan; Fahad Shahbaz Khan edit   pdf
url  doi
openurl 
  Title Learning the Model Update for Siamese Trackers Type Conference Article
  Year 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 4009-4018  
  Keywords  
  Abstract Siamese approaches address the visual tracking problem by extracting an appearance template from the current frame, which is used to localize the target in the next frame. In general, this template is linearly combined with the accumulated template from the previous frame, resulting in an exponential decay of information over time. While such an approach to updating has led to improved results, its simplicity limits the potential gain likely to be obtained by learning to update. Therefore, we propose to replace the handcrafted update function with a method which learns to update. We use a convolutional neural network, called UpdateNet, which given the initial template, the accumulated template and the template of the current frame aims to estimate the optimal template for the next frame. The UpdateNet is compact and can easily be integrated into existing Siamese trackers. We demonstrate the generality of the proposed approach by applying it to two Siamese trackers, SiamFC and DaSiamRPN. Extensive experiments on VOT2016, VOT2018, LaSOT, and TrackingNet datasets demonstrate that our UpdateNet effectively predicts the new target template, outperforming the standard linear update. On the large-scale TrackingNet dataset, our UpdateNet improves the results of DaSiamRPN with an absolute gain of 3.9% in terms of success score.  
  Address Seul; Corea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes LAMP; 600.109; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ ZGW2019 Serial 3295  
Permanent link to this record
 

 
Author Shifeng Zhang; Ajian Liu; Jun Wan; Yanyan Liang; Guogong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z. Li edit  url
doi  openurl
  Title CASIA-SURF: A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing Type Journal
  Year 2020 Publication IEEE Transactions on Biometrics, Behavior, and Identity Science Abbreviated Journal TTBIS  
  Volume 2 Issue 2 Pages 182 - 193  
  Keywords  
  Abstract Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects (≤170) and modalities (≤2), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities. Specifically, it consists of 1,000 subjects with 21,000 videos and each sample has 3 modalities ( i.e. , RGB, Depth and IR). We also provide comprehensive evaluation metrics, diverse evaluation protocols, training/validation/testing subsets and a measurement tool, developing a new benchmark for face anti-spoofing. Moreover, we present a novel multi-modal multi-scale fusion method as a strong baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modality across different scales. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2019?authuser=0  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ ZLW2020 Serial 3412  
Permanent link to this record
 

 
Author Margarita Torre; Beatriz Remeseiro; Petia Radeva; Fernando Martinez edit  url
doi  openurl
  Title DeepNEM: Deep Network Energy-Minimization for Agricultural Field Segmentation Type Journal Article
  Year 2020 Publication IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Abbreviated Journal JSTAEOR  
  Volume 13 Issue Pages 726-737  
  Keywords  
  Abstract One of the main characteristics of agricultural fields is that the appearance of different crops and their growth status, in an aerial image, is varied, and has a wide range of radiometric values and high level of variability. The extraction of these fields and their monitoring are activities that require a high level of human intervention. In this article, we propose a novel automatic algorithm, named deep network energy-minimization (DeepNEM), to extract agricultural fields in aerial images. The model-guided process selects the most relevant image clues extracted by a deep network, completes them and finally generates regions that represent the agricultural fields under a minimization scheme. DeepNEM has been tested over a broad range of fields in terms of size, shape, and content. Different measures were used to compare the DeepNEM with other methods, and to prove that it represents an improved approach to achieve a high-quality segmentation of agricultural fields. Furthermore, this article also presents a new public dataset composed of 1200 images with their parcels boundaries annotations.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB Approved no  
  Call Number Admin @ si @ TRR2020 Serial 3410  
Permanent link to this record
 

 
Author Ali Furkan Biten; R. Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; M. Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title ICDAR 2019 Competition on Scene Text Visual Question Answering Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1563-1570  
  Keywords  
  Abstract This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image. The competition introduces a new dataset comprising 23,038 images annotated with 31,791 question / answer pairs where the answer is always grounded on text instances present in the image. The images are taken from 7 different public computer vision datasets, covering a wide range of scenarios. The competition was structured in three tasks of increasing difficulty, that require reading the text in a scene and understanding it in the context of the scene, to correctly answer a given question. A novel evaluation metric is presented, which elegantly assesses both key capabilities expected from an optimal model: text recognition and image understanding. A detailed analysis of results from different participants is showcased, which provides insight into the current capabilities of VQA systems that can read. We firmly believe the dataset proposed in this challenge will be an important milestone to consider towards a path of more robust and general models that can exploit scene text to achieve holistic image understanding.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 601.338; 600.121 Approved no  
  Call Number Admin @ si @ BTM2019c Serial 3286  
Permanent link to this record
 

 
Author Chee-Kheng Chng; Yuliang Liu; Yipeng Sun; Chun Chet Ng; Canjie Luo; Zihan Ni; ChuanMing Fang; Shuaitao Zhang; Junyu Han; Errui Ding; Jingtuo Liu; Dimosthenis Karatzas; Chee Seng Chan; Lianwen Jin edit   pdf
url  doi
openurl 
  Title ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text – RRC-ArT Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1571-1576  
  Keywords  
  Abstract This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text – RRC-ArT that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting. A total of 78 submissions from 46 unique teams/individuals were received for this competition. The top performing score of each challenge is as follows: i) T1 – 82.65%, ii) T2.1 – 74.3%, iii) T2.2 – 85.32%, iv) T3.1 – 53.86%, and v) T3.2 – 54.91%. Apart from the results, this paper also details the ArT dataset, tasks description, evaluation metrics and participants' methods. The dataset, the evaluation kit as well as the results are publicly available at the challenge website.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ CLS2019 Serial 3340  
Permanent link to this record
 

 
Author Yipeng Sun; Zihan Ni; Chee-Kheng Chng; Yuliang Liu; Canjie Luo; Chun Chet Ng; Junyu Han; Errui Ding; Jingtuo Liu; Dimosthenis Karatzas; Chee Seng Chan; Lianwen Jin edit   pdf
url  doi
openurl 
  Title ICDAR 2019 Competition on Large-Scale Street View Text with Partial Labeling – RRC-LSVT Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1557-1562  
  Keywords  
  Abstract Robust text reading from street view images provides valuable information for various applications. Performance improvement of existing methods in such a challenging scenario heavily relies on the amount of fully annotated training data, which is costly and in-efficient to obtain. To scale up the amount of training data while keeping the labeling procedure cost-effective, this competition introduces a new challenge on Large-scale Street View Text with Partial Labeling (LSVT), providing 50, 000 and 400, 000 images in full and weak annotations, respectively. This competition aims to explore the abilities of state-of-the-art methods to detect and recognize text instances from large-scale street view images, closing the gap between research benchmarks and real applications. During the competition period, a total of 41 teams participated in the two proposed tasks with 132 valid submissions, ie, text detection and end-to-end text spotting. This paper includes dataset descriptions, task definitions, evaluation protocols and results summaries of the ICDAR 2019-LSVT challenge.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ SNC2019 Serial 3339  
Permanent link to this record
 

 
Author Rui Zhang; Yongsheng Zhou; Qianyi Jiang; Qi Song; Nan Li; Kai Zhou; Lei Wang; Dong Wang; Minghui Liao; Mingkun Yang; Xiang Bai; Baoguang Shi; Dimosthenis Karatzas; Shijian Lu; CV Jawahar edit   pdf
url  doi
openurl 
  Title ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1577-1581  
  Keywords  
  Abstract Chinese scene text reading is one of the most challenging problems in computer vision and has attracted great interest. Different from English text, Chinese has more than 6000 commonly used characters and Chinesecharacters can be arranged in various layouts with numerous fonts. The Chinese signboards in street view are a good choice for Chinese scene text images since they have different backgrounds, fonts and layouts. We organized a competition called ICDAR2019-ReCTS, which mainly focuses on reading Chinese text on signboard. This report presents the final results of the competition. A large-scale dataset of 25,000 annotated signboard images, in which all the text lines and characters are annotated with locations and transcriptions, were released. Four tasks, namely character recognition, text line recognition, text line detection and end-to-end recognition were set up. Besides, considering the Chinese text ambiguity issue, we proposed a multi ground truth (multi-GT) evaluation method to make evaluation fairer. The competition started on March 1, 2019 and ended on April 30, 2019. 262 submissions from 46 teams are received. Most of the participants come from universities, research institutes, and tech companies in China. There are also some participants from the United States, Australia, Singapore, and Korea. 21 teams submit results for Task 1, 23 teams submit results for Task 2, 24 teams submit results for Task 3, and 13 teams submit results for Task 4.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ LZZ2019 Serial 3335  
Permanent link to this record
 

 
Author Nibal Nayef; Yash Patel; Michal Busta; Pinaki Nath Chowdhury; Dimosthenis Karatzas; Wafa Khlif; Jiri Matas; Umapada Pal; Jean-Christophe Burie; Cheng-lin Liu; Jean-Marc Ogier edit   pdf
url  doi
openurl 
  Title ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition — RRC-MLT-2019 Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1582-1587  
  Keywords  
  Abstract With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense. With the goal to systematically benchmark and push the state-of-the-art forward, the proposed competition builds on top of the RRC-MLT-2017 with an additional end-to-end task, an additional language in the real images dataset, a large scale multi-lingual synthetic dataset to assist the training, and a baseline End-to-End recognition method. The real dataset consists of 20,000 images containing text from 10 languages. The challenge has 4 tasks covering various aspects of multi-lingual scene text: (a) text detection, (b) cropped word script classification, (c) joint text detection and script classification and (d) end-to-end detection and recognition. In total, the competition received 60 submissions from the research and industrial communities. This paper presents the dataset, the tasks and the findings of the presented RRC-MLT-2019 challenge.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ NPB2019 Serial 3341  
Permanent link to this record
 

 
Author Pau Riba; Anjan Dutta; Lutz Goldmann; Alicia Fornes; Oriol Ramos Terrades; Josep Llados edit   pdf
url  doi
openurl 
  Title Table Detection in Invoice Documents by Graph Neural Networks Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 122-127  
  Keywords  
  Abstract Tabular structures in documents offer a complementary dimension to the raw textual data, representing logical or quantitative relationships among pieces of information. In digital mail room applications, where a large amount of
administrative documents must be processed with reasonable accuracy, the detection and interpretation of tables is crucial. Table recognition has gained interest in document image analysis, in particular in unconstrained formats (absence of rule lines, unknown information of rows and columns). In this work, we propose a graph-based approach for detecting tables in document images. Instead of using the raw content (recognized text), we make use of the location, context and content type, thus it is purely a structure perception approach, not dependent on the language and the quality of the text
reading. Our framework makes use of Graph Neural Networks (GNNs) in order to describe the local repetitive structural information of tables in invoice documents. Our proposed model has been experimentally validated in two invoice datasets and achieved encouraging results. Additionally, due to the scarcity
of benchmark datasets for this task, we have contributed to the community a novel dataset derived from the RVL-CDIP invoice data. It will be publicly released to facilitate future research.
 
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.140; 601.302; 602.167; 600.121; 600.141 Approved no  
  Call Number Admin @ si @ RDG2019 Serial 3355  
Permanent link to this record
 

 
Author Ekta Vats; Anders Hast; Alicia Fornes edit   pdf
url  doi
openurl 
  Title Training-Free and Segmentation-Free Word Spotting using Feature Matching and Query Expansion Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1294-1299  
  Keywords Word spotting; Segmentation-free; Trainingfree; Query expansion; Feature matching  
  Abstract Historical handwritten text recognition is an interesting yet challenging problem. In recent times, deep learning based methods have achieved significant performance in handwritten text recognition. However, handwriting recognition using deep learning needs training data, and often, text must be previously segmented into lines (or even words). These limitations constrain the application of HTR techniques in document collections, because training data or segmented words are not always available. Therefore, this paper proposes a training-free and segmentation-free word spotting approach that can be applied in unconstrained scenarios. The proposed word spotting framework is based on document query word expansion and relaxed feature matching algorithm, which can easily be parallelised. Since handwritten words posses distinct shape and characteristics, this work uses a combination of different keypoint detectors
and Fourier-based descriptors to obtain a sufficient degree of relaxed matching. The effectiveness of the proposed method is empirically evaluated on well-known benchmark datasets using standard evaluation measures. The use of informative features along with query expansion significantly contributed in efficient performance of the proposed method.
 
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ VHF2019 Serial 3356  
Permanent link to this record
 

 
Author Albert Berenguel; Oriol Ramos Terrades; Josep Llados; Cristina Cañero edit  url
doi  openurl
  Title Recurrent Comparator with attention models to detect counterfeit documents Type Conference Article
  Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper is focused on the detection of counterfeit documents via the recurrent comparison of the security textured background regions of two images. The main contributions are twofold: first we apply and adapt a recurrent comparator architecture with attention mechanism to the counterfeit detection task, which constructs a representation of the background regions by recurrently condition the next observation, learning the difference between genuine and counterfeit images through iterative glimpses. Second we propose a new counterfeit document dataset to ensure the generalization of the learned model towards the detection of the lack of resolution during the counterfeit manufacturing. The presented network, outperforms state-of-the-art classification approaches for counterfeit detection as demonstrated in the evaluation.  
  Address Sidney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.140; 600.121; 601.269 Approved no  
  Call Number Admin @ si @ BRL2019 Serial 3456  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: