toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Mohammed Al Rawi; Dimosthenis Karatzas edit   pdf
openurl 
  Title On the Labeling Correctness in Computer Vision Datasets Type Conference Article
  Year 2018 Publication Proceedings of the Workshop on Interactive Adaptive Learning, co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Image datasets have heavily been used to build computer vision systems.
These datasets are either manually or automatically labeled, which is a
problem as both labeling methods are prone to errors. To investigate this problem, we use a majority voting ensemble that combines the results from several Convolutional Neural Networks (CNNs). Majority voting ensembles not only enhance the overall performance, but can also be used to estimate the confidence level of each sample. We also examined Softmax as another form to estimate posterior probability. We have designed various experiments with a range of different ensembles built from one or different, or temporal/snapshot CNNs, which have been trained multiple times stochastically. We analyzed CIFAR10, CIFAR100, EMNIST, and SVHN datasets and we found quite a few incorrect
labels, both in the training and testing sets. We also present detailed confidence analysis on these datasets and we found that the ensemble is better than the Softmax when used estimate the per-sample confidence. This work thus proposes an approach that can be used to scrutinize and verify the labeling of computer vision datasets, which can later be applied to weakly/semi-supervised learning. We propose a measure, based on the Odds-Ratio, to quantify how many of these incorrectly classified labels are actually incorrectly labeled and how many of these are confusing. The proposed methods are easily scalable to larger datasets, like ImageNet, LSUN and SUN, as each CNN instance is trained for 60 epochs; or even faster, by implementing a temporal (snapshot) ensemble.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECML-PKDDW  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ RaK2018 Serial (down) 3144  
Permanent link to this record
 

 
Author Lluis Gomez; Andres Mafla; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
url  openurl
  Title Single Shot Scene Text Retrieval Type Conference Article
  Year 2018 Publication 15th European Conference on Computer Vision Abbreviated Journal  
  Volume 11218 Issue Pages 728-744  
  Keywords Image retrieval; Scene text; Word spotting; Convolutional Neural Networks; Region Proposals Networks; PHOC  
  Abstract Textual information found in scene images provides high level semantic information about the image and its context and it can be leveraged for better scene understanding. In this paper we address the problem of scene text retrieval: given a text query, the system must return all images containing the queried text. The novelty of the proposed model consists in the usage of a single shot CNN architecture that predicts at the same time bounding boxes and a compact text representation of the words in them. In this way, the text based image retrieval task can be casted as a simple nearest neighbor search of the query text representation over the outputs of the CNN over the entire image
database. Our experiments demonstrate that the proposed architecture
outperforms previous state-of-the-art while it offers a significant increase
in processing speed.
 
  Address Munich; September 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCV  
  Notes DAG; 600.084; 601.338; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GMR2018 Serial (down) 3143  
Permanent link to this record
 

 
Author V. Poulain d'Andecy; Emmanuel Hartmann; Marçal Rusiñol edit   pdf
doi  openurl
  Title Field Extraction by hybrid incremental and a-priori structural templates Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 251 - 256  
  Keywords Layout Analysis; information extraction; incremental learning  
  Abstract In this paper, we present an incremental framework for extracting information fields from administrative documents. First, we demonstrate some limits of the existing state-of-the-art methods such as the delay of the system efficiency. This is a concern in industrial context when we have only few samples of each document class. Based on this analysis, we propose a hybrid system combining incremental learning by means of itf-df statistics and a-priori generic
models. We report in the experimental section our results obtained with a dataset of real invoices.
 
  Address Viena; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ PHR2018 Serial (down) 3106  
Permanent link to this record
 

 
Author David Aldavert; Marçal Rusiñol edit   pdf
doi  openurl
  Title Synthetically generated semantic codebook for Bag-of-Visual-Words based word spotting Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 223 - 228  
  Keywords Word Spotting; Bag of Visual Words; Synthetic Codebook; Semantic Information  
  Abstract Word-spotting methods based on the Bag-ofVisual-Words framework have demonstrated a good retrieval performance even when used in a completely unsupervised manner. Although unsupervised approaches are suitable for
large document collections due to the cost of acquiring labeled data, these methods also present some drawbacks. For instance, having to train a suitable “codebook” for a certain dataset has a high computational cost. Therefore, in
this paper we present a database agnostic codebook which is trained from synthetic data. The aim of the proposed approach is to generate a codebook where the only information required is the type of script used in the document. The use of synthetic data also allows to easily incorporate semantic
information in the codebook generation. So, the proposed method is able to determine which set of codewords have a semantic representation of the descriptor feature space. Experimental results show that the resulting codebook attains a state-of-the-art performance while having a more compact representation.
 
  Address Viena; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ AlR2018b Serial (down) 3105  
Permanent link to this record
 

 
Author David Aldavert; Marçal Rusiñol edit   pdf
doi  openurl
  Title Manuscript text line detection and segmentation using second-order derivatives analysis Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 293 - 298  
  Keywords text line detection; text line segmentation; text region detection; second-order derivatives  
  Abstract In this paper, we explore the use of second-order derivatives to detect text lines on handwritten document images. Taking advantage that the second derivative gives a minimum response when a dark linear element over a
bright background has the same orientation as the filter, we use this operator to create a map with the local orientation and strength of putative text lines in the document. Then, we detect line segments by selecting and merging the filter responses that have a similar orientation and scale. Finally, text lines are found by merging the segments that are within the same text region. The proposed segmentation algorithm, is learning-free while showing a performance similar to the state of the art methods in publicly available datasets.
 
  Address Viena; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.129; 302.065; 600.121 Approved no  
  Call Number Admin @ si @ AlR2018a Serial (down) 3104  
Permanent link to this record
 

 
Author Dimosthenis Karatzas; Lluis Gomez; Marçal Rusiñol; Anguelos Nicolaou edit   pdf
url  openurl
  Title The Robust Reading Competition Annotation and Evaluation Platform Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 61-66  
  Keywords  
  Abstract The ICDAR Robust Reading Competition (RRC), initiated in 2003 and reestablished in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous
effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the
Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation of data, and to provide online and offline performance evaluation and analysis services.
 
  Address Viena; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.121 Approved no  
  Call Number KGR2018 Serial (down) 3103  
Permanent link to this record
 

 
Author Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters Type Conference Article
  Year 2018 Publication 13th IAPR International Workshop on Document Analysis Systems Abbreviated Journal  
  Volume Issue Pages 97-102  
  Keywords Robust Reading; End-to-end Systems; CNN; Utility Meters  
  Abstract In this paper we present a segmentation-free system for reading text in natural scenes. A CNN architecture is trained in an end-to-end manner, and is able to directly output readings without any explicit text localization step. In order to validate our proposal, we focus on the specific case of reading utility meters. We present our results in a large dataset of images acquired by different users and devices, so text appears in any location, with different sizes, fonts and lengths, and the images present several distortions such as
dirt, illumination highlights or blur.
 
  Address Viena; Austria; April 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GRK2018 Serial (down) 3102  
Permanent link to this record
 

 
Author ChunYang; Xu Cheng Yin; Hong Yu; Dimosthenis Karatzas; Yu Cao edit  doi
isbn  openurl
  Title ICDAR2017 Robust Reading Challenge on Text Extraction from Biomedical Literature Figures (DeTEXT) Type Conference Article
  Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1444-1447  
  Keywords  
  Abstract Hundreds of millions of figures are available in the biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information and understanding biomedical documents. Unlike images in the open domain, biomedical figures present a variety of unique challenges. For example, biomedical figures typically have complex layouts, small font sizes, short text, specific text, complex symbols and irregular text arrangements. This paper presents the final results of the ICDAR 2017 Competition on Text Extraction from Biomedical Literature Figures (ICDAR2017 DeTEXT Competition), which aims at extracting (detecting and recognizing) text from biomedical literature figures. Similar to text extraction from scene images and web pictures, ICDAR2017 DeTEXT Competition includes three major tasks, i.e., text detection, cropped word recognition and end-to-end text recognition. Here, we describe in detail the data set, tasks, evaluation protocols and participants of this competition, and report the performance of the participating methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-5386-3586-5 Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ YCY2017 Serial (down) 3098  
Permanent link to this record
 

 
Author N.Nayef; F.Yin; I.Bizid; H.Choi; Y.Feng; Dimosthenis Karatzas; Z.Luo; Umapada Pal; Christophe Rigaud; J. Chazalon; W.Khlif; Muhammad Muzzamil Luqman; Jean-Christophe Burie; C.L.Liu; Jean-Marc Ogier edit  doi
isbn  openurl
  Title ICDAR2017 Robust Reading Challenge on Multi-Lingual Scene Text Detection and Script Identification – RRC-MLT Type Conference Article
  Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1454-1459  
  Keywords  
  Abstract Text detection and recognition in a natural environment are key components of many applications, ranging from business card digitization to shop indexation in a street. This competition aims at assessing the ability of state-of-the-art methods to detect Multi-Lingual Text (MLT) in scene images, such as in contents gathered from the Internet media and in modern cities where multiple cultures live and communicate together. This competition is an extension of the Robust Reading Competition (RRC) which has been held since 2003 both in ICDAR and in an online context. The proposed competition is presented as a new challenge of the RRC. The dataset built for this challenge largely extends the previous RRC editions in many aspects: the multi-lingual text, the size of the dataset, the multi-oriented text, the wide variety of scenes. The dataset is comprised of 18,000 images which contain text belonging to 9 languages. The challenge is comprised of three tasks related to text detection and script classification. We have received a total of 16 participations from the research and industrial communities. This paper presents the dataset, the tasks and the findings of this RRC-MLT challenge.  
  Address Kyoto; Japan; November 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-5386-3586-5 Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ NYB2017 Serial (down) 3097  
Permanent link to this record
 

 
Author Sangheeta Roy; Palaiahnakote Shivakumara; Namita Jain; Vijeta Khare; Anjan Dutta; Umapada Pal; Tong Lu edit  doi
openurl 
  Title Rough-Fuzzy based Scene Categorization for Text Detection and Recognition in Video Type Journal Article
  Year 2018 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 80 Issue Pages 64-82  
  Keywords Rough set; Fuzzy set; Video categorization; Scene image classification; Video text detection; Video text recognition  
  Abstract Scene image or video understanding is a challenging task especially when number of video types increases drastically with high variations in background and foreground. This paper proposes a new method for categorizing scene videos into different classes, namely, Animation, Outlet, Sports, e-Learning, Medical, Weather, Defense, Economics, Animal Planet and Technology, for the performance improvement of text detection and recognition, which is an effective approach for scene image or video understanding. For this purpose, at first, we present a new combination of rough and fuzzy concept to study irregular shapes of edge components in input scene videos, which helps to classify edge components into several groups. Next, the proposed method explores gradient direction information of each pixel in each edge component group to extract stroke based features by dividing each group into several intra and inter planes. We further extract correlation and covariance features to encode semantic features located inside planes or between planes. Features of intra and inter planes of groups are then concatenated to get a feature matrix. Finally, the feature matrix is verified with temporal frames and fed to a neural network for categorization. Experimental results show that the proposed method outperforms the existing state-of-the-art methods, at the same time, the performances of text detection and recognition methods are also improved significantly due to categorization.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.097; 600.121 Approved no  
  Call Number Admin @ si @ RSJ2018 Serial (down) 3096  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: