|   | 
Details
   web
Records
Author Suman Ghosh; Ernest Valveny
Title (down) Visual attention models for scene text recognition Type Conference Article
Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract arXiv:1706.01487
In this paper we propose an approach to lexicon-free recognition of text in scene images. Our approach relies on a LSTM-based soft visual attention model learned from convolutional features. A set of feature vectors are derived from an intermediate convolutional layer corresponding to different areas of the image. This permits encoding of spatial information into the image representation. In this way, the framework is able to learn how to selectively focus on different parts of the image. At every time step the recognizer emits one character using a weighted combination of the convolutional feature vectors according to the learned attention model. Training can be done end-to-end using only word level annotations. In addition, we show that modifying the beam search algorithm by integrating an explicit language model leads to significantly better recognition results. We validate the performance of our approach on standard SVT and ICDAR'03 scene text datasets, showing state-of-the-art performance in unconstrained text recognition.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ GhV2017b Serial 3080
Permanent link to this record
 

 
Author Suman Ghosh; Ernest Valveny
Title (down) R-PHOC: Segmentation-Free Word Spotting using CNN Type Conference Article
Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages
Keywords Convolutional neural network; Image segmentation; Artificial neural network; Nearest neighbor search
Abstract arXiv:1707.01294
This paper proposes a region based convolutional neural network for segmentation-free word spotting. Our network takes as input an image and a set of word candidate bound- ing boxes and embeds all bounding boxes into an embedding space, where word spotting can be casted as a simple nearest neighbour search between the query representation and each of the candidate bounding boxes. We make use of PHOC embedding as it has previously achieved significant success in segmentation- based word spotting. Word candidates are generated using a simple procedure based on grouping connected components using some spatial constraints. Experiments show that R-PHOC which operates on images directly can improve the current state-of- the-art in the standard GW dataset and performs as good as PHOCNET in some cases designed for segmentation based word spotting.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ GhV2017a Serial 3079
Permanent link to this record
 

 
Author Suman Ghosh; Ernest Valveny
Title (down) Query by String word spotting based on character bi-gram indexing Type Conference Article
Year 2015 Publication 13th International Conference on Document Analysis and Recognition ICDAR2015 Abbreviated Journal
Volume Issue Pages 881-885
Keywords
Abstract In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC). These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets
Address Nancy; France; August 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ GhV2015a Serial 2715
Permanent link to this record
 

 
Author Suman Ghosh; Ernest Valveny
Title (down) A Sliding Window Framework for Word Spotting Based on Word Attributes Type Conference Article
Year 2015 Publication Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 Abbreviated Journal
Volume 9117 Issue Pages 652-661
Keywords Word spotting; Sliding window; Word attributes
Abstract In this paper we propose a segmentation-free approach to word spotting. Word images are first encoded into feature vectors using Fisher Vector. Then, these feature vectors are used together with pyramidal histogram of characters labels (PHOC) to learn SVM-based attribute models. Documents are represented by these PHOC based word attributes. To efficiently compute the word attributes over a sliding window, we propose to use an integral image representation of the document using a simplified version of the attribute model. Finally we re-rank the top word candidates using the more discriminative full version of the word attributes. We show state-of-the-art results for segmentation-free query-by-example word spotting in single-writer and multi-writer standard datasets.
Address Santiago de Compostela; June 2015
Corporate Author Thesis
Publisher Springer International Publishing Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-319-19389-2 Medium
Area Expedition Conference IbPRIA
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ GhV2015b Serial 2716
Permanent link to this record