|
Records |
Links |
|
Author |
Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny |
|
|
Title |
Efficient Exemplar Word Spotting |
Type |
Conference Article |
|
Year |
2012 |
Publication |
23rd British Machine Vision Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
67.1- 67.11 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose an unsupervised segmentation-free method for word spotting in document images.
Documents are represented with a grid of HOG descriptors, and a sliding window approach is used to locate the document regions that are most similar to the query. We use the exemplar SVM framework to produce a better representation of the query in an unsupervised way. Finally, the document descriptors are precomputed and compressed with Product Quantization. This offers two advantages: first, a large number of documents can be kept in RAM memory at the same time. Second, the sliding window becomes significantly faster since distances between quantized HOG descriptors can be precomputed. Our results significantly outperform other segmentation-free methods in the literature, both in accuracy and in speed and memory usage. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
1-901725-46-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
BMVC |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ AGF2012 |
Serial |
1984 |
|
Permanent link to this record |
|
|
|
|
Author |
Suman Ghosh; Lluis Gomez; Dimosthenis Karatzas; Ernest Valveny |
|
|
Title |
Efficient indexing for Query By String text retrieval |
Type |
Conference Article |
|
Year |
2015 |
Publication |
6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1236 - 1240 |
|
|
Keywords |
|
|
|
Abstract |
This paper deals with Query By String word spotting in scene images. A hierarchical text segmentation algorithm based on text specific selective search is used to find text regions. These regions are indexed per character n-grams present in the text region. An attribute representation based on Pyramidal Histogram of Characters (PHOC) is used to compare text regions with the query text. For generation of the index a similar attribute space based Pyramidal Histogram of character n-grams is used. These attribute models are learned using linear SVMs over the Fisher Vector [1] representation of the images along with the PHOC labels of the corresponding strings. |
|
|
Address |
Nancy; France; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CBDAR |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GGK2015 |
Serial |
2693 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Josep Llados |
|
|
Title |
Efficient Logo Retrieval Through Hashing Shape Context Descriptors |
Type |
Conference Article |
|
Year |
2010 |
Publication |
9th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
215–222 |
|
|
Keywords |
|
|
|
Abstract |
In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents. |
|
|
Address |
Boston; USA |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ RuL2010b |
Serial |
1434 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados |
|
|
Title |
Efficient segmentation-free keyword spotting in historical document collections |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
48 |
Issue |
2 |
Pages |
545–555 |
|
|
Keywords |
Historical documents; Keyword spotting; Segmentation-free; Dense SIFT features; Latent semantic analysis; Product quantization |
|
|
Abstract |
In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; ADAS; 600.076; 600.077; 600.061; 601.223; 602.006; 600.055 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RAT2015a |
Serial |
2544 |
|
Permanent link to this record |
|
|
|
|
Author |
Juan Ignacio Toledo; Alicia Fornes; Jordi Cucurull; Josep Llados |
|
|
Title |
Election Tally Sheets Processing System |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
364-368 |
|
|
Keywords |
|
|
|
Abstract |
In paper based elections, manual tallies at polling station level produce myriads of documents. These documents share a common form-like structure and a reduced vocabulary worldwide. On the other hand, each tally sheet is filled by a different writer and on different countries, different scripts are used. We present a complete document analysis system for electoral tally sheet processing combining state of the art techniques with a new handwriting recognition subprocess based on unsupervised feature discovery with Variational Autoencoders and sequence classification with BLSTM neural networks. The whole system is designed to be script independent and allows a fast and reliable results consolidation process with reduced operational cost. |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 602.006; 600.061; 601.225; 600.077; 600.097 |
Approved |
no |
|
|
Call Number |
TFC2016 |
Serial |
2752 |
|
Permanent link to this record |
|
|
|
|
Author |
Francisco Cruz; Oriol Ramos Terrades |
|
|
Title |
EM-Based Layout Analysis Method for Structured Documents |
Type |
Conference Article |
|
Year |
2014 |
Publication |
22nd International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
315-320 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we present a method to perform layout analysis in structured documents. We proposed an EM-based algorithm to fit a set of Gaussian mixtures to the different regions according to the logical distribution along the page. After the convergence, we estimate the final shape of the regions according
to the parameters computed for each component of the mixture. We evaluated our method in the task of record detection in a collection of historical structured documents and performed a comparison with other previous works in this task. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1051-4651 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 602.006; 600.061; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CrR2014 |
Serial |
2530 |
|
Permanent link to this record |
|
|
|
|
Author |
Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados |
|
|
Title |
Embedding Document Structure to Bag-of-Words through Pair-wise Stable Key-regions |
Type |
Conference Article |
|
Year |
2014 |
Publication |
22nd International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2903 - 2908 |
|
|
Keywords |
|
|
|
Abstract |
Since the document structure carries valuable discriminative information, plenty of efforts have been made for extracting and understanding document structure among which layout analysis approaches are the most commonly used. In this paper, Distance Transform based MSER (DTMSER) is employed to efficiently extract the document structure as a dendrogram of key-regions which roughly correspond to structural elements such as characters, words and paragraphs. Inspired by the Bag
of Words (BoW) framework, we propose an efficient method for structural document matching by representing the document image as a histogram of key-region pairs encoding structural relationships.
Applied to the scenario of document image retrieval, experimental results demonstrate a remarkable improvement when comparing the proposed method with typical BoW and pyramidal BoW methods. |
|
|
Address |
Stockholm; Sweden; August 2014 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.056; 600.061; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GRK2014b |
Serial |
2497 |
|
Permanent link to this record |
|
|
|
|
Author |
Jaume Gibert; Ernest Valveny; Horst Bunke |
|
|
Title |
Embedding of Graphs with Discrete Attributes Via Label Frequencies |
Type |
Journal Article |
|
Year |
2013 |
Publication |
International Journal of Pattern Recognition and Artificial Intelligence |
Abbreviated Journal |
IJPRAI |
|
|
Volume |
27 |
Issue |
3 |
Pages |
1360002-1360029 |
|
|
Keywords |
Discrete attributed graphs; graph embedding; graph classification |
|
|
Abstract |
Graph-based representations of patterns are very flexible and powerful, but they are not easily processed due to the lack of learning algorithms in the domain of graphs. Embedding a graph into a vector space solves this problem since graphs are turned into feature vectors and thus all the statistical learning machinery becomes available for graph input patterns. In this work we present a new way of embedding discrete attributed graphs into vector spaces using node and edge label frequencies. The methodology is experimentally tested on graph classification problems, using patterns of different nature, and it is shown to be competitive to state-of-the-art classification algorithms for graphs, while being computationally much more efficient. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GVB2013 |
Serial |
2305 |
|
Permanent link to this record |
|
|
|
|
Author |
Manuel Carbonell; Joan Mas; Mauricio Villegas; Alicia Fornes; Josep Llados |
|
|
Title |
End-to-End Handwritten Text Detection and Transcription in Full Pages |
Type |
Conference Article |
|
Year |
2019 |
Publication |
2nd International Workshop on Machine Learning |
Abbreviated Journal |
|
|
|
Volume |
5 |
Issue |
|
Pages |
29-34 |
|
|
Keywords |
Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning |
|
|
Abstract |
When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect
the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume
segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately. |
|
|
Address |
Sydney; Australia; September 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR WML |
|
|
Notes |
DAG; 600.140; 601.311; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CMV2019 |
Serial |
3353 |
|
Permanent link to this record |
|
|
|
|
Author |
S.K. Jemni; Mohamed Ali Souibgui; Yousri Kessentini; Alicia Fornes |
|
|
Title |
Enhance to Read Better: A Multi-Task Adversarial Network for Handwritten Document Image Enhancement |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
123 |
Issue |
|
Pages |
108370 |
|
|
Keywords |
|
|
|
Abstract |
Handwritten document images can be highly affected by degradation for different reasons: Paper ageing, daily-life scenarios (wrinkles, dust, etc.), bad scanning process and so on. These artifacts raise many readability issues for current Handwritten Text Recognition (HTR) algorithms and severely devalue their efficiency. In this paper, we propose an end to end architecture based on Generative Adversarial Networks (GANs) to recover the degraded documents into a and form. Unlike the most well-known document binarization methods, which try to improve the visual quality of the degraded document, the proposed architecture integrates a handwritten text recognizer that promotes the generated document image to be more readable. To the best of our knowledge, this is the first work to use the text information while binarizing handwritten documents. Extensive experiments conducted on degraded Arabic and Latin handwritten documents demonstrate the usefulness of integrating the recognizer within the GAN architecture, which improves both the visual quality and the readability of the degraded document images. Moreover, we outperform the state of the art in H-DIBCO challenges, after fine tuning our pre-trained model with synthetically degraded Latin handwritten images, on this task. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.124; 600.121; 602.230 |
Approved |
no |
|
|
Call Number |
Admin @ si @ JSK2022 |
Serial |
3613 |
|
Permanent link to this record |