Records |
Links |
Author |
Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas |
Title |
Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters |
Type |
Conference Article |
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
Volume |
Issue |
Pages |
97-102 |
Keywords |
Robust Reading; End-to-end Systems; CNN; Utility Meters |
Abstract |
In this paper we present a segmentation-free system for reading text in natural scenes. A CNN architecture is trained in an end-to-end manner, and is able to directly output readings without any explicit text localization step. In order to validate our proposal, we focus on the specific case of reading utility meters. We present our results in a large dataset of images acquired by different users and devices, so text appears in any location, with different sizes, fonts and lengths, and the images present several distortions such as
dirt, illumination highlights or blur. |
Address |
Viena; Austria; April 2018 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.121; 600.129 |
Approved |
no |
Call Number |
Admin @ si @ GRK2018 |
Serial |
3102 |
Permanent link to this record |
Author |
Dena Bazazian; Raul Gomez; Anguelos Nicolaou; Lluis Gomez; Dimosthenis Karatzas; Andrew Bagdanov |
Title |
Fast: Facilitated and accurate scene text proposals through fcn guided pruning |
Type |
Journal Article |
Year |
2019 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
Volume |
119 |
Issue |
Pages |
112-120 |
Keywords |
Abstract |
Class-specific text proposal algorithms can efficiently reduce the search space for possible text object locations in an image. In this paper we combine the Text Proposals algorithm with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same recall level and thus gaining a significant speed up. Our experiments demonstrate that such text proposal approaches yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. Our results on the ICDAR 2015 Robust Reading Competition (Challenge 4) and the COCO-text datasets show that, when combined with strong word classifiers, this recall margin leads to state-of-the-art results in end-to-end scene text recognition. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.121; 600.129 |
Approved |
no |
Call Number |
Admin @ si @ BGN2019 |
Serial |
3342 |
Permanent link to this record |
Author |
David Aldavert; Marçal Rusiñol; Ricardo Toledo |
Title |
Automatic Static/Variable Content Separation in Administrative Document Images |
Type |
Conference Article |
Year |
2017 |
Publication |
14th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords |
Abstract |
In this paper we present an automatic method for separating static and variable content from administrative document images. An alignment approach is able to unsupervisedly build probabilistic templates from a set of examples of the same document kind. Such templates define which is the likelihood of every pixel of being either static or variable content. In the extraction step, the same alignment technique is used to match
an incoming image with the template and to locate the positions where variable fields appear. We validate our approach on the public NIST Structured Tax Forms Dataset. |
Address |
Kyoto; Japan; November 2017 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.121;ADAS |
Approved |
no |
Call Number |
Admin @ si @ ART2017 |
Serial |
3001 |
Permanent link to this record |
Author |
David Aldavert; Marçal Rusiñol |
Title |
Manuscript text line detection and segmentation using second-order derivatives analysis |
Type |
Conference Article |
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
Volume |
Issue |
Pages |
293 - 298 |
Keywords |
text line detection; text line segmentation; text region detection; second-order derivatives |
Abstract |
In this paper, we explore the use of second-order derivatives to detect text lines on handwritten document images. Taking advantage that the second derivative gives a minimum response when a dark linear element over a
bright background has the same orientation as the filter, we use this operator to create a map with the local orientation and strength of putative text lines in the document. Then, we detect line segments by selecting and merging the filter responses that have a similar orientation and scale. Finally, text lines are found by merging the segments that are within the same text region. The proposed segmentation algorithm, is learning-free while showing a performance similar to the state of the art methods in publicly available datasets. |
Address |
Viena; Austria; April 2018 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.129; 302.065; 600.121;ADAS |
Approved |
no |
Call Number |
Admin @ si @ AlR2018a |
Serial |
3104 |
Permanent link to this record |
Author |
V. Poulain d'Andecy; Emmanuel Hartmann; Marçal Rusiñol |
Title |
Field Extraction by hybrid incremental and a-priori structural templates |
Type |
Conference Article |
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
Volume |
Issue |
Pages |
251 - 256 |
Keywords |
Layout Analysis; information extraction; incremental learning |
Abstract |
In this paper, we present an incremental framework for extracting information fields from administrative documents. First, we demonstrate some limits of the existing state-of-the-art methods such as the delay of the system efficiency. This is a concern in industrial context when we have only few samples of each document class. Based on this analysis, we propose a hybrid system combining incremental learning by means of itf-df statistics and a-priori generic
models. We report in the experimental section our results obtained with a dataset of real invoices. |
Address |
Viena; Austria; April 2018 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.129; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ PHR2018 |
Serial |
3106 |
Permanent link to this record |
Author |
David Aldavert; Marçal Rusiñol |
Title |
Synthetically generated semantic codebook for Bag-of-Visual-Words based word spotting |
Type |
Conference Article |
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
Volume |
Issue |
Pages |
223 - 228 |
Keywords |
Word Spotting; Bag of Visual Words; Synthetic Codebook; Semantic Information |
Abstract |
Word-spotting methods based on the Bag-ofVisual-Words framework have demonstrated a good retrieval performance even when used in a completely unsupervised manner. Although unsupervised approaches are suitable for
large document collections due to the cost of acquiring labeled data, these methods also present some drawbacks. For instance, having to train a suitable “codebook” for a certain dataset has a high computational cost. Therefore, in
this paper we present a database agnostic codebook which is trained from synthetic data. The aim of the proposed approach is to generate a codebook where the only information required is the type of script used in the document. The use of synthetic data also allows to easily incorporate semantic
information in the codebook generation. So, the proposed method is able to determine which set of codewords have a semantic representation of the descriptor feature space. Experimental results show that the resulting codebook attains a state-of-the-art performance while having a more compact representation. |
Address |
Viena; Austria; April 2018 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.129; 600.121;ADAS |
Approved |
no |
Call Number |
Admin @ si @ AlR2018b |
Serial |
3105 |
Permanent link to this record |
Author |
Marçal Rusiñol |
Title |
Classificació semàntica i visual de documents digitals |
Type |
Journal |
Year |
2019 |
Publication |
Revista de biblioteconomia i documentacio |
Abbreviated Journal |
Volume |
Issue |
Pages |
75-86 |
Keywords |
Abstract |
Se analizan los sistemas de procesamiento automático que trabajan sobre documentos digitalizados con el objetivo de describir los contenidos. De esta forma contribuyen a facilitar el acceso, permitir la indización automática y hacer accesibles los documentos a los motores de búsqueda. El objetivo de estas tecnologías es poder entrenar modelos computacionales que sean capaces de clasificar, agrupar o realizar búsquedas sobre documentos digitales. Así, se describen las tareas de clasificación, agrupamiento y búsqueda. Cuando utilizamos tecnologías de inteligencia artificial en los sistemas de
clasificación esperamos que la herramienta nos devuelva etiquetas semánticas; en sistemas de agrupamiento que nos devuelva documentos agrupados en clusters significativos; y en sistemas de búsqueda esperamos que dada una consulta, nos devuelva una lista ordenada de documentos en función de la relevancia. A continuación se da una visión de conjunto de los métodos que nos permiten describir los documentos digitales, tanto de manera visual (cuál es su apariencia), como a partir de sus contenidos semánticos (de qué hablan). En cuanto a la descripción visual de documentos se aborda el estado de la cuestión de las representaciones numéricas de documentos digitalizados
tanto por métodos clásicos como por métodos basados en el aprendizaje profundo (deep learning). Respecto de la descripción semántica de los contenidos se analizan técnicas como el reconocimiento óptico de caracteres (OCR); el cálculo de estadísticas básicas sobre la aparición de las diferentes palabras en un texto (bag-of-words model); y los métodos basados en aprendizaje profundo como el método word2vec, basado en una red neuronal que, dadas unas cuantas palabras de un texto, debe predecir cuál será la
siguiente palabra. Desde el campo de las ingenierías se están transfiriendo conocimientos que se han integrado en productos o servicios en los ámbitos de la archivística, la biblioteconomía, la documentación y las plataformas de gran consumo, sin embargo los algoritmos deben ser lo suficientemente eficientes no sólo para el reconocimiento y transcripción literal sino también para la capacidad de interpretación de los contenidos. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.135; 600.121; 600.129 |
Approved |
no |
Call Number |
Admin @ si @ Rus2019 |
Serial |
3282 |
Permanent link to this record |
Author |
Lluis Gomez; Marçal Rusiñol; Ali Furkan Biten; Dimosthenis Karatzas |
Title |
Subtitulació automàtica d'imatges. Estat de l'art i limitacions en el context arxivístic |
Type |
Conference Article |
Year |
2018 |
Publication |
Jornades Imatge i Recerca |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords |
Abstract |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.135; 601.338; 600.121; 600.129 |
Approved |
no |
Call Number |
Admin @ si @ GRB2018 |
Serial |
3173 |
Permanent link to this record |
Author |
Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier; Josep Llados |
Title |
A Comparative Study of Local Detectors and Descriptors for Mobile Document Classification |
Type |
Conference Article |
Year |
2015 |
Publication |
13th International Conference on Document Analysis and Recognition ICDAR2015 |
Abbreviated Journal |
Volume |
Issue |
Pages |
596-600 |
Keywords |
Abstract |
In this paper we conduct a comparative study of local key-point detectors and local descriptors for the specific task of mobile document classification. A classification architecture based on direct matching of local descriptors is used as baseline for the comparative study. A set of four different key-point
detectors and four different local descriptors are tested in all the possible combinations. The experiments are conducted in a database consisting of 30 model documents acquired on 6 different backgrounds, totaling more than 36.000 test images. |
Address |
Nancy; France; August 2015 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 600.61; 601.223; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ RCO2015 |
Serial |
2684 |
Permanent link to this record |
Author |
Lluis Gomez; Dimosthenis Karatzas |
Title |
TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild |
Type |
Journal Article |
Year |
2017 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
Volume |
70 |
Issue |
Pages |
60-74 |
Keywords |
Abstract |
Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way.
Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.084; 601.197; 600.121; 600.129 |
Approved |
no |
Call Number |
Admin @ si @ GoK2017 |
Serial |
2886 |
Permanent link to this record |