|
Records |
Links |
|
Author |
Juan Ignacio Toledo; Alicia Fornes; Jordi Cucurull; Josep Llados |
|
|
Title |
Election Tally Sheets Processing System |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
364-368 |
|
|
Keywords |
|
|
|
Abstract |
In paper based elections, manual tallies at polling station level produce myriads of documents. These documents share a common form-like structure and a reduced vocabulary worldwide. On the other hand, each tally sheet is filled by a different writer and on different countries, different scripts are used. We present a complete document analysis system for electoral tally sheet processing combining state of the art techniques with a new handwriting recognition subprocess based on unsupervised feature discovery with Variational Autoencoders and sequence classification with BLSTM neural networks. The whole system is designed to be script independent and allows a fast and reliable results consolidation process with reduced operational cost. |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 602.006; 600.061; 601.225; 600.077; 600.097 |
Approved |
no |
|
|
Call Number |
TFC2016 |
Serial |
2752 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta |
|
|
Title |
Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
87 |
Issue |
|
Pages |
203-211 |
|
|
Keywords |
|
|
|
Abstract |
Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 602.006; 603.053; 600.121 |
Approved |
no |
|
|
Call Number |
RLF2017b |
Serial |
2873 |
|
Permanent link to this record |
|
|
|
|
Author |
Christophe Rigaud; Clement Guerin; Dimosthenis Karatzas; Jean-Christophe Burie; Jean-Marc Ogier |
|
|
Title |
Knowledge-driven understanding of images in comic books |
Type |
Journal Article |
|
Year |
2015 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
18 |
Issue |
3 |
Pages |
199-221 |
|
|
Keywords |
Document Understanding; comics analysis; expert system |
|
|
Abstract |
Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1433-2833 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.056; 600.077 |
Approved |
no |
|
|
Call Number |
RGK2015 |
Serial |
2595 |
|
Permanent link to this record |
|
|
|
|
Author |
Joan Mas; Gemma Sanchez; Josep Llados |
|
|
Title |
SSP: Sketching slide Presentations, a Syntactic Approach |
Type |
Book Chapter |
|
Year |
2010 |
Publication |
Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers |
Abbreviated Journal |
|
|
|
Volume |
6020 |
Issue |
|
Pages |
118-129 |
|
|
Keywords |
|
|
|
Abstract |
The design of a slide presentation is a creative process. In this process first, humans visualize in their minds what they want to explain. Then, they have to be able to represent this knowledge in an understandable way. There exists a lot of commercial software that allows to create our own slide presentations but the creativity of the user is rather limited. In this article we present an application that allows the user to create and visualize a slide presentation from a sketch. A slide may be seen as a graphical document or a diagram where its elements are placed in a particular spatial arrangement. To describe and recognize slides a syntactic approach is proposed. This approach is based on an Adjacency Grammar and a parsing methodology to cope with this kind of grammars. The experimental evaluation shows the performance of our methodology from a qualitative and a quantitative point of view. Six different slides containing different number of symbols, from 4 to 7, have been given to the users and they have drawn them without restrictions in the order of the elements. The quantitative results give an idea on how suitable is our methodology to describe and recognize the different elements in a slide. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-13727-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GREC |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
MSL2010 |
Serial |
2405 |
|
Permanent link to this record |
|
|
|
|
Author |
Minesh Mathew; Viraj Bagal; Ruben Tito; Dimosthenis Karatzas; Ernest Valveny; C.V. Jawahar |
|
|
Title |
InfographicVQA |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1697-1706 |
|
|
Keywords |
Document Analysis Datasets; Evaluation and Comparison of Vision Algorithms; Vision and Languages |
|
|
Abstract |
Infographics communicate information using a combination of textual, graphical and visual elements. This work explores the automatic understanding of infographic images by using a Visual Question Answering technique. To this end, we present InfographicVQA, a new dataset comprising a diverse collection of infographics and question-answer annotations. The questions require methods that jointly reason over the document layout, textual content, graphical elements, and data visualizations. We curate the dataset with an emphasis on questions that require elementary reasoning and basic arithmetic skills. For VQA on the dataset, we evaluate two Transformer-based strong baselines. Both the baselines yield unsatisfactory results compared to near perfect human performance on the dataset. The results suggest that VQA on infographics--images that are designed to communicate information quickly and clearly to human brain--is ideal for benchmarking machine understanding of complex document images. The dataset is available for download at docvqa. org |
|
|
Address |
Virtual; Waikoloa; Hawai; USA; January 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
DAG; 600.155 |
Approved |
no |
|
|
Call Number |
MBT2022 |
Serial |
3625 |
|
Permanent link to this record |
|
|
|
|
Author |
Dimosthenis Karatzas; V. Poulain d'Andecy; Marçal Rusiñol |
|
|
Title |
Human-Document Interaction – a new frontier for document image analysis |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
369-374 |
|
|
Keywords |
|
|
|
Abstract |
All indications show that paper documents will not cede in favour of their digital counterparts, but will instead be used increasingly in conjunction with digital information. An open challenge is how to seamlessly link the physical with the digital – how to continue taking advantage of the important affordances of paper, without missing out on digital functionality. This paper
presents the authors’ experience with developing systems for Human-Document Interaction based on augmented document interfaces and examines new challenges and opportunities arising for the document image analysis field in this area. The system presented combines state of the art camera-based document
image analysis techniques with a range of complementary tech-nologies to offer fluid Human-Document Interaction. Both fixed and nomadic setups are discussed that have gone through user testing in real-life environments, and use cases are presented that span the spectrum from business to educational application |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.084; 600.077 |
Approved |
no |
|
|
Call Number |
KPR2016 |
Serial |
2756 |
|
Permanent link to this record |
|
|
|
|
Author |
Dimosthenis Karatzas; Lluis Gomez; Marçal Rusiñol; Anguelos Nicolaou |
|
|
Title |
The Robust Reading Competition Annotation and Evaluation Platform |
Type |
Conference Article |
|
Year |
2018 |
Publication |
13th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
61-66 |
|
|
Keywords |
|
|
|
Abstract |
The ICDAR Robust Reading Competition (RRC), initiated in 2003 and reestablished in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous
effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the
Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation of data, and to provide online and offline performance evaluation and analysis services. |
|
|
Address |
Viena; Austria; April 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 600.084; 600.121 |
Approved |
no |
|
|
Call Number |
KGR2018 |
Serial |
3103 |
|
Permanent link to this record |
|
|
|
|
Author |
Ernest Valveny; Ricardo Toledo; Ramon Baldrich; Enric Marti |
|
|
Title |
Combining recognition-based in segmentation-based approaches for graphic symol recognition using deformable template matching |
Type |
Conference Article |
|
Year |
2002 |
Publication |
Proceeding of the Second IASTED International Conference Visualization, Imaging and Image Proceesing VIIP 2002 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
502–507 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;RV;CAT;IAM;CIC;ADAS |
Approved |
no |
|
|
Call Number |
IAM @ iam @ VTB2002 |
Serial |
1660 |
|
Permanent link to this record |
|
|
|
|
Author |
Ernest Valveny; Enric Marti |
|
|
Title |
Learning of structural descriptions of graphic symbols using deformable template matching |
Type |
Conference Article |
|
Year |
2001 |
Publication |
Proc. Sixth Int Document Analysis and Recognition Conf |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
455-459 |
|
|
Keywords |
|
|
|
Abstract |
Accurate symbol recognition in graphic documents needs an accurate representation of the symbols to be recognized. If structural approaches are used for recognition, symbols have to be described in terms of their shape, using structural relationships among extracted features. Unlike statistical pattern recognition, in structural methods, symbols are usually manually defined from expertise knowledge, and not automatically infered from sample images. In this work we explain one approach to learn from examples a representative structural description of a symbol, thus providing better information about shape variability. The description of a symbol is based on a probabilistic model. It consists of a set of lines described by the mean and the variance of line parameters, respectively providing information about the model of the symbol, and its shape variability. The representation of each image in the sample set as a set of lines is achieved using deformable template matching. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;IAM; |
Approved |
no |
|
|
Call Number |
IAM @ iam @ VMA2001 |
Serial |
1654 |
|
Permanent link to this record |
|
|
|
|
Author |
Ernest Valveny; Enric Marti |
|
|
Title |
A model for image generation and symbol recognition through the deformation of lineal shapes |
Type |
Journal Article |
|
Year |
2003 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
24 |
Issue |
15 |
Pages |
2857-2867 |
|
|
Keywords |
|
|
|
Abstract |
We describe a general framework for the recognition of distorted images of lineal shapes, which relies on three items: a model to represent lineal shapes and their deformations, a model for the generation of distorted binary images and the combination of both models in a common probabilistic framework, where the generation of deformations is related to an internal energy, and the generation of binary images to an external energy. Then, recognition consists in the minimization of a global energy function, performed by using the EM algorithm. This general framework has been applied to the recognition of hand-drawn lineal symbols in graphic documents. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier Science Inc. |
Place of Publication |
New York, NY, USA |
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0167-8655 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ VAM2003 |
Serial |
1653 |
|
Permanent link to this record |