Records |
Links |
Author |
David Fernandez; Josep Llados; Alicia Fornes |
Title |
Handwritten Word Spotting in Old Manuscript Images Using a Pseudo-Structural Descriptor Organized in a Hash Structure |
Type |
Conference Article |
Year |
2011 |
Publication |
5th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
Volume |
6669 |
Issue |
Pages |
628-635 |
Keywords |
Abstract |
There are lots of historical handwritten documents with information that can be used for several studies and projects. The Document Image Analysis and Recognition community is interested in preserving these documents and extracting all the valuable information from them. Handwritten word-spotting is the pattern classification task which consists in detecting handwriting word images. In this work, we have used a query-by-example formalism: we have matched an input image with one or multiple images from handwritten documents to determine the distance that might indicate a correspondence. We have developed an approach based in characteristic Loci Features stored in a hash structure. Document images of the marriage licences of the Cathedral of Barcelona are used as the benchmarking database. |
Address |
Las Palmas de Gran Canaria. Spain |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Jordi Vitria; Joao Miguel Raposo; Mario Hernandez |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
978-3-642-21256-7 |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
Admin @ si @ FLF2011 |
Serial |
1742 |
Permanent link to this record |
Author |
Jean-Marc Ogier; Wenyin Liu; Josep Llados (eds) |
Title |
Graphics Recognition: Achievements, Challenges, and Evolution |
Type |
Book Whole |
Year |
2010 |
Publication |
8th International Workshop GREC 2009. |
Abbreviated Journal |
Volume |
6020 |
Issue |
Pages |
Keywords |
Abstract |
Address |
La Rochelle |
Corporate Author |
Thesis |
Publisher |
Springer Link |
Place of Publication |
Editor |
Jean-Marc Ogier; Wenyin Liu; Josep Llados |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Lecture Notes in Computer Science |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
978-3-642-13727-3 |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
Admin @ si @ OLL2010 |
Serial |
1976 |
Permanent link to this record |
Author |
Partha Pratim Roy; Eduard Vazquez; Josep Llados; Ramon Baldrich; Umapada Pal |
Title |
A System to Retrieve Text/Symbols from Color Maps using Connected Component and Skeleton Analysis |
Type |
Conference Article |
Year |
2007 |
Publication |
Seventh IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
Volume |
Issue |
Pages |
79–78 |
Keywords |
Abstract |
Address |
Curitiba (Brasil) |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
J. Llados, W. Liu, J.M. Ogier |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
CAT @ cat @ RVL2007 |
Serial |
836 |
Permanent link to this record |
Author |
Joan Mas; J.A. Jorge; Gemma Sanchez; Josep Llados |
Title |
Describing and Parising Hand-Drawn Sketches using a Syntactic Approach |
Type |
Conference Article |
Year |
2007 |
Publication |
Seventh IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
Volume |
Issue |
Pages |
61–62 |
Keywords |
Abstract |
Address |
Curitiba (Brasil) |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
J. Llados, W. Liu, J.M. Ogier |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
DAG @ dag @ MJS2007 |
Serial |
845 |
Permanent link to this record |
Author |
Marçal Rusiñol; Josep Llados |
Title |
A Region-Based Hashing Approach for Symbol Spotting in Thechnical Documents |
Type |
Conference Article |
Year |
2007 |
Publication |
Seventh IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
Volume |
Issue |
Pages |
41–42 |
Keywords |
Abstract |
Address |
Curitiba (Brazil) |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
J. Llados, W. Liu, J.M. Ogier |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
DAG @ dag @ RuL2007a |
Serial |
846 |
Permanent link to this record |
Author |
Josep Llados |
Title |
Computer Vision: Progress of Research and Development |
Type |
Book Whole |
Year |
2006 |
Publication |
1st CVC Internal Workshop Computer Vision: Progress of Research and Development, |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords |
Abstract |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
J. Llados (ed.), |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
84-933652-8-9 |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
DAG @ dag @ Lla2006b |
Serial |
766 |
Permanent link to this record |
Author |
Jaume Gibert; Ernest Valveny |
Title |
Graph Embedding based on Nodes Attributes Representatives and a Graph of Words Representation. |
Type |
Conference Article |
Year |
2010 |
Publication |
13th International worshop on structural and syntactic pattern recognition and 8th international worshop on statistical pattern recognition |
Abbreviated Journal |
Volume |
6218 |
Issue |
Pages |
223–232 |
Keywords |
Abstract |
Although graph embedding has recently been used to extend statistical pattern recognition techniques to the graph domain, some existing embeddings are usually computationally expensive as they rely on classical graph-based operations. In this paper we present a new way to embed graphs into vector spaces by first encapsulating the information stored in the original graph under another graph representation by clustering the attributes of the graphs to be processed. This new representation makes the association of graphs to vectors an easy step by just arranging both node attributes and the adjacency matrix in the form of vectors. To test our method, we use two different databases of graphs whose nodes attributes are of different nature. A comparison with a reference method permits to show that this new embedding is better in terms of classification rates, while being much more faster. |
Address |
Corporate Author |
Thesis |
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
Editor |
In E.R. Hancock, R.C. Wilson, T. Windeatt, I. Ulusoy and F. Escolano, |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
0302-9743 |
978-3-642-14979-5 |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
DAG @ dag @ GiV2010 |
Serial |
1416 |
Permanent link to this record |
Author |
Joan Mas |
Title |
A Syntactic Pattern Recognition Approach based on a Distribution Tolerant Adjacency Grammar and a Spatial Indexed Parser. Application to Sketched Document Recognition |
Type |
Book Whole |
Year |
2010 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords |
Abstract |
Sketch recognition is a discipline which has gained an increasing interest in the last
20 years. This is due to the appearance of new devices such as PDA, Tablet PC’s
or digital pen & paper protocols. From the wide range of sketched documents we
focus on those that represent structured documents such as: architectural floor-plans,
engineering drawing, UML diagrams, etc. To recognize and understand these kinds
of documents, first we have to recognize the different compounding symbols and then
we have to identify the relations between these elements. From the way that a sketch
is captured, there are two categories: on-line and off-line. On-line input modes refer
to draw directly on a PDA or a Tablet PC’s while off-line input modes refer to scan
a previously drawn sketch.
This thesis is an overlapping of three different areas on Computer Science: Pattern
Recognition, Document Analysis and Human-Computer Interaction. The aim of this
thesis is to interpret sketched documents independently on whether they are captured
on-line or off-line. For this reason, the proposed approach should contain the following
features. First, as we are working with sketches the elements present in our input
contain distortions. Second, as we would work in on-line or off-line input modes, the
order in the input of the primitives is indifferent. Finally, the proposed method should
be applied in real scenarios, its response time must be slow.
To interpret a sketched document we propose a syntactic approach. A syntactic
approach is composed of two correlated components: a grammar and a parser. The
grammar allows describing the different elements on the document as well as their
relations. The parser, given a document checks whether it belongs to the language
generated by the grammar or not. Thus, the grammar should be able to cope with
the distortions appearing on the instances of the elements. Moreover, it would be
necessary to define a symbol independently of the order of their primitives. Concerning to the parser when analyzing 2D sentences, it does not assume an order in the
primitives. Then, at each new primitive in the input, the parser searches among the
previous analyzed symbols candidates to produce a valid reduction.
Taking into account these features, we have proposed a grammar based on Adjacency Grammars. This kind of grammars defines their productions as a multiset
of symbols rather than a list. This allows describing a symbol without an order in
their components. To cope with distortion we have proposed a distortion model.
This distortion model is an attributed estimated over the constraints of the grammar and passed through the productions. This measure gives an idea on how far is the
symbol from its ideal model. In addition to the distortion on the constraints other
distortions appear when working with sketches. These distortions are: overtracing,
overlapping, gaps or spurious strokes. Some grammatical productions have been defined to cope with these errors. Concerning the recognition, we have proposed an
incremental parser with an indexation mechanism. Incremental parsers analyze the
input symbol by symbol given a response to the user when a primitive is analyzed.
This makes incremental parser suitable to work in on-line as well as off-line input
modes. The parser has been adapted with an indexation mechanism based on a spatial division. This indexation mechanism allows setting the primitives in the space
and reducing the search to a neighbourhood.
A third contribution is a grammatical inference algorithm. This method given a
set of symbols captures the production describing it. In the field of formal languages,
different approaches has been proposed but in the graphical domain not so much work
is done in this field. The proposed method is able to capture the production from
a set of symbol although they are drawn in different order. A matching step based
on the Haussdorff distance and the Hungarian method has been proposed to match
the primitives of the different symbols. In addition the proposed approach is able to
capture the variability in the parameters of the constraints.
From the experimental results, we may conclude that we have proposed a robust
approach to describe and recognize sketches. Moreover, the addition of new symbols
to the alphabet is not restricted to an expert. Finally, the proposed approach has
been used in two real scenarios obtaining a good performance. |
Address |
Corporate Author |
Thesis |
Ph.D. thesis |
Publisher |
Ediciones Graficas Rey |
Place of Publication |
Editor |
Gemma Sanchez;Josep Llados |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
978-84-937261-4-0 |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
DAG @ dag @ Mas2010 |
Serial |
1334 |
Permanent link to this record |
Author |
Lluis Pere de las Heras |
Title |
Relational Models for Visual Understanding of Graphical Documents. Application to Architectural Drawings. |
Type |
Book Whole |
Year |
2014 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords |
Abstract |
Graphical documents express complex concepts using a visual language. This language consists of a vocabulary (symbols) and a syntax (structural relations between symbols) that articulate a semantic meaning in a certain context. Therefore, the automatic interpretation by computers of these sort of documents entails three main steps: the detection of the symbols, the extraction of the structural relations between these symbols, and the modeling of the knowledge that permits the extraction of the semantics. Dierent domains in graphical documents include: architectural and engineering drawings, maps, owcharts, etc.
Graphics Recognition in particular and Document Image Analysis in general are
born from the industrial need of interpreting a massive amount of digitalized documents after the emergence of the scanner. Although many years have passed, the graphical document understanding problem still seems to be far from being solved. The main reason is that the vast majority of the systems in the literature focus on very specic problems, where the domain of the document dictates the implementation of the interpretation. As a result, it is dicult to reuse these strategies on dierent data and on dierent contexts, hindering thus the natural progress in the eld.
In this thesis, we face the graphical document understanding problem by proposing several relational models at dierent levels that are designed from a generic perspective. Firstly, we introduce three dierent strategies for the detection of symbols. The first method tackles the problem structurally, wherein general knowledge of the domain guides the detection. The second is a statistical method that learns the graphical appearance of the symbols and easily adapts to the big variability of the problem. The third method is a combination of the previous two methods that inherits their respective strengths, i.e. copes the big variability and does not need annotated data. Secondly, we present two relational strategies that tackle the problem of the visual context extraction. The first one is a full bottom up method that heuristically searches in a graph representation the contextual relations between symbols. Contrarily, the second is syntactic method that models probabilistically the structure of the documents. It automatically learns the model, which guides the inference algorithm to encounter the best structural representation for a given input. Finally, we construct a knowledge-based model consisting of an ontological denition of the domain and real data. This model permits to perform contextual reasoning and to detect semantic inconsistencies within the data. We evaluate the suitability of the proposed contributions in the framework of floor plan interpretation. Since there is no standard in the modeling of these documents there exists an enormous notation variability from plan to plan in terms of vocabulary and syntax. Therefore, floor plan interpretation is a relevant task in the graphical document understanding problem. It is also worth to mention that we make freely available all the resources used in this thesis {the data, the tool used to generate the data, and the evaluation scripts{ with the aim of fostering research in the graphical document understanding task. |
Address |
Corporate Author |
Thesis |
Ph.D. thesis |
Publisher |
Ediciones Graficas Rey |
Place of Publication |
Editor |
Gemma Sanchez |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
978-84-940902-8-8 |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ Her2014 |
Serial |
2574 |
Permanent link to this record |
Author |
Albert Gordo |
Title |
Document Image Representation, Classification and Retrieval in Large-Scale Domains |
Type |
Book Whole |
Year |
2013 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords |
Abstract |
Despite the “paperless office” ideal that started in the decade of the seventies, businesses still strive against an increasing amount of paper documentation. Companies still receive huge amounts of paper documentation that need to be analyzed and processed, mostly in a manual way. A solution for this task consists in, first, automatically scanning the incoming documents. Then, document images can be analyzed and information can be extracted from the data. Documents can also be automatically dispatched to the appropriate workflows, used to retrieve similar documents in the dataset to transfer information, etc.
Due to the nature of this “digital mailroom”, we need document representation methods to be general, i.e., able to cope with very different types of documents. We need the methods to be sound, i.e., able to cope with unexpected types of documents, noise, etc. And, we need to methods to be scalable, i.e., able to cope with thousands or millions of documents that need to be processed, stored, and consulted. Unfortunately, current techniques of document representation, classification and retrieval are not apt for this digital mailroom framework, since they do not fulfill some or all of these requirements.
Through this thesis we focus on the problem of document representation aimed at classification and retrieval tasks under this digital mailroom framework. We first propose a novel document representation based on runlength histograms, and extend it to cope with more complex documents such as multiple-page documents, or documents that contain more sources of information such as extracted OCR text. Then we focus on the scalability requirements and propose a novel binarization method which we dubbed PCAE, as well as two general asymmetric distances between binary embeddings that can significantly improve the retrieval results at a minimal extra computational cost. Finally, we note the importance of supervised learning when performing large-scale retrieval, and study several approaches that can significantly boost the results at no extra cost at query time. |
Address |
Barcelona |
Corporate Author |
Thesis |
Ph.D. thesis |
Publisher |
Ediciones Graficas Rey |
Place of Publication |
Editor |
Ernest Valveny;Florent Perronnin |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
Admin @ si @ Gor2013 |
Serial |
2277 |
Permanent link to this record |