Records |
Links |
Author |
David Fernandez; Josep Llados; Alicia Fornes; R.Manmatha |

Title |
On Influence of Line Segmentation in Efficient Word Segmentation in Old Manuscripts |
Type |
Conference Article |
Year |
2012 |
Publication |
13th International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
Volume |
Issue |
Pages |
763-768 |
Keywords  |
document image processing;handwritten character recognition;history;image segmentation;Spanish document;historical document;line segmentation;old handwritten document;old manuscript;word segmentation;Bifurcation;Dynamic programming;Handwriting recognition;Image segmentation;Measurement;Noise;Skeleton;Segmentation;document analysis;document and text processing;handwriting analysis;heuristics;path-finding |
Abstract |
he objective of this work is to show the importance of a good line segmentation to obtain better results in the segmentation of words of historical documents. We have used the approach developed by Manmatha and Rothfeder [1] to segment words in old handwritten documents. In their work the lines of the documents are extracted using projections. In this work, we have developed an approach to segment lines more efficiently. The new line segmentation algorithm tackles with skewed, touching and noisy lines, so it is significantly improves word segmentation. Experiments using Spanish documents from the Marriages Database of the Barcelona Cathedral show that this approach reduces the error rate by more than 20% |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
978-1-4673-2262-1 |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
Admin @ si @ FLF2012 |
Serial |
2200 |
Permanent link to this record |
Author |
Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados |

Title |
Fast Structural Matching for Document Image Retrieval through Spatial Databases |
Type |
Conference Article |
Year |
2014 |
Publication |
Document Recognition and Retrieval XXI |
Abbreviated Journal |
Volume |
9021 |
Issue |
Pages |
Keywords  |
Document image retrieval; distance transform; MSER; spatial database |
Abstract |
The structure of document images plays a signicant role in document analysis thus considerable eorts have been made towards extracting and understanding document structure, usually in the form of layout analysis approaches. In this paper, we rst employ Distance Transform based MSER (DTMSER) to eciently extract stable document structural elements in terms of a dendrogram of key-regions. Then a fast structural matching method is proposed to query the structure of document (dendrogram) based on a spatial database which facilitates the formulation of advanced spatial queries. The experiments demonstrate a signicant improvement in a document retrieval scenario when compared to the use of typical Bag of Words (BoW) and pyramidal BoW descriptors. |
Address |
Amsterdam; September 2014 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.056; 600.061; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ GRK2014a |
Serial |
2496 |
Permanent link to this record |
Author |
Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados |

Title |
Automatic Verification of Properly Signed Multi-page Document Images |
Type |
Conference Article |
Year |
2015 |
Publication |
Proceedings of the Eleventh International Symposium on Visual Computing |
Abbreviated Journal |
Volume |
9475 |
Issue |
Pages |
327-336 |
Keywords  |
Document Image; Manual Inspection; Signature Verification; Rejection Criterion; Document Flow |
Abstract |
In this paper we present an industrial application for the automatic screening of incoming multi-page documents in a banking workflow aimed at determining whether these documents are properly signed or not. The proposed method is divided in three main steps. First individual pages are classified in order to identify the pages that should contain a signature. In a second step, we segment within those key pages the location where the signatures should appear. The last step checks whether the signatures are present or not. Our method is tested in a real large-scale environment and we report the results when checking two different types of real multi-page contracts, having in total more than 14,500 pages. |
Address |
Las Vegas, Nevada, USA; December 2015 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
9475 |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ |
Serial |
3189 |
Permanent link to this record |
Author |
Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal |

Title |
SemiDocSeg: Harnessing Semi-Supervised Learning for Document Layout Analysis |
Type |
Journal Article |
Year |
2024 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords  |
Document layout analysis; Semi-supervised learning; Co-Occurrence matrix; Instance segmentation; Swin transformer |
Abstract |
Document Layout Analysis (DLA) is the process of automatically identifying and categorizing the structural components (e.g. Text, Figure, Table, etc.) within a document to extract meaningful content and establish the page's layout structure. It is a crucial stage in document parsing, contributing to their comprehension. However, traditional DLA approaches often demand a significant volume of labeled training data, and the labor-intensive task of generating high-quality annotated training data poses a substantial challenge. In order to address this challenge, we proposed a semi-supervised setting that aims to perform learning on limited annotated categories by eliminating exhaustive and expensive mask annotations. The proposed setting is expected to be generalizable to novel categories as it learns the underlying positional information through a support set and class information through Co-Occurrence that can be generalized from annotated categories to novel categories. Here, we first extract features from the input image and support set with a shared multi-scale feature acquisition backbone. Then, the extracted feature representation is fed to the transformer encoder as a query. Later on, we utilize a semantic embedding network before the decoder to capture the underlying semantic relationships and similarities between different instances, enabling the model to make accurate predictions or classifications with only a limited amount of labeled data. Extensive experimentation on competitive benchmarks like PRIMA, DocLayNet, and Historical Japanese (HJ) demonstrate that this generalized setup obtains significant performance compared to the conventional supervised approach. |
Address |
June 2024 |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
Admin @ si @ BBL2024a |
Serial |
4001 |
Permanent link to this record |
Author |
Volkmar Frinken; Andreas Fischer; Markus Baumgartner; Horst Bunke |

Title |
Keyword spotting for self-training of BLSTM NN based handwriting recognition systems |
Type |
Journal Article |
Year |
2014 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
Volume |
47 |
Issue |
3 |
Pages |
1073-1082 |
Keywords  |
Document retrieval; Keyword spotting; Handwriting recognition; Neural networks; Semi-supervised learning |
Abstract |
The automatic transcription of unconstrained continuous handwritten text requires well trained recognition systems. The semi-supervised paradigm introduces the concept of not only using labeled data but also unlabeled data in the learning process. Unlabeled data can be gathered at little or not cost. Hence it has the potential to reduce the need for labeling training data, a tedious and costly process. Given a weak initial recognizer trained on labeled data, self-training can be used to recognize unlabeled data and add words that were recognized with high confidence to the training set for re-training. This process is not trivial and requires great care as far as selecting the elements that are to be added to the training set is concerned. In this paper, we propose to use a bidirectional long short-term memory neural network handwritten recognition system for keyword spotting in order to select new elements. A set of experiments shows the high potential of self-training for bootstrapping handwriting recognition systems, both for modern and historical handwritings, and demonstrate the benefits of using keyword spotting over previously published self-training schemes. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.077; 602.101 |
Approved |
no |
Call Number |
Admin @ si @ FFB2014 |
Serial |
2297 |
Permanent link to this record |
Author |
Christophe Rigaud; Clement Guerin; Dimosthenis Karatzas; Jean-Christophe Burie; Jean-Marc Ogier |

Title |
Knowledge-driven understanding of images in comic books |
Type |
Journal Article |
Year |
2015 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
Volume |
18 |
Issue |
3 |
Pages |
199-221 |
Keywords  |
Document Understanding; comics analysis; expert system |
Abstract |
Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way. |
Address |
Corporate Author |
Thesis |
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
1433-2833 |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.056; 600.077 |
Approved |
no |
Call Number |
RGK2015 |
Serial |
2595 |
Permanent link to this record |
Author |
Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades |

Title |
Flowchart Recognition for Non-Textual Information Retrieval in Patent Search |
Type |
Journal Article |
Year |
2014 |
Publication |
Information Retrieval |
Abbreviated Journal |
IR |
Volume |
17 |
Issue |
5-6 |
Pages |
545-562 |
Keywords  |
Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition |
Abstract |
Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset. |
Address |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
1386-4564 |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ RHR2013 |
Serial |
2342 |
Permanent link to this record |
Author |
Marçal Rusiñol; Josep Llados |

Title |
Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections |
Type |
Book Whole |
Year |
2010 |
Publication |
Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections |
Abbreviated Journal |
Volume |
Issue |
Pages |
Keywords  |
Focused Retrieval , Graphical Pattern Indexation,Graphics Recognition ,Pattern Recognition , Performance Evaluation , Symbol Description ,Symbol Spotting |
Abstract |
The specific problem of symbol recognition in graphical documents requires additional techniques to those developed for character recognition. The most well-known obstacle is the so-called Sayre paradox: Correct recognition requires good segmentation, yet improvement in segmentation is achieved using information provided by the recognition process. This dilemma can be avoided by techniques that identify sets of regions containing useful information. Such symbol-spotting methods allow the detection of symbols in maps or technical drawings without having to fully segment or fully recognize the entire content.
This unique text/reference provides a complete, integrated and large-scale solution to the challenge of designing a robust symbol-spotting method for collections of graphic-rich documents. The book examines a number of features and descriptors, from basic photometric descriptors commonly used in computer vision techniques to those specific to graphical shapes, presenting a methodology which can be used in a wide variety of applications. Additionally, readers are supplied with an insight into the problem of performance evaluation of spotting methods. Some very basic knowledge of pattern recognition, document image analysis and graphics recognition is assumed. |
Address |
Corporate Author |
Thesis |
Publisher |
Springer |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
978-1-84996-208-7 |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
DAG @ dag @ RuL2010a |
Serial |
1292 |
Permanent link to this record |
Author |
Muhammad Muzzamil Luqman; Thierry Brouard; Jean-Yves Ramel; Josep Llados |

Title |
Vers une approche foue of encapsulation de graphes: application a la reconnaissance de symboles |
Type |
Conference Article |
Year |
2010 |
Publication |
Colloque International Francophone sur l'Écrit et le Document |
Abbreviated Journal |
Volume |
Issue |
Pages |
169-184 |
Keywords  |
Fuzzy interval; Graph embedding; Bayesian network; Symbol recognition |
Abstract |
We present a new methodology for symbol recognition, by employing a structural approach for representing visual associations in symbols and a statistical classifier for recognition. A graphic symbol is vectorized, its topological and geometrical details are encoded by an attributed relational graph and a signature is computed for it. Data adapted fuzzy intervals have been introduced for addressing the sensitivity of structural representations to noise. The joint probability distribution of signatures is encoded by a Bayesian network, which serves as a mechanism for pruning irrelevant features and choosing a subset of interesting features from structural signatures of underlying symbol set, and is deployed in a supervised learning scenario for recognizing query symbols. Experimental results on pre-segmented 2D linear architectural and electronic symbols from GREC databases are presented. |
Address |
Sousse, Tunisia |
Corporate Author |
Thesis |
Publisher |
Place of Publication |
Editor |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
Approved |
no |
Call Number |
DAG @ dag @ LBR2010a |
Serial |
1293 |
Permanent link to this record |
Author |
Pau Riba; Josep Llados; Alicia Fornes |

Title |
Error-tolerant coarse-to-fine matching model for hierarchical graphs |
Type |
Conference Article |
Year |
2017 |
Publication |
11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition |
Abbreviated Journal |
Volume |
10310 |
Issue |
Pages |
107-117 |
Keywords  |
Graph matching; Hierarchical graph; Graph-based representation; Coarse-to-fine matching |
Abstract |
Graph-based representations are effective tools to capture structural information from visual elements. However, retrieving a query graph from a large database of graphs implies a high computational complexity. Moreover, these representations are very sensitive to noise or small changes. In this work, a novel hierarchical graph representation is designed. Using graph clustering techniques adapted from graph-based social media analysis, we propose to generate a hierarchy able to deal with different levels of abstraction while keeping information about the topology. For the proposed representations, a coarse-to-fine matching method is defined. These approaches are validated using real scenarios such as classification of colour images and handwritten word spotting. |
Address |
Anacapri; Italy; May 2017 |
Corporate Author |
Thesis |
Publisher |
Springer International Publishing |
Place of Publication |
Editor |
Pasquale Foggia; Cheng-Lin Liu; Mario Vento |
Language |
Summary Language |
Original Title |
Series Editor |
Series Title |
Abbreviated Series Title |
Series Volume |
Series Issue |
Edition |
Medium |
Area |
Expedition |
Conference |
Notes |
DAG; 600.097; 601.302; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ RLF2017a |
Serial |
2951 |
Permanent link to this record |