toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links (down)
Author Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta edit  url
openurl 
  Title Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases Type Journal Article
  Year 2017 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 87 Issue Pages 203-211  
  Keywords  
  Abstract Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.097; 602.006; 603.053; 600.121 Approved no  
  Call Number RLF2017b Serial 2873  
Permanent link to this record
 

 
Author S.K. Jemni; Mohamed Ali Souibgui; Yousri Kessentini; Alicia Fornes edit  url
openurl 
  Title Enhance to Read Better: A Multi-Task Adversarial Network for Handwritten Document Image Enhancement Type Journal Article
  Year 2022 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 123 Issue Pages 108370  
  Keywords  
  Abstract Handwritten document images can be highly affected by degradation for different reasons: Paper ageing, daily-life scenarios (wrinkles, dust, etc.), bad scanning process and so on. These artifacts raise many readability issues for current Handwritten Text Recognition (HTR) algorithms and severely devalue their efficiency. In this paper, we propose an end to end architecture based on Generative Adversarial Networks (GANs) to recover the degraded documents into a and form. Unlike the most well-known document binarization methods, which try to improve the visual quality of the degraded document, the proposed architecture integrates a handwritten text recognizer that promotes the generated document image to be more readable. To the best of our knowledge, this is the first work to use the text information while binarizing handwritten documents. Extensive experiments conducted on degraded Arabic and Latin handwritten documents demonstrate the usefulness of integrating the recognizer within the GAN architecture, which improves both the visual quality and the readability of the degraded document images. Moreover, we outperform the state of the art in H-DIBCO challenges, after fine tuning our pre-trained model with synthetically degraded Latin handwritten images, on this task.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.124; 600.121; 602.230 Approved no  
  Call Number Admin @ si @ JSK2022 Serial 3613  
Permanent link to this record
 

 
Author Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes edit   pdf
url  openurl
  Title Learning graph edit distance by graph neural networks Type Journal Article
  Year 2021 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 120 Issue Pages 108132  
  Keywords  
  Abstract The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies. In this paper, we propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance. Hence, we propose an efficient graph distance based on the novel field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation. The performance of the proposed graph distance is validated on two different scenarios. On the one hand, in a graph retrieval of handwritten words i.e. keyword spotting, showing its superior performance when compared with (approximate) graph edit distance benchmarks. On the other hand, demonstrating competitive results for graph similarity learning when compared with the current state-of-the-art on a recent benchmark dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ RFL2021 Serial 3611  
Permanent link to this record
 

 
Author Lei Kang; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol edit   pdf
url  openurl
  Title Candidate Fusion: Integrating Language Modelling into a Sequence-to-Sequence Handwritten Word Recognition Architecture Type Journal Article
  Year 2021 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 112 Issue Pages 107790  
  Keywords  
  Abstract Sequence-to-sequence models have recently become very popular for tackling
handwritten word recognition problems. However, how to effectively integrate an external language model into such recognizer is still a challenging
problem. The main challenge faced when training a language model is to
deal with the language model corpus which is usually different to the one
used for training the handwritten word recognition system. Thus, the bias
between both word corpora leads to incorrectness on the transcriptions, providing similar or even worse performances on the recognition task. In this
work, we introduce Candidate Fusion, a novel way to integrate an external
language model to a sequence-to-sequence architecture. Moreover, it provides suggestions from an external language knowledge, as a new input to
the sequence-to-sequence recognizer. Hence, Candidate Fusion provides two
improvements. On the one hand, the sequence-to-sequence recognizer has
the flexibility not only to combine the information from itself and the language model, but also to choose the importance of the information provided
by the language model. On the other hand, the external language model
has the ability to adapt itself to the training corpus and even learn the
most commonly errors produced from the recognizer. Finally, by conducting
comprehensive experiments, the Candidate Fusion proves to outperform the
state-of-the-art language models for handwritten word recognition tasks.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.140; 601.302; 601.312; 600.121 Approved no  
  Call Number Admin @ si @ KRV2021 Serial 3343  
Permanent link to this record
 

 
Author Juan Ignacio Toledo; Manuel Carbonell; Alicia Fornes; Josep Llados edit  url
openurl 
  Title Information Extraction from Historical Handwritten Document Images with a Context-aware Neural Model Type Journal Article
  Year 2019 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 86 Issue Pages 27-36  
  Keywords Document image analysis; Handwritten documents; Named entity recognition; Deep neural networks  
  Abstract Many historical manuscripts that hold trustworthy memories of the past societies contain information organized in a structured layout (e.g. census, birth or marriage records). The precious information stored in these documents cannot be effectively used nor accessed without costly annotation efforts. The transcription driven by the semantic categories of words is crucial for the subsequent access. In this paper we describe an approach to extract information from structured historical handwritten text images and build a knowledge representation for the extraction of meaning out of historical data. The method extracts information, such as named entities, without the need of an intermediate transcription step, thanks to the incorporation of context information through language models. Our system has two variants, the first one is based on bigrams, whereas the second one is based on recurrent neural networks. Concretely, our second architecture integrates a Convolutional Neural Network to model visual information from word images together with a Bidirecitonal Long Short Term Memory network to model the relation among the words. This integrated sequential approach is able to extract more information than just the semantic category (e.g. a semantic category can be associated to a person in a record). Our system is generic, it deals with out-of-vocabulary words by design, and it can be applied to structured handwritten texts from different domains. The method has been validated with the ICDAR IEHHR competition protocol, outperforming the existing approaches.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.097; 601.311; 603.057; 600.084; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ TCF2019 Serial 3166  
Permanent link to this record
 

 
Author Anjan Dutta; Josep Llados; Horst Bunke; Umapada Pal edit   pdf
url  openurl
  Title Product graph-based higher order contextual similarities for inexact subgraph matching Type Journal Article
  Year 2018 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 76 Issue Pages 596-611  
  Keywords  
  Abstract Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched. Pairwise measurements usually consider local attributes but disregard contextual information involved in graph structures. We address this issue by proposing contextual similarities between pairs of nodes. This is done by considering the tensor product graph (TPG) of two graphs to be matched, where each node is an ordered pair of nodes of the operand graphs. Contextual similarities between a pair of nodes are computed by accumulating weighted walks (normalized pairwise similarities) terminating at the corresponding paired node in TPG. Once the contextual similarities are obtained, we formulate subgraph matching as a node and edge selection problem in TPG. We use contextual similarities to construct an objective function and optimize it with a linear programming approach. Since random walk formulation through TPG takes into account higher order information, it is not a surprise that we obtain more reliable similarities and better discrimination among the nodes and edges. Experimental results shown on synthetic as well as real benchmarks illustrate that higher order contextual similarities increase discriminating power and allow one to find approximate solutions to the subgraph matching problem.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 602.167; 600.097; 600.121 Approved no  
  Call Number Admin @ si @ DLB2018 Serial 3083  
Permanent link to this record
 

 
Author Lluis Gomez; Dimosthenis Karatzas edit   pdf
url  openurl
  Title TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild Type Journal Article
  Year 2017 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 70 Issue Pages 60-74  
  Keywords  
  Abstract Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way.

Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.084; 601.197; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GoK2017 Serial 2886  
Permanent link to this record
 

 
Author Sounak Dey; Palaiahnakote Shivakumara; K.S. Raghunanda; Umapada Pal; Tong Lu; G. Hemantha Kumar; Chee Seng Chan edit  url
openurl 
  Title Script independent approach for multi-oriented text detection in scene image Type Journal Article
  Year 2017 Publication Neurocomputing Abbreviated Journal NEUCOM  
  Volume 242 Issue Pages 96-112  
  Keywords  
  Abstract Developing a text detection method which is invariant to scripts in natural scene images is a challeng- ing task due to different geometrical structures of various scripts. Besides, multi-oriented of text lines in natural scene images make the problem more challenging. This paper proposes to explore ring radius transform (RRT) for text detection in multi-oriented and multi-script environments. The method finds component regions based on convex hull to generate radius matrices using RRT. It is a fact that RRT pro- vides low radius values for the pixels that are near to edges, constant radius values for the pixels that represent stroke width, and high radius values that represent holes created in background and convex hull because of the regular structures of text components. We apply k -means clustering on the radius matrices to group such spatially coherent regions into individual clusters. Then the proposed method studies the radius values of such cluster components that are close to the centroid and far from the cen- troid to detect text components. Furthermore, we have developed a Bangla dataset (named as ISI-UM dataset) and propose a semi-automatic system for generating its ground truth for text detection of arbi- trary orientations, which can be used by the researchers for text detection and recognition in the future. The ground truth will be released to public. Experimental results on our ISI-UM data and other standard datasets, namely, ICDAR 2013 scene, SVT and MSRA data, show that the proposed method outperforms the existing methods in terms of multi-lingual and multi-oriented text detection ability.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ DSR2017 Serial 3260  
Permanent link to this record
 

 
Author Marçal Rusiñol; Josep Llados edit  url
openurl 
  Title A Performance Evaluation Protocol for Symbol Spotting Systems in Terms of Recognition and Location Indices Type Journal Article
  Year 2009 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume 12 Issue 2 Pages 83-96  
  Keywords Performance evaluation; Symbol Spotting; Graphics Recognition  
  Abstract Symbol spotting systems are intended to retrieve regions of interest from a document image database where the queried symbol is likely to be found. They shall have the ability to recognize and locate graphical symbols in a single step. In this paper, we present a set of measures to evaluate the performance of a symbol spotting system in terms of recognition abilities, location accuracy and scalability. We show that the proposed measures allow to determine the weaknesses and strengths of different methods. In particular we have tested a symbol spotting method based on a set of four different off-the-shelf shape descriptors.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1433-2833 ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number DAG @ dag @ RuL2009a Serial 1166  
Permanent link to this record
 

 
Author Antonio Clavelli; Dimosthenis Karatzas; Josep Llados; Mario Ferraro; Giuseppe Boccignone edit   pdf
url  doi
isbn  openurl
  Title Towards Modelling an Attention-Based Text Localization Process Type Conference Article
  Year 2013 Publication 6th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal  
  Volume 7887 Issue Pages 296-303  
  Keywords text localization; visual attention; eye guidance  
  Abstract This note introduces a visual attention model of text localization in real-world scenes. The core of the model built upon the proto-object concept is discussed. It is shown how such dynamic mid-level representation of the scene can be derived in the framework of an action-perception loop engaging salience, text information value computation, and eye guidance mechanisms.
Preliminary results that compare model generated scanpaths with those eye-tracked from human subjects are presented.
 
  Address Madeira; Portugal; June 2013  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-38627-5 Medium  
  Area Expedition Conference IbPRIA  
  Notes DAG Approved no  
  Call Number Admin @ si @ CKL2013 Serial 2291  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: