|   | 
Details
   web
Records
Author Francisco Cruz
Title Probabilistic Graphical Models for Document Analysis Type Book Whole
Year 2016 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Latest advances in digitization techniques have fostered the interest in creating digital copies of collections of documents. Digitized documents permit an easy maintenance, loss-less storage, and efficient ways for transmission and to perform information retrieval processes. This situation has opened a new market niche to develop systems able to automatically extract and analyze information contained in these collections, specially in the ambit of the business activity.

Due to the great variety of types of documents this is not a trivial task. For instance, the automatic extraction of numerical data from invoices differs substantially from a task of text recognition in historical documents. However, in order to extract the information of interest, is always necessary to identify the area of the document where it is located. In the area of Document Analysis we refer to this process as layout analysis, which aims at identifying and categorizing the different entities that compose the document, such as text regions, pictures, text lines, or tables, among others. To perform this task it is usually necessary to incorporate a prior knowledge about the task into the analysis process, which can be modeled by defining a set of contextual relations between the different entities of the document. The use of context has proven to be useful to reinforce the recognition process and improve the results on many computer vision tasks. It presents two fundamental questions: What kind of contextual information is appropriate for a given task, and how to incorporate this information into the models.

In this thesis we study several ways to incorporate contextual information to the task of document layout analysis, and to the particular case of handwritten text line segmentation. We focus on the study of Probabilistic Graphical Models and other mechanisms for this purpose, and propose several solutions to these problems. First, we present a method for layout analysis based on Conditional Random Fields. With this model we encode local contextual relations between variables, such as pair-wise constraints. Besides, we encode a set of structural relations between different classes of regions at feature level. Second, we present a method based on 2D-Probabilistic Context-free Grammars to encode structural and hierarchical relations. We perform a comparative study between Probabilistic Graphical Models and this syntactic approach. Third, we propose a method for structured documents based on Bayesian Networks to represent the document structure, and an algorithm based in the Expectation-Maximization to find the best configuration of the page. We perform a thorough evaluation of the proposed methods on two particular collections of documents: a historical collection composed of ancient structured documents, and a collection of contemporary documents. In addition, we present a general method for the task of handwritten text line segmentation. We define a probabilistic framework where we combine the EM algorithm with variational approaches for computing inference and parameter learning on a Markov Random Field. We evaluate our method on several collections of documents, including a general dataset of annotated administrative documents. Results demonstrate the applicability of our method to real problems, and the contribution of the use of contextual information to this kind of problems.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Oriol Ramos Terrades
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-2-5 Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ Cru2016 Serial 2861
Permanent link to this record
 

 
Author Lluis Gomez; Dimosthenis Karatzas
Title A fast hierarchical method for multi‐script and arbitrary oriented scene text extraction Type Journal Article
Year 2016 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 19 Issue 4 Pages 335-349
Keywords scene text; segmentation; detection; hierarchical grouping; perceptual organisation
Abstract Typography and layout lead to the hierarchical organisation of text in words, text lines, paragraphs. This inherent structure is a key property of text in any script and language, which has nonetheless been minimally leveraged by existing text detection methods. This paper addresses the problem of text
segmentation in natural scenes from a hierarchical perspective.
Contrary to existing methods, we make explicit use of text structure, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypotheses with
high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Results obtained over four standard datasets, covering text in variable orientations and different languages, demonstrate that our algorithm, while being trained in a single mixed dataset, outperforms state of the art
methods in unconstrained scenarios.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.056; 601.197 Approved no
Call Number Admin @ si @ GoK2016a Serial 2862
Permanent link to this record
 

 
Author Lluis Gomez; Dimosthenis Karatzas
Title A fine-grained approach to scene text script identification Type Conference Article
Year 2016 Publication 12th IAPR Workshop on Document Analysis Systems Abbreviated Journal
Volume Issue Pages 192-197
Keywords
Abstract This paper focuses on the problem of script identification in unconstrained scenarios. Script identification is an important prerequisite to recognition, and an indispensable condition for automatic text understanding systems designed for multi-language environments. Although widely studied for document images and handwritten documents, it remains an almost unexplored territory for scene text images. We detail a novel method for script identification in natural images that combines convolutional features and the Naive-Bayes Nearest Neighbor classifier. The proposed framework efficiently exploits the discriminative power of small stroke-parts, in a fine-grained classification framework. In addition, we propose a new public benchmark dataset for the evaluation of joint text detection and script identification in natural scenes. Experiments done in this new dataset demonstrate that the proposed method yields state of the art results, while it generalizes well to different datasets and variable number of scripts. The evidence provided shows that multi-lingual scene text recognition in the wild is a viable proposition. Source code of the proposed method is made available online.
Address Santorini; Grecia; April 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DAS
Notes DAG; 601.197; 600.084 Approved no
Call Number Admin @ si @ GoK2016b Serial 2863
Permanent link to this record
 

 
Author Aura Hernandez-Sabate; Lluis Albarracin; Daniel Calvo; Nuria Gorgorio
Title EyeMath: Identifying Mathematics Problem Solving Processes in a RTS Video Game Type Conference Article
Year 2016 Publication 5th International Conference Games and Learning Alliance Abbreviated Journal
Volume 10056 Issue Pages 50-59
Keywords Simulation environment; Automated Driving; Driver-Vehicle interaction
Abstract Photorealistic virtual environments are crucial for developing and testing automated driving systems in a safe way during trials. As commercially available simulators are expensive and bulky, this paper presents a low-cost, extendable, and easy-to-use (LEE) virtual environment with the aim to highlight its utility for level 3 driving automation. In particular, an experiment is performed using the presented simulator to explore the influence of different variables regarding control transfer of the car after the system was driving autonomously in a highway scenario. The results show that the speed of the car at the time when the system needs to transfer the control to the human driver is critical.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference GALA
Notes ADAS;IAM; Approved no
Call Number HAC2016 Serial 2864
Permanent link to this record
 

 
Author Saad Minhas; Aura Hernandez-Sabate; Shoaib Ehsan; Katerine Diaz; Ales Leonardis; Antonio Lopez; Klaus McDonald Maier
Title LEE: A photorealistic Virtual Environment for Assessing Driver-Vehicle Interactions in Self-Driving Mode Type Conference Article
Year 2016 Publication 14th European Conference on Computer Vision Workshops Abbreviated Journal
Volume 9915 Issue Pages 894-900
Keywords Simulation environment; Automated Driving; Driver-Vehicle interaction
Abstract Photorealistic virtual environments are crucial for developing and testing automated driving systems in a safe way during trials. As commercially available simulators are expensive and bulky, this paper presents a low-cost, extendable, and easy-to-use (LEE) virtual environment with the aim to highlight its utility for level 3 driving automation. In particular, an experiment is performed using the presented simulator to explore the influence of different variables regarding control transfer of the car after the system was driving autonomously in a highway scenario. The results show that the speed of the car at the time when the system needs to transfer the control to the human driver is critical.
Address Amsterdam; The Netherlands; October 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes ADAS;IAM; 600.085; 600.076 Approved no
Call Number MHE2016 Serial 2865
Permanent link to this record
 

 
Author Marta Diez-Ferrer; Debora Gil; Elena Carreño; Susana Padrones; Samantha Aso
Title Positive Airway Pressure-Enhanced CT to Improve Virtual Bronchoscopic Navigation Type Journal Article
Year 2017 Publication Journal of Thoracic Oncology Abbreviated Journal JTO
Volume 12 Issue 1S Pages S596-S597
Keywords Thorax CT; diagnosis; Peripheral Pulmonary Nodule
Abstract A main weakness of virtual bronchoscopic navigation (VBN) is unsuccessful segmentation of distal branches approaching peripheral pulmonary nodules (PPN). CT scan acquisition protocol is pivotal for segmentation covering the utmost periphery. We hypothesize that application of continuous positive airway pressure (CPAP) during CT acquisition could improve visualization and segmentation of peripheral bronchi. The purpose of the present pilot study is to compare quality of segmentations under 4 CT acquisition modes: inspiration (INSP), expiration (EXP) and both with CPAP (INSP-CPAP and EXP-CPAP).
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.096; 600.075; 600.145 Approved no
Call Number Admin @ si @ DGC2017a Serial 2883
Permanent link to this record
 

 
Author Simon Jégou; Michal Drozdzal; David Vazquez; Adriana Romero; Yoshua Bengio
Title The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation Type Conference Article
Year 2017 Publication IEEE Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages
Keywords Semantic Segmentation
Abstract State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g. Conditional Random Fields) to refine the model predictions.

Recently, a new CNN architecture, Densely Connected Convolutional Networks (DenseNets), has shown excellent results on image classification tasks. The idea of DenseNets is based on the observation that if each layer is directly connected to every other layer in a feed-forward fashion then the network will be more accurate and easier to train.

In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets.
Address Honolulu; USA; July 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes MILAB; ADAS; 600.076; 600.085; 601.281 Approved no
Call Number ADAS @ adas @ JDV2016 Serial 2866
Permanent link to this record
 

 
Author Arash Akbarinia; C. Alejandro Parraga
Title Biologically plausible boundary detection Type Conference Article
Year 2016 Publication 27th British Machine Vision Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Edges are key components of any visual scene to the extent that we can recognise objects merely by their silhouettes. The human visual system captures edge information through neurons in the visual cortex that are sensitive to both intensity discontinuities and particular orientations. The “classical approach” assumes that these cells are only responsive to the stimulus present within their receptive fields, however, recent studies demonstrate that surrounding regions and inter-areal feedback connections influence their responses significantly. In this work we propose a biologically-inspired edge detection model in which orientation selective neurons are represented through the first derivative of a Gaussian function resembling double-opponent cells in the primary visual cortex (V1). In our model we account for four kinds of surround, i.e. full, far, iso- and orthogonal-orientation, whose contributions are contrast-dependant. The output signal from V1 is pooled in its perpendicular direction by larger V2 neurons employing a contrast-variant centre-surround kernel. We further introduce a feedback connection from higher-level visual areas to the lower ones. The results of our model on two benchmark datasets show a big improvement compared to the current non-learning and biologically-inspired state-of-the-art algorithms while being competitive to the learning-based methods.
Address York; UK; September 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference BMVC
Notes NEUROBIT; 600.068; 600.072 Approved no
Call Number Admin @ si @ AkP2016a Serial 2867
Permanent link to this record
 

 
Author Azadeh S. Mozafari; David Vazquez; Mansour Jamzad; Antonio Lopez
Title Node-Adapt, Path-Adapt and Tree-Adapt:Model-Transfer Domain Adaptation for Random Forest Type Miscellaneous
Year 2016 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords Domain Adaptation; Pedestrian detection; Random Forest
Abstract Random Forest (RF) is a successful paradigm for learning classifiers due to its ability to learn from large feature spaces and seamlessly integrate multi-class classification, as well as the achieved accuracy and processing efficiency. However, as many other classifiers, RF requires domain adaptation (DA) provided that there is a mismatch between the training (source) and testing (target) domains which provokes classification degradation. Consequently, different RF-DA methods have been proposed, which not only require target-domain samples but revisiting the source-domain ones, too. As novelty, we propose three inherently different methods (Node-Adapt, Path-Adapt and Tree-Adapt) that only require the learned source-domain RF and a relatively few target-domain samples for DA, i.e. source-domain samples do not need to be available. To assess the performance of our proposals we focus on image-based object detection, using the pedestrian detection problem as challenging proof-of-concept. Moreover, we use the RF with expert nodes because it is a competitive patch-based pedestrian model. We test our Node-, Path- and Tree-Adapt methods in standard benchmarks, showing that DA is largely achieved.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ MVJ2016 Serial 2868
Permanent link to this record
 

 
Author Pau Riba; Josep Llados; Alicia Fornes
Title Error-tolerant coarse-to-fine matching model for hierarchical graphs Type Conference Article
Year 2017 Publication 11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition Abbreviated Journal
Volume 10310 Issue Pages 107-117
Keywords Graph matching; Hierarchical graph; Graph-based representation; Coarse-to-fine matching
Abstract Graph-based representations are effective tools to capture structural information from visual elements. However, retrieving a query graph from a large database of graphs implies a high computational complexity. Moreover, these representations are very sensitive to noise or small changes. In this work, a novel hierarchical graph representation is designed. Using graph clustering techniques adapted from graph-based social media analysis, we propose to generate a hierarchy able to deal with different levels of abstraction while keeping information about the topology. For the proposed representations, a coarse-to-fine matching method is defined. These approaches are validated using real scenarios such as classification of colour images and handwritten word spotting.
Address Anacapri; Italy; May 2017
Corporate Author Thesis
Publisher Springer International Publishing Place of Publication Editor Pasquale Foggia; Cheng-Lin Liu; Mario Vento
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference GbRPR
Notes DAG; 600.097; 601.302; 600.121 Approved no
Call Number Admin @ si @ RLF2017a Serial 2951
Permanent link to this record
 

 
Author Veronica Romero; Alicia Fornes; Enrique Vidal; Joan Andreu Sanchez
Title Information Extraction in Handwritten Marriage Licenses Books Using the MGGI Methodology Type Conference Article
Year 2017 Publication 8th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal
Volume 10255 Issue Pages 287-294
Keywords Handwritten Text Recognition; Information extraction; Language modeling; MGGI; Categories-based language model
Abstract Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach.
Address Faro; Portugal; June 2017
Corporate Author Thesis
Publisher Place of Publication Editor L.A. Alexandre; J.Salvador Sanchez; Joao M. F. Rodriguez
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-319-58837-7 Medium
Area Expedition Conference IbPRIA
Notes DAG; 602.006; 600.097; 600.121 Approved no
Call Number Admin @ si @ RFV2017 Serial 2952
Permanent link to this record
 

 
Author Ariel Amato
Title Moving cast shadow detection Type Journal Article
Year 2014 Publication Electronic letters on computer vision and image analysis Abbreviated Journal ELCVIA
Volume 13 Issue 2 Pages 70-71
Keywords
Abstract Motion perception is an amazing innate ability of the creatures on the planet. This adroitness entails a functional advantage that enables species to compete better in the wild. The motion perception ability is usually employed at different levels, allowing from the simplest interaction with the ’physis’ up to the most transcendental survival tasks. Among the five classical perception system , vision is the most widely used in the motion perception field. Millions years of evolution have led to a highly specialized visual system in humans, which is characterized by a tremendous accuracy as well as an extraordinary robustness. Although humans and an immense diversity of species can distinguish moving object with a seeming simplicity, it has proven to be a difficult and non trivial problem from a computational perspective. In the field of Computer Vision, the detection of moving objects is a challenging and fundamental research area. This can be referred to as the ’origin’ of vast and numerous vision-based research sub-areas. Nevertheless, from the bottom to the top of this hierarchical analysis, the foundations still relies on when and where motion has occurred in an image. Pixels corresponding to moving objects in image sequences can be identified by measuring changes in their values. However, a pixel’s value (representing a combination of color and brightness) could also vary due to other factors such as: variation in scene illumination, camera noise and nonlinear sensor responses among others. The challenge lies in detecting if the changes in pixels’ value are caused by a genuine object movement or not. An additional challenging aspect in motion detection is represented by moving cast shadows. The paradox arises because a moving object and its cast shadow share similar motion patterns. However, a moving cast shadow is not a moving object. In fact, a shadow represents a photometric illumination effect caused by the relative position of the object with respect to the light sources. Shadow detection methods are mainly divided in two domains depending on the application field. One normally consists of static images where shadows are casted by static objects, whereas the second one is referred to image sequences where shadows are casted by moving objects. For the first case, shadows can provide additional geometric and semantic cues about shape and position of its casting object as well as the localization of the light source. Although the previous information can be extracted from static images as well as video sequences, the main focus in the second area is usually change detection, scene matching or surveillance. In this context, a shadow can severely affect with the analysis and interpretation of the scene. The work done in the thesis is focused on the second case, thus it addresses the problem of detection and removal of moving cast shadows in video sequences in order to enhance the detection of moving object.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number Admin @ si @ Ama2014 Serial 2870
Permanent link to this record
 

 
Author Youssef El Rhabi; Simon Loic; Brun Luc; Josep Llados; Felipe Lumbreras
Title Information Theoretic Rotationwise Robust Binary Descriptor Learning Type Conference Article
Year 2016 Publication Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) Abbreviated Journal
Volume Issue Pages 368-378
Keywords
Abstract In this paper, we propose a new data-driven approach for binary descriptor selection. In order to draw a clear analysis of common designs, we present a general information-theoretic selection paradigm. It encompasses several standard binary descriptor construction schemes, including a recent state-of-the-art one named BOLD. We pursue the same endeavor to increase the stability of the produced descriptors with respect to rotations. To achieve this goal, we have designed a novel offline selection criterion which is better adapted to the online matching procedure. The effectiveness of our approach is demonstrated on two standard datasets, where our descriptor is compared to BOLD and to several classical descriptors. In particular, it emerges that our approach can reproduce equivalent if not better performance as BOLD while relying on twice shorter descriptors. Such an improvement can be influential for real-time applications.
Address Mérida; Mexico; November 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference S+SSPR
Notes DAG; ADAS; 600.097; 600.086 Approved no
Call Number Admin @ si @ RLL2016 Serial 2871
Permanent link to this record
 

 
Author Antonio Lopez; Jiaolong Xu; Jose Luis Gomez; David Vazquez; German Ros
Title From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example Type Book Chapter
Year 2017 Publication Domain Adaptation in Computer Vision Applications Abbreviated Journal
Volume Issue 13 Pages 243-258
Keywords Domain Adaptation
Abstract Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world.
Address
Corporate Author Thesis
Publisher Springer Place of Publication Editor Gabriela Csurka
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.085; 601.223; 600.076; 600.118 Approved no
Call Number ADAS @ adas @ LXG2017 Serial 2872
Permanent link to this record
 

 
Author Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta
Title Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases Type Journal Article
Year 2017 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 87 Issue Pages 203-211
Keywords
Abstract Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title (up) Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 602.006; 603.053; 600.121 Approved no
Call Number RLF2017b Serial 2873
Permanent link to this record