Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	1–1 of 1 record found matching your query (RSS):

Search & Display Options

Select All Deselect All

<< 1 >>

|

|

Details

	Record						Links
	Author	David Aldavert
	Title	Efficient and Scalable Handwritten Word Spotting on Historical Documents using Bag of Visual Words			Type	Book Whole
	Year	2021	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Word spotting can be defined as the pattern recognition tasked aimed at locating and retrieving a specific keyword within a document image collection without explicitly transcribing the whole corpus. Its use is particularly interesting when applied in scenarios where Optical Character Recognition performs poorly or can not be used at all. This thesis focuses on such a scenario, word spotting on historical handwritten documents that have been written by a single author or by multiple authors with a similar calligraphy. This problem requires a visual signature that is robust to image artifacts, flexible to accommodate script variations and efficient to retrieve information in a rapid manner. For this, we have developed a set of word spotting methods that on their foundation use the well known Bag-of-Visual-Words (BoVW) representation. This representation has gained popularity among the document image analysis community to characterize handwritten words in an unsupervised manner. However, most approaches on this field rely on a basic BoVW configuration and disregard complex encoding and spatial representations. We determine which BoVW configurations provide the best performance boost to a spotting system. Then, we extend the segmentation-based word spotting, where word candidates are given a priori, to segmentation-free spotting. The proposed approach seeds the document images with overlapping word location candidates and characterizes them with a BoVW signature. Retrieval is achieved comparing the query and candidate signatures and returning the locations that provide a higher consensus. This is a simple but powerful approach that requires a more compact signature than in a segmentation-based scenario. We first project the BoVW signature into a reduced semantic topics space and then compress it further using Product Quantizers. The resulting signature only requires a few dozen bytes, allowing us to index thousands of pages on a common desktop computer. The final system still yields a performance comparable to the state-of-the-art despite all the information loss during the compression phases. Afterwards, we also study how to combine different modalities of information in order to create a query-by-X spotting system where, words are indexed using an information modality and queries are retrieved using another. We consider three different information modalities: visual, textual and audio. Our proposal is to create a latent feature space where features which are semantically related are projected onto the same topics. Creating thus a new feature space where information from different modalities can be compared. Later, we consider the codebook generation and descriptor encoding problem. The codebooks used to encode the BoVW signatures are usually created using an unsupervised clustering algorithm and, they require to test multiple parameters to determine which configuration is best for a certain document collection. We propose a semantic clustering algorithm which allows to estimate the best parameter from data. Since gather annotated data is costly, we use synthetically generated word images. The resulting codebook is database agnostic, i. e. a codebook that yields a good performance on document collections that use the same script. We also propose the use of an additional codebook to approximate descriptors and reduce the descriptor encoding complexity to sub-linear. Finally, we focus on the problem of signatures dimensionality. We propose a new symbol probability signature where each bin represents the probability that a certain symbol is present a certain location of the word image. This signature is extremely compact and combined with compression techniques can represent word images with just a few bytes per signature.
	Address	April 2021
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Marçal Rusiñol;Josep Llados
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-122714-5-4	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121;ADAS			Approved	no
	Call Number	Admin @ si @ Ald2021			Serial	3601
Permanent link to this record

Select All Deselect All

<< 1 >>

|

|

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format:

SQL Search | Library Search | Show Record | Extract Citations