<?xml version="1.0" encoding="UTF-8" ?>

<modsCollection xmlns="http://www.loc.gov/mods/v3">
	<mods version="3.2">
		<titleInfo>
			<title>Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing</title>
		</titleInfo>
		<name type="personal">
			<namePart type="family">Palaiahnakote Shivakumara</namePart>
			<role>
				<roleTerm authority="marcrelator" type="text">author</roleTerm>
			</role>
		</name>
		<name type="personal">
			<namePart type="family">Anjan Dutta</namePart>
			<role>
				<roleTerm authority="marcrelator" type="text">author</roleTerm>
			</role>
		</name>
		<name type="personal">
			<namePart type="family">Chew Lim Tan</namePart>
			<role>
				<roleTerm authority="marcrelator" type="text">author</roleTerm>
			</role>
		</name>
		<name type="personal">
			<namePart type="family">Umapada Pal</namePart>
			<role>
				<roleTerm authority="marcrelator" type="text">author</roleTerm>
			</role>
		</name>
		<originInfo>
			<dateIssued>2014</dateIssued>
		</originInfo>
		<abstract>In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well.</abstract>
		<note>DAG; 600.077</note>
		<note>exported from refbase (http://refbase.cvc.uab.es/show.php?record=2357), last updated on Thu, 26 Feb 2015 17:02:07 +0100</note>
		<typeOfResource>text</typeOfResource>
		<identifier type="doi">10.1007/s11042-013-1385-0</identifier>
		<identifier type="local">Admin @ si @ SDT2014</identifier>
		<relatedItem type="host">
			<titleInfo>
				<title>Multimedia Tools and Applications</title>
			</titleInfo>
			<titleInfo type="abbreviated">
				<title>MTAP</title>
			</titleInfo>
			<originInfo>
				<dateIssued>2014</dateIssued>
				<publisher>Springer US</publisher>
				<issuance>continuing</issuance>
			</originInfo>
			<genre authority="marcgt">periodical</genre>
			<genre>academic journal</genre>
			<part>
				<detail type="volume">
					<number>72</number>
				</detail>
				<detail type="issue">
					<number>1</number>
				</detail>
				<extent unit="page">
					<start>515</start>
					<end>539</end>
				</extent>
			</part>
			<identifier type="issn">1380-7501</identifier>
		</relatedItem>
	</mods>
</modsCollection>