|
Palaiahnakote Shivakumara, Anjan Dutta, Chew Lim Tan and Umapada Pal. 2014. Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing. MTAP, 72(1), 515–539.
Abstract: In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well.
|
|
|
Anjan Dutta, Josep Llados, Horst Bunke and Umapada Pal. 2014. A Product Graph Based Method for Dual Subgraph Matching Applied to Symbol Spotting. In Bart Lamiroy and Jean-Marc Ogier, eds. Graphics Recognition. Current Trends and Challenges. Springer Berlin Heidelberg, 7–11. (LNCS.)
Abstract: Product graph has been shown as a way for matching subgraphs. This paper reports the extension of the product graph methodology for subgraph matching applied to symbol spotting in graphical documents. Here we focus on the two major limitations of the previous version of the algorithm: (1) spurious nodes and edges in the graph representation and (2) inefficient node and edge attributes. To deal with noisy information of vectorized graphical documents, we consider a dual edge graph representation on the original graph representing the graphical information and the product graph is computed between the dual edge graphs of the pattern graph and the target graph. The dual edge graph with redundant edges is helpful for efficient and tolerating encoding of the structural information of the graphical documents. The adjacency matrix of the product graph locates the pair of similar edges of two operand graphs and exponentiating the adjacency matrix finds similar random walks of greater lengths. Nodes joining similar random walks between two graphs are found by combining different weighted exponentials of adjacency matrices. An experimental investigation reveals that the recall obtained by this approach is quite encouraging.
Keywords: Product graph; Dual edge graph; Subgraph matching; Random walks; Graph kernel
|
|
|
Marçal Rusiñol, Lluis Pere de las Heras and Oriol Ramos Terrades. 2014. Flowchart Recognition for Non-Textual Information Retrieval in Patent Search. IR, 17(5-6), 545–562.
Abstract: Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.
Keywords: Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition
|
|
|
T.Chauhan, E.Perales, Kaida Xiao, E.Hird, Dimosthenis Karatzas and Sophie Wuerger. 2014. The achromatic locus: Effect of navigation direction in color space. VSS, 14 (1)(25), 1–11.
Abstract: 5Y Impact Factor: 2.99 / 1st (Ophthalmology)
An achromatic stimulus is defined as a patch of light that is devoid of any hue. This is usually achieved by asking observers to adjust the stimulus such that it looks neither red nor green and at the same time neither yellow nor blue. Despite the theoretical and practical importance of the achromatic locus, little is known about the variability in these settings. The main purpose of the current study was to evaluate whether achromatic settings were dependent on the task of the observers, namely the navigation direction in color space. Observers could either adjust the test patch along the two chromatic axes in the CIE u*v* diagram or, alternatively, navigate along the unique-hue lines. Our main result is that the navigation method affects the reliability of these achromatic settings. Observers are able to make more reliable achromatic settings when adjusting the test patch along the directions defined by the four unique hues as opposed to navigating along the main axes in the commonly used CIE u*v* chromaticity plane. This result holds across different ambient viewing conditions (Dark, Daylight, Cool White Fluorescent) and different test luminance levels (5, 20, and 50 cd/m2). The reduced variability in the achromatic settings is consistent with the idea that internal color representations are more aligned with the unique-hue lines than the u* and v* axes.
Keywords: achromatic; unique hues; color constancy; luminance; color space
|
|
|
A.Kesidis and Dimosthenis Karatzas. 2014. Logo and Trademark Recognition. In D. Doermann and K. Tombre, eds. Handbook of Document Image Processing and Recognition. Springer London, 591–646.
Abstract: The importance of logos and trademarks in nowadays society is indisputable, variably seen under a positive light as a valuable service for consumers or a negative one as a catalyst of ever-increasing consumerism. This chapter discusses the technical approaches for enabling machines to work with logos, looking into the latest methodologies for logo detection, localization, representation, recognition, retrieval, and spotting in a variety of media. This analysis is presented in the context of three different applications covering the complete depth and breadth of state of the art techniques. These are trademark retrieval systems, logo recognition in document images, and logo detection and removal in images and videos. This chapter, due to the very nature of logos and trademarks, brings together various facets of document image analysis spanning graphical and textual content, while it links document image analysis to other computer vision domains, especially when it comes to the analysis of real-scene videos and images.
Keywords: Logo recognition; Logo removal; Logo spotting; Trademark registration; Trademark retrieval systems
|
|
|
Anjan Dutta. 2014. Inexact Subgraph Matching Applied to Symbol Spotting in Graphical Documents. (Ph.D. thesis, Ediciones Graficas Rey.)
Abstract: There is a resurgence in the use of structural approaches in the usual object recognition and retrieval problem. Graph theory, in particular, graph matching plays a relevant role in that. Specifically, the detection of an object (or a part of that) in an image in terms of structural features can be formulated as a subgraph matching. Subgraph matching is a challenging task. Specially due to the presence of outliers most of the graph matching algorithms do not perform well in subgraph matching scenario. Also exact subgraph isomorphism has proven to be an NP-complete problem. So naturally, in graph matching community, there are lot of efforts addressing the problem of subgraph matching within suboptimal bound. Most of them work with approximate algorithms that try to get an inexact solution in estimated way. In addition, usual recognition must cope with distortion. Inexact graph matching consists in finding the best isomorphism under a similarity measure. Theoretically this thesis proposes algorithms for solving subgraph matching in an approximate and inexact way.
We consider the symbol spotting problem on graphical documents or line drawings from application point of view. This is a well known problem in the graphics recognition community. It can be further applied for indexing and classification of documents based on their contents. The structural nature of this kind of documents easily motivates one for giving a graph based representation. So the symbol spotting problem on graphical documents can be considered as a subgraph matching problem. The main challenges in this application domain is the noise and distortions that might come during the usage, digitalization and raster to vector conversion of those documents. Apart from that computer vision nowadays is not any more confined within a limited number of images. So dealing a huge number of images with graph based method is a further challenge.
In this thesis, on one hand, we have worked on efficient and robust graph representation to cope with the noise and distortions coming from documents. On the other hand, we have worked on different graph based methods and framework to solve the subgraph matching problem in a better approximated way, which can also deal with considerable number of images. Firstly, we propose a symbol spotting method by hashing serialized subgraphs. Graph serialization allows to create factorized substructures such as graph paths, which can be organized in hash tables depending on the structural similarities of the serialized subgraphs. The involvement of hashing techniques helps to reduce the search space substantially and speeds up the spotting procedure. Secondly, we introduce contextual similarities based on the walk based propagation on tensor product graph. These contextual similarities involve higher order information and more reliable than pairwise similarities. We use these higher order similarities to formulate subgraph matching as a node and edge selection problem in the tensor product graph. Thirdly, we propose near convex grouping to form near convex region adjacency graph which eliminates the limitations of traditional region adjacency graph representation for graphic recognition. Fourthly, we propose a hierarchical graph representation by simplifying/correcting the structural errors to create a hierarchical graph of the base graph. Later these hierarchical graph structures are matched with some graph matching methods. Apart from that, in this thesis we have provided an overall experimental comparison of all the methods and some of the state-of-the-art methods. Furthermore, some dataset models have also been proposed.
|
|
|
Clement Guerin, Christophe Rigaud, Karell Bertet, Jean-Christophe Burie, Arnaud Revel and Jean-Marc Ogier. 2014. Réduction de l’espace de recherche pour les personnages de bandes dessinées. 19th National Congress Reconnaissance de Formes et l'Intelligence Artificielle.
Abstract: Les bandes dessinées représentent un patrimoine culturel important dans de nombreux pays et leur numérisation massive offre la possibilité d'effectuer des recherches dans le contenu des images. À ce jour, ce sont principalement les structures des pages et leurs contenus textuels qui ont été étudiés, peu de travaux portent sur le contenu graphique. Nous proposons de nous appuyer sur des éléments déjà étudiés tels que la position des cases et des bulles, pour réduire l'espace de recherche et localiser les personnages en fonction de la queue des bulles. L'évaluation de nos différentes contributions à partir de la base eBDtheque montre un taux de détection des queues de bulle de 81.2%, de localisation des personnages allant jusqu'à 85% et un gain d'espace de recherche de plus de 50%.
Keywords: contextual search; document analysis; comics characters
|
|
|
Christophe Rigaud and Clement Guerin. 2014. Localisation contextuelle des personnages de bandes dessinées. Colloque International Francophone sur l'Écrit et le Document.
Abstract: Les auteurs proposent une méthode de localisation des personnages dans des cases de bandes dessinées en s'appuyant sur les caractéristiques des bulles de dialogue. L'évaluation montre un taux de localisation des personnages allant jusqu'à 65%.
|
|
|
Lluis Gomez and Dimosthenis Karatzas. 2014. Scene Text Recognition: No Country for Old Men? 1st International Workshop on Robust Reading.
|
|
|
Thanh Ha Do, Salvatore Tabbone and Oriol Ramos Terrades. 2014. Spotting Symbol Using Sparsity over Learned Dictionary of Local Descriptors. 11th IAPR International Workshop on Document Analysis and Systems.156–160.
Abstract: This paper proposes a new approach to spot symbols into graphical documents using sparse representations. More specifically, a dictionary is learned from a training database of local descriptors defined over the documents. Following their sparse representations, interest points sharing similar properties are used to define interest regions. Using an original adaptation of information retrieval techniques, a vector model for interest regions and for a query symbol is built based on its sparsity in a visual vocabulary where the visual words are columns in the learned dictionary. The matching process is performed comparing the similarity between vector models. Evaluation on SESYD datasets demonstrates that our method is promising.
|
|