|
Ariel Amato, Angel Sappa, Alicia Fornes, Felipe Lumbreras and Josep Llados. 2013. Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform. 2nd International ACM Workshop on Crowdsourcing for Multimedia.21–22.
Abstract: In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.
|
|
|
Partha Pratim Roy, Josep Llados and Umapada Pal. 2009. A Complete System for Detection and Recognition of Text in Graphical Documents using Background Information. 5th International Conference on Computer Vision Theory and Applications.
|
|
|
Partha Pratim Roy, Umapada Pal and Josep Llados. 2009. Seal detection and recognition: An approach for document indexing. 10th International Conference on Document Analysis and Recognition.101–105.
Abstract: Reliable indexing of documents having seal instances can be achieved by recognizing seal information. This paper presents a novel approach for detecting and classifying such multi-oriented seals in these documents. First, Hough Transform based methods are applied to extract the seal regions in documents. Next, isolated text characters within these regions are detected. Rotation and size invariant features and a support vector machine based classifier have been used to recognize these detected text characters. Next, for each pair of character, we encode their relative spatial organization using their distance and angular position with respect to the centre of the seal, and enter this code into a hash table. Given an input seal, we recognize the individual text characters and compute the code for pair-wise character based on the relative spatial organization. The code obtained from the input seal helps to retrieve model hypothesis from the hash table. The seal model to which we get maximum hypothesis is selected for the recognition of the input seal. The methodology is tested to index seal in rotation and size invariant environment and we obtained encouraging results.
|
|
|
Partha Pratim Roy, Umapada Pal, Josep Llados and Mathieu Nicolas Delalandre. 2009. Multi-Oriented and Multi-Sized Touching Character Segmentation using Dynamic Programming. 10th International Conference on Document Analysis and Recognition.11–15.
Abstract: In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region at the background portion. Using Convex Hull information, we use these background information to find some initial points to segment a touching string into possible primitive segments (a primitive segment consists of a single character or a part of a character). Next these primitive segments are merged to get optimum segmentation and dynamic programming is applied using total likelihood of characters as the objective function. SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Circular ring and convex hull ring based approach has been used along with angular information of the contour pixels of the character to make the feature rotation invariant. From the experiment, we obtained encouraging results.
|
|
|
Anjan Dutta and Zeynep Akata. 2019. Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval. 32nd IEEE Conference on Computer Vision and Pattern Recognition.5089–5098.
Abstract: Zero-shot sketch-based image retrieval (SBIR) is an emerging task in computer vision, allowing to retrieve natural images relevant to sketch queries that might not been seen in the training phase. Existing works either require aligned sketch-image pairs or inefficient memory fusion layer for mapping the visual information to a semantic space. In this work, we propose a semantically aligned paired cycle-consistent generative (SEM-PCYC) model for zero-shot SBIR, where each branch maps the visual information to a common semantic space via an adversarial training. Each of these branches maintains a cycle consistency that only requires supervision at category levels, and avoids the need of highly-priced aligned sketch-image pairs. A classification criteria on the generators' outputs ensures the visual to semantic space mapping to be discriminating. Furthermore, we propose to combine textual and hierarchical side information via a feature selection auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in zero-shot SBIR performance over the state-of-the-art on the challenging Sketchy and TU-Berlin datasets.
|
|
|
Partha Pratim Roy, Umapada Pal and Josep Llados. 2010. Seal Object Detection in Document Images using GHT of Local Component Shapes. 10th ACM Symposium On Applied Computing.23–27.
Abstract: Due to noise, overlapped text/signature and multi-oriented nature, seal (stamp) object detection involves a difficult challenge. This paper deals with automatic detection of seal from documents with cluttered background. Here, a seal object is characterized by scale and rotation invariant spatial feature descriptors (distance and angular position) computed from recognition result of individual connected components (characters). Recognition of multi-scale and multi-oriented component is done using Support Vector Machine classifier. Generalized Hough Transform (GHT) is used to detect the seal and a voting is casted for finding possible location of the seal object in a document based on these spatial feature descriptor of components pairs. The peak of votes in GHT accumulator validates the hypothesis to locate the seal object in a document. Experimental results show that, the method is efficient to locate seal instance of arbitrary shape and orientation in documents.
|
|
|
Muhammad Muzzamil Luqman, Thierry Brouard, Jean-Yves Ramel and Josep Llados. 2010. Vers une approche foue of encapsulation de graphes: application a la reconnaissance de symboles. Colloque International Francophone sur l'Écrit et le Document.169–184.
Abstract: We present a new methodology for symbol recognition, by employing a structural approach for representing visual associations in symbols and a statistical classifier for recognition. A graphic symbol is vectorized, its topological and geometrical details are encoded by an attributed relational graph and a signature is computed for it. Data adapted fuzzy intervals have been introduced for addressing the sensitivity of structural representations to noise. The joint probability distribution of signatures is encoded by a Bayesian network, which serves as a mechanism for pruning irrelevant features and choosing a subset of interesting features from structural signatures of underlying symbol set, and is deployed in a supervised learning scenario for recognizing query symbols. Experimental results on pre-segmented 2D linear architectural and electronic symbols from GREC databases are presented.
Keywords: Fuzzy interval; Graph embedding; Bayesian network; Symbol recognition
|
|
|
Albert Gordo, Alicia Fornes, Ernest Valveny and Josep Llados. 2010. A Bag of Notes Approach to Writer Identification in Old Handwritten Music Scores. 9th IAPR International Workshop on Document Analysis Systems.247–254.
Abstract: Determining the authorship of a document, namely writer identification, can be an important source of information for document categorization. Contrary to text documents, the identification of the writer of graphical documents is still a challenge. In this paper we present a robust approach for writer identification in a particular kind of graphical documents, old music scores. This approach adapts the bag of visual terms method for coping with graphic documents. The identification is performed only using the graphical music notation. For this purpose, we generate a graphic vocabulary without recognizing any music symbols, and consequently, avoiding the difficulties in the recognition of hand-drawn symbols in old and degraded documents. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving very high identification rates.
|
|
|
Alicia Fornes and Josep Llados. 2010. A Symbol-dependent Writer Identifcation Approach in Old Handwritten Music Scores. 12th International Conference on Frontiers in Handwriting Recognition.634–639.
Abstract: Writer identification consists in determining the writer of a piece of handwriting from a set of writers. In this paper we introduce a symbol-dependent approach for identifying the writer of old music scores, which is based on two symbol recognition methods. The main idea is to use the Blurred Shape Model descriptor and a DTW-based method for detecting, recognizing and describing the music clefs and notes. The proposed approach has been evaluated in a database of old music scores, achieving very high writer identification rates.
|
|
|
Jaume Gibert and Ernest Valveny. 2010. Graph Embedding based on Nodes Attributes Representatives and a Graph of Words Representation. In In E.R. Hancock, R.C.W., T. Windeatt, I. Ulusoy and F. Escolano,, ed. 13th International worshop on structural and syntactic pattern recognition and 8th international worshop on statistical pattern recognition. Springer Berlin Heidelberg, 223–232. (LNCS.)
Abstract: Although graph embedding has recently been used to extend statistical pattern recognition techniques to the graph domain, some existing embeddings are usually computationally expensive as they rely on classical graph-based operations. In this paper we present a new way to embed graphs into vector spaces by first encapsulating the information stored in the original graph under another graph representation by clustering the attributes of the graphs to be processed. This new representation makes the association of graphs to vectors an easy step by just arranging both node attributes and the adjacency matrix in the form of vectors. To test our method, we use two different databases of graphs whose nodes attributes are of different nature. A comparison with a reference method permits to show that this new embedding is better in terms of classification rates, while being much more faster.
|
|