|
Marçal Rusiñol, K. Bertet, Jean-Marc Ogier and Josep Llados. 2010. Symbol Recognition Using a Concept Lattice of Graphical Patterns. Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers. Springer Berlin Heidelberg, 187–198. (LNCS.)
Abstract: In this paper we propose a new approach to recognize symbols by the use of a concept lattice. We propose to build a concept lattice in terms of graphical patterns. Each model symbol is decomposed in a set of composing graphical patterns taken as primitives. Each one of these primitives is described by boundary moment invariants. The obtained concept lattice relates which symbolic patterns compose a given graphical symbol. A Hasse diagram is derived from the context and is used to recognize symbols affected by noise. We present some preliminary results over a variation of the dataset of symbols from the GREC 2005 symbol recognition contest.
|
|
|
Nuria Cirera. 2012. Recognition of Handwritten Historical Documents. (Master's thesis, .)
|
|
|
Miquel Ferrer, I. Bardaji, Ernest Valveny, Dimosthenis Karatzas and Horst Bunke. 2013. Median Graph Computation by Means of Graph Embedding into Vector Spaces. In Yun Fu and Yungian Ma, eds. Graph Embedding for Pattern Analysis. Springer New York, 45–72.
Abstract: In pattern recognition [8, 14], a key issue to be addressed when designing a system is how to represent input patterns. Feature vectors is a common option. That is, a set of numerical features describing relevant properties of the pattern are computed and arranged in a vector form. The main advantages of this kind of representation are computational simplicity and a well sound mathematical foundation. Thus, a large number of operations are available to work with vectors and a large repository of algorithms for pattern analysis and classification exist. However, the simple structure of feature vectors might not be the best option for complex patterns where nonnumerical features or relations between different parts of the pattern become relevant.
|
|
|
Christophe Rigaud, Dimosthenis Karatzas, Jean-Christophe Burie and Jean-Marc Ogier. 2013. Speech balloon contour classification in comics. 10th IAPR International Workshop on Graphics Recognition.
Abstract: Comic books digitization combined with subsequent comic book understanding create a variety of new applications, including mobile reading and data mining. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. In this work we detail a novel approach for classifying speech balloon in scanned comics book pages based on their contour time series.
|
|
|
Carles Sanchez, Oriol Ramos Terrades, Patricia Marquez, Enric Marti, Jaume Rocarias and Debora Gil. 2014. Evaluación automática de prácticas en Moodle para el aprendizaje autónomo en Ingenierías.
|
|
|
David Fernandez, Jon Almazan, Nuria Cirera, Alicia Fornes and Josep Llados. 2014. BH2M: the Barcelona Historical Handwritten Marriages database. 22nd International Conference on Pattern Recognition.256–261.
Abstract: This paper presents an image database of historical handwritten marriages records stored in the archives of Barcelona cathedral, and the corresponding meta-data addressed to evaluate the performance of document analysis algorithms. The contribution of this paper is twofold. First, it presents a complete ground truth which covers the whole pipeline of handwriting
recognition research, from layout analysis to recognition and understanding. Second, it is the first dataset in the emerging area of genealogical document analysis, where documents are manuscripts pseudo-structured with specific lexicons and the interest is beyond pure transcriptions but context dependent.
|
|
|
David Fernandez, R.Manmatha, Josep Llados and Alicia Fornes. 2014. Sequential Word Spotting in Historical Handwritten Documents. 11th IAPR International Workshop on Document Analysis and Systems.101–105.
Abstract: In this work we present a handwritten word spotting approach that takes advantage of the a priori known order of appearance of the query words. Given an ordered sequence of query word instances, the proposed approach performs a
sequence alignment with the words in the target collection. Although the alignment is quite sparse, i.e. the number of words in the database is higher than the query set, the improvement in the overall performance is sensitively higher than isolated word spotting. As application dataset, we use a collection of handwritten marriage licenses taking advantage of the ordered
index pages of family names.
|
|
|
Pau Riba, Jon Almazan, Alicia Fornes, David Fernandez, Ernest Valveny and Josep Llados. 2014. e-Crowds: a mobile platform for browsing and searching in historical demographyrelated manuscripts. 14th International Conference on Frontiers in Handwriting Recognition.228–233.
Abstract: This paper presents a prototype system running on portable devices for browsing and word searching through historical handwritten document collections. The platform adapts the paradigm of eBook reading, where the narrative is not necessarily sequential, but centered on the user actions. The novelty is to replace digitally born books by digitized historical manuscripts of marriage licenses, so document analysis tasks are required in the browser. With an active reading paradigm, the user can cast queries of people names, so he/she can implicitly follow genealogical links. In addition, the system allows combined searches: the user can refine a search by adding more words to search. As a second contribution, the retrieval functionality involves as a core technology a word spotting module with an unified approach, which allows combined query searches, and also two input modalities: query-by-example, and query-by-string.
|
|
|
Anjan Dutta. 2014. Inexact Subgraph Matching Applied to Symbol Spotting in Graphical Documents. (Ph.D. thesis, Ediciones Graficas Rey.)
Abstract: There is a resurgence in the use of structural approaches in the usual object recognition and retrieval problem. Graph theory, in particular, graph matching plays a relevant role in that. Specifically, the detection of an object (or a part of that) in an image in terms of structural features can be formulated as a subgraph matching. Subgraph matching is a challenging task. Specially due to the presence of outliers most of the graph matching algorithms do not perform well in subgraph matching scenario. Also exact subgraph isomorphism has proven to be an NP-complete problem. So naturally, in graph matching community, there are lot of efforts addressing the problem of subgraph matching within suboptimal bound. Most of them work with approximate algorithms that try to get an inexact solution in estimated way. In addition, usual recognition must cope with distortion. Inexact graph matching consists in finding the best isomorphism under a similarity measure. Theoretically this thesis proposes algorithms for solving subgraph matching in an approximate and inexact way.
We consider the symbol spotting problem on graphical documents or line drawings from application point of view. This is a well known problem in the graphics recognition community. It can be further applied for indexing and classification of documents based on their contents. The structural nature of this kind of documents easily motivates one for giving a graph based representation. So the symbol spotting problem on graphical documents can be considered as a subgraph matching problem. The main challenges in this application domain is the noise and distortions that might come during the usage, digitalization and raster to vector conversion of those documents. Apart from that computer vision nowadays is not any more confined within a limited number of images. So dealing a huge number of images with graph based method is a further challenge.
In this thesis, on one hand, we have worked on efficient and robust graph representation to cope with the noise and distortions coming from documents. On the other hand, we have worked on different graph based methods and framework to solve the subgraph matching problem in a better approximated way, which can also deal with considerable number of images. Firstly, we propose a symbol spotting method by hashing serialized subgraphs. Graph serialization allows to create factorized substructures such as graph paths, which can be organized in hash tables depending on the structural similarities of the serialized subgraphs. The involvement of hashing techniques helps to reduce the search space substantially and speeds up the spotting procedure. Secondly, we introduce contextual similarities based on the walk based propagation on tensor product graph. These contextual similarities involve higher order information and more reliable than pairwise similarities. We use these higher order similarities to formulate subgraph matching as a node and edge selection problem in the tensor product graph. Thirdly, we propose near convex grouping to form near convex region adjacency graph which eliminates the limitations of traditional region adjacency graph representation for graphic recognition. Fourthly, we propose a hierarchical graph representation by simplifying/correcting the structural errors to create a hierarchical graph of the base graph. Later these hierarchical graph structures are matched with some graph matching methods. Apart from that, in this thesis we have provided an overall experimental comparison of all the methods and some of the state-of-the-art methods. Furthermore, some dataset models have also been proposed.
|
|
|
Christophe Rigaud, Dimosthenis Karatzas, Jean-Christophe Burie and Jean-Marc Ogier. 2014. Color descriptor for content-based drawing retrieval. 11th IAPR International Workshop on Document Analysis and Systems.267–271.
Abstract: Human detection in computer vision field is an active field of research. Extending this to human-like drawings such as the main characters in comic book stories is not trivial. Comics analysis is a very recent field of research at the intersection of graphics, texts, objects and people recognition. The detection of the main comic characters is an essential step towards a fully automatic comic book understanding. This paper presents a color-based approach for comics character retrieval using content-based drawing retrieval and color palette.
|
|