Eduard Vazquez, Joost Van de Weijer, & Ramon Baldrich. (2008). Image Segmentation in the Presence of Shadows and Highligts. In 10th European Conference on Computer Vision (Vol. 5305, 1–14). LNCS.
|
Albert Gordo, Florent Perronnin, & Ernest Valveny. (2012). Document classification using multiple views. In 10th IAPR International Workshop on Document Analysis Systems (pp. 33–37). IEEE Computer Society Washington.
Abstract: The combination of multiple features or views when representing documents or other kinds of objects usually leads to improved results in classification (and retrieval) tasks. Most systems assume that those views will be available both at training and test time. However, some views may be too `expensive' to be available at test time. In this paper, we consider the use of Canonical Correlation Analysis to leverage `expensive' views that are available only at training time. Experimental results show that this information may significantly improve the results in a classification task.
|
Klaus Broelemann, Anjan Dutta, Xiaoyi Jiang, & Josep Llados. (2013). Plausibility-Graphs for Symbol Spotting in Graphical Documents. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: Graph representation of graphical documents often suffers from noise viz. spurious nodes and spurios edges of graph and their discontinuity etc. In general these errors occur during the low-level image processing viz. binarization, skeletonization, vectorization etc. Hierarchical graph representation is a nice and efficient way to solve this kind of problem by hierarchically merging node-node and node-edge depending on the distance.
But the creation of hierarchical graph representing the graphical information often uses hard thresholds on the distance to create the hierarchical nodes (next state) of the lower nodes (or states) of a graph. As a result the representation often loses useful information. This paper introduces plausibilities to the nodes of hierarchical graph as a function of distance and proposes a modified algorithm for matching subgraphs of the hierarchical
graphs. The plausibility-annotated nodes help to improve the performance of the matching algorithm on two hierarchical structures. To show the potential of this approach, we conduct an experiment with the SESYD dataset.
|
Anjan Dutta, Josep Llados, Horst Bunke, & Umapada Pal. (2013). A Product graph based method for dual subgraph matching applied to symbol spotting. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: Product graph has been shown to be an efficient way for matching subgraphs. This paper reports the extension of the product graph methodology for subgraph matching applied to symbol spotting in graphical documents. This paper focuses on the two major limitations of the previous version of product graph: (1) Spurious nodes and edges in the graph representation and (2) Inefficient node and edge attributes. To deal with noisy information of vectorized graphical documents, we consider a dual graph representation on the original graph representing the graphical information and the product graph is computed between the dual graphs of the query graphs and the input graph.
The dual graph with redundant edges is helpful for efficient and tolerating encoding of the structural information of the graphical documents. The adjacency matrix of the product graph locates similar path information of two graphs and exponentiating the adjacency matrix finds similar paths of greater lengths. Nodes joining similar paths between two graphs are found by combining different exponentials of adjacency matrices. An experimental investigation reveals that the recall obtained by this approach is quite encouraging.
|
V.C.Kieu, Alicia Fornes, M. Visani, N.Journet, & Anjan Dutta. (2013). The ICDAR/GREC 2013 Music Scores Competition on Staff Removal. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated both at staff removal and writer identification tasks. In this second edition, we propose a staff removal competition where we simulate old music scores. Thus, we have created a new set of images, which contain noise and 3D distortions. This paper describes the distortion methods, metrics, the participant’s methods and the obtained results.
Keywords: Competition; Music scores; Staff Removal
|
Marçal Rusiñol, V. Poulain d'Andecy, Dimosthenis Karatzas, & Josep Llados. (2013). Classification of Administrative Document Images by Logo Identification. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier's graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
|
Marçal Rusiñol, Dimosthenis Karatzas, & Josep Llados. (2013). Spotting Graphical Symbols in Camera-Acquired Documents in Real Time. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time.
|
Lluis Pere de las Heras, David Fernandez, Alicia Fornes, Ernest Valveny, Gemma Sanchez, & Josep Llados. (2013). Perceptual retrieval of architectural floor plans. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: This paper proposes a runlength histogram signature as a percetual descriptor of architectural plans in a retrieval scenario. The style of an architectural drawing is characterized by the perception of lines, shapes and texture. Such visual stimuli are the basis for defining semantic concepts as space properties, symmetry, density, etc. We propose runlength histograms extracted in vertical, horizontal and diagonal directions as a characterization of line and space properties in floorplans, so it can be roughly associated to a description of walls and room structure. A retrieval application illustrates the performance of the proposed approach, where given a plan as a query,
similar ones are obtained from a database. A ground truth based on human observation has been constructed to validate the hypothesis. Preliminary results show the interest of the proposed approach and opens a challenging research line in graphics recognition.
|
Lluis Pere de las Heras, Ernest Valveny, & Gemma Sanchez. (2013). Combining structural and statistical strategies for unsupervised wall detection in floor plans. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: This paper presents an evolution of the first unsupervised wall segmentation method in floor plans, that was presented by the authors in [1]. This first approach, contrarily to the existing ones, is able to segment walls independently to their notation and without the need of any pre-annotated data
to learn their visual appearance. Despite the good performance of the first approach, some specific cases, such as curved shaped walls, were not correctly segmented since they do not agree the strict structural assumptions that guide the whole methodology in order to be able to learn, in an unsupervised way, the structure of a wall. In this paper, we refine this strategy by dividing the
process in two steps. In a first step, potential wall segments are extracted unsupervisedly using a modification of [1], by restricting even more the areas considered as walls in a first moment. In a second step, these segments are used to learn and spot lost instances based on a modified version of [2], also presented by the authors. The presented combined method have been tested on
4 datasets with different notations and compared with the stateof-the-art applyed on the same datasets. The results show its adaptability to different wall notations and shapes, significantly outperforming the original approach.
|
Christophe Rigaud, Dimosthenis Karatzas, Jean-Christophe Burie, & Jean-Marc Ogier. (2013). Speech balloon contour classification in comics. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: Comic books digitization combined with subsequent comic book understanding create a variety of new applications, including mobile reading and data mining. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. In this work we detail a novel approach for classifying speech balloon in scanned comics book pages based on their contour time series.
|
Lluis Pere de las Heras, David Fernandez, Alicia Fornes, Ernest Valveny, Gemma Sanchez, & Josep Llados. (2013). Runlength Histogram Image Signature for Perceptual Retrieval of Architectural Floor Plans. In 10th IAPR International Workshop on Graphics Recognition.
|
Lluis Pere de las Heras, Ernest Valveny, & Gemma Sanchez. (2013). Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies. In 10th IAPR International Workshop on Graphics Recognition.
|
Pau Riba, Josep Llados, Alicia Fornes, & Anjan Dutta. (2015). Large-scale Graph Indexing using Binary Embeddings of Node Contexts. In C.-L.Liu, B.Luo, W.G.Kropatsch, & J.Cheng (Eds.), 10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition (Vol. 9069, pp. 208–217). LNCS. Springer International Publishing.
Abstract: Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.
Keywords: Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding
|
Nil Ballus, Bhalaji Nagarajan, & Petia Radeva. (2022). Opt-SSL: An Enhanced Self-Supervised Framework for Food Recognition. In 10th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 13256). LNCS.
Abstract: Self-supervised Learning has been showing upbeat performance in several computer vision tasks. The popular contrastive methods make use of a Siamese architecture with different loss functions. In this work, we go deeper into two very recent state of the art frameworks, namely, SimSiam and Barlow Twins. Inspired by them, we propose a new self-supervised learning method we call Opt-SSL that combines both image and feature contrasting. We validate the proposed method on the food recognition task, showing that our proposed framework enables the self-learning networks to learn better visual representations.
Keywords: Self-supervised; Contrastive learning; Food recognition
|
Fadi Dornaika, & Franck Davoine. (2005). Simultaneous Facial Action Tracking and Expression Recognition using a Particle Filter.
|