Mathieu Nicolas Delalandre, Jean-Yves Ramel, Ernest Valveny, & Muhammad Muzzamil Luqman. (2010). A Performance Characterization Algorithm for Symbol Localization. In Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers (Vol. 6020, 260–271). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we present an algorithm for performance characterization of symbol localization systems. This algorithm is aimed to be a more “reliable” and “open” solution to characterize the performance. To achieve that, it exploits only single points as the result of localization and offers the possibility to reconsider the localization results provided by a system. We use the information about context in groundtruth, and overall localization results, to detect the ambiguous localization results. A probability score is computed for each matching between a localization point and a groundtruth region, depending on the spatial distribution of the other regions in the groundtruth. Final characterization is given with detection rate/probability score plots, describing the sets of possible interpretations of the localization results, according to a given confidence rate. We present experimentation details along with the results for the symbol localization system of [1], exploiting a synthetic dataset of architectural floorplans and electrical diagrams (composed of 200 images and 3861 symbols).
|
Mathieu Nicolas Delalandre, Ernest Valveny, Tony Pridmore, & Dimosthenis Karatzas. (2010). Generation of Synthetic Documents for Performance Evaluation of Symbol Recognition & Spotting Systems. IJDAR - International Journal on Document Analysis and Recognition, 13(3), 187–207.
Abstract: This paper deals with the topic of performance evaluation of symbol recognition & spotting systems. We propose here a new approach to the generation of synthetic graphics documents containing non-isolated symbols in a real context. This approach is based on the definition of a set of constraints that permit us to place the symbols on a pre-defined background according to the properties of a particular domain (architecture, electronics, engineering, etc.). In this way, we can obtain a large amount of images resembling real documents by simply defining the set of constraints and providing a few pre-defined backgrounds. As documents are synthetically generated, the groundtruth (the location and the label of every symbol) becomes automatically available. We have applied this approach to the generation of a large database of architectural drawings and electronic diagrams, which shows the flexibility of the system. Performance evaluation experiments of a symbol localization system show that our approach permits to generate documents with different features that are reflected in variation of localization results.
|
Marta Teres, & Eduard Vazquez. (2010). Museums, spaces and museographical resources. Current state and proposals for a multidisciplinary framework to open new perspectives. In Proceedings of The CREATE 2010 Conference (319–323).
Abstract: Two of the main aims of a museum are to communicate its heritage and to make enjoy its visitors. This communication can be done through the pieces itself and the museographical resources but also through the building, the interior design, the light and the colour. Art museums, in opposition with other museums, lack on the application of these additional resources. Such a work necessarily requires a multidisciplinary point of view for a holistic vision of all what a museum implies and to use all its potential as a tool of knowledge and culture for all the visitors.
|
Mario Rojas, David Masip, A. Todorov, & Jordi Vitria. (2010). Automatic Point-based Facial Trait Judgments Evaluation. In 23rd IEEE Conference on Computer Vision and Pattern Recognition (2715–2720).
Abstract: Humans constantly evaluate the personalities of other people using their faces. Facial trait judgments have been studied in the psychological field, and have been determined to influence important social outcomes of our lives, such as elections outcomes and social relationships. Recent work on textual descriptions of faces has shown that trait judgments are highly correlated. Further, behavioral studies suggest that two orthogonal dimensions, valence and dominance, can describe the basis of the human judgments from faces. In this paper, we used a corpus of behavioral data of judgments on different trait dimensions to automatically learn a trait predictor from facial pixel images. We study whether trait evaluations performed by humans can be learned using machine learning classifiers, and used later in automatic evaluations of new facial images. The experiments performed using local point-based descriptors show promising results in the evaluation of the main traits.
|
Marco Pedersoli, Jordi Gonzalez, Andrew Bagdanov, & Juan J. Villanueva. (2010). Recursive Coarse-to-Fine Localization for fast Object Recognition. In 11th European Conference on Computer Vision (Vol. 6313, 280–293). LNCS. Springer Berlin Heidelberg.
Abstract: Cascading techniques are commonly used to speed-up the scan of an image for object detection. However, cascades of detectors are slow to train due to the high number of detectors and corresponding thresholds to learn. Furthermore, they do not use any prior knowledge about the scene structure to decide where to focus the search. To handle these problems, we propose a new way to scan an image, where we couple a recursive coarse-to-fine refinement together with spatial constraints of the object location. For doing that we split an image into a set of uniformly distributed neighborhood regions, and for each of these we apply a local greedy search over feature resolutions. The neighborhood is defined as a scanning region that only one object can occupy. Therefore the best hypothesis is obtained as the location with maximum score and no thresholds are needed. We present an implementation of our method using a pyramid of HOG features and we evaluate it on two standard databases, VOC2007 and INRIA dataset. Results show that the Recursive Coarse-to-Fine Localization (RCFL) achieves a 12x speed-up compared to standard sliding windows. Compared with a cascade of multiple resolutions approach our method has slightly better performance in speed and Average-Precision. Furthermore, in contrast to cascading approach, the speed-up is independent of image conditions, the number of detected objects and clutter.
|
Marc Serra. (2010). Estimating Intrinsic Images from Physical and Categorical Color Cues (Vol. 151). Master's thesis, , .
|
Marçal Rusiñol, K. Bertet, Jean-Marc Ogier, & Josep Llados. (2010). Symbol Recognition Using a Concept Lattice of Graphical Patterns. In Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers (Vol. 6020, pp. 187–198). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we propose a new approach to recognize symbols by the use of a concept lattice. We propose to build a concept lattice in terms of graphical patterns. Each model symbol is decomposed in a set of composing graphical patterns taken as primitives. Each one of these primitives is described by boundary moment invariants. The obtained concept lattice relates which symbolic patterns compose a given graphical symbol. A Hasse diagram is derived from the context and is used to recognize symbols affected by noise. We present some preliminary results over a variation of the dataset of symbols from the GREC 2005 symbol recognition contest.
|
Marçal Rusiñol, Josep Llados, & Gemma Sanchez. (2010). Symbol Spotting in Vectorized Technical Drawings Through a Lookup Table of Region Strings. PAA - Pattern Analysis and Applications, 13(3), 321–331.
Abstract: In this paper, we address the problem of symbol spotting in technical document images applied to scanned and vectorized line drawings. Like any information spotting architecture, our approach has two components. First, symbols are decomposed in primitives which are compactly represented and second a primitive indexing structure aims to efficiently retrieve similar primitives. Primitives are encoded in terms of attributed strings representing closed regions. Similar strings are clustered in a lookup table so that the set median strings act as indexing keys. A voting scheme formulates hypothesis in certain locations of the line drawing image where there is a high presence of regions similar to the queried ones, and therefore, a high probability to find the queried graphical symbol. The proposed approach is illustrated in a framework consisting in spotting furniture symbols in architectural drawings. It has been proved to work even in the presence of noise and distortion introduced by the scanning and raster-to-vector processes.
|
Marçal Rusiñol, & Josep Llados. (2010). Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections. Springer.
Abstract: The specific problem of symbol recognition in graphical documents requires additional techniques to those developed for character recognition. The most well-known obstacle is the so-called Sayre paradox: Correct recognition requires good segmentation, yet improvement in segmentation is achieved using information provided by the recognition process. This dilemma can be avoided by techniques that identify sets of regions containing useful information. Such symbol-spotting methods allow the detection of symbols in maps or technical drawings without having to fully segment or fully recognize the entire content.
This unique text/reference provides a complete, integrated and large-scale solution to the challenge of designing a robust symbol-spotting method for collections of graphic-rich documents. The book examines a number of features and descriptors, from basic photometric descriptors commonly used in computer vision techniques to those specific to graphical shapes, presenting a methodology which can be used in a wide variety of applications. Additionally, readers are supplied with an insight into the problem of performance evaluation of spotting methods. Some very basic knowledge of pattern recognition, document image analysis and graphics recognition is assumed.
Keywords: Focused Retrieval , Graphical Pattern Indexation,Graphics Recognition ,Pattern Recognition , Performance Evaluation , Symbol Description ,Symbol Spotting
|
Marçal Rusiñol, & Josep Llados. (2010). Efficient Logo Retrieval Through Hashing Shape Context Descriptors. In 9th IAPR International Workshop on Document Analysis Systems (215–222).
Abstract: In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.
|
Marçal Rusiñol, Farshad Nourbakhsh, Dimosthenis Karatzas, Ernest Valveny, & Josep Llados. (2010). Perceptual Image Retrieval by Adding Color Information to the Shape Context Descriptor. In 20th International Conference on Pattern Recognition (1594–1597).
Abstract: In this paper we present a method for the retrieval of images in terms of perceptual similarity. Local color information is added to the shape context descriptor in order to obtain an object description integrating both shape and color as visual cues. We use a color naming algorithm in order to represent the color information from a perceptual point of view. The proposed method has been tested in two different applications, an object retrieval scenario based on color sketch queries and a color trademark retrieval problem. Experimental results show that the addition of the color information significantly outperforms the sole use of the shape context descriptor.
|
Marçal Rusiñol, Agnes Borras, & Josep Llados. (2010). Relational Indexing of Vectorial Primitives for Symbol Spotting in Line-Drawing Images. PRL - Pattern Recognition Letters, 31(3), 188–201.
Abstract: This paper presents a symbol spotting approach for indexing by content a database of line-drawing images. As line-drawings are digital-born documents designed by vectorial softwares, instead of using a pixel-based approach, we present a spotting method based on vector primitives. Graphical symbols are represented by a set of vectorial primitives which are described by an off-the-shelf shape descriptor. A relational indexing strategy aims to retrieve symbol locations into the target documents by using a combined numerical-relational description of 2D structures. The zones which are likely to contain the queried symbol are validated by a Hough-like voting scheme. In addition, a performance evaluation framework for symbol spotting in graphical documents is proposed. The presented methodology has been evaluated with a benchmarking set of architectural documents achieving good performance results.
Keywords: Document image analysis and recognition, Graphics recognition, Symbol spotting ,Vectorial representations, Line-drawings
|
Lluis Pere de las Heras. (2010). Syntactic Model for Semantic Document Analysis (Vol. 158).
|
Koen E.A. van de Sande, Theo Gevers, & C.G.M. Snoek. (2010). Evaluating Color Descriptors for Object and Scene Recognition. TPAMI - IEEE Transaction on Pattern Analysis and Machine Intelligence, 32(9), 1582–1596.
Abstract: Impact factor: 5.308
Image category recognition is important to access visual information on the level of objects and scene types. So far, intensity-based descriptors have been widely used for feature extraction at salient points. To increase illumination invariance and discriminative power, color descriptors have been proposed. Because many different descriptors exist, a structured overview is required of color invariant descriptors in the context of image category recognition. Therefore, this paper studies the invariance properties and the distinctiveness of color descriptors (software to compute the color descriptors from this paper is available from http://www.colordescriptors.com) in a structured way. The analytical invariance properties of color descriptors are explored, using a taxonomy based on invariance properties with respect to photometric transformations, and tested experimentally using a data set with known illumination conditions. In addition, the distinctiveness of color descriptors is assessed experimentally using two benchmarks, one from the image domain and one from the video domain. From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition. The results further reveal that, for light intensity shifts, the usefulness of invariance is category-specific. Overall, when choosing a single descriptor and no prior knowledge about the data set and object and scene categories is available, the OpponentSIFT is recommended. Furthermore, a combined set of color descriptors outperforms intensity-based SIFT and improves category recognition by 8 percent on the PASCAL VOC 2007 and by 7 percent on the Mediamill Challenge.
|
Josep M. Gonfaus, Xavier Boix, Joost Van de Weijer, Andrew Bagdanov, Joan Serrat, & Jordi Gonzalez. (2010). Harmony Potentials for Joint Classification and Segmentation. In 23rd IEEE Conference on Computer Vision and Pattern Recognition (3280–3287).
Abstract: Hierarchical conditional random fields have been successfully applied to object segmentation. One reason is their ability to incorporate contextual information at different scales. However, these models do not allow multiple labels to be assigned to a single node. At higher scales in the image, this yields an oversimplified model, since multiple classes can be reasonable expected to appear within one region. This simplified model especially limits the impact that observations at larger scales may have on the CRF model. Neglecting the information at larger scales is undesirable since class-label estimates based on these scales are more reliable than at smaller, noisier scales. To address this problem, we propose a new potential, called harmony potential, which can encode any possible combination of class labels. We propose an effective sampling strategy that renders tractable the underlying optimization problem. Results show that our approach obtains state-of-the-art results on two challenging datasets: Pascal VOC 2009 and MSRC-21.
|