|
Debora Gil, & Petia Radeva. (2003). Curvature Vector Flow to Assure Convergent Deformable Models for Shape Modelling. In B. Springer (Ed.), Energy Minimization Methods In Computer Vision And Pattern Recognition (Vol. 2683, pp. 357–372). LNCS. Lisbon, PORTUGAL: Springer, Berlin.
Abstract: Poor convergence to concave shapes is a main limitation of snakes as a standard segmentation and shape modelling technique. The gradient of the external energy of the snake represents a force that pushes the snake into concave regions, as its internal energy increases when new inexion points are created. In spite of the improvement of the external energy by the gradient vector ow technique, highly non convex shapes can not be obtained, yet. In the present paper, we develop a new external energy based on the geometry of the curve to be modelled. By tracking back the deformation of a curve that evolves by minimum curvature ow, we construct a distance map that encapsulates the natural way of adapting to non convex shapes. The gradient of this map, which we call curvature vector ow (CVF), is capable of attracting a snake towards any contour, whatever its geometry. Our experiments show that, any initial snake condition converges to the curve to be modelled in optimal time.
Keywords: Initial condition; Convex shape; Non convex analysis; Increase; Segmentation; Gradient; Standard; Standards; Concave shape; Flow models; Tracking; Edge detection; Curvature
|
|
|
Josep Llados, Ernest Valveny, Gemma Sanchez, & Enric Marti. (2002). Symbol recognition: current advances and perspectives. In Dorothea Blostein and Young- Bin Kwon (Ed.), Graphics Recognition Algorithms And Applications (Vol. 2390, pp. 104–128). LNCS. Springer-Verlag.
Abstract: The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content.
|
|
|
Jose Manuel Alvarez, & Antonio Lopez. (2012). Photometric Invariance by Machine Learning. In Jan-Mark Geusebroek Joost van de Weijer A. G. Theo Gevers (Ed.), Color in Computer Vision: Fundamentals and Applications (Vol. 7, pp. 113–134). iConcept Press Ltd.
|
|
|
Josep Llados, & Marçal Rusiñol. (2014). Graphics Recognition Techniques. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. D, pp. 489–521). Springer London.
Abstract: This chapter describes the most relevant approaches for the analysis of graphical documents. The graphics recognition pipeline can be splitted into three tasks. The low level or lexical task extracts the basic units composing the document. The syntactic level is focused on the structure, i.e., how graphical entities are constructed, and involves the location and classification of the symbols present in the document. The third level is a functional or semantic level, i.e., it models what the graphical symbols do and what they mean in the context where they appear. This chapter covers the lexical level, while the next two chapters are devoted to the syntactic and semantic level, respectively. The main problems reviewed in this chapter are raster-to-vector conversion (vectorization algorithms) and the separation of text and graphics components. The research and industrial communities have provided standard methods achieving reasonable performance levels. Hence, graphics recognition techniques can be considered to be in a mature state from a scientific point of view. Additionally this chapter provides insights on some related problems, namely, the extraction and recognition of dimensions in engineering drawings, and the recognition of hatched and tiled patterns. Both problems are usually associated, even integrated, in the vectorization process.
Keywords: Dimension recognition; Graphics recognition; Graphic-rich documents; Polygonal approximation; Raster-to-vector conversion; Texture-based primitive extraction; Text-graphics separation
|
|
|
Salvatore Tabbone, & Oriol Ramos Terrades. (2014). An Overview of Symbol Recognition. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. D, pp. 523–551). Springer London.
Abstract: According to the Cambridge Dictionaries Online, a symbol is a sign, shape, or object that is used to represent something else. Symbol recognition is a subfield of general pattern recognition problems that focuses on identifying, detecting, and recognizing symbols in technical drawings, maps, or miscellaneous documents such as logos and musical scores. This chapter aims at providing the reader an overview of the different existing ways of describing and recognizing symbols and how the field has evolved to attain a certain degree of maturity.
Keywords: Pattern recognition; Shape descriptors; Structural descriptors; Symbolrecognition; Symbol spotting
|
|
|
A.Kesidis, & Dimosthenis Karatzas. (2014). Logo and Trademark Recognition. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. D, pp. 591–646). Springer London.
Abstract: The importance of logos and trademarks in nowadays society is indisputable, variably seen under a positive light as a valuable service for consumers or a negative one as a catalyst of ever-increasing consumerism. This chapter discusses the technical approaches for enabling machines to work with logos, looking into the latest methodologies for logo detection, localization, representation, recognition, retrieval, and spotting in a variety of media. This analysis is presented in the context of three different applications covering the complete depth and breadth of state of the art techniques. These are trademark retrieval systems, logo recognition in document images, and logo detection and removal in images and videos. This chapter, due to the very nature of logos and trademarks, brings together various facets of document image analysis spanning graphical and textual content, while it links document image analysis to other computer vision domains, especially when it comes to the analysis of real-scene videos and images.
Keywords: Logo recognition; Logo removal; Logo spotting; Trademark registration; Trademark retrieval systems
|
|
|
Alicia Fornes, & Gemma Sanchez. (2014). Analysis and Recognition of Music Scores. In D. Doermann, & K. Tombre (Eds.), Handbook of Document Image Processing and Recognition (Vol. E, pp. 749–774). Springer London.
Abstract: The analysis and recognition of music scores has attracted the interest of researchers for decades. Optical Music Recognition (OMR) is a classical research field of Document Image Analysis and Recognition (DIAR), whose aim is to extract information from music scores. Music scores contain both graphical and textual information, and for this reason, techniques are closely related to graphics recognition and text recognition. Since music scores use a particular diagrammatic notation that follow the rules of music theory, many approaches make use of context information to guide the recognition and solve ambiguities. This chapter overviews the main Optical Music Recognition (OMR) approaches. Firstly, the different methods are grouped according to the OMR stages, namely, staff removal, music symbol recognition, and syntactical analysis. Secondly, specific approaches for old and handwritten music scores are reviewed. Finally, online approaches and commercial systems are also commented.
|
|
|
Nataliya Shapovalova, Carles Fernandez, Xavier Roca, & Jordi Gonzalez. (2011). Semantics of Human Behavior in Image Sequences. In Albert Ali Salah, & (Ed.), Computer Analysis of Human Behavior (pp. 151–182). Springer London.
Abstract: Human behavior is contextualized and understanding the scene of an action is crucial for giving proper semantics to behavior. In this chapter we present a novel approach for scene understanding. The emphasis of this work is on the particular case of Human Event Understanding. We introduce a new taxonomy to organize the different semantic levels of the Human Event Understanding framework proposed. Such a framework particularly contributes to the scene understanding domain by (i) extracting behavioral patterns from the integrative analysis of spatial, temporal, and contextual evidence and (ii) integrative analysis of bottom-up and top-down approaches in Human Event Understanding. We will explore how the information about interactions between humans and their environment influences the performance of activity recognition, and how this can be extrapolated to the temporal domain in order to extract higher inferences from human events observed in sequences of images.
|
|
|
Murad Al Haj, Carles Fernandez, Zhanwu Xiong, Ivan Huerta, Jordi Gonzalez, & Xavier Roca. (2011). Beyond the Static Camera: Issues and Trends in Active Vision. In Th.B. Moeslund, A. Hilton, V. Krüger, & L. Sigal (Eds.), Visual Analysis of Humans: Looking at People (pp. 11–30). Springer London.
Abstract: Maximizing both the area coverage and the resolution per target is highly desirable in many applications of computer vision. However, with a limited number of cameras viewing a scene, the two objectives are contradictory. This chapter is dedicated to active vision systems, trying to achieve a trade-off between these two aims and examining the use of high-level reasoning in such scenarios. The chapter starts by introducing different approaches to active cameras configurations. Later, a single active camera system to track a moving object is developed, offering the reader first-hand understanding of the issues involved. Another section discusses practical considerations in building an active vision platform, taking as an example a multi-camera system developed for a European project. The last section of the chapter reflects upon the future trends of using semantic factors to drive smartly coordinated active systems.
|
|
|
Svebor Karaman, Giuseppe Lisanti, Andrew Bagdanov, & Alberto del Bimbo. (2014). From re-identification to identity inference: Labeling consistency by local similarity constraints. In Person Re-Identification (Vol. 2, pp. 287–307). Springer London.
Abstract: In this chapter, we introduce the problem of identity inference as a generalization of person re-identification. It is most appropriate to distinguish identity inference from re-identification in situations where a large number of observations must be identified without knowing a priori that groups of test images represent the same individual. The standard single- and multishot person re-identification common in the literature are special cases of our formulation. We present an approach to solving identity inference by modeling it as a labeling problem in a Conditional Random Field (CRF). The CRF model ensures that the final labeling gives similar labels to detections that are similar in feature space. Experimental results are given on the ETHZ, i-LIDS and CAVIAR datasets. Our approach yields state-of-the-art performance for multishot re-identification, and our results on the more general identity inference problem demonstrate that we are able to infer the identity of very many examples even with very few labeled images in the gallery.
Keywords: re-identification; Identity inference; Conditional random fields; Video surveillance
|
|
|
Muhammad Muzzamil Luqman, Jean-Yves Ramel, & Josep Llados. (2013). Multilevel Analysis of Attributed Graphs for Explicit Graph Embedding in Vector Spaces. In Graph Embedding for Pattern Analysis (pp. 1–26). Springer New York.
Abstract: Ability to recognize patterns is among the most crucial capabilities of human beings for their survival, which enables them to employ their sophisticated neural and cognitive systems [1], for processing complex audio, visual, smell, touch, and taste signals. Man is the most complex and the best existing system of pattern recognition. Without any explicit thinking, we continuously compare, classify, and identify huge amount of signal data everyday [2], starting from the time we get up in the morning till the last second we fall asleep. This includes recognizing the face of a friend in a crowd, a spoken word embedded in noise, the proper key to lock the door, smell of coffee, the voice of a favorite singer, the recognition of alphabetic characters, and millions of more tasks that we perform on regular basis.
|
|
|
Miquel Ferrer, I. Bardaji, Ernest Valveny, Dimosthenis Karatzas, & Horst Bunke. (2013). Median Graph Computation by Means of Graph Embedding into Vector Spaces. In Yun Fu, & Yungian Ma (Eds.), Graph Embedding for Pattern Analysis (pp. 45–72). Springer New York.
Abstract: In pattern recognition [8, 14], a key issue to be addressed when designing a system is how to represent input patterns. Feature vectors is a common option. That is, a set of numerical features describing relevant properties of the pattern are computed and arranged in a vector form. The main advantages of this kind of representation are computational simplicity and a well sound mathematical foundation. Thus, a large number of operations are available to work with vectors and a large repository of algorithms for pattern analysis and classification exist. However, the simple structure of feature vectors might not be the best option for complex patterns where nonnumerical features or relations between different parts of the pattern become relevant.
|
|
|
C. Alejandro Parraga. (2014). Color Vision, Computational Methods for. In Dieter Jaeger, & Ranu Jung (Eds.), Encyclopedia of Computational Neuroscience (pp. 1–11). Springer-Verlag Berlin Heidelberg.
Abstract: The study of color vision has been aided by a whole battery of computational methods that attempt to describe the mechanisms that lead to our perception of colors in terms of the information-processing properties of the visual system. Their scope is highly interdisciplinary, linking apparently dissimilar disciplines such as mathematics, physics, computer science, neuroscience, cognitive science, and psychology. Since the sensation of color is a feature of our brains, computational approaches usually include biological features of neural systems in their descriptions, from retinal light-receptor interaction to subcortical color opponency, cortical signal decoding, and color categorization. They produce hypotheses that are usually tested by behavioral or psychophysical experiments.
Keywords: Color computational vision; Computational neuroscience of color
|
|
|
Javier Marin, David Geronimo, David Vazquez, & Antonio Lopez. (2012). Pedestrian Detection: Exploring Virtual Worlds. In Handbook of Pattern Recognition: Methods and Application (Vol. 5, pp. 145–162). iConcept Press.
Abstract: Handbook of pattern recognition will include contributions from university educators and active research experts. This Handbook is intended to serve as a basic reference on methods and applications of pattern recognition. The primary aim of this handbook is providing the community of pattern recognition with a readable, easy to understand resource that covers introductory, intermediate and advanced topics with equal clarity. Therefore, the Handbook of pattern recognition can serve equally well as reference resource and as classroom textbook. Contributions cover all methods, techniques and applications of pattern recognition. A tentative list of relevant topics might include: 1- Statistical, structural, syntactic pattern recognition. 2- Neural networks, machine learning, data mining. 3- Discrete geometry, algebraic, graph-based techniques for pattern recognition. 4- Face recognition, Signal analysis, image coding and processing, shape and texture analysis. 5- Document processing, text and graphics recognition, digital libraries. 6- Speech recognition, music analysis, multimedia systems. 7- Natural language analysis, information retrieval. 8- Biometrics, biomedical pattern analysis and information systems. 9- Other scientific, engineering, social and economical applications of pattern recognition. 10- Special hardware architectures, software packages for pattern recognition.
Keywords: Virtual worlds; Pedestrian Detection; Domain Adaptation
|
|
|
Julie Digne, Mariella Dimiccoli, Neus Sabater, & Philippe Salembier. (2015). Neighborhood Filters and the Recovery of 3D Information. In Handbook of Mathematical Methods in Imaging (pp. 1645–1673). Springer New York.
Abstract: Following their success in image processing (see Chapter Local Smoothing Neighborhood Filters), neighborhood filters have been extended to 3D surface processing. This adaptation is not straightforward. It has led to several variants for surfaces depending on whether the surface is defined as a mesh, or as a raw data point set. The image gray level in the bilateral similarity measure is replaced by a geometric information such as the normal or the curvature. The first section of this chapter reviews the variants of 3D mesh bilateral filters and compares them to the simplest possible isotropic filter, the mean curvature motion.In a second part, this chapter reviews applications of the bilateral filter to a data composed of a sparse depth map (or of depth cues) and of the image on which they have been computed. Such sparse depth cues can be obtained by stereovision or by psychophysical techniques. The underlying assumption to these applications is that pixels with similar intensity around a region are likely to have similar depths. Therefore, when diffusing depth information with a bilateral filter based on locality and color similarity, the discontinuities in depth are assured to be consistent with the color discontinuities, which is generally a desirable property. In the reviewed applications, this ends up with the reconstruction of a dense perceptual depth map from the joint data of an image and of depth cues.
|
|