|
Anjan Dutta, Josep Llados, Horst Bunke, & Umapada Pal. (2013). A Product graph based method for dual subgraph matching applied to symbol spotting. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: Product graph has been shown to be an efficient way for matching subgraphs. This paper reports the extension of the product graph methodology for subgraph matching applied to symbol spotting in graphical documents. This paper focuses on the two major limitations of the previous version of product graph: (1) Spurious nodes and edges in the graph representation and (2) Inefficient node and edge attributes. To deal with noisy information of vectorized graphical documents, we consider a dual graph representation on the original graph representing the graphical information and the product graph is computed between the dual graphs of the query graphs and the input graph.
The dual graph with redundant edges is helpful for efficient and tolerating encoding of the structural information of the graphical documents. The adjacency matrix of the product graph locates similar path information of two graphs and exponentiating the adjacency matrix finds similar paths of greater lengths. Nodes joining similar paths between two graphs are found by combining different exponentials of adjacency matrices. An experimental investigation reveals that the recall obtained by this approach is quite encouraging.
|
|
|
Ivan Huerta, Ariel Amato, Xavier Roca, & Jordi Gonzalez. (2013). Exploiting Multiple Cues in Motion Segmentation Based on Background Subtraction. NEUCOM - Neurocomputing, 100, 183–196.
Abstract: This paper presents a novel algorithm for mobile-object segmentation from static background scenes, which is both robust and accurate under most of the common problems found in motionsegmentation. In our first contribution, a case analysis of motionsegmentation errors is presented taking into account the inaccuracies associated with different cues, namely colour, edge and intensity. Our second contribution is an hybrid architecture which copes with the main issues observed in the case analysis by fusing the knowledge from the aforementioned three cues and a temporal difference algorithm. On one hand, we enhance the colour and edge models to solve not only global and local illumination changes (i.e. shadows and highlights) but also the camouflage in intensity. In addition, local information is also exploited to solve the camouflage in chroma. On the other hand, the intensity cue is applied when colour and edge cues are not available because their values are beyond the dynamic range. Additionally, temporal difference scheme is included to segment motion where those three cues cannot be reliably computed, for example in those background regions not visible during the training period. Lastly, our approach is extended for handling ghost detection. The proposed method obtains very accurate and robust motionsegmentation results in multiple indoor and outdoor scenarios, while outperforming the most-referred state-of-art approaches.
Keywords: Motion segmentation; Shadow suppression; Colour segmentation; Edge segmentation; Ghost detection; Background subtraction
|
|
|
Bhaskar Chakraborty, Andrew Bagdanov, Jordi Gonzalez, & Xavier Roca. (2013). Human Action Recognition Using an Ensemble of Body-Part Detectors. EXSY - Expert Systems, 30(2), 101–114.
Abstract: This paper describes an approach to human action recognition based on a probabilistic optimization model of body parts using hidden Markov model (HMM). Our method is able to distinguish between similar actions by only considering the body parts having major contribution to the actions, for example, legs for walking, jogging and running; arms for boxing, waving and clapping. We apply HMMs to model the stochastic movement of the body parts for action recognition. The HMM construction uses an ensemble of body-part detectors, followed by grouping of part detections, to perform human identification. Three example-based body-part detectors are trained to detect three components of the human body: the head, legs and arms. These detectors cope with viewpoint changes and self-occlusions through the use of ten sub-classifiers that detect body parts over a specific range of viewpoints. Each sub-classifier is a support vector machine trained on features selected for the discriminative power for each particular part/viewpoint combination. Grouping of these detections is performed using a simple geometric constraint model that yields a viewpoint-invariant human detector. We test our approach on three publicly available action datasets: the KTH dataset, Weizmann dataset and HumanEva dataset. Our results illustrate that with a simple and compact representation we can achieve robust recognition of human actions comparable to the most complex, state-of-the-art methods.
Keywords: Human action recognition;body-part detection;hidden Markov model
|
|
|
Nataliya Shapovalova, Carles Fernandez, Xavier Roca, & Jordi Gonzalez. (2011). Semantics of Human Behavior in Image Sequences. In Albert Ali Salah, & (Ed.), Computer Analysis of Human Behavior (pp. 151–182). Springer London.
Abstract: Human behavior is contextualized and understanding the scene of an action is crucial for giving proper semantics to behavior. In this chapter we present a novel approach for scene understanding. The emphasis of this work is on the particular case of Human Event Understanding. We introduce a new taxonomy to organize the different semantic levels of the Human Event Understanding framework proposed. Such a framework particularly contributes to the scene understanding domain by (i) extracting behavioral patterns from the integrative analysis of spatial, temporal, and contextual evidence and (ii) integrative analysis of bottom-up and top-down approaches in Human Event Understanding. We will explore how the information about interactions between humans and their environment influences the performance of activity recognition, and how this can be extrapolated to the temporal domain in order to extract higher inferences from human events observed in sequences of images.
|
|
|
Bhaskar Chakraborty, Michael Holte, Thomas B. Moeslund, Jordi Gonzalez, & Xavier Roca. (2011). A Selective Spatio-Temporal Interest Point Detector for Human Action Recognition in Complex Scenes. In 13th IEEE International Conference on Computer Vision (pp. 1776–1783).
Abstract: Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper we present a new approach for STIP detection by applying surround suppression combined with local and temporal constraints. Our method is significantly different from existing STIP detectors and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-visual words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on existing benchmark datasets, and more challenging datasets of complex scenes, validate our approach and show state-of-the-art performance.
|
|
|
Wenjuan Gong, Jürgen Brauer, Michael Arens, & Jordi Gonzalez. (2011). Modeling vs. Learning Approaches for Monocular 3D Human Pose Estimation. In 1st IEEE International Workshop on Performance Evaluation on Recognition of Human Actions and Pose Estimation Methods.
|
|
|
Jordi Gonzalez, Josep M. Gonfaus, Carles Fernandez, & Xavier Roca. (2011). Exploiting Natural-Language Interaction in Video Surveillance Systems. In V&L Net Workshop on Vision and Language.
|
|
|
Murad Al Haj, Carles Fernandez, Zhanwu Xiong, Ivan Huerta, Jordi Gonzalez, & Xavier Roca. (2011). Beyond the Static Camera: Issues and Trends in Active Vision. In Th.B. Moeslund, A. Hilton, V. Krüger, & L. Sigal (Eds.), Visual Analysis of Humans: Looking at People (pp. 11–30). Springer London.
Abstract: Maximizing both the area coverage and the resolution per target is highly desirable in many applications of computer vision. However, with a limited number of cameras viewing a scene, the two objectives are contradictory. This chapter is dedicated to active vision systems, trying to achieve a trade-off between these two aims and examining the use of high-level reasoning in such scenarios. The chapter starts by introducing different approaches to active cameras configurations. Later, a single active camera system to track a moving object is developed, offering the reader first-hand understanding of the issues involved. Another section discusses practical considerations in building an active vision platform, taking as an example a multi-camera system developed for a European project. The last section of the chapter reflects upon the future trends of using semantic factors to drive smartly coordinated active systems.
|
|
|
Kaida Xiao, Chenyang Fu, Dimosthenis Karatzas, & Sophie Wuerger. (2011). Visual Gamma Correction for LCD Displays. DIS - Displays, 32(1), 17–23.
Abstract: An improved method for visual gamma correction is developed for LCD displays to increase the accuracy of digital colour reproduction. Rather than utilising a photometric measurement device, we use observ- ers’ visual luminance judgements for gamma correction. Eight half tone patterns were designed to gen- erate relative luminances from 1/9 to 8/9 for each colour channel. A psychophysical experiment was conducted on an LCD display to find the digital signals corresponding to each relative luminance by visually matching the half-tone background to a uniform colour patch. Both inter- and intra-observer vari- ability for the eight luminance matches in each channel were assessed and the luminance matches proved to be consistent across observers (DE00 < 3.5) and repeatable (DE00 < 2.2). Based on the individual observer judgements, the display opto-electronic transfer function (OETF) was estimated by using either a 3rd order polynomial regression or linear interpolation for each colour channel. The performance of the proposed method is evaluated by predicting the CIE tristimulus values of a set of coloured patches (using the observer-based OETFs) and comparing them to the expected CIE tristimulus values (using the OETF obtained from spectro-radiometric luminance measurements). The resulting colour differences range from 2 to 4.6 DE00. We conclude that this observer-based method of visual gamma correction is useful to estimate the OETF for LCD displays. Its major advantage is that no particular functional relationship between digital inputs and luminance outputs has to be assumed.
Keywords: Display calibration; Psychophysics ; Perceptual; Visual gamma correction; Luminance matching; Observer-based calibration
|
|
|
Kaida Xiao, Sophie Wuerger, Chenyang Fu, & Dimosthenis Karatzas. (2011). Unique Hue Data for Colour Appearance Models. Part i: Loci of Unique Hues and Hue Uniformity. CRA - Color Research & Application, 36(5), 316–323.
Abstract: Psychophysical experiments were conducted to assess unique hues on a CRT display for a large sample of colour-normal observers (n 1⁄4 185). These data were then used to evaluate the most commonly used colour appear- ance model, CIECAM02, by transforming the CIEXYZ tris- timulus values of the unique hues to the CIECAM02 colour appearance attributes, lightness, chroma and hue angle. We report two findings: (1) the hue angles derived from our unique hue data are inconsistent with the commonly used Natural Color System hues that are incorporated in the CIECAM02 model. We argue that our predicted unique hue angles (derived from our large dataset) provide a more reliable standard for colour management applications when the precise specification of these salient colours is im- portant. (2) We test hue uniformity for CIECAM02 in all four unique hues and show significant disagreements for all hues, except for unique red which seems to be invariant under lightness changes. Our dataset is useful to improve the CIECAM02 model as it provides reliable data for benchmarking.
Keywords: unique hues; colour appearance models; CIECAM02; hue uniformity
|
|
|
Albert Gordo, & Florent Perronnin. (2011). Asymmetric Distances for Binary Embeddings. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 729–736).
Abstract: In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH) and Semi-Supervised Hashing (SSH). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques. We also propose a novel simple binary embedding technique – PCA Embedding (PCAE) – which is shown to yield competitive results with respect to more complex algorithms such as SH and SSH.
|
|
|
Chenyang Fu, Kaida Xiao, Dimosthenis Karatzas, & Sophie Wuerger. (2011). Investigation of Unique Hue Setting Changes with Ageing. COL - Chinese Optics Letters, 9(5), 053301-5.
Abstract: Clromatic sensitivity along the protan, deutan, and tritan lines and the loci of the unique hues (red, green, yellow, blue) for a very large sample (n = 185) of colour-normal observers ranging from 18 to 75 years of age are assessed. Visual judgments are obtained under normal viewing conditions using colour patches on self-luminous display under controlled adaptation conditions. Trivector discrimination thresholds show an increase as a function of age along the protan, deutan, and tritan axes, with the largest increase present along the tritan line, less pronounced shifts in unique hue settings are also observed. Based on the chromatic (protan, deutan, tritan) thresholds and using scaled cone signals, we predict the unique hue changes with ageing. A dependency on age for unique red and unique yellow for predicted hue angle is found. We conclude that the chromatic sensitivity deteriorates significantly with age, whereas the appearance of unique hues is much less affected, remaining almost constant despite the known changes in the ocular media.
|
|
|
Lluis Pere de las Heras, Joan Mas, Gemma Sanchez, & Ernest Valveny. (2011). Descriptor-based Svm Wall Detector. In 9th International Workshop on Graphic Recognition.
Abstract: Architectural floorplans exhibit a large variability in notation. Therefore, segmenting and identifying the elements of any kind of plan becomes a challenging task for approaches based on grouping structural primitives obtained by vectorization. Recently, a patch-based segmentation method working at pixel level and relying on the construction of a visual vocabulary has been proposed showing its adaptability to different notations by automatically learning the visual appearance of the elements in each different notation. In this paper we describe an evolution of this new approach in two directions: firstly we evaluate different features to obtain the description of every patch. Secondly, we train an SVM classifier to obtain the category of every patch instead of constructing a visual vocabulary. These modifications of the method have been tested for wall detection on two datasets of architectural floorplans with different notations and compared with the results obtained with the original approach.
|
|
|
Partha Pratim Roy, Umapada Pal, & Josep Llados. (2011). Document Seal Detection Using Ght and Character Proximity Graphs. PR - Pattern Recognition, 44(6), 1282–1295.
Abstract: This paper deals with automatic detection of seal (stamp) from documents with cluttered background. Seal detection involves a difficult challenge due to its multi-oriented nature, arbitrary shape, overlapping of its part with signature, noise, etc. Here, a seal object is characterized by scale and rotation invariant spatial feature descriptors computed from recognition result of individual connected components (characters). Scale and rotation invariant features are used in a Support Vector Machine (SVM) classifier to recognize multi-scale and multi-oriented text characters. The concept of generalized Hough transform (GHT) is used to detect the seal and a voting scheme is designed for finding possible location of the seal in a document based on the spatial feature descriptor of neighboring component pairs. The peak of votes in GHT accumulator validates the hypothesis to locate the seal in a document. Experiment is performed in an archive of historical documents of handwritten/printed English text. Experimental results show that the method is robust in locating seal instances of arbitrary shape and orientation in documents, and also efficient in indexing a collection of documents for retrieval purposes.
Keywords: Seal recognition; Graphical symbol spotting; Generalized Hough transform; Multi-oriented character recognition
|
|
|
Marçal Rusiñol, V. Poulain d'Andecy, Dimosthenis Karatzas, & Josep Llados. (2011). Classification of Administrative Document Images by Logo Identification. In In proceedings of 9th IAPR Workshop on Graphic Recognition.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier's graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
|
|