|
Miguel Oliveira, Angel Sappa, & V. Santos. (2012). Color Correction using 3D Gaussian Mixture Models. In 9th International Conference on Image Analysis and Recognition (Vol. 7324, pp. 97–106). LNCS. Springer Berlin Heidelberg.
Abstract: The current paper proposes a novel color correction approach based on a probabilistic segmentation framework by using 3D Gaussian Mixture Models. Regions are used to compute local color correction functions, which are then combined to obtain the final corrected image. The proposed approach is evaluated using both a recently published metric and two large data sets composed of seventy images. The evaluation is performed by comparing our algorithm with eight well known color correction algorithms. Results show that the proposed approach is the highest scoring color correction method. Also, the proposed single step 3D color space probabilistic segmentation reduces processing time over similar approaches.
|
|
|
Theo Gevers, Arjan Gijsenij, Joost Van de Weijer, & J.M. Geusebroek. (2012). Color in Computer Vision: Fundamentals and Applications. The Wiley-IS&T Series in Imaging Science and Technology.
|
|
|
Joost Van de Weijer, Robert Benavente, Maria Vanrell, Cordelia Schmid, Ramon Baldrich, Jacob Verbeek, et al. (2012). Color Naming. In Theo Gevers, Arjan Gijsenij, Joost Van de Weijer, & Jan-Mark Geusebroek (Eds.), Color in Computer Vision: Fundamentals and Applications (pp. 287–317). John Wiley & Sons, Ltd.
|
|
|
Anjan Dutta, Jaume Gibert, Josep Llados, Horst Bunke, & Umapada Pal. (2012). Combination of Product Graph and Random Walk Kernel for Symbol Spotting in Graphical Documents. In 21st International Conference on Pattern Recognition (pp. 1663–1666).
Abstract: This paper explores the utilization of product graph for spotting symbols on graphical documents. Product graph is intended to find the candidate subgraphs or components in the input graph containing the paths similar to the query graph. The acute angle between two edges and their length ratio are considered as the node labels. In a second step, each of the candidate subgraphs in the input graph is assigned with a distance measure computed by a random walk kernel. Actually it is the minimum of the distances of the component to all the components of the model graph. This distance measure is then used to eliminate dissimilar components. The remaining neighboring components are grouped and the grouped zone is considered as a retrieval zone of a symbol similar to the queried one. The entire method works online, i.e., it doesn't need any preprocessing step. The present paper reports the initial results of the method, which are very encouraging.
|
|
|
R. Valenti, & Theo Gevers. (2012). Combining Head Pose and Eye Location Information for Gaze Estimation. TIP - IEEE Transactions on Image Processing, 21(2), 802–815.
Abstract: Impact factor 2010: 2.92
Impact factor 2011/12?: 3.32
Head pose and eye location for gaze estimation have been separately studied in numerous works in the literature. Previous research shows that satisfactory accuracy in head pose and eye location estimation can be achieved in constrained settings. However, in the presence of nonfrontal faces, eye locators are not adequate to accurately locate the center of the eyes. On the other hand, head pose estimation techniques are able to deal with these conditions; hence, they may be suited to enhance the accuracy of eye localization. Therefore, in this paper, a hybrid scheme is proposed to combine head pose and eye location information to obtain enhanced gaze estimation. To this end, the transformation matrix obtained from the head pose is used to normalize the eye regions, and in turn, the transformation matrix generated by the found eye location is used to correct the pose estimation procedure. The scheme is designed to enhance the accuracy of eye location estimations, particularly in low-resolution videos, to extend the operative range of the eye locators, and to improve the accuracy of the head pose tracker. These enhanced estimations are then combined to obtain a novel visual gaze estimation system, which uses both eye location and head information to refine the gaze estimates. From the experimental results, it can be derived that the proposed unified scheme improves the accuracy of eye estimations by 16% to 23%. Furthermore, it considerably extends its operating range by more than 15° by overcoming the problems introduced by extreme head poses. Moreover, the accuracy of the head pose tracker is improved by 12% to 24%. Finally, the experimentation on the proposed combined gaze estimation system shows that it is accurate (with a mean error between 2° and 5°) and that it can be used in cases where classic approaches would fail without imposing restraints on the position of the head.
|
|
|
Noha Elfiky, Jordi Gonzalez, & Xavier Roca. (2012). Compact and Adaptive Spatial Pyramids for Scene Recognition. IMAVIS - Image and Vision Computing, 30(8), 492–500.
Abstract: Most successful approaches on scenerecognition tend to efficiently combine global image features with spatial local appearance and shape cues. On the other hand, less attention has been devoted for studying spatial texture features within scenes. Our method is based on the insight that scenes can be seen as a composition of micro-texture patterns. This paper analyzes the role of texture along with its spatial layout for scenerecognition. However, one main drawback of the resulting spatial representation is its huge dimensionality. Hence, we propose a technique that addresses this problem by presenting a compactSpatialPyramid (SP) representation. The basis of our compact representation, namely, CompactAdaptiveSpatialPyramid (CASP) consists of a two-stages compression strategy. This strategy is based on the Agglomerative Information Bottleneck (AIB) theory for (i) compressing the least informative SP features, and, (ii) automatically learning the most appropriate shape for each category. Our method exceeds the state-of-the-art results on several challenging scenerecognition data sets.
|
|
|
Noha Elfiky. (2012). Compact, Adaptive and Discriminative Spatial Pyramids for Improved Object and Scene Classification (Jordi Gonzalez, & Xavier Roca, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: The release of challenging datasets with a vast number of images, requires the development of efficient image representations and algorithms which are able to manipulate these large-scale datasets efficiently. Nowadays the Bag-of-Words (BoW) is the most successful approach in the context of object and scene classification tasks. However, its main drawback is the absence of the important spatial information. Spatial pyramids (SP) have been successfully applied to incorporate spatial information into BoW-based image representation. Observing the remarkable performance of spatial pyramids, their growing number of applications to a broad range of vision problems, and finally its geometry inclusion, a question can be asked what are the limits of spatial pyramids. Within the SP framework, the optimal way for obtaining an image spatial representation, which is able to cope with it’s most foremost shortcomings, concretely, it’s high dimensionality and the rigidity of the resulting image representation, still remains an active research domain. In summary, the main concern of this thesis is to search for the limits of spatial pyramids and try to figure out solutions for them.
|
|
|
Sergio Vera, Debora Gil, Agnes Borras, F. Javier Sanchez, Frederic Perez, Marius G. Linguraru, et al. (2012). Computation and Evaluation of Medial Surfaces for Shape Representation of Abdominal Organs. In H. Yoshida et al (Ed.), Workshop on Computational and Clinical Applications in Abdominal Imaging (Vol. 7029, 223–230). LNCS. Berlin: Springer Link.
Abstract: Medial representations are powerful tools for describing and parameterizing the volumetric shape of anatomical structures. Existing methods show excellent results when applied to 2D
objects, but their quality drops across dimensions. This paper contributes to the computation of medial manifolds in two aspects. First, we provide a standard scheme for the computation of medial
manifolds that avoid degenerated medial axis segments; second, we introduce an energy based method which performs independently of the dimension. We evaluate quantitatively the performance of our
method with respect to existing approaches, by applying them to synthetic shapes of known medial geometry. Finally, we show results on shape representation of multiple abdominal organs,
exploring the use of medial manifolds for the representation of multi-organ relations.
Keywords: medial manifolds, abdomen.
|
|
|
Angel Sappa, & George A. Triantafyllid. (2012). Computer Graphics and Imaging.
|
|
|
Jordi Roca. (2012). Constancy and inconstancy in categorical colour perception (Maria Vanrell, & C. Alejandro Parraga, Eds.). Ph.D. thesis, , .
Abstract: To recognise objects is perhaps the most important task an autonomous system, either biological or artificial needs to perform. In the context of human vision, this is partly achieved by recognizing the colour of surfaces despite changes in the wavelength distribution of the illumination, a property called colour constancy. Correct surface colour recognition may be adequately accomplished by colour category matching without the need to match colours precisely, therefore categorical colour constancy is likely to play an important role for object identification to be successful. The main aim of this work is to study the relationship between colour constancy and categorical colour perception. Previous studies of colour constancy have shown the influence of factors such the spatio-chromatic properties of the background, individual observer's performance, semantics, etc. However there is very little systematic study of these influences. To this end, we developed a new approach to colour constancy which includes both individual observers' categorical perception, the categorical structure of the background, and their interrelations resulting in a more comprehensive characterization of the phenomenon. In our study, we first developed a new method to analyse the categorical structure of 3D colour space, which allowed us to characterize individual categorical colour perception as well as quantify inter-individual variations in terms of shape and centroid location of 3D categorical regions. Second, we developed a new colour constancy paradigm, termed chromatic setting, which allows measuring the precise location of nine categorically-relevant points in colour space under immersive illumination. Additionally, we derived from these measurements a new colour constancy index which takes into account the magnitude and orientation of the chromatic shift, memory effects and the interrelations among colours and a model of colour naming tuned to each observer/adaptation state. Our results lead to the following conclusions: (1) There exists large inter-individual variations in the categorical structure of colour space, and thus colour naming ability varies significantly but this is not well predicted by low-level chromatic discrimination ability; (2) Analysis of the average colour naming space suggested the need for an additional three basic colour terms (turquoise, lilac and lime) for optimal colour communication; (3) Chromatic setting improved the precision of more complex linear colour constancy models and suggested that mechanisms other than cone gain might be best suited to explain colour constancy; (4) The categorical structure of colour space is broadly stable under illuminant changes for categorically balanced backgrounds; (5) Categorical inconstancy exists for categorically unbalanced backgrounds thus indicating that categorical information perceived in the initial stages of adaptation may constrain further categorical perception.
|
|
|
Pedro Martins, Paulo Carvalho, & Carlo Gatta. (2012). Context Aware Keypoint Extraction for Robust Image Representation. In 23rd British Machine Vision Conference (100.pp. 1–100.12).
|
|
|
Alicia Fornes, Anjan Dutta, Albert Gordo, & Josep Llados. (2012). CVC-MUSCIMA: A Ground-Truth of Handwritten Music Score Images for Writer Identification and Staff Removal. IJDAR - International Journal on Document Analysis and Recognition, 15(3), 243–251.
Abstract: 0,405JCR
The analysis of music scores has been an active research field in the last decades. However, there are no publicly available databases of handwritten music scores for the research community. In this paper we present the CVC-MUSCIMA database and ground-truth of handwritten music score images. The dataset consists of 1,000 music sheets written by 50 different musicians. It has been especially designed for writer identification and staff removal tasks. In addition to the description of the dataset, ground-truth, partitioning and evaluation metrics, we also provide some base-line results for easing the comparison between different approaches.
Keywords: Music scores; Handwritten documents; Writer identification; Staff removal; Performance evaluation; Graphics recognition; Ground truths
|
|
|
Marçal Rusiñol, Lluis Pere de las Heras, Joan Mas, Oriol Ramos Terrades, Dimosthenis Karatzas, Anjan Dutta, et al. (2012). CVC-UAB's participation in the Flowchart Recognition Task of CLEF-IP 2012. In Conference and Labs of the Evaluation Forum.
|
|
|
Ricard Borras, Agata Lapedriza, & Laura Igual. (2012). Depth Information in Human Gait Analysis: An Experimental Study on Gender Recognition. In 9th International Conference on Image Analysis and Recognition (Vol. 7325, pp. 98–105). Springer Berlin Heidelberg.
Abstract: This work presents DGait, a new gait database acquired with a depth camera. This database contains videos from 53 subjects walking in different directions. The intent of this database is to provide a public set to explore whether the depth can be used as an additional information source for gait classification purposes. Each video is labelled according to subject, gender and age. Furthermore, for each subject and view point, we provide initial and final frames of an entire walk cycle. On the other hand, we perform gait-based gender classification experiments with DGait database, in order to illustrate the usefulness of depth information for this purpose. In our experiments, we extract 2D and 3D gait features based on shape descriptors, and compare the performance of these features for gender identification, using a Kernel SVM. The obtained results show that depth can be an information source of great relevance for gait classification problems.
|
|
|
Noha Elfiky, Fahad Shahbaz Khan, Joost Van de Weijer, & Jordi Gonzalez. (2012). Discriminative Compact Pyramids for Object and Scene Recognition. PR - Pattern Recognition, 45(4), 1627–1636.
Abstract: Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets.
|
|