|
Katerine Diaz, Jesus Martinez del Rincon, Aura Hernandez-Sabate, Marçal Rusiñol, & Francesc J. Ferri. (2018). Fast Kernel Generalized Discriminative Common Vectors for Feature Extraction. JMIV - Journal of Mathematical Imaging and Vision, 60(4), 512–524.
Abstract: This paper presents a supervised subspace learning method called Kernel Generalized Discriminative Common Vectors (KGDCV), as a novel extension of the known Discriminative Common Vectors method with Kernels. Our method combines the advantages of kernel methods to model complex data and solve nonlinear
problems with moderate computational complexity, with the better generalization properties of generalized approaches for large dimensional data. These attractive combination makes KGDCV specially suited for feature extraction and classification in computer vision, image processing and pattern recognition applications. Two different approaches to this generalization are proposed, a first one based on the kernel trick (KT) and a second one based on the nonlinear projection trick (NPT) for even higher efficiency. Both methodologies
have been validated on four different image datasets containing faces, objects and handwritten digits, and compared against well known non-linear state-of-art methods. Results show better discriminant properties than other generalized approaches both linear or kernel. In addition, the KGDCV-NPT approach presents a considerable computational gain, without compromising the accuracy of the model.
|
|
|
M. Oliver, G. Haro, Mariella Dimiccoli, B. Mazin, & C. Ballester. (2016). A Computational Model for Amodal Completion. JMIV - Journal of Mathematical Imaging and Vision, 56(3), 511–534.
Abstract: This paper presents a computational model to recover the most likely interpretation
of the 3D scene structure from a planar image, where some objects may occlude others. The estimated scene interpretation is obtained by integrating some global and local cues and provides both the complete disoccluded objects that form the scene and their ordering according to depth.
Our method first computes several distal scenes which are compatible with the proximal planar image. To compute these different hypothesized scenes, we propose a perceptually inspired object disocclusion method, which works by minimizing the Euler's elastica as well as by incorporating the relatability of partially occluded contours and the convexity of the disoccluded objects. Then, to estimate the preferred scene we rely on a Bayesian model and define probabilities taking into account the global complexity of the objects in the hypothesized scenes as well as the effort of bringing these objects in their relative position in the planar image, which is also measured by an Euler's elastica-based quantity. The model is illustrated with numerical experiments on, both, synthetic and real images showing the ability of our model to reconstruct the occluded objects and the preferred perceptual order among them. We also present results on images of the Berkeley dataset with provided figure-ground ground-truth labeling.
Keywords: Perception; visual completion; disocclusion; Bayesian model;relatability; Euler elastica
|
|
|
Naveen Onkarappa, & Angel Sappa. (2013). A Novel Space Variant Image Representation. JMIV - Journal of Mathematical Imaging and Vision, 47(1-2), 48–59.
Abstract: Traditionally, in machine vision images are represented using cartesian coordinates with uniform sampling along the axes. On the contrary, biological vision systems represent images using polar coordinates with non-uniform sampling. For various advantages provided by space-variant representations many researchers are interested in space-variant computer vision. In this direction the current work proposes a novel and simple space variant representation of images. The proposed representation is compared with the classical log-polar mapping. The log-polar representation is motivated by biological vision having the characteristic of higher resolution at the fovea and reduced resolution at the periphery. On the contrary to the log-polar, the proposed new representation has higher resolution at the periphery and lower resolution at the fovea. Our proposal is proved to be a better representation in navigational scenarios such as driver assistance systems and robotics. The experimental results involve analysis of optical flow fields computed on both proposed and log-polar representations. Additionally, an egomotion estimation application is also shown as an illustrative example. The experimental analysis comprises results from synthetic as well as real sequences.
Keywords: Space-variant representation; Log-polar mapping; Onboard vision applications
|
|
|
Carme Julia, Angel Sappa, Felipe Lumbreras, Joan Serrat, & Antonio Lopez. (2011). Rank Estimation in Missing Data Matrix Problems. JMIV - Journal of Mathematical Imaging and Vision, 39(2), 140–160.
Abstract: A novel technique for missing data matrix rank estimation is presented. It is focused on matrices of trajectories, where every element of the matrix corresponds to an image coordinate from a feature point of a rigid moving object at a given frame; missing data are represented as empty entries. The objective of the proposed approach is to estimate the rank of a missing data matrix in order to fill in empty entries with some matrix completion method, without using or assuming neither the number of objects contained in the scene nor the kind of their motion. The key point of the proposed technique consists in studying the frequency behaviour of the individual trajectories, which are seen as 1D signals. The main assumption is that due to the rigidity of the moving objects, the frequency content of the trajectories will be similar after filling in their missing entries. The proposed rank estimation approach can be used in different computer vision problems, where the rank of a missing data matrix needs to be estimated. Experimental results with synthetic and real data are provided in order to empirically show the good performance of the proposed approach.
|
|
|
Arnau Ramisa, David Aldavert, Shrihari Vasudevan, Ricardo Toledo, & Ramon Lopez de Mantaras. (2012). Evaluation of Three Vision Based Object Perception Methods for a Mobile Robot. JIRC - Journal of Intelligent and Robotic Systems, 68(2), 185–208.
Abstract: This paper addresses visual object perception applied to mobile robotics. Being able to perceive household objects in unstructured environments is a key capability in order to make robots suitable to perform complex tasks in home environments. However, finding a solution for this task is daunting: it requires the ability to handle the variability in image formation in a moving camera with tight time constraints. The paper brings to attention some of the issues with applying three state of the art object recognition and detection methods in a mobile robotics scenario, and proposes methods to deal with windowing/segmentation. Thus, this work aims at evaluating the state-of-the-art in object perception in an attempt to develop a lightweight solution for mobile robotics use/research in typical indoor settings.
|
|
|
Arnau Ramisa, Alex Goldhoorn, David Aldavert, Ricardo Toledo, & Ramon Lopez de Mantaras. (2011). Combining Invariant Features and the ALV Homing Method for Autonomous Robot Navigation Based on Panoramas. JIRC - Journal of Intelligent and Robotic Systems, 64(3-4), 625–649.
Abstract: Biologically inspired homing methods, such as the Average Landmark Vector, are an interesting solution for local navigation due to its simplicity. However, usually they require a modification of the environment by placing artificial landmarks in order to work reliably. In this paper we combine the Average Landmark Vector with invariant feature points automatically detected in panoramic images to overcome this limitation. The proposed approach has been evaluated first in simulation and, as promising results are found, also in two data sets of panoramas from real world environments.
|
|
|
Mariella Dimiccoli, Benoît Girard, Alain Berthoz, & Daniel Bennequin. (2013). Striola Magica: a functional explanation of otolith organs. JCN - Journal of Computational Neuroscience, 35(2), 125–154.
Abstract: Otolith end organs of vertebrates sense linear accelerations of the head and gravitation. The hair cells on their epithelia are responsible for transduction. In mammals, the striola, parallel to the line where hair cells reverse their polarization, is a narrow region centered on a curve with curvature and torsion. It has been shown that the striolar region is functionally different from the rest, being involved in a phasic vestibular pathway. We propose a mathematical and computational model that explains the necessity of this amazing geometry for the striola to be able to carry out its function. Our hypothesis, related to the biophysics of the hair cells and to the physiology of their afferent neurons, is that striolar afferents collect information from several type I hair cells to detect the jerk in a large domain of acceleration directions. This predicts a mean number of two calyces for afferent neurons, as measured in rodents. The domain of acceleration directions sensed by our striolar model is compatible with the experimental results obtained on monkeys considering all afferents. Therefore, the main result of our study is that phasic and tonic vestibular afferents cover the same geometrical fields, but at different dynamical and frequency domains.
Keywords: Otolith organs ;Striola; Vestibular pathway
|
|
|
Marçal Rusiñol, Lluis Pere de las Heras, & Oriol Ramos Terrades. (2014). Flowchart Recognition for Non-Textual Information Retrieval in Patent Search. IR - Information Retrieval, 17(5-6), 545–562.
Abstract: Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.
Keywords: Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition
|
|
|
Debora Gil, David Roche, Agnes Borras, & Jesus Giraldo. (2015). Terminating Evolutionary Algorithms at their Steady State. COA - Computational Optimization and Applications, 61(2), 489–515.
Abstract: Assessing the reliability of termination conditions for evolutionary algorithms (EAs) is of prime importance. An erroneous or weak stop criterion can negatively affect both the computational effort and the final result. We introduce a statistical framework for assessing whether a termination condition is able to stop an EA at its steady state, so that its results can not be improved anymore. We use a regression model in order to determine the requirements ensuring that a measure derived from EA evolving population is related to the distance to the optimum in decision variable space. Our framework is analyzed across 24 benchmark test functions and two standard termination criteria based on function fitness value in objective function space and EA population decision variable space distribution for the differential evolution (DE) paradigm. Results validate our framework as a powerful tool for determining the capability of a measure for terminating EA and the results also identify the decision variable space distribution as the best-suited for accurately terminating DE in real-world applications.
Keywords: Evolutionary algorithms; Termination condition; Steady state; Differential evolution
|
|
|
G.Blasco, Simone Balocco, J.Puig, J.Sanchez-Gonzalez, W.Ricart, J.Daunis-I-Estadella, et al. (2015). Carotid pulse wave velocity by magnetic resonance imaging is increased in middle-aged subjects with the metabolic syndrome. ICJI - International Journal of Cardiovascular Imaging, 31(3), 603–612.
Abstract: Arterial pulse wave velocity (PWV), an independent predictor of cardiovascular disease, physiologically increases with age; however, growing evidence suggests metabolic syndrome (MetS) accelerates this increase. Magnetic resonance imaging (MRI) enables reliable noninvasive assessment of arterial stiffness by measuring arterial PWV in specific vascular segments. We investigated the association between the presence of MetS and its components with carotid PWV (cPWV) in asymptomatic subjects without diabetes. We assessed cPWV by MRI in 61 individuals (mean age, 55.3 ± 14.1 years; median age, 55 years): 30 with MetS and 31 controls with similar age, sex, body mass index, and LDL-cholesterol levels. The study population was dichotomized by the median age. To remove the physiological association between PWV and age, unpaired t tests and multiple regression analyses were performed using the residuals of the regression between PWV and age. cPWV was higher in middle-aged subjects with MetS than in those without (p = 0.001), but no differences were found in elder subjects (p = 0.313). cPWV was associated with diastolic blood pressure (r = 0.276, p = 0.033) and waist circumference (r = 0.268, p = 0.038). The presence of MetS was associated with increased cPWV regardless of age, sex, blood pressure, and waist (p = 0.007). The MetS components contributing independently to an increased cPWV were hypertension (p = 0.018) and hypertriglyceridemia (p = 0.002). The presence of MetS is associated with an increased cPWV in middle-aged subjects. In particular, hypertension and hypertriglyceridemia may contribute to early progression of carotid stiffness.
Keywords: Metabolic syndrome; Arterial stiffness; Pulse wave velocity; Carotid artery; Magnetic resonance
|
|
|
Arnau Ramisa, Adriana Tapus, David Aldavert, Ricardo Toledo, & Ramon Lopez de Mantaras. (2009). Robust Vision-Based Localization using Combinations of Local Feature Regions Detectors. AR - Autonomous Robots, 27(4), 373–385.
Abstract: This paper presents a vision-based approach for mobile robot localization. The model of the environment is topological. The new approach characterizes a place using a signature. This signature consists of a constellation of descriptors computed over different types of local affine covariant regions extracted from an omnidirectional image acquired rotating a standard camera with a pan-tilt unit. This type of representation permits a reliable and distinctive environment modelling. Our objectives were to validate the proposed method in indoor environments and, also, to find out if the combination of complementary local feature region detectors improves the localization versus using a single region detector. Our experimental results show that if false matches are effectively rejected, the combination of different covariant affine region detectors increases notably the performance of the approach by combining the different strengths of the individual detectors. In order to reduce the localization time, two strategies are evaluated: re-ranking the map nodes using a global similarity measure and using standard perspective view field of 45°.
In order to systematically test topological localization methods, another contribution proposed in this work is a novel method to see the degradation in localization performance as the robot moves away from the point where the original signature was acquired. This allows to know the robustness of the proposed signature. In order for this to be effective, it must be done in several, variated, environments that test all the possible situations in which the robot may have to perform localization.
|
|
|
Jaume Amores. (2015). MILDE: multiple instance learning by discriminative embedding. KAIS - Knowledge and Information Systems, 42(2), 381–407.
Abstract: While the objective of the standard supervised learning problem is to classify feature vectors, in the multiple instance learning problem, the objective is to classify bags, where each bag contains multiple feature vectors. This represents a generalization of the standard problem, and this generalization becomes necessary in many real applications such as drug activity prediction, content-based image retrieval, and others. While the existing paradigms are based on learning the discriminant information either at the instance level or at the bag level, we propose to incorporate both levels of information. This is done by defining a discriminative embedding of the original space based on the responses of cluster-adapted instance classifiers. Results clearly show the advantage of the proposed method over the state of the art, where we tested the performance through a variety of well-known databases that come from real problems, and we also included an analysis of the performance using synthetically generated data.
Keywords: Multi-instance learning; Codebook; Bag of words
|
|
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2015). Combining Local and Global Learners in the Pairwise Multiclass Classification. PAA - Pattern Analysis and Applications, 18(4), 845–860.
Abstract: Pairwise classification is a well-known class binarization technique that converts a multiclass problem into a number of two-class problems, one problem for each pair of classes. However, in the pairwise technique, nuisance votes of many irrelevant classifiers may result in a wrong class prediction. To overcome this problem, a simple, but efficient method is proposed and evaluated in this paper. The proposed method is based on excluding some classes and focusing on the most probable classes in the neighborhood space, named Local Crossing Off (LCO). This procedure is performed by employing a modified version of standard K-nearest neighbor and large margin nearest neighbor algorithms. The LCO method takes advantage of nearest neighbor classification algorithm because of its local learning behavior as well as the global behavior of powerful binary classifiers to discriminate between two classes. Combining these two properties in the proposed LCO technique will avoid the weaknesses of each method and will increase the efficiency of the whole classification system. On several benchmark datasets of varying size and difficulty, we found that the LCO approach leads to significant improvements using different base learners. The experimental results show that the proposed technique not only achieves better classification accuracy in comparison to other standard approaches, but also is computationally more efficient for tackling classification problems which have a relatively large number of target classes.
Keywords: Multiclass classification; Pairwise approach; One-versus-one
|
|
|
Marçal Rusiñol, Josep Llados, & Gemma Sanchez. (2010). Symbol Spotting in Vectorized Technical Drawings Through a Lookup Table of Region Strings. PAA - Pattern Analysis and Applications, 13(3), 321–331.
Abstract: In this paper, we address the problem of symbol spotting in technical document images applied to scanned and vectorized line drawings. Like any information spotting architecture, our approach has two components. First, symbols are decomposed in primitives which are compactly represented and second a primitive indexing structure aims to efficiently retrieve similar primitives. Primitives are encoded in terms of attributed strings representing closed regions. Similar strings are clustered in a lookup table so that the set median strings act as indexing keys. A voting scheme formulates hypothesis in certain locations of the line drawing image where there is a high presence of regions similar to the queried ones, and therefore, a high probability to find the queried graphical symbol. The proposed approach is illustrated in a framework consisting in spotting furniture symbols in architectural drawings. It has been proved to work even in the presence of noise and distortion introduced by the scanning and raster-to-vector processes.
|
|
|
David Aldavert, Marçal Rusiñol, Ricardo Toledo, & Josep Llados. (2015). A Study of Bag-of-Visual-Words Representations for Handwritten Keyword Spotting. IJDAR - International Journal on Document Analysis and Recognition, 18(3), 223–234.
Abstract: The Bag-of-Visual-Words (BoVW) framework has gained popularity among the document image analysis community, specifically as a representation of handwritten words for recognition or spotting purposes. Although in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis domain still rely on the basic implementation of the BoVW method disregarding such latest refinements. In this paper, we present a review of those improvements and its application to the keyword spotting task. We thoroughly evaluate their impact against a baseline system in the well-known George Washington dataset and compare the obtained results against nine state-of-the-art keyword spotting methods. In addition, we also compare both the baseline and improved systems with the methods presented at the Handwritten Keyword Spotting Competition 2014.
Keywords: Bag-of-Visual-Words; Keyword spotting; Handwritten documents; Performance evaluation
|
|