|
Marc Bolaños, Maite Garolera, & Petia Radeva. (2013). Active labeling application applied to food-related object recognition. In 5th International Workshop on Multimedia for Cooking & Eating Activities (pp. 45–50).
Abstract: Every day, lifelogging devices, available for recording different aspects of our daily life, increase in number, quality and functions, just like the multiple applications that we give to them. Applying wearable devices to analyse the nutritional habits of people is a challenging application based on acquiring and analyzing life records in long periods of time. However, to extract the information of interest related to the eating patterns of people, we need automatic methods to process large amount of life-logging data (e.g. recognition of food-related objects). Creating a rich set of manually labeled samples to train the algorithms is slow, tedious and subjective. To address this problem, we propose a novel method in the framework of Active Labeling for construct- ing a training set of thousands of images. Inspired by the hierarchical sampling method for active learning [6], we propose an Active forest that organizes hierarchically the data for easy and fast labeling. Moreover, introducing a classifier into the hierarchical structures, as well as transforming the feature space for better data clustering, additionally im- prove the algorithm. Our method is successfully tested to label 89.700 food-related objects and achieves significant reduction in expert time labelling.
Active labeling application applied to food-related object recognition ResearchGate. Available from: http://www.researchgate.net/publication/262252017Activelabelingapplicationappliedtofood-relatedobjectrecognition [accessed Jul 14, 2015].
|
|
|
Lluis Pere de las Heras, David Fernandez, Alicia Fornes, Ernest Valveny, Gemma Sanchez, & Josep Llados. (2013). Runlength Histogram Image Signature for Perceptual Retrieval of Architectural Floor Plans. In 10th IAPR International Workshop on Graphics Recognition.
|
|
|
Lluis Pere de las Heras, Ernest Valveny, & Gemma Sanchez. (2013). Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies. In 10th IAPR International Workshop on Graphics Recognition.
|
|
|
A.S. Coquel, Jean-Pascal Jacob, M. Primet, A. Demarez, Mariella Dimiccoli, T. Julou, et al. (2013). Localization of protein aggregation in Escherichia coli is governed by diffusion and nucleoid macromolecular crowding effect. PCB - Plos Computational Biology, 9(4).
Abstract: Aggregates of misfolded proteins are a hallmark of many age-related diseases. Recently, they have been linked to aging of Escherichia coli (E. coli) where protein aggregates accumulate at the old pole region of the aging bacterium. Because of the potential of E. coli as a model organism, elucidating aging and protein aggregation in this bacterium may pave the way to significant advances in our global understanding of aging. A first obstacle along this path is to decipher the mechanisms by which protein aggregates are targeted to specific intercellular locations. Here, using an integrated approach based on individual-based modeling, time-lapse fluorescence microscopy and automated image analysis, we show that the movement of aging-related protein aggregates in E. coli is purely diffusive (Brownian). Using single-particle tracking of protein aggregates in live E. coli cells, we estimated the average size and diffusion constant of the aggregates. Our results provide evidence that the aggregates passively diffuse within the cell, with diffusion constants that depend on their size in agreement with the Stokes-Einstein law. However, the aggregate displacements along the cell long axis are confined to a region that roughly corresponds to the nucleoid-free space in the cell pole, thus confirming the importance of increased macromolecular crowding in the nucleoids. We thus used 3D individual-based modeling to show that these three ingredients (diffusion, aggregation and diffusion hindrance in the nucleoids) are sufficient and necessary to reproduce the available experimental data on aggregate localization in the cells. Taken together, our results strongly support the hypothesis that the localization of aging-related protein aggregates in the poles of E. coli results from the coupling of passive diffusion-aggregation with spatially non-homogeneous macromolecular crowding. They further support the importance of “soft” intracellular structuring (based on macromolecular crowding) in diffusion-based protein localization in E. coli.
|
|
|
Jose Manuel Alvarez, Theo Gevers, & Antonio Lopez. (2013). Evaluating Color Representation for Online Road Detection. In ICCV Workshop on Computer Vision in Vehicle Technology: From Earth to Mars (pp. 594–595).
Abstract: Detecting traversable road areas ahead a moving vehicle is a key process for modern autonomous driving systems. Most existing algorithms use color to classify pixels as road or background. These algorithms reduce the effect of lighting variations and weather conditions by exploiting the discriminant/invariant properties of different color representations. However, up to date, no comparison between these representations have been conducted. Therefore, in this paper, we perform an evaluation of existing color representations for road detection. More specifically, we focus on color planes derived from RGB data and their most com-
mon combinations. The evaluation is done on a set of 7000 road images acquired
using an on-board camera in different real-driving situations.
|
|
|
Jaume Amores. (2013). Multiple Instance Classification: review, taxonomy and comparative study. AI - Artificial Intelligence, 201, 81–105.
Abstract: Multiple Instance Learning (MIL) has become an important topic in the pattern recognition community, and many solutions to this problemhave been proposed until now. Despite this fact, there is a lack of comparative studies that shed light into the characteristics and behavior of the different methods. In this work we provide such an analysis focused on the classification task (i.e.,leaving out other learning tasks such as regression). In order to perform our study, we implemented
fourteen methods grouped into three different families. We analyze the performance of the approaches across a variety of well-known databases, and we also study their behavior in synthetic scenarios in order to highlight their characteristics. As a result of this analysis, we conclude that methods that extract global bag-level information show a clearly superior performance in general. In this sense, the analysis permits us to understand why some types of methods are more successful than others, and it permits us to establish guidelines in the design of new MIL
methods.
Keywords: Multi-instance learning; Codebook; Bag-of-Words
|
|
|
David Geronimo, Joan Serrat, Antonio Lopez, & Ramon Baldrich. (2013). Traffic sign recognition for computer vision project-based learning. T-EDUC - IEEE Transactions on Education, 56(3), 364–371.
Abstract: This paper presents a graduate course project on computer vision. The aim of the project is to detect and recognize traffic signs in video sequences recorded by an on-board vehicle camera. This is a demanding problem, given that traffic sign recognition is one of the most challenging problems for driving assistance systems. Equally, it is motivating for the students given that it is a real-life problem. Furthermore, it gives them the opportunity to appreciate the difficulty of real-world vision problems and to assess the extent to which this problem can be solved by modern computer vision and pattern classification techniques taught in the classroom. The learning objectives of the course are introduced, as are the constraints imposed on its design, such as the diversity of students' background and the amount of time they and their instructors dedicate to the course. The paper also describes the course contents, schedule, and how the project-based learning approach is applied. The outcomes of the course are discussed, including both the students' marks and their personal feedback.
Keywords: traffic signs
|
|
|
Albert Gordo, Florent Perronnin, & Ernest Valveny. (2013). Large-scale document image retrieval and classification with runlength histograms and binary embeddings. PR - Pattern Recognition, 46(7), 1898–1905.
Abstract: We present a new document image descriptor based on multi-scale runlength
histograms. This descriptor does not rely on layout analysis and can be
computed efficiently. We show how this descriptor can achieve state-of-theart
results on two very different public datasets in classification and retrieval
tasks. Moreover, we show how we can compress and binarize these descriptors
to make them suitable for large-scale applications. We can achieve state-ofthe-
art results in classification using binary descriptors of as few as 16 to 64
bits.
Keywords: visual document descriptor; compression; large-scale; retrieval; classification
|
|
|
Albert Gordo, Alicia Fornes, & Ernest Valveny. (2013). Writer identification in handwritten musical scores with bags of notes. PR - Pattern Recognition, 46(5), 1337–1345.
Abstract: Writer Identification is an important task for the automatic processing of documents. However, the identification of the writer in graphical documents is still challenging. In this work, we adapt the Bag of Visual Words framework to the task of writer identification in handwritten musical scores. A vanilla implementation of this method already performs comparably to the state-of-the-art. Furthermore, we analyze the effect of two improvements of the representation: a Bhattacharyya embedding, which improves the results at virtually no extra cost, and a Fisher Vector representation that very significantly improves the results at the cost of a more complex and costly representation. Experimental evaluation shows results more than 20 points above the state-of-the-art in a new, challenging dataset.
|
|
|
Veronica Romero, Alicia Fornes, Nicolas Serrano, Joan Andreu Sanchez, A.H. Toselli, Volkmar Frinken, et al. (2013). The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition. PR - Pattern Recognition, 46(6), 1658–1669.
Abstract: Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies.
|
|
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2013). A Genetic-based Subspace Analysis Method for Improving Error-Correcting Output Coding. PR - Pattern Recognition, 46(10), 2830–2839.
Abstract: Two key factors affecting the performance of Error Correcting Output Codes (ECOC) in multiclass classification problems are the independence of binary classifiers and the problem-dependent coding design. In this paper, we propose an evolutionary algorithm-based approach to the design of an application-dependent codematrix in the ECOC framework. The central idea of this work is to design a three-dimensional codematrix, where the third dimension is the feature space of the problem domain. In order to do that, we consider the feature space in the design process of the codematrix with the aim of improving the independence and accuracy of binary classifiers. The proposed method takes advantage of some basic concepts of ensemble classification, such as diversity of classifiers, and also benefits from the evolutionary approach for optimizing the three-dimensional codematrix, taking into account the problem domain. We provide a set of experimental results using a set of benchmark datasets from the UCI Machine Learning Repository, as well as two real multiclass Computer Vision problems. Both sets of experiments are conducted using two different base learners: Neural Networks and Decision Trees. The results show that the proposed method increases the classification accuracy in comparison with the state-of-the-art ECOC coding techniques.
Keywords: Error Correcting Output Codes; Evolutionary computation; Multiclass classification; Feature subspace; Ensemble classification
|
|
|
Muhammad Muzzamil Luqman, Jean-Yves Ramel, Josep Llados, & Thierry Brouard. (2013). Fuzzy Multilevel Graph Embedding. PR - Pattern Recognition, 46(2), 551–565.
Abstract: Structural pattern recognition approaches offer the most expressive, convenient, powerful but computational expensive representations of underlying relational information. To benefit from mature, less expensive and efficient state-of-the-art machine learning models of statistical pattern recognition they must be mapped to a low-dimensional vector space. Our method of explicit graph embedding bridges the gap between structural and statistical pattern recognition. We extract the topological, structural and attribute information from a graph and encode numeric details by fuzzy histograms and symbolic details by crisp histograms. The histograms are concatenated to achieve a simple and straightforward embedding of graph into a low-dimensional numeric feature vector. Experimentation on standard public graph datasets shows that our method outperforms the state-of-the-art methods of graph embedding for richly attributed graphs.
Keywords: Pattern recognition; Graphics recognition; Graph clustering; Graph classification; Explicit graph embedding; Fuzzy logic
|
|
|
Anjan Dutta, Josep Llados, & Umapada Pal. (2013). A symbol spotting approach in graphical documents by hashing serialized graphs. PR - Pattern Recognition, 46(3), 752–768.
Abstract: In this paper we propose a symbol spotting technique in graphical documents. Graphs are used to represent the documents and a (sub)graph matching technique is used to detect the symbols in them. We propose a graph serialization to reduce the usual computational complexity of graph matching. Serialization of graphs is performed by computing acyclic graph paths between each pair of connected nodes. Graph paths are one-dimensional structures of graphs which are less expensive in terms of computation. At the same time they enable robust localization even in the presence of noise and distortion. Indexing in large graph databases involves a computational burden as well. We propose a graph factorization approach to tackle this problem. Factorization is intended to create a unified indexed structure over the database of graphical documents. Once graph paths are extracted, the entire database of graphical documents is indexed in hash tables by locality sensitive hashing (LSH) of shape descriptors of the paths. The hashing data structure aims to execute an approximate k-NN search in a sub-linear time. We have performed detailed experiments with various datasets of line drawings and compared our method with the state-of-the-art works. The results demonstrate the effectiveness and efficiency of our technique.
Keywords: Symbol spotting; Graphics recognition; Graph matching; Graph serialization; Graph factorization; Graph paths; Hashing
|
|
|
Naila Murray, Maria Vanrell, Xavier Otazu, & C. Alejandro Parraga. (2013). Low-level SpatioChromatic Grouping for Saliency Estimation. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2810–2816.
Abstract: We propose a saliency model termed SIM (saliency by induction mechanisms), which is based on a low-level spatiochromatic model that has successfully predicted chromatic induction phenomena. In so doing, we hypothesize that the low-level visual mechanisms that enhance or suppress image detail are also responsible for making some image regions more salient. Moreover, SIM adds geometrical grouplets to enhance complex low-level features such as corners, and suppress relatively simpler features such as edges. Since our model has been fitted on psychophysical chromatic induction data, it is largely nonparametric. SIM outperforms state-of-the-art methods in predicting eye fixations on two datasets and using two metrics.
|
|
|
Alex Pardo, Albert Clapes, Sergio Escalera, & Oriol Pujol. (2013). Actions in Context: System for people with Dementia. In 2nd International Workshop on Citizen Sensor Networks (Citisen2013) at the European Conference on Complex Systems (pp. 3–14). Springer International Publishing.
Abstract: In the next forty years, the number of people living with dementia is expected to triple. In the last stages, people affected by this disease become dependent. This hinders the autonomy of the patient and has a huge social impact in time, money and effort. Given this scenario, we propose an ubiquitous system capable of recognizing daily specific actions. The system fuses and synchronizes data obtained from two complementary modalities – ambient and egocentric. The ambient approach consists in a fixed RGB-Depth camera for user and object recognition and user-object interaction, whereas the egocentric point of view is given by a personal area network (PAN) formed by a few wearable sensors and a smartphone, used for gesture recognition. The system processes multi-modal data in real-time, performing paralleled task recognition and modality synchronization, showing high performance recognizing subjects, objects, and interactions, showing its reliability to be applied in real case scenarios.
Keywords: Multi-modal data Fusion; Computer vision; Wearable sensors; Gesture recognition; Dementia
|
|