Fernando Vilariño, Panagiota Spyridonos, Jordi Vitria, Fernando Azpiroz, & Petia Radeva. (2006). Cascade analysis for intestinal contraction detection. In 20th International Congress and exhibition Computer Assisted Radiology and Surgery (pp. 9–10).
Abstract: In this work, we address the study of intestinal contractions in a novel approach based on a machine learning framework to process data from Wireless Capsule Video Endoscopy. Wireless endoscopy represents a unique way to visualize the intestine motility by creating long videos to visualize intestine dynamics. In this paper we argue that to analyze huge amount of wireless endoscopy data and define robust methods for contraction detection we should base our approach on sophisticated machine learning techniques. In particular, we propose a cascade of classifiers in order to remove different physiological phenomenon and obtain the motility pattern of small intestines. Our results show obtaining high specificity and sensitivity rates that highlight the high efficiency of the selected approach and support the feasibility of the proposed methodology in the automatic detection and analysis of intestine contractions.
Keywords: intestine video analysis, anisotropic features, support vector machine, cascade of classifiers
|
Jaume Amores. (2010). Vocabulary-based Approaches for Multiple-Instance Data: a Comparative Study. In 20th International Conference on Pattern Recognition (4246–4250).
Abstract: Multiple Instance Learning (MIL) has become a hot topic and many different algorithms have been proposed in the last years. Despite this fact, there is a lack of comparative studies that shed light into the characteristics of the different methods and their behavior in different scenarios. In this paper we provide such an analysis. We include methods from different families, and pay special attention to vocabulary-based approaches, a new family of methods that has not received much attention in the MIL literature. The empirical comparison includes seven databases from four heterogeneous domains, implementations of eight popular MIL methods, and a study of the behavior under synthetic conditions. Based on this analysis, we show that, with an appropriate implementation, vocabulary-based approaches outperform other MIL methods in most of the cases, showing in general a more consistent performance.
|
Fadi Dornaika, & Bogdan Raducanu. (2010). Person-specific face shape estimation under varying head pose from single snapshots. In 20th International Conference on Pattern Recognition (3496–3499).
Abstract: This paper presents a new method for person-specific face shape estimation under varying head pose of a previously unseen person from a single image. We describe a featureless approach based on a deformable 3D model and a learned face subspace. The proposed approach is based on maximizing a likelihood measure associated with a learned face subspace, which is carried out by a stochastic and genetic optimizer. We conducted the experiments on a subset of Honda Video Database showing the feasibility and robustness of the proposed approach. For this reason, our approach could lend itself nicely to complex frameworks involving 3D face tracking and face gesture recognition in monocular videos.
|
Francesco Ciompi, Oriol Pujol, & Petia Radeva. (2010). A meta-learning approach to Conditional Random Fields using Error-Correcting Output Codes. In 20th International Conference on Pattern Recognition (710–713).
Abstract: We present a meta-learning framework for the design of potential functions for Conditional Random Fields. The design of both node potential and edge potential is formulated as a classification problem where margin classifiers are used. The set of state transitions for the edge potential is treated as a set of different classes, thus defining a multi-class learning problem. The Error-Correcting Output Codes (ECOC) technique is used to deal with the multi-class problem. Furthermore, the point defined by the combination of margin classifiers in the ECOC space is interpreted in a probabilistic manner, and the obtained distance values are then converted into potential values. The proposed model exhibits very promising results when applied to two real detection problems.
|
David Augusto Rojas, Fahad Shahbaz Khan, & Joost Van de Weijer. (2010). The Impact of Color on Bag-of-Words based Object Recognition. In 20th International Conference on Pattern Recognition (1549–1553).
Abstract: In recent years several works have aimed at exploiting color information in order to improve the bag-of-words based image representation. There are two stages in which color information can be applied in the bag-of-words framework. Firstly, feature detection can be improved by choosing highly informative color-based regions. Secondly, feature description, typically focusing on shape, can be improved with a color description of the local patches. Although both approaches have been shown to improve results the combined merits have not yet been analyzed. Therefore, in this paper we investigate the combined contribution of color to both the feature detection and extraction stages. Experiments performed on two challenging data sets, namely Flower and Pascal VOC 2009; clearly demonstrate that incorporating color in both feature detection and extraction significantly improves the overall performance.
|
Murad Al Haj, Andrew Bagdanov, Jordi Gonzalez, & Xavier Roca. (2010). Reactive object tracking with a single PTZ camera. In 20th International Conference on Pattern Recognition (1690–1693).
Abstract: In this paper we describe a novel approach to reactive tracking of moving targets with a pan-tilt-zoom camera. The approach uses an extended Kalman filter to jointly track the object position in the real world, its velocity in 3D and the camera intrinsics, in addition to the rate of change of these parameters. The filter outputs are used as inputs to PID controllers which continuously adjust the camera motion in order to reactively track the object at a constant image velocity while simultaneously maintaining a desirable target scale in the image plane. We provide experimental results on simulated and real tracking sequences to show how our tracker is able to accurately estimate both 3D object position and camera intrinsics with very high precision over a wide range of focal lengths.
|
Anjan Dutta, Umapada Pal, Alicia Fornes, & Josep Llados. (2010). An Efficient Staff Removal Technique from Printed Musical Documents. In 20th International Conference on Pattern Recognition (1965–1968).
Abstract: Staff removal is an important preprocessing step of the Optical Music Recognition (OMR). The process aims to remove the stafflines from a musical document and retain only the musical symbols, later these symbols are used effectively to identify the music information. This paper proposes a simple but robust method to remove stafflines from printed musical scores. In the proposed methodology we have considered a staffline segment as a horizontal linkage of vertical black runs with uniform height. We have used the neighbouring properties of a staffline segment to validate it as a true segment. We have considered the dataset along with the deformations described in for evaluation purpose. From experimentation we have got encouraging results.
|
Alicia Fornes, Sergio Escalera, Josep Llados, & Ernest Valveny. (2010). Symbol Classification using Dynamic Aligned Shape Descriptor. In 20th International Conference on Pattern Recognition (1957–1960).
Abstract: Shape representation is a difficult task because of several symbol distortions, such as occlusions, elastic deformations, gaps or noise. In this paper, we propose a new descriptor and distance computation for coping with the problem of symbol recognition in the domain of Graphical Document Image Analysis. The proposed D-Shape descriptor encodes the arrangement information of object parts in a circular structure, allowing different levels of distortion. The classification is performed using a cyclic Dynamic Time Warping based method, allowing distortions and rotation. The methodology has been validated on different data sets, showing very high recognition rates.
|
Susana Alvarez, Anna Salvatella, Maria Vanrell, & Xavier Otazu. (2010). Perceptual color texture codebooks for retrieving in highly diverse texture datasets. In 20th International Conference on Pattern Recognition (866–869).
Abstract: Color and texture are visual cues of different nature, their integration in a useful visual descriptor is not an obvious step. One way to combine both features is to compute texture descriptors independently on each color channel. A second way is integrate the features at a descriptor level, in this case arises the problem of normalizing both cues. A significant progress in the last years in object recognition has provided the bag-of-words framework that again deals with the problem of feature combination through the definition of vocabularies of visual words. Inspired in this framework, here we present perceptual textons that will allow to fuse color and texture at the level of p-blobs, which is our feature detection step. Feature representation is based on two uniform spaces representing the attributes of the p-blobs. The low-dimensionality of these text on spaces will allow to bypass the usual problems of previous approaches. Firstly, no need for normalization between cues; and secondly, vocabularies are directly obtained from the perceptual properties of text on spaces without any learning step. Our proposal improve current state-of-art of color-texture descriptors in an image retrieval experiment over a highly diverse texture dataset from Corel.
|
Marçal Rusiñol, Farshad Nourbakhsh, Dimosthenis Karatzas, Ernest Valveny, & Josep Llados. (2010). Perceptual Image Retrieval by Adding Color Information to the Shape Context Descriptor. In 20th International Conference on Pattern Recognition (1594–1597).
Abstract: In this paper we present a method for the retrieval of images in terms of perceptual similarity. Local color information is added to the shape context descriptor in order to obtain an object description integrating both shape and color as visual cues. We use a color naming algorithm in order to represent the color information from a perceptual point of view. The proposed method has been tested in two different applications, an object retrieval scenario based on color sketch queries and a color trademark retrieval problem. Experimental results show that the addition of the color information significantly outperforms the sole use of the shape context descriptor.
|
Muhammad Muzzamil Luqman, Josep Llados, Jean-Yves Ramel, & Thierry Brouard. (2010). A Fuzzy-Interval Based Approach For Explicit Graph Embedding, Recognizing Patterns in Signals, Speech, Images and Video. In 20th International Conference on Pattern Recognition (Vol. 6388, 93–98). LNCS. Springer, Heidelberg.
Abstract: We present a new method for explicit graph embedding. Our algorithm extracts a feature vector for an undirected attributed graph. The proposed feature vector encodes details about the number of nodes, number of edges, node degrees, the attributes of nodes and the attributes of edges in the graph. The first two features are for the number of nodes and the number of edges. These are followed by w features for node degrees, m features for k node attributes and n features for l edge attributes — which represent the distribution of node degrees, node attribute values and edge attribute values, and are obtained by defining (in an unsupervised fashion), fuzzy-intervals over the list of node degrees, node attributes and edge attributes. Experimental results are provided for sample data of ICPR2010 contest GEPR.
|
Muhammad Muzzamil Luqman, Thierry Brouard, Jean-Yves Ramel, & Josep Llados. (2010). A Content Spotting System For Line Drawing Graphic Document Images. In 20th International Conference on Pattern Recognition (Vol. 20, 3420–3423).
Abstract: We present a content spotting system for line drawing graphic document images. The proposed system is sufficiently domain independent and takes the keyword based information retrieval for graphic documents, one step forward, to Query By Example (QBE) and focused retrieval. During offline learning mode: we vectorize the documents in the repository, represent them by attributed relational graphs, extract regions of interest (ROIs) from them, convert each ROI to a fuzzy structural signature, cluster similar signatures to form ROI classes and build an index for the repository. During online querying mode: a Bayesian network classifier recognizes the ROIs in the query image and the corresponding documents are fetched by looking up in the repository index. Experimental results are presented for synthetic images of architectural and electronic documents.
|
Albert Gordo, & Florent Perronnin. (2010). A Bag-of-Pages Approach to Unordered Multi-Page Document Classification. In 20th International Conference on Pattern Recognition (1920–1923).
Abstract: We consider the problem of classifying documents containing multiple unordered pages. For this purpose, we propose a novel bag-of-pages document representation. To represent a document, one assigns every page to a prototype in a codebook of pages. This leads to a histogram representation which can then be fed to any discriminative classifier. We also consider several refinements over this initial approach. We show on two challenging datasets that the proposed approach significantly outperforms a baseline system.
|
Katerine Diaz, Francesc J. Ferri, & W. Diaz. (2013). Fast Approximated Discriminative Common Vectors using rank-one SVD updates. In 20th International Conference On Neural Information Processing (Vol. 8228, pp. 368–375). LNCS. Springer Berlin Heidelberg.
Abstract: An efficient incremental approach to the discriminative common vector (DCV) method for dimensionality reduction and classification is presented. The proposal consists of a rank-one update along with an adaptive restriction on the rank of the null space which leads to an approximate but convenient solution. The algorithm can be implemented very efficiently in terms of matrix operations and space complexity, which enables its use in large-scale dynamic application domains. Deep comparative experimentation using publicly available high dimensional image datasets has been carried out in order to properly assess the proposed algorithm against several recent incremental formulations.
K. Diaz-Chito, F.J. Ferri, W. Diaz
|
Rahat Khan, Joost Van de Weijer, Dimosthenis Karatzas, & Damien Muselet. (2013). Towards multispectral data acquisition with hand-held devices. In 20th IEEE International Conference on Image Processing (pp. 2053–2057).
Abstract: We propose a method to acquire multispectral data with handheld devices with front-mounted RGB cameras. We propose to use the display of the device as an illuminant while the camera captures images illuminated by the red, green and
blue primaries of the display. Three illuminants and three response functions of the camera lead to nine response values which are used for reflectance estimation. Results are promising and show that the accuracy of the spectral reconstruction improves in the range from 30-40% over the spectral
reconstruction based on a single illuminant. Furthermore, we propose to compute sensor-illuminant aware linear basis by discarding the part of the reflectances that falls in the sensorilluminant null-space. We show experimentally that optimizing reflectance estimation on these new basis functions decreases
the RMSE significantly over basis functions that are independent to sensor-illuminant. We conclude that, multispectral data acquisition is potentially possible with consumer hand-held devices such as tablets, mobiles, and laptops, opening up applications which are currently considered to be unrealistic.
Keywords: Multispectral; mobile devices; color measurements
|