|
Josep M. Gonfaus, Theo Gevers, Arjan Gijsenij, Xavier Roca, & Jordi Gonzalez. (2012). Edge Classification using Photo-Geo metric features. In 21st International Conference on Pattern Recognition (pp. 1497–1500).
Abstract: Edges are caused by several imaging cues such as shadow, material and illumination transitions. Classification methods have been proposed which are solely based on photometric information, ignoring geometry to classify the physical nature of edges in images. In this paper, the aim is to present a novel strategy to handle both photometric and geometric information for edge classification. Photometric information is obtained through the use of quasi-invariants while geometric information is derived from the orientation and contrast of edges. Different combination frameworks are compared with a new principled approach that captures both information into the same descriptor. From large scale experiments on different datasets, it is shown that, in addition to photometric information, the geometry of edges is an important visual cue to distinguish between different edge types. It is concluded that by combining both cues the performance improves by more than 7% for shadows and highlights.
|
|
|
Francesco Ciompi. (2012). Multi-Class Learning for Vessel Characterization in Intravascular Ultrasound (Petia Radeva, & Oriol Pujol, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: In this thesis we tackle the problem of automatic characterization of human coronary vessel in Intravascular Ultrasound (IVUS) image modality. The basis for the whole characterization process is machine learning applied to multi-class problems. In all the presented approaches, the Error-Correcting Output Codes (ECOC) framework is used as central element for the design of multi-class classifiers.
Two main topics are tackled in this thesis. First, the automatic detection of the vessel borders is presented. For this purpose, a novel context-aware classifier for multi-class classification of the vessel morphology is presented, namely ECOC-DRF. Based on ECOC-DRF, the lumen border and the media-adventitia border in IVUS are robustly detected by means of a novel holistic approach, achieving an error comparable with inter-observer variability and with state of the art methods.
The two vessel borders define the atheroma area of the vessel. In this area, tissue characterization is required. For this purpose, we present a framework for automatic plaque characterization by processing both texture in IVUS images and spectral information in raw Radio Frequency data. Furthermore, a novel method for fusing in-vivo and in-vitro IVUS data for plaque characterization is presented, namely pSFFS. The method demonstrates to effectively fuse data generating a classifier that improves the tissue characterization in both in-vitro and in-vivo datasets.
A novel method for automatic video summarization in IVUS sequences is also presented. The method aims to detect the key frames of the sequence, i.e., the frames representative of morphological changes. This novel method represents the basis for video summarization in IVUS as well as the markers for the partition of the vessel into morphological and clinically interesting events.
Finally, multi-class learning based on ECOC is applied to lung tissue characterization in Computed Tomography. The novel proposed approach, based on supervised and unsupervised learning, achieves accurate tissue classification on a large and heterogeneous dataset.
|
|
|
Karel Paleček, David Geronimo, & Frederic Lerasle. (2012). Pre-attention cues for person detection. In Cognitive Behavioural Systems, COST 2102 International Training School (pp. 225–235). LNCS. Springer Berlin Heidelberg.
Abstract: Current state-of-the-art person detectors have been proven reliable and achieve very good detection rates. However, the performance is often far from real time, which limits their use to low resolution images only. In this paper, we deal with candidate window generation problem for person detection, i.e. we want to reduce the computational complexity of a person detector by reducing the number of regions that has to be evaluated. We base our work on Alexe’s paper [1], which introduced several pre-attention cues for generic object detection. We evaluate these cues in the context of person detection and show that their performance degrades rapidly for scenes containing multiple objects of interest such as pictures from urban environment. We extend this set by new cues, which better suits our class-specific task. The cues are designed to be simple and efficient, so that they can be used in the pre-attention phase of a more complex sliding window based person detector.
|
|
|
Petia Radeva, Michal Drozdzal, Santiago Segui, Laura Igual, Carolina Malagelada, Fernando Azpiroz, et al. (2012). Active labeling: Application to wireless endoscopy analysis. In High Performance Computing and Simulation, International Conference on (pp. 174–181).
Abstract: Today, robust learners trained in a real supervised machine learning application should count with a rich collection of positive and negative examples. Although in many applications, it is not difficult to obtain huge amount of data, labeling those data can be a very expensive process, especially when dealing with data of high variability and complexity. A good example of such cases are data from medical imaging applications where annotating anomalies like tumors, polyps, atherosclerotic plaque or informative frames in wireless endoscopy need highly trained experts. Building a representative set of training data from medical videos (e.g. Wireless Capsule Endoscopy) means that thousands of frames to be labeled by an expert. It is quite normal that data in new videos come different and thus are not represented by the training set. In this paper, we review the main approaches on active learning and illustrate how active learning can help to reduce expert effort in constructing the training sets. We show that applying active learning criteria, the number of human interventions can be significantly reduced. The proposed system allows the annotation of informative/non-informative frames of Wireless Capsule Endoscopy video containing more than 30000 frames each one with less than 100 expert ”clicks”.
|
|
|
Jose Carlos Rubio, Joan Serrat, & Antonio Lopez. (2012). Video Co-segmentation. In 11th Asian Conference on Computer Vision (Vol. 7725, pp. 13–24). LNCS. Springer Berlin Heidelberg.
Abstract: Segmentation of a single image is in general a highly underconstrained problem. A frequent approach to solve it is to somehow provide prior knowledge or constraints on how the objects of interest look like (in terms of their shape, size, color, location or structure). Image co-segmentation trades the need for such knowledge for something much easier to obtain, namely, additional images showing the object from other viewpoints. Now the segmentation problem is posed as one of differentiating the similar object regions in all the images from the more varying background. In this paper, for the first time, we extend this approach to video segmentation: given two or more video sequences showing the same object (or objects belonging to the same class) moving in a similar manner, we aim to outline its region in all the frames. In addition, the method works in an unsupervised manner, by learning to segment at testing time. We compare favorably with two state-of-the-art methods on video segmentation and report results on benchmark videos.
|
|
|
Cristhian Aguilera, M.Ramos, & Angel Sappa. (2012). Simulated Annealing: A Novel Application of Image Processing in the Wood Area. In Marcos de Sales Guerra Tsuzuki (Ed.), Simulated Annealing – Advances, Applications and Hybridizations (pp. 91–104).
|
|
|
Monica Piñol, Angel Sappa, & Ricardo Toledo. (2012). MultiTable Reinforcement for Visual Object Recognition. In 4th International Conference on Signal and Image Processing (Vol. 221, pp. 469–480). LNCS. Springer India.
Abstract: This paper presents a bag of feature based method for visual object recognition. Our contribution is focussed on the selection of the best feature descriptor. It is implemented by using a novel multi-table reinforcement learning method that selects among five of classical descriptors (i.e., Spin, SIFT, SURF, C-SIFT and PHOW) the one that best describes each image. Experimental results and comparisons are provided showing the improvements achieved with the proposed approach.
|
|
|
Mohammad Rouhani, & Angel Sappa. (2012). Non-Rigid Shape Registration: A Single Linear Least Squares Framework. In 12th European Conference on Computer Vision (Vol. 7578, pp. 264–277). LNCS. Springer Berlin Heidelberg.
Abstract: This paper proposes a non-rigid registration formulation capturing both global and local deformations in a single framework. This formulation is based on a quadratic estimation of the registration distance together with a quadratic regularization term. Hence, the optimal transformation parameters are easily obtained by solving a liner system of equations, which guarantee a fast convergence. Experimental results with challenging 2D and 3D shapes are presented to show the validity of the proposed framework. Furthermore, comparisons with the most relevant approaches are provided.
|
|
|
Miguel Oliveira, V.Santos, & Angel Sappa. (2012). Short term path planning using a multiple hypothesis evaluation approach for an autonomous driving competition. In IEEE 4th Workshop on Planning, Perception and Navigation for Intelligent Vehicles.
|
|
|
Marina Alberti, Simone Balocco, Xavier Carrillo, J. Mauri, & Petia Radeva. (2012). Automatic Non-Rigid Temporal Alignment of IVUS Sequences. In 15th International Conference on Medical Image Computing and Computer Assisted Intervention (Vol. 1, pp. 642–650). Springer-Verlag Berlin, Heidelberg.
Abstract: Clinical studies on atherosclerosis regression/progression performed by Intravascular Ultrasound analysis require the alignment of pullbacks of the same patient before and after clinical interventions. In this paper, a methodology for the automatic alignment of IVUS sequences based on the Dynamic Time Warping technique is proposed. The method is adapted to the specific IVUS alignment task by applying the non-rigid alignment technique to multidimensional morphological signals, and by introducing a sliding window approach together with a regularization term. To show the effectiveness of our method, an extensive validation is performed both on synthetic data and in-vivo IVUS sequences. The proposed method is robust to stent deployment and post dilation surgery and reaches an alignment error of approximately 0.7 mm for in-vivo data, which is comparable to the inter-observer variability.
|
|
|
Onur Ferhat. (2012). Eye-Tracking with Webcam-Based Setups: Implementation of a Real-Time System and an Analysis of Factors Affecting Performance (Fernando Vilariño, Ed.) (Vol. 172). Master's thesis, , .
Abstract: In the recent years commercial eye-tracking hardware has become more common, with the introduction of new models from several brands that have better performance and easier setup procedures. A cause and at the same time a result of this phenomenon is the popularity of eye-tracking research directed at marketing, accessibility and usability, among others.
One problem with these hardware components is scalability, because both the price and the necessary expertise to operate them makes it practically impossible in the large scale. In this work, we analyze the feasibility of a software eye-tracking system based on a single, ordinary webcam. Our aim is to discover the limits of such a system and to see whether it provides acceptable performances.
The significance of this setup is that it is the most common setup found in consumer environments, off-the-shelf electronic devices such as laptops, mobile phones and tablet computers. As no special equipment such as infrared lights, mirrors or zoom lenses are used; setting up and calibrating the system is easier compared to other approaches using these components.
Our work is based on the open source application Opengazer, which provides a good starting point for our contributions. We propose several improvements in order to push the system's performance further and make it feasible as a robust, real-time device. Then we carry out an elaborate experiment involving 18 human subjects and 4 different system setups. Finally, we give an analysis of the results and discuss the effects of setup changes, subject differences and modifications in the software.
Keywords: Computer vision, eye-tracking, gaussian process, feature selection, optical flow
|
|
|
Pedro Martins, Paulo Carvalho, & Carlo Gatta. (2012). Stable Salient Shapes. In International Conference on Digital Image Computing: Techniques and Applications.
|
|
|
Jaume Gibert, Ernest Valveny, Horst Bunke, & Alicia Fornes. (2012). On the Correlation of Graph Edit Distance and L1 Distance in the Attribute Statistics Embedding Space. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 135–143). LNCS. Springer-Berlag, Berlin.
Abstract: Graph embeddings in vector spaces aim at assigning a pattern vector to every graph so that the problems of graph classification and clustering can be solved by using data processing algorithms originally developed for statistical feature vectors. An important requirement graph features should fulfil is that they reproduce as much as possible the properties among objects in the graph domain. In particular, it is usually desired that distances between pairs of graphs in the graph domain closely resemble those between their corresponding vectorial representations. In this work, we analyse relations between the edit distance in the graph domain and the L1 distance of the attribute statistics based embedding, for which good classification performance has been reported on various datasets. We show that there is actually a high correlation between the two kinds of distances provided that the corresponding parameter values that account for balancing the weight between node and edge based features are properly selected.
|
|
|
Rui Hua, Oriol Pujol, Francesco Ciompi, Marina Alberti, Simone Balocco, J. Mauri, et al. (2012). Stent Strut Detection by Classifying a Wide Set of IVUS Features. In Computed Assisted Stenting Workshop.
|
|
|
Adela Barbulescu, Wenjuan Gong, Jordi Gonzalez, Thomas B. Moeslund, & Xavier Roca. (2012). 3D Human Pose Estimation Using 2D Body Part Detectors. In 21st International Conference on Pattern Recognition (pp. 2484–2487).
Abstract: Automatic 3D reconstruction of human poses from monocular images is a challenging and popular topic in the computer vision community, which provides a wide range of applications in multiple areas. Solutions for 3D pose estimation involve various learning approaches, such as support vector machines and Gaussian processes, but many encounter difficulties in cluttered scenarios and require additional input data, such as silhouettes, or controlled camera settings. We present a framework that is capable of estimating the 3D pose of a person from single images or monocular image sequences without requiring background information and which is robust to camera variations. The framework models the non-linearity present in human pose estimation as it benefits from flexible learning approaches, including a highly customizable 2D detector. Results on the HumanEva benchmark show how they perform and influence the quality of the 3D pose estimates.
|
|