Home | [41–50] << 51 52 53 54 55 56 57 58 59 60 >> [61–70] |
![]() |
Lluis Pere de las Heras, Joan Mas, Gemma Sanchez, & Ernest Valveny. (2011). Descriptor-based Svm Wall Detector. In 9th International Workshop on Graphic Recognition.
Abstract: Architectural floorplans exhibit a large variability in notation. Therefore, segmenting and identifying the elements of any kind of plan becomes a challenging task for approaches based on grouping structural primitives obtained by vectorization. Recently, a patch-based segmentation method working at pixel level and relying on the construction of a visual vocabulary has been proposed showing its adaptability to different notations by automatically learning the visual appearance of the elements in each different notation. In this paper we describe an evolution of this new approach in two directions: firstly we evaluate different features to obtain the description of every patch. Secondly, we train an SVM classifier to obtain the category of every patch instead of constructing a visual vocabulary. These modifications of the method have been tested for wall detection on two datasets of architectural floorplans with different notations and compared with the results obtained with the original approach.
|
Marçal Rusiñol, V. Poulain d'Andecy, Dimosthenis Karatzas, & Josep Llados. (2011). Classification of Administrative Document Images by Logo Identification. In In proceedings of 9th IAPR Workshop on Graphic Recognition.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier's graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
|
Anjan Dutta, Josep Llados, & Umapada Pal. (2011). Bag-of-GraphPaths Descriptors for Symbol Recognition and Spotting in Line Drawings. In In proceedings of 9th IAPR Workshop on Graphic Recognition. LNCS. Springer Berlin Heidelberg.
Abstract: Graphical symbol recognition and spotting recently have become an important research activity. In this work we present a descriptor for symbols, especially for line drawings. The descriptor is based on the graph representation of graphical objects. We construct graphs from the vectorized information of the binarized images, where the critical points detected by the vectorization algorithm are considered as nodes and the lines joining them are considered as edges. Graph paths between two nodes in a graph are the finite sequences of nodes following the order from the starting to the final node. The occurrences of different graph paths in a given graph is an important feature, as they capture the geometrical and structural attributes of a graph. So the graph representing a symbol can efficiently be represent by the occurrences of its different paths. Their occurrences in a symbol can be obtained in terms of a histogram counting the number of some fixed prototype paths, we call the histogram as the Bag-of-GraphPaths (BOGP). These BOGP histograms are used as a descriptor to measure the distance among the symbols in vector space. We use the descriptor for three applications, they are: (1) classification of the graphical symbols, (2) spotting of the architectural symbols on floorplans, (3) classification of the historical handwritten words.
|
Eduard Vazquez. (2011). Unsupervised image segmentation based on material reflectance description and saliency (Ramon Baldrich, Ed.). Ph.D. thesis, , .
Abstract: Image segmentations aims to partition an image into a set of non-overlapped regions, called segments. Despite the simplicity of the definition, image segmentation raises as a very complex problem in all its stages. The definition of segment is still unclear. When asking to a human to perform a segmentation, this person segments at different levels of abstraction. Some segments might be a single, well-defined texture whereas some others correspond with an object in the scene which might including multiple textures and colors. For this reason, segmentation is divided in bottom-up segmentation and top-down segmentation. Bottom up-segmentation is problem independent, that is, focused on general properties of the images such as textures or illumination. Top-down segmentation is a problem-dependent approach which looks for specific entities in the scene, such as known objects. This work is focused on bottom-up segmentation. Beginning from the analysis of the lacks of current methods, we propose an approach called RAD. Our approach overcomes the main shortcomings of those methods which use the physics of the light to perform the segmentation. RAD is a topological approach which describes a single-material reflectance. Afterwards, we cope with one of the main problems in image segmentation: non supervised adaptability to image content. To yield a non-supervised method, we use a model of saliency yet presented in this thesis. It computes the saliency of the chromatic transitions of an image by means of a statistical analysis of the images derivatives. This method of saliency is used to build our final approach of segmentation: spRAD. This method is a non-supervised segmentation approach. Our saliency approach has been validated with a psychophysical experiment as well as computationally, overcoming a state-of-the-art saliency method. spRAD also outperforms state-of-the-art segmentation techniques as results obtained with a widely-used segmentation dataset show
|
Santiago Segui. (2011). Contributions to the Diagnosis of Intestinal Motility by Automatic Image Analysis (Jordi Vitria, Ed.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: In the early twenty first century Given Imaging Ltd. presented wireless capsule endoscopy (WCE) as a new technological breakthrough that allowed the visualization of
the intestine by using a small, swallowed camera. This small size device was received with a high enthusiasm within the medical community, and until now, it is still one of the medical devices with the highest use growth rate. WCE can be used as a novel diagnostic tool that presents several clinical advantages, since it is non-invasive and at the same time it provides, for the first time, a full picture of the small bowel morphology, contents and dynamics. Since its appearance, the WCE has been used to detect several intestinal dysfunctions such as: polyps, ulcers and bleeding. However, the visual analysis of WCE videos presents an important drawback: the long time required by the physicians for proper video visualization. In this sense and regarding to this limitation, the development of computer aided systems is required for the extensive use of WCE in the medical community. The work presented in this thesis is a set of contributions for the automatic image analysis and computer-aided diagnosis of intestinal motility disorders using WCE. Until now, the diagnosis of small bowel motility dysfunctions was basically performed by invasive techniques such as the manometry test, which can only be conducted at some referral centers around the world owing to the complexity of the procedure and the medial expertise required in the interpretation of the results. Our contributions are divided in three main blocks: 1. Image analysis by computer vision techniques to detect events in the endoluminal WCE scene. Several methods have been proposed to detect visual events such as: intestinal contractions, intestinal content, tunnel and wrinkles; 2. Machine learning techniques for the analysis and the manipulation of the data from WCE. These methods have been proposed in order to overcome the problems that the analysis of WCE presents such as: video acquisition cost, unlabeled data and large number of data; 3. Two different systems for the computer-aided diagnosis of intestinal motility disorders using WCE. The first system presents a fully automatic method that aids at discriminating healthy subjects from patients with severe intestinal motor disorders like pseudo-obstruction or food intolerance. The second system presents another automatic method that models healthy subjects and discriminate them from mild intestinal motility patients. |
Pierluigi Casale. (2011). Approximate Ensemble Methods for Physical Activity Recognition Applications (Oriol Pujol, & Petia Radeva, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: The main interest of this thesis focuses on computational methodologies able to
reduce the degree of complexity of learning algorithms and its application to physical activity recognition. Random Projections will be used to reduce the computational complexity in Multiple Classifier Systems. A new boosting algorithm and a new one-class classification methodology have been developed. In both cases, random projections are used for reducing the dimensionality of the problem and for generating diversity, exploiting in this way the benefits that ensembles of classifiers provide in terms of performances and stability. Moreover, the new one-class classification methodology, based on an ensemble strategy able to approximate a multidimensional convex-hull, has been proved to over-perform state-of-the-art one-class classification methodologies. The practical focus of the thesis is towards Physical Activity Recognition. A new hardware platform for wearable computing application has been developed and used for collecting data of activities of daily living allowing to study the optimal features set able to successful classify activities. Based on the classification methodologies developed and the study conducted on physical activity classification, a machine learning architecture capable to provide a continuous authentication mechanism for mobile-devices users has been worked out, as last part of the thesis. The system, based on a personalized classifier, states on the analysis of the characteristic gait patterns typical of each individual ensuring an unobtrusive and continuous authentication mechanism |
Fahad Shahbaz Khan. (2011). Coloring bag-of-words based image representations (Joost Van de Weijer, & Maria Vanrell, Eds.). Ph.D. thesis, , .
Abstract: Put succinctly, the bag-of-words based image representation is the most successful approach for object and scene recognition. Within the bag-of-words framework the optimal fusion of multiple cues, such as shape, texture and color, still remains an active research domain. There exist two main approaches to combine color and shape information within the bag-of-words framework. The first approach called, early fusion, fuses color and shape at the feature level as a result of which a joint colorshape vocabulary is produced. The second approach, called late fusion, concatenates histogram representation of both color and shape, obtained independently. In the first part of this thesis, we analyze the theoretical implications of both early and late feature fusion. We demonstrate that both these approaches are suboptimal for a subset of object categories. Consequently, we propose a novel method for recognizing object categories when using multiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category specific color attention. Color is used to compute bottom-up and top-down attention maps. Subsequently, the color attention maps are used to modulate the weights of the shape features. Shape features are given more weight in regions with higher attention and vice versa. The approach is tested on several benchmark object recognition data sets and the results clearly demonstrate the effectiveness of our proposed method. In the second part of the thesis, we investigate the problem of obtaining compact spatial pyramid representations for object and scene recognition. Spatial pyramids have been successfully applied to incorporate spatial information into bag-of-words based image representation. However, a major drawback of spatial pyramids is that it leads to high dimensional image representations. We present a novel framework for obtaining compact pyramid representation. The approach reduces the size of a high dimensional pyramid representation upto an order of magnitude without any significant reduction in accuracy. Moreover, we also investigate the optimal combination of multiple features such as color and shape within the context of our compact pyramid representation. Finally, we describe a novel technique to build discriminative visual words from multiple cues learned independently from training images. To this end, we use an information theoretic vocabulary compression technique to find discriminative combinations of visual cues and the resulting visual vocabulary is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. The approach is tested on standard object recognition data sets. The results obtained clearly demonstrate the effectiveness of our approach.
|
Sergio Vera, Debora Gil, Antonio Lopez, & Miguel Angel Gonzalez Ballester. (2012). Multilocal Creaseness Measure. IJ - The Insight Journal.
Abstract: This document describes the implementation using the Insight Toolkit of an algorithm for detecting creases (ridges and valleys) in N-dimensional images, based on the Local Structure Tensor of the image. In addition to the filter used to calculate the creaseness image, a filter for the computation of the structure tensor is also included in this submission.
Keywords: Ridges, Valley, Creaseness, Structure Tensor, Skeleton,
|
Michal Drozdzal, Petia Radeva, Santiago Segui, Laura Igual, Carolina Malagelada, Fernando Azpiroz, et al. (2012). System and Method for Improving a Discriminative Model. |
Carles Sanchez. (2011). Tracheal ring detection in bronchoscopy (F. J. S. Debora Gil, Ed.) (Vol. 168). Master's thesis, , .
Abstract: Endoscopy is the process in which a camera is introduced inside a human.
Given that endoscopy provides realistic images (in contrast to other modalities) and allows non-invase minimal intervention procedures (which can aid in diagnosis and surgical interventions), its use has spreaded during last decades. In this project we will focus on bronchoscopic procedures, during which the camera is introduced through the trachea in order to have a diagnostic of the patient. The diagnostic interventions are focused on: degree of stenosis (reduction in tracheal area), prosthesis or early diagnosis of tumors. In the first case, assessment of the luminal area and the calculation of the diameters of the tracheal rings are required. A main limitation is that all the process is done by hand, which means that the doctor takes all the measurements and decisions just by looking at the screen. As far as we know there is no computational framework for helping the doctors in the diagnosis. This project will consist of analysing bronchoscopic videos in order to extract useful information for the diagnostic of the degree of stenosis. In particular we will focus on segmentation of the tracheal rings. As a result of this project several strategies (for detecting tracheal rings) had been implemented in order to compare their performance. Keywords: Bronchoscopy, tracheal ring, segmentation
|
Francesc Tanarro Marquez, Pau Gratacos Marti, F. Javier Sanchez, Joan Ramon Jimenez Minguell, Coen Antens, & Enric Sala i Esteva. (2012). A device for monitoring condition of a railway supply. European Patent Office.
Abstract: of a railway supply line when the supply line is in contact with a head of a pantograph of a vehicle in order to power said vehicle . The device includes a camera ( for monitoring parameters indicative of operating capability of said supply line.
The device is intended to monitor condition tive of operating capability of said supply line. The device includes a reflective element. comprising a pattern , intended to be arranged onto the pantograph head . The camera is intended to be arranged on the vehicle (10) so as to register the pattern position regarding a vertical direction. |
G.D. Evangelidis, Ferran Diego, Joan Serrat, & Antonio Lopez. (2011). Slice Matching for Accurate Spatio-Temporal Alignment. In In ICCV Workshop on Visual Surveillance.
Abstract: Video synchronization and alignment is a rather recent topic in computer vision. It usually deals with the problem of aligning sequences recorded simultaneously by static, jointly- or independently-moving cameras. In this paper, we investigate the more difficult problem of matching videos captured at different times from independently-moving cameras, whose trajectories are approximately coincident or parallel. To this end, we propose a novel method that pixel-wise aligns videos and allows thus to automatically highlight their differences. This primarily aims at visual surveillance but the method can be adopted as is by other related video applications, like object transfer (augmented reality) or high dynamic range video. We build upon a slice matching scheme to first synchronize the sequences, while we develop a spatio-temporal alignment scheme to spatially register corresponding frames and refine the temporal mapping. We investigate the performance of the proposed method on videos recorded from vehicles driven along different types of roads and compare with related previous works.
Keywords: video alignment
|
G. Roig, Xavier Boix, F. de la Torre, Joan Serrat, & C. Vilella. (2011). Hierarchical CRF with product label spaces for parts-based Models. In IEEE Conference on Automatic Face and Gesture Recognition.
Abstract: Non-rigid object detection is a challenging an open research problem in computer vision. It is a critical part in many applications such as image search, surveillance, human-computer interaction or image auto-annotation. Most successful approaches to non-rigid object detection make use of part-based models. In particular, Conditional Random Fields (CRF) have been successfully embedded into a discriminative parts-based model framework due to its effectiveness for learning and inference (usually based on a tree structure). However, CRF-based approaches do not incorporate global constraints and only model pairwise interactions. This is especially important when modeling object classes that may have complex parts interactions (e.g. facial features or body articulations), because neglecting them yields an oversimplified model with suboptimal performance. To overcome this limitation, this paper proposes a novel hierarchical CRF (HCRF). The main contribution is to build a hierarchy of part combinations by extending the label set to a hierarchy of product label spaces. In order to keep the inference computation tractable, we propose an effective method to reduce the new label set. We test our method on two applications: facial feature detection on the Multi-PIE database and human pose estimation on the Buffy dataset.
|
Albert Andaluz. (2012). Harmonic Phase Flow: User's guide. Barcelona: CVC.
Abstract: HPF is a plugin for the computation of clinical scores under Osirix.
This manual provides a basic guide for experienced clinical staff. Chapter 1 provides the theoretical background in which this plugin is based. Next, in chapter 2 we provide basic instructions for installing and uninstalling this plugin. chapter 3we shows a step-by-step scenario to compute clinical scores from tagged-MRI images with HPF. Finally, in chapter 4 we provide a quick guide for plugin developers |
Fahad Shahbaz Khan, Joost Van de Weijer, Andrew Bagdanov, & Maria Vanrell. (2011). Portmanteau Vocabularies for Multi-Cue Image Representation. In 25th Annual Conference on Neural Information Processing Systems.
Abstract: We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation
|