|   | 
Details
   web
Records
Author L. Rothacker; Marçal Rusiñol; Josep Llados; G.A. Fink
Title A Two-stage Approach to Segmentation-Free Query-by-example Word Spotting Type Journal
Year 2014 Publication Manuscript Cultures Abbreviated Journal
Volume 7 Issue Pages 47-58
Keywords
Abstract With the ongoing progress in digitization, huge document collections and archives have become available to a broad audience. Scanned document images can be transmitted electronically and studied simultaneously throughout the world. While this is very beneficial, it is often impossible to perform automated searches on these document collections. Optical character recognition usually fails when it comes to handwritten or historic documents. In order to address the need for exploring document collections rapidly, researchers are working on word spotting. In query-by-example word spotting scenarios, the user selects an exemplary occurrence of the query word in a document image. The word spotting system then retrieves all regions in the collection that are visually similar to the given example of the query word. The best matching regions are presented to the user and no actual transcription is required.
An important property of a word spotting system is the computational speed with which queries can be executed. In our previous work, we presented a relatively slow but high-precision method. In the present work, we will extend this baseline system to an integrated two-stage approach. In a coarse-grained first stage, we will filter document images efficiently in order to identify regions that are likely to contain the query word. In the fine-grained second stage, these regions will be analyzed with our previously presented high-precision method. Finally, we will report recognition results and query times for the well-known George Washington
benchmark in our evaluation. We achieve state-of-the-art recognition results while the query times can be reduced to 50% in comparison with our baseline.
Address
Corporate Author Thesis (up)
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.061; 600.077 Approved no
Call Number Admin @ si @ Serial 3190
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco
Title Evaluation of feature detectors and descriptors in VISIBLE-LWIR cross-spectral imaging Type Report
Year 2014 Publication CVC Technical Report Abbreviated Journal
Volume 177 Issue Pages
Keywords Multi-spectral; Cross-spectral; Visible-LWIR imaging; Multimodal.
Abstract This thesis evaluates the performance of different state-of-art feature detectors and descriptors algorithms in the Visible-LWIR cross-spectral scenario. The focus is to determine if current detector and descriptor algorithms can be used to match features between the LWIR spectrum and the visible spectrum in applications such as, visual odometry, object recognition, image registration and stereo vision. An outdoor cross-spectral dataset was created to evaluate the suitability of the different algorithms. The results
show that the tested algorithms are not suitable to the task of matching features across different spectra. The repeatability ratio was smaller than the 30 percent in the best case and in general matched features were not accurate located. Additionally, these results also suggest that is necessary to create new algorithms that take into account the nature of the different spectra, describing characteristics that exist in both spectra such as discontinuities.
Address
Corporate Author Thesis (up) Master's thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @Agu2014 Serial 2526
Permanent link to this record
 

 
Author Antonio Esteban Lansaque
Title 3D reconstruction and recognition using structured ligth Type Report
Year 2014 Publication CVC Technical Report Abbreviated Journal
Volume 179 Issue Pages
Keywords
Abstract This work covers the problem of 3D reconstruction, recognition and 6DOF pose estimation. The goal of this project is to reconstruct a 3D scene and to align an object model of the industrial pieces onto the reconstructed scene. The reconstruction algorithm is based on stereo techniques and the recognition algorithm is based on SHOT descriptors computed on a set of uniform keypoints. Correspondences are used to estimate a first 6DOF transformation that maps the model onto the scene and then ICP algorithm is used to refine the transformation. In order to check the effectiveness of the proposed algorithm, several experiments were performed. These experiments were conducted on a lab environment in order to get results under the same conditions in all of them. Although obtained results are not real time results, the proposed algorithm ends up with high rates of object recognition.
Address UAB; September 2014
Corporate Author Thesis (up) Master's thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.075 Approved no
Call Number Admin @ si @ Est2014 Serial 2578
Permanent link to this record
 

 
Author Ricard Balague
Title Exploring the combination of color cues for intrinsic image decomposition Type Report
Year 2014 Publication CVC Technical Report Abbreviated Journal
Volume 178 Issue Pages
Keywords
Abstract Intrinsic image decomposition is a challenging problem that consists in separating an image into its physical characteristics: reflectance and shading. This problem can be solved in different ways, but most methods have combined information from several visual cues. In this work we describe an extension of an existing method proposed by Serra et al. which considers two color descriptors and combines them by means of a Markov Random Field. We analyze in depth the weak points of the method and we explore more possibilities to use in both descriptors. The proposed extension depends on the combination of the cues considered to overcome some of the limitations of the original method. Our approach is tested on the MIT dataset and Beigpour et al. dataset, which contain images of real objects acquired under controlled conditions and synthetic images respectively, with their corresponding ground truth.
Address UAB; September 2014
Corporate Author Thesis (up) Master's thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes CIC; 600.074 Approved no
Call Number Admin @ si @ Bal2014 Serial 2579
Permanent link to this record
 

 
Author Sebastian Ramos
Title Vision-based Detection of Road Hazards for Autonomous Driving Type Report
Year 2014 Publication CVC Technical Report Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address UAB; September 2014
Corporate Author Thesis (up) Master's thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @ Ram2014 Serial 2580
Permanent link to this record
 

 
Author Monica Piñol
Title Reinforcement Learning of Visual Descriptors for Object Recognition Type Book Whole
Year 2014 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The human visual system is able to recognize the object in an image even if the object is partially occluded, from various points of view, in different colors, or with independence of the distance to the object. To do this, the eye obtains an image and extracts features that are sent to the brain, and then, in the brain the object is recognized. In computer vision, the object recognition branch tries to learns from the human visual system behaviour to achieve its goal. Hence, an algorithm is used to identify representative features of the scene (detection), then another algorithm is used to describe these points (descriptor) and finally the extracted information is used for classifying the object in the scene. The selection of this set of algorithms is a very complicated task and thus, a very active research field. In this thesis we are focused on the selection/learning of the best descriptor for a given image. In the state of the art there are several descriptors but we do not know how to choose the best descriptor because depends on scenes that we will use (dataset) and the algorithm chosen to do the classification. We propose a framework based on reinforcement learning and bag of features to choose the best descriptor according to the given image. The system can analyse the behaviour of different learning algorithms and descriptor sets. Furthermore the proposed framework for improving the classification/recognition ratio can be used with minor changes in other computer vision fields, such as video retrieval.
Address
Corporate Author Thesis (up) Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Ricardo Toledo;Angel Sappa
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-940902-5-7 Medium
Area Expedition Conference
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @ Piñ2014 Serial 2464
Permanent link to this record
 

 
Author Anjan Dutta
Title Inexact Subgraph Matching Applied to Symbol Spotting in Graphical Documents Type Book Whole
Year 2014 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract There is a resurgence in the use of structural approaches in the usual object recognition and retrieval problem. Graph theory, in particular, graph matching plays a relevant role in that. Specifically, the detection of an object (or a part of that) in an image in terms of structural features can be formulated as a subgraph matching. Subgraph matching is a challenging task. Specially due to the presence of outliers most of the graph matching algorithms do not perform well in subgraph matching scenario. Also exact subgraph isomorphism has proven to be an NP-complete problem. So naturally, in graph matching community, there are lot of efforts addressing the problem of subgraph matching within suboptimal bound. Most of them work with approximate algorithms that try to get an inexact solution in estimated way. In addition, usual recognition must cope with distortion. Inexact graph matching consists in finding the best isomorphism under a similarity measure. Theoretically this thesis proposes algorithms for solving subgraph matching in an approximate and inexact way.
We consider the symbol spotting problem on graphical documents or line drawings from application point of view. This is a well known problem in the graphics recognition community. It can be further applied for indexing and classification of documents based on their contents. The structural nature of this kind of documents easily motivates one for giving a graph based representation. So the symbol spotting problem on graphical documents can be considered as a subgraph matching problem. The main challenges in this application domain is the noise and distortions that might come during the usage, digitalization and raster to vector conversion of those documents. Apart from that computer vision nowadays is not any more confined within a limited number of images. So dealing a huge number of images with graph based method is a further challenge.
In this thesis, on one hand, we have worked on efficient and robust graph representation to cope with the noise and distortions coming from documents. On the other hand, we have worked on different graph based methods and framework to solve the subgraph matching problem in a better approximated way, which can also deal with considerable number of images. Firstly, we propose a symbol spotting method by hashing serialized subgraphs. Graph serialization allows to create factorized substructures such as graph paths, which can be organized in hash tables depending on the structural similarities of the serialized subgraphs. The involvement of hashing techniques helps to reduce the search space substantially and speeds up the spotting procedure. Secondly, we introduce contextual similarities based on the walk based propagation on tensor product graph. These contextual similarities involve higher order information and more reliable than pairwise similarities. We use these higher order similarities to formulate subgraph matching as a node and edge selection problem in the tensor product graph. Thirdly, we propose near convex grouping to form near convex region adjacency graph which eliminates the limitations of traditional region adjacency graph representation for graphic recognition. Fourthly, we propose a hierarchical graph representation by simplifying/correcting the structural errors to create a hierarchical graph of the base graph. Later these hierarchical graph structures are matched with some graph matching methods. Apart from that, in this thesis we have provided an overall experimental comparison of all the methods and some of the state-of-the-art methods. Furthermore, some dataset models have also been proposed.
Address
Corporate Author Thesis (up) Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Umapada Pal
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-940902-4-0 Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ Dut2014 Serial 2465
Permanent link to this record
 

 
Author Michal Drozdzal
Title Sequential image analysis for computer-aided wireless endoscopy Type Book Whole
Year 2014 Publication PhD Thesis, Universitat de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Wireless Capsule Endoscopy (WCE) is a technique for inner-visualization of the entire small intestine and, thus, offers an interesting perspective on intestinal motility. The two major drawbacks of this technique are: 1) huge amount of data acquired by WCE makes the motility analysis tedious and 2) since the capsule is the first tool that offers complete inner-visualization of the small intestine,the exact importance of the observed events is still an open issue. Therefore, in this thesis, a novel computer-aided system for intestinal motility analysis is presented. The goal of the system is to provide an easily-comprehensible visual description of motility-related intestinal events to a physician. In order to do so, several tools based either on computer vision concepts or on machine learning techniques are presented. A method for transforming 3D video signal to a holistic image of intestinal motility, called motility bar, is proposed. The method calculates the optimal mapping from video into image from the intestinal motility point of view.
To characterize intestinal motility, methods for automatic extraction of motility information from WCE are presented. Two of them are based on the motility bar and two of them are based on frame-per-frame analysis. In particular, four algorithms dealing with the problems of intestinal contraction detection, lumen size estimation, intestinal content characterization and wrinkle frame detection are proposed and validated. The results of the algorithms are converted into sequential features using an online statistical test. This test is designed to work with multivariate data streams. To this end, we propose a novel formulation of concentration inequality that is introduced into a robust adaptive windowing algorithm for multivariate data streams. The algorithm is used to obtain robust representation of segments with constant intestinal motility activity. The obtained sequential features are shown to be discriminative in the problem of abnormal motility characterization.
Finally, we tackle the problem of efficient labeling. To this end, we incorporate active learning concepts to the problems present in WCE data and propose two approaches. The first one is based the concepts of sequential learning and the second one adapts the partition-based active learning to an error-free labeling scheme. All these steps are sufficient to provide an extensive visual description of intestinal motility that can be used by an expert as decision support system.
Address
Corporate Author Thesis (up) Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Petia Radeva
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-940902-3-3 Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number Admin @ si @ Dro2014 Serial 2486
Permanent link to this record
 

 
Author Antonio Clavelli
Title A computational model of eye guidance, searching for text in real scene images Type Book Whole
Year 2014 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Searching for text objects in real scene images is an open problem and a very active computer vision research area. A large number of methods have been proposed tackling the text search as extension of the ones from the document analysis field or inspired by general purpose object detection methods. However the general problem of object search in real scene images remains an extremely challenging problem due to the huge variability in object appearance. This thesis builds on top of the most recent findings in the visual attention literature presenting a novel computational model of eye guidance aiming to better describe text object search in real scene images.
First are presented the relevant state-of-the-art results from the visual attention literature regarding eye movements and visual search. Relevant models of attention are discussed and integrated with recent observations on the role of top-down constraints and the emerging need for a layered model of attention in which saliency is not the only factor guiding attention. Visual attention is then explained by the interaction of several modulating factors, such as objects, value, plans and saliency. Then we introduce our probabilistic formulation of attention deployment in real scene. The model is based on the rationale that oculomotor control depends on two interacting but distinct processes: an attentional process that assigns value to the sources of information and motor process that flexibly links information with action.
In such framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the reward of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects.
In the experimental section the model is tested in laboratory condition, comparing model simulations with data from eye tracking experiments. The comparison is qualitative in terms of observable scan paths and quantitative in terms of statistical similarity of gaze shift amplitude. Experiments are performed using eye tracking data from both a publicly available dataset of face and text and from newly performed eye-tracking experiments on a dataset of street view pictures containing text. The last part of this thesis is dedicated to study the extent to which the proposed model can account for human eye movements in a low constrained setting. We used a mobile eye tracking device and an ad-hoc developed methodology to compare model simulated eye data with the human eye data from mobile eye tracking recordings. Such setting allow to test the model in an incomplete visual information condition, reproducing a close to real-life search task.
Address
Corporate Author Thesis (up) Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Dimosthenis Karatzas;Giuseppe Boccignone;Josep Llados
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-940902-6-4 Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ Cla2014 Serial 2571
Permanent link to this record
 

 
Author Jon Almazan
Title Learning to Represent Handwritten Shapes and Words for Matching and Recognition Type Book Whole
Year 2014 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Writing is one of the most important forms of communication and for centuries, handwriting had been the most reliable way to preserve knowledge. However, despite the recent development of printing houses and electronic devices, handwriting is still broadly used for taking notes, doing annotations, or sketching ideas.
Transferring the ability of understanding handwritten text or recognizing handwritten shapes to computers has been the goal of many researches due to its huge importance for many different fields. However, designing good representations to deal with handwritten shapes, e.g. symbols or words, is a very challenging problem due to the large variability of these kinds of shapes. One of the consequences of working with handwritten shapes is that we need representations to be robust, i.e., able to adapt to large intra-class variability. We need representations to be discriminative, i.e., able to learn what are the differences between classes. And, we need representations to be efficient, i.e., able to be rapidly computed and compared. Unfortunately, current techniques of handwritten shape representation for matching and recognition do not fulfill some or all of these requirements.
Through this thesis we focus on the problem of learning to represent handwritten shapes aimed at retrieval and recognition tasks. Concretely, on the first part of the thesis, we focus on the general problem of representing any kind of handwritten shape. We first present a novel shape descriptor based on a deformable grid that deals with large deformations by adapting to the shape and where the cells of the grid can be used to extract different features. Then, we propose to use this descriptor to learn statistical models, based on the Active Appearance Model, that jointly learns the variability in structure and texture of a given class. Then, on the second part, we focus on a concrete application, the problem of representing handwritten words, for the tasks of word spotting, where the goal is to find all instances of a query word in a dataset of images, and recognition. First, we address the segmentation-free problem and propose an unsupervised, sliding-window-based approach that achieves state-of- the-art results in two public datasets. Second, we address the more challenging multi-writer problem, where the variability in words exponentially increases. We describe an approach in which both word images and text strings are embedded in a common vectorial subspace, and where those that represent the same word are close together. This is achieved by a combination of label embedding and attributes learning, and a common subspace regression. This leads to a low-dimensional, unified representation of word images and strings, resulting in a method that allows one to perform either image and text searches, as well as image transcription, in a unified framework. We evaluate our methods on different public datasets of both handwritten documents and natural images showing results comparable or better than the state-of-the-art on spotting and recognition tasks.
Address
Corporate Author Thesis (up) Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Ernest Valveny;Alicia Fornes
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ Alm2014 Serial 2572
Permanent link to this record
 

 
Author David Fernandez
Title Contextual Word Spotting in Historical Handwritten Documents Type Book Whole
Year 2014 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract There are countless collections of historical documents in archives and libraries that contain plenty of valuable information for historians and researchers. The extraction of this information has become a central task among the Document Analysis researches and practitioners.
There is an increasing interest to digital preserve and provide access to these kind of documents. But only the digitalization is not enough for the researchers. The extraction and/or indexation of information of this documents has had an increased interest among researchers. In many cases, and in particular in historical manuscripts, the full transcription of these documents is extremely dicult due the inherent de ciencies: poor physical preservation, di erent writing styles, obsolete languages, etc. Word spotting has become a popular an ecient alternative to full transcription. It inherently involves a high level of degradation in the images. The search of words is holistically
formulated as a visual search of a given query shape in a larger image, instead of recognising the input text and searching the query word with an ascii string comparison. But the performance of classical word spotting approaches depend on the degradation level of the images being unacceptable in many cases . In this thesis we have proposed a novel paradigm called contextual word spotting method that uses the contextual/semantic information to achieve acceptable results whereas classical word spotting does not reach. The contextual word spotting framework proposed in this thesis is a segmentation-based word spotting approach, so an ecient word segmentation is needed. Historical handwritten
documents present some common diculties that can increase the diculties the extraction of the words. We have proposed a line segmentation approach that formulates the problem as nding the central part path in the area between two consecutive lines. This is solved as a graph traversal problem. A path nding algorithm is used to nd the optimal path in a graph, previously computed, between the text lines. Once the text lines are extracted, words are localized inside the text lines using a word segmentation technique from the state of the
art. Classical word spotting approaches can be improved using the contextual information of the documents. We have introduced a new framework, oriented to handwritten documents that present a highly structure, to extract information making use of context. The framework is an ecient tool for semi-automatic transcription that uses the contextual information to achieve better results than classical word spotting approaches. The contextual information is
automatically discovered by recognizing repetitive structures and categorizing all the words according to semantic classes. The most frequent words in each semantic cluster are extracted and the same text is used to transcribe all them. The experimental results achieved in this thesis outperform classical word spotting approaches demonstrating the suitability of the proposed ensemble architecture for spotting words in historical handwritten documents using contextual information.
Address
Corporate Author Thesis (up) Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Alicia Fornes
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-940902-7-1 Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ Fer2014 Serial 2573
Permanent link to this record
 

 
Author Lluis Pere de las Heras
Title Relational Models for Visual Understanding of Graphical Documents. Application to Architectural Drawings. Type Book Whole
Year 2014 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Graphical documents express complex concepts using a visual language. This language consists of a vocabulary (symbols) and a syntax (structural relations between symbols) that articulate a semantic meaning in a certain context. Therefore, the automatic interpretation by computers of these sort of documents entails three main steps: the detection of the symbols, the extraction of the structural relations between these symbols, and the modeling of the knowledge that permits the extraction of the semantics. Di erent domains in graphical documents include: architectural and engineering drawings, maps, owcharts, etc.
Graphics Recognition in particular and Document Image Analysis in general are
born from the industrial need of interpreting a massive amount of digitalized documents after the emergence of the scanner. Although many years have passed, the graphical document understanding problem still seems to be far from being solved. The main reason is that the vast majority of the systems in the literature focus on very speci c problems, where the domain of the document dictates the implementation of the interpretation. As a result, it is dicult to reuse these strategies on di erent data and on di erent contexts, hindering thus the natural progress in the eld.
In this thesis, we face the graphical document understanding problem by proposing several relational models at di erent levels that are designed from a generic perspective. Firstly, we introduce three di erent strategies for the detection of symbols. The fi rst method tackles the problem structurally, wherein general knowledge of the domain guides the detection. The second is a statistical method that learns the graphical appearance of the symbols and easily adapts to the big variability of the problem. The third method is a combination of the previous two methods that inherits their respective strengths, i.e. copes the big variability and does not need annotated data. Secondly, we present two relational strategies that tackle the problem of the visual context extraction. The fi rst one is a full bottom up method that heuristically searches in a graph representation the contextual relations between symbols. Contrarily, the second is syntactic method that models probabilistically the structure of the documents. It automatically learns the model, which guides the inference algorithm to encounter the best structural representation for a given input. Finally, we construct a knowledge-based model consisting of an ontological de nition of the domain and real data. This model permits to perform contextual reasoning and to detect semantic inconsistencies within the data. We evaluate the suitability of the proposed contributions in the framework of floor plan interpretation. Since there is no standard in the modeling of these documents there exists an enormous notation variability from plan to plan in terms of vocabulary and syntax. Therefore, floor plan interpretation is a relevant task in the graphical document understanding problem. It is also worth to mention that we make freely available all the resources used in this thesis {the data, the tool used to generate the data, and the evaluation scripts{ with the aim of fostering research in the graphical document understanding task.
Address
Corporate Author Thesis (up) Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Gemma Sanchez
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-940902-8-8 Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ Her2014 Serial 2574
Permanent link to this record
 

 
Author Carles Sanchez
Title Tracheal Structure Characterization using Geometric and Appearance Models for Efficient Assessment of Stenosis in Videobronchoscopy Type Book Whole
Year 2014 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Recent advances in endoscopic devices have increased their use for minimal invasive diagnostic and intervention procedures. Among all endoscopic modalities, bronchoscopy is one of the most frequent with around 261 millions of procedures per year. Although the use of bronchoscopy is spread among clinical facilities it presents some drawbacks, being the visual inspection for the assessment of anatomical measurements the most prevalent of them. In
particular, inaccuracies in the estimation of the degree of stenosis (the percentage of obstructed airway) decreases its diagnostic yield and might lead to erroneous treatments. An objective computation of tracheal stenosis in bronchoscopy videos would constitute a breakthrough for this non-invasive technique and a reduction in treatment cost.
This thesis settles the first steps towards on-line reliable extraction of anatomical information from videobronchoscopy for computation of objective measures. In particular, we focus on the computation of the degree of stenosis, which is obtained by comparing the area delimited by a healthy tracheal ring and the stenosed lumen. Reliable extraction of airway structures in interventional videobronchoscopy is a challenging task. This is mainly due to the large variety of acquisition conditions (positions and illumination), devices (different digitalizations) and in videos acquired at the operating room the unpredicted presence of surgical devices (such as probe ends). This thesis contributes to on-line stenosis assessment in several ways. We
propose a parametric strategy for the extraction of lumen and tracheal rings regions based on the characterization of their geometry and appearance that guide a deformable model. The geometric and appearance characterization is based on a physical model describing the way bronchoscopy images are obtained and includes local and global descriptions. In order to ensure a systematic applicability we present a statistical framework to select the optimal
parameters of our method. Experiments perform on the first public annotated database, show that the performance of our method is comparable to the one provided by clinicians and its computation time allows for a on-line implementation in the operating room.
Address
Corporate Author Thesis (up) Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor F. Javier Sanchez;Debora Gil;Jorge Bernal
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-940902-9-5 Medium
Area Expedition Conference
Notes IAM; 600.075 Approved no
Call Number Admin @ si @ San2014 Serial 2575
Permanent link to this record