toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Jaume Gibert edit  openurl
  Title Vector Space Embedding of Graphs via Statistics of Labelling Information Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Pattern recognition is the task that aims at distinguishing objects among different classes. When such a task wants to be solved in an automatic way a crucial step is how to formally represent such patterns to the computer. Based on the different representational formalisms, we may distinguish between statistical and structural pattern recognition. The former describes objects as a set of measurements arranged in the form of what is called a feature vector. The latter assumes that relations between parts of the underlying objects need to be explicitly represented and thus it uses relational structures such as graphs for encoding their inherent information. Vector spaces are a very flexible mathematical structure that has allowed to come up with several efficient ways for the analysis of patterns under the form of feature vectors. Nevertheless, such a representation cannot explicitly cope with binary relations between parts of the objects and it is restricted to measure the exact same number of features for each pattern under study regardless of their complexity. Graph-based representations present the contrary situation. They can easily adapt to the inherent complexity of the patterns but introduce a problem of high computational complexity, hindering the design of efficient tools to process and analyse patterns.

Solving this paradox is the main goal of this thesis. The ideal situation for solving pattern recognition problems would be to represent the patterns using relational structures such as graphs, and to be able to use the wealthy repository of data processing tools from the statistical pattern recognition domain. An elegant solution to this problem is to transform the graph domain into a vector domain where any processing algorithm can be applied. In other words, by mapping each graph to a point in a vector space we automatically get access to the rich set of algorithms from the statistical domain to be applied in the graph domain. Such methodology is called graph embedding.

In this thesis we propose to associate feature vectors to graphs in a simple and very efficient way by just putting attention on the labelling information that graphs store. In particular, we count frequencies of node labels and of edges between labels. Although their locality, these features are able to robustly represent structurally global properties of graphs, when considered together in the form of a vector. We initially deal with the case of discrete attributed graphs, where features are easy to compute. The continuous case is tackled as a natural generalization of the discrete one, where rather than counting node and edge labelling instances, we count statistics of some representatives of them. We encounter how the proposed vectorial representations of graphs suffer from high dimensionality and correlation among components and we face these problems by feature selection algorithms. We also explore how the diversity of different embedding representations can be exploited in order to boost the performance of base classifiers in a multiple classifier systems framework. An extensive experimental evaluation finally shows how the methodology we propose can be efficiently computed and compete with other graph matching and embedding methodologies.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Ernest Valveny  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ Gib2012 Serial (down) 2204  
Permanent link to this record
 

 
Author Marco Pedersoli edit  openurl
  Title Hierarchical Multiresolution Models for fast Object Detection Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The ability to automatically detect and recognize objects in unconstrained images is becoming more and more critical: from security systems and autonomous robots, to smart phones and augmented reality, intelligent devices need to understand the meaning of images as a composition of semantic objects. This Thesis tackles the problem of fast object detection based on template models. Detection consists of searching for an object in an image by evaluating the similarity between a template model and an image region at each possible location and scale. In this work, we show that using a template model representation based on a multiple resolution hierarchy is an optimal choice that can lead to excellent detection accuracy and fast computation. We implement two different approaches that make use of a hierarchy of multiresolution models: a multiresolution cascade and a coarse-to-fine search. Also, we extend the coarse-to-fine search by introducing a deformable part-based model that achieves state-of-the-art results together with a very reduced computational cost. Finally, we specialize our approach to the challenging task of pedestrian detection from moving vehicles and show that the overall quality of the system outperforms previous works in terms of speed and accuracy.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Ped2012 Serial (down) 2203  
Permanent link to this record
 

 
Author Noha Elfiky edit  openurl
  Title Compact, Adaptive and Discriminative Spatial Pyramids for Improved Object and Scene Classification Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The release of challenging datasets with a vast number of images, requires the development of efficient image representations and algorithms which are able to manipulate these large-scale datasets efficiently. Nowadays the Bag-of-Words (BoW) is the most successful approach in the context of object and scene classification tasks. However, its main drawback is the absence of the important spatial information. Spatial pyramids (SP) have been successfully applied to incorporate spatial information into BoW-based image representation. Observing the remarkable performance of spatial pyramids, their growing number of applications to a broad range of vision problems, and finally its geometry inclusion, a question can be asked what are the limits of spatial pyramids. Within the SP framework, the optimal way for obtaining an image spatial representation, which is able to cope with it’s most foremost shortcomings, concretely, it’s high dimensionality and the rigidity of the resulting image representation, still remains an active research domain. In summary, the main concern of this thesis is to search for the limits of spatial pyramids and try to figure out solutions for them.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Elf2012 Serial (down) 2202  
Permanent link to this record
 

 
Author Ariel Amato edit  openurl
  Title Environment-Independent Moving Cast Shadow Suppression in Video Surveillance Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This thesis is devoted to moving shadows detection and suppression. Shadows could be defined as the parts of the scene that are not directly illuminated by a light source due to obstructing object or objects. Often, moving shadows in images sequences are undesirable since they could cause degradation of the expected results during processing of images for object detection, segmentation, scene surveillance or similar purposes. In this thesis first moving shadow detection methods are exhaustively overviewed. Beside the mentioned methods from literature and to compensate their limitations a new moving shadow detection method is proposed. It requires no prior knowledge about the scene, nor is it restricted to assumptions about specific scene structures. Furthermore, the technique can detect both achromatic and chromatic shadows even in the presence of camouflage that occurs when foreground regions are very similar in color to shadowed regions. The method exploits local color constancy properties due to reflectance suppression over shadowed regions. To detect shadowed regions in a scene the values of the background image are divided by values of the current frame in the RGB color space. In the thesis how this luminance ratio can be used to identify segments with low gradient constancy is shown, which in turn distinguish shadows from foreground. Experimental results on a collection of publicly available datasets illustrate the superior performance of the proposed method compared with the most sophisticated state-of-the-art shadow detection algorithms. These results show that the proposed approach is robust and accurate over a broad range of shadow types and challenging video conditions.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Mikhail Mozerov;Jordi Gonzalez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Ama2012 Serial (down) 2201  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan edit  openurl
  Title Coloring bag-of-words based image representations Type Book Whole
  Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Put succinctly, the bag-of-words based image representation is the most successful approach for object and scene recognition. Within the bag-of-words framework the optimal fusion of multiple cues, such as shape, texture and color, still remains an active research domain. There exist two main approaches to combine color and shape information within the bag-of-words framework. The first approach called, early fusion, fuses color and shape at the feature level as a result of which a joint colorshape vocabulary is produced. The second approach, called late fusion, concatenates histogram representation of both color and shape, obtained independently. In the first part of this thesis, we analyze the theoretical implications of both early and late feature fusion. We demonstrate that both these approaches are suboptimal for a subset of object categories. Consequently, we propose a novel method for recognizing object categories when using multiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category specific color attention. Color is used to compute bottom-up and top-down attention maps. Subsequently, the color attention maps are used to modulate the weights of the shape features. Shape features are given more weight in regions with higher attention and vice versa. The approach is tested on several benchmark object recognition data sets and the results clearly demonstrate the effectiveness of our proposed method. In the second part of the thesis, we investigate the problem of obtaining compact spatial pyramid representations for object and scene recognition. Spatial pyramids have been successfully applied to incorporate spatial information into bag-of-words based image representation. However, a major drawback of spatial pyramids is that it leads to high dimensional image representations. We present a novel framework for obtaining compact pyramid representation. The approach reduces the size of a high dimensional pyramid representation upto an order of magnitude without any significant reduction in accuracy. Moreover, we also investigate the optimal combination of multiple features such as color and shape within the context of our compact pyramid representation. Finally, we describe a novel technique to build discriminative visual words from multiple cues learned independently from training images. To this end, we use an information theoretic vocabulary compression technique to find discriminative combinations of visual cues and the resulting visual vocabulary is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. The approach is tested on standard object recognition data sets. The results obtained clearly demonstrate the effectiveness of our approach.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Joost Van de Weijer;Maria Vanrell  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number Admin @ si @ Kha2011 Serial (down) 1838  
Permanent link to this record
 

 
Author Eduard Vazquez edit  openurl
  Title Unsupervised image segmentation based on material reflectance description and saliency Type Book Whole
  Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Image segmentations aims to partition an image into a set of non-overlapped regions, called segments. Despite the simplicity of the definition, image segmentation raises as a very complex problem in all its stages. The definition of segment is still unclear. When asking to a human to perform a segmentation, this person segments at different levels of abstraction. Some segments might be a single, well-defined texture whereas some others correspond with an object in the scene which might including multiple textures and colors. For this reason, segmentation is divided in bottom-up segmentation and top-down segmentation. Bottom up-segmentation is problem independent, that is, focused on general properties of the images such as textures or illumination. Top-down segmentation is a problem-dependent approach which looks for specific entities in the scene, such as known objects. This work is focused on bottom-up segmentation. Beginning from the analysis of the lacks of current methods, we propose an approach called RAD. Our approach overcomes the main shortcomings of those methods which use the physics of the light to perform the segmentation. RAD is a topological approach which describes a single-material reflectance. Afterwards, we cope with one of the main problems in image segmentation: non supervised adaptability to image content. To yield a non-supervised method, we use a model of saliency yet presented in this thesis. It computes the saliency of the chromatic transitions of an image by means of a statistical analysis of the images derivatives. This method of saliency is used to build our final approach of segmentation: spRAD. This method is a non-supervised segmentation approach. Our saliency approach has been validated with a psychophysical experiment as well as computationally, overcoming a state-of-the-art saliency method. spRAD also outperforms state-of-the-art segmentation techniques as results obtained with a widely-used segmentation dataset show  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Ramon Baldrich  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number Admin @ si @ Vaz2011b Serial (down) 1835  
Permanent link to this record
 

 
Author Ferran Diego edit  openurl
  Title Probabilistic Alignment of Video Sequences Recorded by Moving Cameras Type Book Whole
  Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Video alignment consists of integrating multiple video sequences recorded independently into a single video sequence. This means to register both in time (synchronize
frames) and space (image registration) so that the two videos sequences can be fused
or compared pixel–wise. In spite of being relatively unknown, many applications today may benefit from the availability of robust and efficient video alignment methods.
For instance, video surveillance requires to integrate video sequences that are recorded
of the same scene at different times in order to detect changes. The problem of aligning videos has been addressed before, but in the relatively simple cases of fixed or rigidly attached cameras and simultaneous acquisition. In addition, most works rely
on restrictive assumptions which reduce its difficulty such as linear time correspondence or the knowledge of the complete trajectories of corresponding scene points on the images; to some extent, these assumptions limit the practical applicability of the solutions developed until now. In this thesis, we focus on the challenging problem of aligning sequences recorded at different times from independent moving cameras following similar but not coincident trajectories. More precisely, this thesis covers four studies that advance the state-of-the-art in video alignment. First, we focus on analyzing and developing a probabilistic framework for video alignment, that is, a principled way to integrate multiple observations and prior information. In this way, two different approaches are presented to exploit the combination of several purely visual features (image–intensities, visual words and dense motion field descriptor), and
global positioning system (GPS) information. Second, we focus on reformulating the
problem into a single alignment framework since previous works on video alignment
adopt a divide–and–conquer strategy, i.e., first solve the synchronization, and then
register corresponding frames. This also generalizes the ’classic’ case of fixed geometric transform and linear time mapping. Third, we focus on exploiting directly the
time domain of the video sequences in order to avoid exhaustive cross–frame search.
This provides relevant information used for learning the temporal mapping between
pairs of video sequences. Finally, we focus on adapting these methods to the on–line
setting for road detection and vehicle geolocation. The qualitative and quantitative
results presented in this thesis on a variety of real–world pairs of video sequences show that the proposed method is: robust to varying imaging conditions, different image
content (e.g., incoming and outgoing vehicles), variations on camera velocity, and
different scenarios (indoor and outdoor) going beyond the state–of–the–art. Moreover, the on–line video alignment has been successfully applied for road detection and
vehicle geolocation achieving promising results.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Joan Serrat  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Die2011 Serial (down) 1787  
Permanent link to this record
 

 
Author Jaime Moreno edit  url
isbn  openurl
  Title Perceptual Criteria on Image Compresions Type Book Whole
  Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Nowadays, digital images are used in many areas in everyday life, but they tend to be big. This increases amount of information leads us to the problem of image data storage. For example, it is common to have a representation a color pixel as a 24-bit number, where the channels red, green, and blue employ 8 bits each. In consequence, this kind of color pixel can specify one of 224 ¼ 16:78 million colors. Therefore, an image at a resolution of 512 £ 512 that allocates 24 bits per pixel, occupies 786,432 bytes. That is why image compression is important. An important feature of image compression is that it can be lossy or lossless. A compressed image is acceptable provided these losses of image information are not perceived by the eye. It is possible to assume that a portion of this information is redundant. Lossless Image Compression is defined as to mathematically decode the same image which was encoded. In Lossy Image Compression needs to identify two features inside the image: the redundancy and the irrelevancy of information. Thus, lossy compression modifies the image data in such a way when they are encoded and decoded, the recovered image is similar enough to the original one. How similar is the recovered image in comparison to the original image is defined prior to the compression process, and it depends on the implementation to be performed. In lossy compression, current image compression schemes remove information considered irrelevant by using mathematical criteria. One of the problems of these schemes is that although the numerical quality of the compressed image is low, it shows a high visual image quality, e.g. it does not show a lot of visible artifacts. It is because these mathematical criteria, used to remove information, do not take into account if the viewed information is perceived by the Human Visual System. Therefore, the aim of an image compression scheme designed to obtain images that do not show artifacts although their numerical quality can be low, is to eliminate the information that is not visible by the Human Visual System. Hence, this Ph.D. thesis proposes to exploit the visual redundancy existing in an image by reducing those features that can be unperceivable for the Human Visual System. First, we define an image quality assessment, which is highly correlated with the psychophysical experiments performed by human observers. The proposed CwPSNR metrics weights the well-known PSNR by using a particular perceptual low level model of the Human Visual System, e.g. the Chromatic Induction Wavelet Model (CIWaM). Second, we propose an image compression algorithm (called Hi-SET), which exploits the high correlation and self-similarity of pixels in a given area or neighborhood by means of a fractal function. Hi-SET possesses the main features that modern image compressors have, that is, it is an embedded coder, which allows a progressive transmission. Third, we propose a perceptual quantizer (½SQ), which is a modification of the uniform scalar quantizer. The ½SQ is applied to a pixel set in a certain Wavelet sub-band, that is, a global quantization. Unlike this, the proposed modification allows to perform a local pixel-by-pixel forward and inverse quantization, introducing into this process a perceptual distortion which depends on the surround spatial information of the pixel. Combining ½SQ method with the Hi-SET image compressor, we define a perceptual image compressor, called ©SET. Finally, a coding method for Region of Interest areas is presented, ½GBbBShift, which perceptually weights pixels into these areas and maintains only the more important perceivable features in the rest of the image. Results presented in this report show that CwPSNR is the best-ranked image quality method when it is applied to the most common image compression distortions such as JPEG and JPEG2000. CwPSNR shows the best correlation with the judgement of human observers, which is based on the results of psychophysical experiments obtained for relevant image quality databases such as TID2008, LIVE, CSIQ and IVC. Furthermore, Hi-SET coder obtains better results both for compression ratios and perceptual image quality than the JPEG2000 coder and other coders that use a Hilbert Fractal for image compression. Hence, when the proposed perceptual quantization is introduced to Hi-SET coder, our compressor improves its numerical and perceptual e±ciency. When ½GBbBShift method applied to Hi-SET is compared against MaxShift method applied to the JPEG2000 standard and Hi-SET, the images coded by our ROI method get the best results when the overall image quality is estimated. Both the proposed perceptual quantization and the ½GBbBShift method are generalized algorithms that can be applied to other Wavelet based image compression algorithms such as JPEG2000, SPIHT or SPECK.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Xavier Otazu  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-938351-3-2 Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number Admin @ si @ Mor2011 Serial (down) 1786  
Permanent link to this record
 

 
Author Javier Vazquez edit  openurl
  Title Colour Constancy in Natural Through Colour Naming and Sensor Sharpening Type Book Whole
  Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Colour is derived from three physical properties: incident light, object reflectance and sensor sensitivities. Incident light varies under natural conditions; hence, recovering scene illuminant is an important issue in computational colour. One way to deal with this problem under calibrated conditions is by following three steps, 1) building a narrow-band sensor basis to accomplish the diagonal model, 2) building a feasible set of illuminants, and 3) defining criteria to select the best illuminant. In this work we focus on colour constancy for natural images by introducing perceptual criteria in the first and third stages.
To deal with the illuminant selection step, we hypothesise that basic colour categories can be used as anchor categories to recover the best illuminant. These colour names are related to the way that the human visual system has evolved to encode relevant natural colour statistics. Therefore the recovered image provides the best representation of the scene labelled with the basic colour terms. We demonstrate with several experiments how this selection criterion achieves current state-of-art results in computational colour constancy. In addition to this result, we psychophysically prove that usual angular error used in colour constancy does not correlate with human preferences, and we propose a new perceptual colour constancy evaluation.
The implementation of this selection criterion strongly relies on the use of a diagonal
model for illuminant change. Consequently, the second contribution focuses on building an appropriate narrow-band sensor basis to represent natural images. We propose to use the spectral sharpening technique to compute a unique narrow-band basis optimised to represent a large set of natural reflectances under natural illuminants and given in the basis of human cones. The proposed sensors allow predicting unique hues and the World colour Survey data independently of the illuminant by using a compact singularity function. Additionally, we studied different families of sharp sensors to minimise different perceptual measures. This study brought us to extend the spherical sampling procedure from 3D to 6D.
Several research lines still remain open. One natural extension would be to measure the
effects of using the computed sharp sensors on the category hypothesis, while another might be to insert spatial contextual information to improve category hypothesis. Finally, much work still needs to be done to explore how individual sensors can be adjusted to the colours in a scene.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Maria Vanrell;Graham D. Finlayson  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number Admin @ si @ Vaz2011a Serial (down) 1785  
Permanent link to this record
 

 
Author Aura Hernandez-Sabate edit   pdf
isbn  openurl
  Title Exploring Arterial Dynamics and Structures in IntraVascular Ultrasound Sequences Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Cardiovascular diseases are a leading cause of death in developed countries. Most of them are caused by arterial (specially coronary) diseases, mainly caused by plaque accumulation. Such pathology narrows blood flow (stenosis) and affects artery bio- mechanical elastic properties (atherosclerosis). In the last decades, IntraVascular UltraSound (IVUS) has become a usual imaging technique for the diagnosis and follow up of arterial diseases. IVUS is a catheter-based imaging technique which shows a sequence of cross sections of the artery under study. Inspection of a single image gives information about the percentage of stenosis. Meanwhile, inspection of longitudinal views provides information about artery bio-mechanical properties, which can prevent a fatal outcome of the cardiovascular disease. On one hand, dynamics of arteries (due to heart pumping among others) is a major artifact for exploring tissue bio-mechanical properties. On the other one, manual stenosis measurements require a manual tracing of vessel borders, which is a time-consuming task and might suffer from inter-observer variations. This PhD thesis proposes several image processing tools for exploring vessel dy- namics and structures. We present a physics-based model to extract, analyze and correct vessel in-plane rigid dynamics and to retrieve cardiac phase. Furthermore, we introduce a deterministic-statistical method for automatic vessel borders detection. In particular, we address adventitia layer segmentation. An accurate validation pro- tocol to ensure reliable clinical applicability of the methods is a crucial step in any proposal of an algorithm. In this thesis we take special care in designing a valida- tion protocol for each approach proposed and we contribute to the in vivo dynamics validation with a quantitative and objective score to measure the amount of motion suppressed.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Debora Gil  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-937261-6-4 Medium  
  Area Expedition Conference  
  Notes IAM; Approved no  
  Call Number IAM @ iam @ Her2009 Serial (down) 1543  
Permanent link to this record
 

 
Author Debora Gil edit   pdf
isbn  openurl
  Title Geometric Differential Operators for Shape Modelling Type Book Whole
  Year 2004 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Medical imaging feeds research in many computer vision and image processing fields: image filtering, segmentation, shape recovery, registration, retrieval and pattern matching. Because of their low contrast changes and large variety of artifacts and noise, medical imaging processing techniques relying on an analysis of the geometry of image level sets rather than on intensity values result in more robust treatment. From the starting point of treatment of intravascular images, this PhD thesis ad- dresses the design of differential image operators based on geometric principles for a robust shape modelling and restoration. Among all fields applying shape recovery, we approach filtering and segmentation of image objects. For a successful use in real images, the segmentation process should go through three stages: noise removing, shape modelling and shape recovery. This PhD addresses all three topics, but for the sake of algorithms as automated as possible, techniques for image processing will be designed to satisfy three main principles: a) convergence of the iterative schemes to non-trivial states avoiding image degeneration to a constant image and representing smooth models of the originals; b) smooth asymptotic behav- ior ensuring stabilization of the iterative process; c) fixed parameter values ensuring equal (domain free) performance of the algorithms whatever initial images/shapes. Our geometric approach to the generic equations that model the different processes approached enables defining techniques satisfying all the former requirements. First, we introduce a new curvature-based geometric flow for image filtering achieving a good compromise between noise removing and resemblance to original images. Sec- ond, we describe a new family of diffusion operators that restrict their scope to image level curves and serve to restore smooth closed models from unconnected sets of points. Finally, we design a regularization of snake (distance) maps that ensures its smooth convergence towards any closed shape. Experiments show that performance of the techniques proposed overpasses that of state-of-the-art algorithms.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Barcelona (Spain) Editor Jordi Saludes i Closa;Petia Radeva  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 84-933652-0-3 Medium prit  
  Area Expedition Conference  
  Notes IAM; Approved no  
  Call Number IAM @ iam @ GIL2004 Serial (down) 1517  
Permanent link to this record
 

 
Author Jaume Garcia edit   pdf
openurl 
  Title Statistical Models of the Architecture and Function of the Left Ventricle Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Cardiovascular Diseases, specially those affecting the Left Ventricle (LV), are the leading cause of death in developed countries with approximately a 30% of all global deaths. In order to address this public health concern, physicians focus on diagnosis and therapy planning. On one hand, early and accurate detection of Regional Wall Motion Abnormalities (RWMA) significantly contributes to a quick diagnosis and prevents the patient to reach more severe stages. On the other hand, a thouroughly knowledge of the normal gross anatomy of the LV, as well as, the distribution of its muscular fibers is crucial for designing specific interventions and therapies (such as pacemaker implanction). Statistical models obtained from the analysis of different imaging modalities allow the computation of the normal ranges of variation within a given population. Normality models are a valuable tool for the definition of objective criterions quantifying the degree of (anomalous) deviation of the LV function and anatomy for a given subject. The creation of statistical models involve addressing three main issues: extraction of data from images, definition of a common domain for comparison of data across patients and designing appropriate statistical analysis schemes. In this PhD thesis we present generic image processing tools for the creation of statistical models of the LV anatomy and function. On one hand, we use differential geometry concepts to define a computational framework (the Normalized Parametric Domain, NPD) suitable for the comparison and fusion of several clinical scores obtained over the LV. On the other hand, we present a variational approach (the Harmonic Phase Flow, HPF) for the estimation of myocardial motion that provides dense and continuous vector fields without overestimating motion at injured areas. These tools are used for the creation of statistical models. Regarding anatomy, we obtain an atlas jointly modelling, both, LV gross anatomy and fiber architecture. Regarding function, we compute normality patterns of scores characterizing the (global and local) LV function and explore, for the first time, the configuration of local scores better suited for RWMA detection.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Debora Gil  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM Approved no  
  Call Number IAM @ iam @ Gar2009a Serial (down) 1499  
Permanent link to this record
 

 
Author Partha Pratim Roy edit  isbn
openurl 
  Title Multi-Oriented and Multi-Scaled Text Character Analysis and Recognition in Graphical Documents and their Applications to Document Image Retrieval Type Book Whole
  Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract With the advent research of Document Image Analysis and Recognition (DIAR), an
important line of research is explored on indexing and retrieval of graphics rich documents. It aims at finding relevant documents relying on segmentation and recognition
of text and graphics components underlying in non-standard layout where commercial
OCRs can not be applied due to complexity. This thesis is focused towards text information extraction approaches in graphical documents and retrieval of such documents
using text information.
Automatic text recognition in graphical documents (map, engineering drawing,
etc.) involves many challenges because text characters are usually printed in multioriented and multi-scale way along with different graphical objects. Text characters
are used to annotate the graphical curve lines and hence, many times they follow
curvi-linear paths too. For OCR of such documents, individual text lines and their
corresponding words/characters need to be extracted.
For recognition of multi-font, multi-scale and multi-oriented characters, we have
proposed a feature descriptor for character shape using angular information from contour pixels to take care of the invariance nature. To improve the efficiency of OCR, an
approach towards the segmentation of multi-oriented touching strings into individual
characters is also discussed. Convex hull based background information is used to
segment a touching string into possible primitive segments and later these primitive
segments are merged to get optimum segmentation using dynamic programming. To
overcome the touching/overlapping problem of text with graphical lines, a character
spotting approach using SIFT and skeleton information is included. Afterwards, we
propose a novel method to extract individual curvi-linear text lines using the foreground and background information of the characters of the text and a water reservoir
concept is used to utilize the background information.
We have also formulated the methodologies for graphical document retrieval applications using query words and seals. The retrieval approaches are performed using
recognition results of individual components in the document. Given a query text,
the system extracts positional knowledge from the query word and uses the same to
generate hypothetical locations in the document. Indexing of documents is also performed based on automatic detection of seals from documents containing cluttered
background. A seal is characterized by scale and rotation invariant spatial feature
descriptors computed from labelled text characters and a concept based on the Generalized Hough Transform is used to locate the seal in documents.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Umapada Pal  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-937261-7-1 Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ Roy2010 Serial (down) 1455  
Permanent link to this record
 

 
Author Jose Manuel Alvarez edit  isbn
openurl 
  Title Combining Context and Appearance for Road Detection Type Book Whole
  Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Road traffic crashes have become a major cause of death and injury throughout the world.
Hence, in order to improve road safety, the automobile manufacture is moving towards the
development of vehicles with autonomous functionalities such as keeping in the right lane, safe distance keeping between vehicles or regulating the speed of the vehicle according to the traffic conditions. A key component of these systems is vision–based road detection that aims to detect the free road surface ahead the moving vehicle. Detecting the road using a monocular vision system is very challenging since the road is an outdoor scenario imaged from a mobile platform. Hence, the detection algorithm must be able to deal with continuously changing imaging conditions such as the presence ofdifferent objects (vehicles, pedestrians), different environments (urban, highways, off–road), different road types (shape, color), and different imaging conditions (varying illumination, different viewpoints and changing weather conditions). Therefore, in this thesis, we focus on vision–based road detection using a single color camera. More precisely, we first focus on analyzing and grouping pixels according to their low–level properties. In this way, two different approaches are presented to exploit
color and photometric invariance. Then, we focus the research of the thesis on exploiting context information. This information provides relevant knowledge about the road not using pixel features from road regions but semantic information from the analysis of the scene.
In this way, we present two different approaches to infer the geometry of the road ahead
the moving vehicle. Finally, we focus on combining these context and appearance (color)
approaches to improve the overall performance of road detection algorithms. The qualitative and quantitative results presented in this thesis on real–world driving sequences show that the proposed method is robust to varying imaging conditions, road types and scenarios going beyond the state–of–the–art.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Theo Gevers  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-937261-8-8 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Alv2010 Serial (down) 1454  
Permanent link to this record
 

 
Author Francisco Javier Orozco edit  isbn
openurl 
  Title Human Emotion Evaluation on Facial Image Sequences Type Book Whole
  Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Psychological evidence has emphasized the importance of affective behaviour understanding due to its high impact in nowadays interaction humans and computers. All
type of affective and behavioural patterns such as gestures, emotions and mental
states are highly displayed through the face, head and body. Therefore, this thesis is
focused to analyse affective behaviours on head and face. To this end, head and facial
movements are encoded by using appearance based tracking methods. Specifically,
a wise combination of deformable models captures rigid and non-rigid movements of
different kinematics; 3D head pose, eyebrows, mouth, eyelids and irises are taken into
account as basis for extracting features from databases of video sequences. This approach combines the strengths of adaptive appearance models, optimization methods
and backtracking techniques.
For about thirty years, computer sciences have addressed the investigation on
human emotions to the automatic recognition of six prototypic emotions suggested
by Darwin and systematized by Paul Ekman in the seventies. The Facial Action
Coding System (FACS) which uses discrete movements of the face (called Action
units or AUs) to code the six facial emotions named anger, disgust, fear, happy-Joy,
sadness and surprise. However, human emotions are much complex patterns that
have not received the same attention from computer scientists.
Simon Baron-Cohen proposed a new taxonomy of emotions and mental states
without a system coding of the facial actions. These 426 affective behaviours are
more challenging for the understanding of human emotions. Beyond of classically
classifying the six basic facial expressions, more subtle gestures, facial actions and
spontaneous emotions are considered here. By assessing confidence on the recognition
results, exploring spatial and temporal relationships of the features, some methods are
combined and enhanced for developing new taxonomy of expressions and emotions.
The objective of this dissertation is to develop a computer vision system, including both facial feature extraction, expression recognition and emotion understanding
by building a bottom-up reasoning process. Building a detailed taxonomy of human
affective behaviours is an interesting challenge for head-face-based image analysis
methods. In this paper, we exploit the strengths of Canonical Correlation Analysis
(CCA) to enhance an on-line head-face tracker. A relationship between head pose and
local facial movements is studied according to their cognitive interpretation on affective expressions and emotions. Active Shape Models are synthesized for AAMs based
on CCA-regression. Head pose and facial actions are fused into a maximally correlated space in order to assess expressiveness, confidence and classification in a CBR system. The CBR solutions are also correlated to the cognitive features, which allow
avoiding exhaustive search when recognizing new head-face features. Subsequently,
Support Vector Machines (SVMs) and Bayesian Networks are applied for learning the
spatial relationships of facial expressions. Similarly, the temporal evolution of facial
expressions, emotion and mental states are analysed based on Factorized Dynamic
Bayesian Networks (FaDBN).
As results, the bottom-up system recognizes six facial expressions, six basic emotions and six mental states, plus enhancing this categorization with confidence assessment at each level, intensity of expressions and a complete taxonomy
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-936529-3-7 Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ Oro2010 Serial (down) 1335  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: