toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Akhil Gurram edit  isbn
openurl 
  Title Monocular Depth Estimation for Autonomous Driving Type Book Whole
  Year 2022 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) 3D geometric information is essential for on-board perception in autonomous driving and driver assistance. Autonomous vehicles (AVs) are equipped with calibrated sensor suites. As part of these suites, we can find LiDARs, which are expensive active sensors in charge of providing the 3D geometric information. Depending on the operational conditions for the AV, calibrated stereo rigs may be also sufficient for obtaining 3D geometric information, being these rigs less expensive and easier to install than LiDARs. However, ensuring a proper maintenance and calibration of these types of sensors is not trivial. Accordingly, there is an increasing interest on performing monocular depth estimation (MDE) to obtain 3D geometric information on-board. MDE is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Moreover, a set of single cameras with MDE capabilities would still be a cheap solution for on-board perception, relatively easy to integrate and maintain in an AV.
Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i.e., assuming pixelwise ground truth (GT). Accordingly, the overall goal of this PhD is to study methods for improving CNN-based MDE accuracy under different training settings. More specifically, this PhD addresses different research questions that are described below. When we started to work in this PhD, state-of-theart methods for MDE were already based on CNNs. In fact, a promising line of work consisted in using image-based semantic supervision (i.e., pixel-level class labels) while training CNNs for MDE using LiDAR-based supervision (i.e., depth). It was common practice to assume that the same raw training data are complemented by both types of supervision, i.e., with depth and semantic labels. However, in practice, it was more common to find heterogeneous datasets with either only depth supervision or only semantic supervision. Therefore, our first work was to research if we could train CNNs for MDE by leveraging depth and semantic information from heterogeneous datasets. We show that this is indeed possible, and we surpassed the state-of-the-art results on MDE at the time we did this research. To achieve our results, we proposed a particular CNN architecture and a new training protocol.
After this research, it was clear that the upper-bound setting to train CNN-based MDE models consists in using LiDAR data as supervision. However, it would be cheaper and more scalable if we would be able to train such models from monocular sequences. Obviously, this is far more challenging, but worth to research. Training MDE models using monocular sequences is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. To alleviate these problems, we perform MDE by virtual-world supervision and real-world SfM self-supervision. We call our proposalMonoDEVSNet. We compensate the SfM self-supervision limitations by leveraging
virtual-world images with accurate semantic and depth supervision, as well as addressing the virtual-to-real domain gap. MonoDEVSNet outperformed previous MDE CNNs trained on monocular and even stereo sequences. We have publicly released MonoDEVSNet at <https://github.com/HMRC-AEL/MonoDEVSNet>.
Finally, since MDE is performed to produce 3D information for being used in downstream tasks related to on-board perception. We also address the question of whether the standard metrics for MDE assessment are a good indicator for future MDE-based driving-related perception tasks. By using 3D object detection on point clouds as proxy of on-board perception, we conclude that, indeed, MDE evaluation metrics give rise to a ranking of methods which reflects relatively well the 3D object detection results we may expect.
 
  Address March, 2022  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Antonio Lopez;Onay Urfalioglu  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-124793-0-0 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Gur2022 Serial 3712  
Permanent link to this record
 

 
Author Zhijie Fang edit  isbn
openurl 
  Title Behavior understanding of vulnerable road users by 2D pose estimation Type Book Whole
  Year 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Anticipating the intentions of vulnerable road users (VRUs) such as pedestrians
and cyclists can be critical for performing safe and comfortable driving maneuvers. This is the case for human driving and, therefore, should be taken into account by systems providing any level of driving assistance, i.e. from advanced driver assistant systems (ADAS) to fully autonomous vehicles (AVs). In this PhD work, we show how the latest advances on monocular vision-based human pose estimation, i.e. those relying on deep Convolutional Neural Networks (CNNs), enable to recognize the intentions of such VRUs. In the case of cyclists, we assume that they follow the established traffic codes to indicate future left/right turns and stop maneuvers with arm signals. In the case of pedestrians, no indications can be assumed a priori. Instead, we hypothesize that the walking pattern of a pedestrian can allow us to determine if he/she has the intention of crossing the road in the path of the egovehicle, so that the ego-vehicle must maneuver accordingly (e.g. slowing down or stopping). In this PhD work, we show how the same methodology can be used for recognizing pedestrians and cyclists’ intentions. For pedestrians, we perform experiments on the publicly available Daimler and JAAD datasets. For cyclists, we did not found an analogous dataset, therefore, we created our own one by acquiring
and annotating corresponding video-sequences which we aim to share with the
research community. Overall, the proposed pipeline provides new state-of-the-art results on the intention recognition of VRUs.
 
  Address May 2019  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;David Vazquez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-6-6 Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ Fan2019 Serial 3388  
Permanent link to this record
 

 
Author Vacit Oguz Yazici edit  isbn
openurl 
  Title Towards Smart Fashion: Visual Recognition of Products and Attributes Type Book Whole
  Year 2022 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Artificial intelligence is innovating the fashion industry by proposing new applications and solutions to the problems encountered by researchers and engineers working in the industry. In this thesis, we address three of these problems. In the first part of the thesis, we tackle the problem of multi-label image classification which is very related to fashion attribute recognition. In the second part of the thesis, we address two problems that are specific to fashion. Firstly, we address the problem of main product detection which is the task of associating correct image parts (e.g. bounding boxes) with the fashion product being sold. Secondly, we address the problem of color naming for multicolored fashion items. The task of multi-label image classification consists in assigning various concepts such as objects or attributes to images. Usually, there are dependencies that can be learned between the concepts to capture label correlations (chair and table classes are more likely to co-exist than chair and giraffe).
If we treat the multi-label image classification problem as an orderless set prediction problem, we can exploit recurrent neural networks (RNN) to capture label correlations. However, RNNs are trained to predict ordered sequences of tokens, so if the order of the predicted sequence is different than the order of the ground truth sequence, there will be penalization although the predictions are correct. Therefore, in the first part of the thesis, we propose an orderless loss function which will order the labels in the ground truth sequence dynamically in a way that the minimum loss is achieved. This results in a significant improvement of RNN models on multi-label image classification over the previous methods.
However, RNNs suffer from long term dependencies when the cardinality of set grows bigger. The decoding process might stop early if the current hidden state cannot find any object and outputs the termination token. This would cause the remaining classes not to be predicted and lower recall metric. Transformers can be used to avoid the long term dependency problem exploiting their selfattention modules that process sequential data simultaneously. Consequently, we propose a novel transformer model for multi-label image classification which surpasses the state-of-the-art results by a large margin.
In the second part of thesis, we focus on two fashion-specific problems. Main product detection is the task of associating image parts with the fashion product that is being sold, generally using associated textual metadata (product title or description). Normally, in fashion e-commerces, products are represented by multiple images where a person wears the product along with other fashion items. If all the fashion items in the images are marked with bounding boxes, we can use the textual metadata to decide which item is the main product. The initial work treated each of these images independently, discarding the fact that they all belong to the same product. In this thesis, we represent the bounding boxes from all the images as nodes in a fully connected graph. This allows the algorithm to learn relations between the nodes during training and take the entire context into account for the final decision. Our algorithm results in a significant improvement of the state-ofthe-art.
Moreover, we address the problem of color naming for multicolored fashion items, which is a challenging task due to the external factors such as illumination changes or objects that act as clutter. In the context of multi-label classification, the vaguely defined lines between the classes in the color space cause ambiguity. For example, a shade of blue which is very close to green might cause the model to incorrectly predict the color blue and green at the same time. Based on this, models trained for color naming are expected to recognize the colors and their quantities in both single colored and multicolored fashion items. Therefore, in this thesis, we propose a novel architecture with an additional head that explicitly estimates the number of colors in fashion items. This removes the ambiguity problem and results in better color naming performance.
 
  Address January 2022  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Joost Van de Weijer;Arnau Ramisa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-122714-6-1 Medium  
  Area Expedition Conference  
  Notes LAMP Approved no  
  Call Number Admin @ si @ Ogu2022 Serial 3631  
Permanent link to this record
 

 
Author David Roche edit  openurl
  Title A Statistical Framework for Terminating Evolutionary Algorithms at their Steady State Type Book Whole
  Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) As any iterative technique, it is a necessary condition a stop criterion for terminating Evolutionary Algorithms (EA). In the case of optimization methods, the algorithm should stop at the time it has reached a steady state so it can not improve results anymore. Assessing the reliability of termination conditions for EAs is of prime importance. A wrong or weak stop criterion can negatively a ect both the computational e ort and the nal result.
In this Thesis, we introduce a statistical framework for assessing whether a termination condition is able to stop EA at its steady state. In one hand a numeric approximation to steady states to detect the point in which EA population has lost its diversity has been presented for EA termination. This approximation has been applied to di erent EA paradigms based on diversity and a selection of functions covering the properties most relevant for EA convergence. Experiments show that our condition works regardless of the search space dimension and function landscape and Di erential Evolution (DE) arises as the best paradigm. On the other hand, we use a regression model in order to determine the requirements ensuring that a measure derived from EA evolving population is related to the distance to the optimum in xspace.
Our theoretical framework is analyzed across several benchmark test functions
and two standard termination criteria based on function improvement in f-space and EA population x-space distribution for the DE paradigm. Results validate our statistical framework as a powerful tool for determining the capability of a measure for terminating EA and select the x-space distribution as the best-suited for accurately stopping DE in real-world applications.
 
  Address July 2015  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Debora Gil;Jesus Giraldo  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; 600.075 Approved no  
  Call Number Admin @ si @ Roc2015 Serial 2686  
Permanent link to this record
 

 
Author David Geronimo edit  isbn
openurl 
  Title A Global Approach to Vision-Based Pedestrian Detection for Advanced Driver Assistance Systems Type Book Whole
  Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) At the beginning of the 21th century, traffic accidents have become a major problem not only for developed countries but also for emerging ones. As in other scientific areas in which Artificial Intelligence is becoming a key actor, advanced driver assistance systems, and concretely pedestrian protection systems based on Computer Vision, are becoming a strong topic of research aimed at improving the safety of pedestrians. However, the challenge is of considerable complexity due to the varying appearance of humans (e.g., clothes, size, aspect ratio, shape, etc.), the dynamic nature of on-board systems and the unstructured moving environments that urban scenarios represent. In addition, the required performance is demanding both in terms of computational time and detection rates. In this thesis, instead of focusing on improving specific tasks as it is frequent in the literature, we present a global approach to the problem. Such a global overview starts by the proposal of a generic architecture to be used as a framework both to review the literature and to organize the studied techniques along the thesis. We then focus the research on tasks such as foreground segmentation, object classification and refinement following a general viewpoint and exploring aspects that are not usually analyzed. In order to perform the experiments, we also present a novel pedestrian dataset that consists of three subsets, each one addressed to the evaluation of a different specific task in the system. The results presented in this thesis not only end with a proposal of a pedestrian detection system but also go one step beyond by pointing out new insights, formalizing existing and proposed algorithms, introducing new techniques and evaluating their performance, which we hope will provide new foundations for future research in the area.  
  Address Antonio Lopez;Krystian Mikolajczyk;Jaume Amores;Dariu M. Gavrila;Oriol Pujol;Felipe Lumbreras  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Krystian Mikolajczyk;Jaume Amores;Dariu M. Gavrila;Oriol Pujol;Felipe Lumbreras  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-936529-5-1 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ Ger2010 Serial 1279  
Permanent link to this record
 

 
Author Meysam Madadi edit  isbn
openurl 
  Title Human Segmentation, Pose Estimation and Applications Type Book Whole
  Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Automatic analyzing humans in photographs or videos has great potential applications in computer vision, including medical diagnosis, sports, entertainment, movie editing and surveillance, just to name a few. Body, face and hand are the most studied components of humans. Body has many variabilities in shape and clothing along with high degrees of freedom in pose. Face has many muscles causing many visible deformity, beside variable shape and hair style. Hand is a small object, moving fast and has high degrees of freedom. Adding human characteristics to all aforementioned variabilities makes human analysis quite a challenging task.
In this thesis, we developed human segmentation in different modalities. In a first scenario, we segmented human body and hand in depth images using example-based shape warping. We developed a shape descriptor based on shape context and class probabilities of shape regions to extract nearest neighbors. We then considered rigid affine alignment vs. nonrigid iterative shape warping. In a second scenario, we segmented face in RGB images using convolutional neural networks (CNN). We modeled conditional random field with recurrent neural networks. In our model pair-wise kernels are not fixed and learned during training. We trained the network end-to-end using adversarial networks which improved hair segmentation by a high margin.
We also worked on 3D hand pose estimation in depth images. In a generative approach, we fitted a finger model separately for each finger based on our example-based rigid hand segmentation. We minimized an energy function based on overlapping area, depth discrepancy and finger collisions. We also applied linear models in joint trajectory space to refine occluded joints based on visible joints error and invisible joints trajectory smoothness. In a CNN-based approach, we developed a tree-structure network to train specific features for each finger and fused them for global pose consistency. We also formulated physical and appearance constraints as loss functions.
Finally, we developed a number of applications consisting of human soft biometrics measurement and garment retexturing. We also generated some datasets in this thesis consisting of human segmentation, synthetic hand pose, garment retexturing and Italian gestures.
 
  Address October 2017  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Sergio Escalera;Jordi Gonzalez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-945373-3-2 Medium  
  Area Expedition Conference  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ Mad2017 Serial 3017  
Permanent link to this record
 

 
Author Agata Lapedriza edit  openurl
  Title Multitask Learning Techniques for Automatic Face Classification Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Automatic face classification is currently a popular research area in Computer Vision. It involves several subproblems, such as subject recognition, gender classification or subject verification.

Current systems of automatic face classification need a large amount of training data to robustly learn a task. However, the collection of labeled data is usually a difficult issue. For this reason, the research on methods that are able to learn from a small sized training set is essential.

The dependency on the abundance of training data is not so evident in human learning processes. We are able to learn from a very small number of examples, given that we use, additionally, some prior knowledge to learn a new task. For example, we frequently find patterns and analogies from other domains to reuse them in new situations, or exploit training data from other experiences.

In computer science, Multitask Learning is a new Machine Learning approach that studies this idea of knowledge transfer among different tasks, to overcome the effects of the small sample sized problem.

This thesis explores, proposes and tests some Multitask Learning methods specially developed for face classification purposes. Moreover, it presents two more contributions dealing with the small sample sized problem, out of the Multitask Learning context. The first one is a method to extract external face features, to be used as an additional information source in automatic face classification problems. The second one is an empirical study on the most suitable face image resolution to perform automatic subject recognition.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Vitria;David Masip  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes OR;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ Lap2009 Serial 1263  
Permanent link to this record
 

 
Author Bhaskar Chakraborty edit  openurl
  Title Model free approach to human action recognition Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Automatic understanding of human activity and action is very important and challenging research area of Computer Vision with wide applications in video surveillance, motion analysis, virtual reality interfaces, video indexing, content based video retrieval, HCI and health care. This thesis presents a series of techniques to solve the problem of human action recognition in video. First approach towards this goal is based on a probabilistic optimization model of body parts using Hidden Markov Model. This strong model based approach is able to distinguish between similar actions by only considering the body parts having major contributions to the actions. In next approach, we apply a weak model based human detector and actions are represented by Bag-of-key poses model to capture the human pose changes during the actions. To tackle the problem of human action recognition in complex scenes, a selective spatio-temporal interest point (STIP) detector is proposed by using a mechanism similar to that of the non-classical receptive field inhibition that is exhibited by most oriented selective neuron in the primary visual cortex. An extension of the selective STIP detector is applied to multi-view action recognition system by introducing a novel 4D STIPs (3D space + time). Finally, we use our STIP detector on large scale continuous visual event recognition problem and propose a novel generalized max-margin Hough transformation framework for activity detection  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Xavier Roca  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Cha2012 Serial 2207  
Permanent link to this record
 

 
Author Felipe Codevilla edit  openurl
  Title On Building End-to-End Driving Models Through Imitation Learning Type Book Whole
  Year 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Autonomous vehicles are now considered as an assured asset in the future. Literally, all the relevant car-markers are now in a race to produce fully autonomous vehicles. These car-makers usually make use of modular pipelines for designing autonomous vehicles. This strategy decomposes the problem in a variety of tasks such as object detection and recognition, semantic and instance segmentation, depth estimation, SLAM and place recognition, as well as planning and control. Each module requires a separate set of expert algorithms, which are costly specially in the amount of human labor and necessity of data labelling. An alternative, that recently has driven considerable interest, is the end-to-end driving. In the end-to-end driving paradigm, perception and control are learned simultaneously using a deep network. These sensorimotor models are typically obtained by imitation learning fromhuman demonstrations. The main advantage is that this approach can directly learn from large fleets of human-driven vehicles without requiring a fixed ontology and extensive amounts of labeling. However, scaling end-to-end driving methods to behaviors more complex than simple lane keeping or lead vehicle following remains an open problem. On this thesis, in order to achieve more complex behaviours, we
address some issues when creating end-to-end driving system through imitation
learning. The first of themis a necessity of an environment for algorithm evaluation and collection of driving demonstrations. On this matter, we participated on the creation of the CARLA simulator, an open source platformbuilt from ground up for autonomous driving validation and prototyping. Since the end-to-end approach is purely reactive, there is also the necessity to provide an interface with a global planning system. With this, we propose the conditional imitation learning that conditions the actions produced into some high level command. Evaluation is also a concern and is commonly performed by comparing the end-to-end network output to some pre-collected driving dataset. We show that this is surprisingly weakly correlated to the actual driving and propose strategies on how to better acquire data and a better comparison strategy. Finally, we confirmwell-known generalization issues
(due to dataset bias and overfitting), new ones (due to dynamic objects and the
lack of a causal model), and training instability; problems requiring further research before end-to-end driving through imitation can scale to real-world driving.
 
  Address May 2019  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ Cod2019 Serial 3387  
Permanent link to this record
 

 
Author Aura Hernandez-Sabate edit   pdf
isbn  openurl
  Title Exploring Arterial Dynamics and Structures in IntraVascular Ultrasound Sequences Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Cardiovascular diseases are a leading cause of death in developed countries. Most of them are caused by arterial (specially coronary) diseases, mainly caused by plaque accumulation. Such pathology narrows blood flow (stenosis) and affects artery bio- mechanical elastic properties (atherosclerosis). In the last decades, IntraVascular UltraSound (IVUS) has become a usual imaging technique for the diagnosis and follow up of arterial diseases. IVUS is a catheter-based imaging technique which shows a sequence of cross sections of the artery under study. Inspection of a single image gives information about the percentage of stenosis. Meanwhile, inspection of longitudinal views provides information about artery bio-mechanical properties, which can prevent a fatal outcome of the cardiovascular disease. On one hand, dynamics of arteries (due to heart pumping among others) is a major artifact for exploring tissue bio-mechanical properties. On the other one, manual stenosis measurements require a manual tracing of vessel borders, which is a time-consuming task and might suffer from inter-observer variations. This PhD thesis proposes several image processing tools for exploring vessel dy- namics and structures. We present a physics-based model to extract, analyze and correct vessel in-plane rigid dynamics and to retrieve cardiac phase. Furthermore, we introduce a deterministic-statistical method for automatic vessel borders detection. In particular, we address adventitia layer segmentation. An accurate validation pro- tocol to ensure reliable clinical applicability of the methods is a crucial step in any proposal of an algorithm. In this thesis we take special care in designing a valida- tion protocol for each approach proposed and we contribute to the in vivo dynamics validation with a quantitative and objective score to measure the amount of motion suppressed.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Debora Gil  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-937261-6-4 Medium  
  Area Expedition Conference  
  Notes IAM; Approved no  
  Call Number IAM @ iam @ Her2009 Serial 1543  
Permanent link to this record
 

 
Author Jaume Garcia edit   pdf
openurl 
  Title Statistical Models of the Architecture and Function of the Left Ventricle Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Cardiovascular Diseases, specially those affecting the Left Ventricle (LV), are the leading cause of death in developed countries with approximately a 30% of all global deaths. In order to address this public health concern, physicians focus on diagnosis and therapy planning. On one hand, early and accurate detection of Regional Wall Motion Abnormalities (RWMA) significantly contributes to a quick diagnosis and prevents the patient to reach more severe stages. On the other hand, a thouroughly knowledge of the normal gross anatomy of the LV, as well as, the distribution of its muscular fibers is crucial for designing specific interventions and therapies (such as pacemaker implanction). Statistical models obtained from the analysis of different imaging modalities allow the computation of the normal ranges of variation within a given population. Normality models are a valuable tool for the definition of objective criterions quantifying the degree of (anomalous) deviation of the LV function and anatomy for a given subject. The creation of statistical models involve addressing three main issues: extraction of data from images, definition of a common domain for comparison of data across patients and designing appropriate statistical analysis schemes. In this PhD thesis we present generic image processing tools for the creation of statistical models of the LV anatomy and function. On one hand, we use differential geometry concepts to define a computational framework (the Normalized Parametric Domain, NPD) suitable for the comparison and fusion of several clinical scores obtained over the LV. On the other hand, we present a variational approach (the Harmonic Phase Flow, HPF) for the estimation of myocardial motion that provides dense and continuous vector fields without overestimating motion at injured areas. These tools are used for the creation of statistical models. Regarding anatomy, we obtain an atlas jointly modelling, both, LV gross anatomy and fiber architecture. Regarding function, we compute normality patterns of scores characterizing the (global and local) LV function and explore, for the first time, the configuration of local scores better suited for RWMA detection.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Debora Gil  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM Approved no  
  Call Number IAM @ iam @ Gar2009a Serial 1499  
Permanent link to this record
 

 
Author Jorge Bernal edit  openurl
  Title Polyp Localization and Segmentation in Colonoscopy Images by Means of a Model of Appearance for Polyps Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Colorectal cancer is the fourth most common cause of cancer death worldwide and its survival rate depends on the stage in which it is detected on hence the necessity for an early colon screening. There are several screening techniques but colonoscopy is still nowadays the gold standard, although it has some drawbacks such as the miss rate. Our contribution, in the field of intelligent systems for colonoscopy, aims at providing a polyp localization and a polyp segmentation system based on a model of appearance for polyps. To develop both methods we define a model of appearance for polyps, which describes a polyp as enclosed by intensity valleys. The novelty of our contribution resides on the fact that we include in our model aspects of the image formation and we also consider the presence of other elements from the endoluminal scene such as specular highlights and blood vessels, which have an impact on the performance of our methods. In order to develop our polyp localization method we accumulate valley information in order to generate energy maps, which are also used to guide the polyp segmentation. Our methods achieve promising results in polyp localization and segmentation. As we want to explore the usability of our methods we present a comparative analysis between physicians fixations obtained via an eye tracking device and our polyp localization method. The results show that our method is indistinguishable to novice physicians although it is far from expert physicians.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor F. Javier Sanchez;Fernando Vilariño  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area 800 Expedition Conference  
  Notes MV Approved no  
  Call Number Admin @ si @ Ber2012 Serial 2211  
Permanent link to this record
 

 
Author Joan M. Nuñez edit  isbn
openurl 
  Title Vascular Pattern Characterization in Colonoscopy Images Type Book Whole
  Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Colorectal cancer is the third most common cancer worldwide and the second most common malignant tumor in Europe. Screening tests have shown to be very e ective in increasing the survival rates since they allow an early detection of polyps. Among the di erent screening techniques, colonoscopy is considered the gold standard although clinical studies mention several problems that have an impact in the quality of the procedure. The navigation through the rectum and colon track can be challenging for the physicians which can increase polyp miss rates. The thorough visualization of the colon track must be ensured so that
the chances of missing lesions are minimized. The visual analysis of colonoscopy images can provide important information to the physicians and support their navigation during the procedure.
Blood vessels and their branching patterns can provide descriptive power to potentially develop biometric markers. Anatomical markers based on blood vessel patterns could be used to identify a particular scene in colonoscopy videos and to support endoscope navigation by generating a sequence of ordered scenes through the di erent colon sections. By verifying the presence of vascular content in the endoluminal scene it is also possible to certify a proper
inspection of the colon mucosa and to improve polyp localization. Considering the potential uses of blood vessel description, this contribution studies the characterization of the vascular content and the analysis of the descriptive power of its branching patterns.
Blood vessel characterization in colonoscopy images is shown to be a challenging task. The endoluminal scene is conformed by several elements whose similar characteristics hinder the development of particular models for each of them. To overcome such diculties we propose the use of the blood vessel branching characteristics as key features for pattern description. We present a model to characterize junctions in binary patterns. The implementation
of the junction model allows us to develop a junction localization method. We
created two data sets including manually labeled vessel information as well as manual ground truths of two types of keypoint landmarks: junctions and endpoints. The proposed method outperforms the available algorithms in the literature in experiments in both, our newly created colon vessel data set, and in DRIVE retinal fundus image data set. In the latter case, we created a manual ground truth of junction coordinates. Since we want to explore the descriptive potential of junctions and vessels, we propose a graph-based approach to
create anatomical markers. In the context of polyp localization, we present a new method to inhibit the in uence of blood vessels in the extraction valley-pro le information. The results show that our methodology decreases vessel in
uence, increases polyp information and leads to an improvement in state-of-the-art polyp localization performance. We also propose a polyp-speci c segmentation method that outperforms other general and speci c approaches.
 
  Address November 2015  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Fernando Vilariño  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-943427-6-9 Medium  
  Area Expedition Conference  
  Notes MV Approved no  
  Call Number Admin @ si @ Nuñ2015 Serial 2709  
Permanent link to this record
 

 
Author Javier Vazquez edit  openurl
  Title Colour Constancy in Natural Through Colour Naming and Sensor Sharpening Type Book Whole
  Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Colour is derived from three physical properties: incident light, object reflectance and sensor sensitivities. Incident light varies under natural conditions; hence, recovering scene illuminant is an important issue in computational colour. One way to deal with this problem under calibrated conditions is by following three steps, 1) building a narrow-band sensor basis to accomplish the diagonal model, 2) building a feasible set of illuminants, and 3) defining criteria to select the best illuminant. In this work we focus on colour constancy for natural images by introducing perceptual criteria in the first and third stages.
To deal with the illuminant selection step, we hypothesise that basic colour categories can be used as anchor categories to recover the best illuminant. These colour names are related to the way that the human visual system has evolved to encode relevant natural colour statistics. Therefore the recovered image provides the best representation of the scene labelled with the basic colour terms. We demonstrate with several experiments how this selection criterion achieves current state-of-art results in computational colour constancy. In addition to this result, we psychophysically prove that usual angular error used in colour constancy does not correlate with human preferences, and we propose a new perceptual colour constancy evaluation.
The implementation of this selection criterion strongly relies on the use of a diagonal
model for illuminant change. Consequently, the second contribution focuses on building an appropriate narrow-band sensor basis to represent natural images. We propose to use the spectral sharpening technique to compute a unique narrow-band basis optimised to represent a large set of natural reflectances under natural illuminants and given in the basis of human cones. The proposed sensors allow predicting unique hues and the World colour Survey data independently of the illuminant by using a compact singularity function. Additionally, we studied different families of sharp sensors to minimise different perceptual measures. This study brought us to extend the spherical sampling procedure from 3D to 6D.
Several research lines still remain open. One natural extension would be to measure the
effects of using the computed sharp sensors on the category hypothesis, while another might be to insert spatial contextual information to improve category hypothesis. Finally, much work still needs to be done to explore how individual sensors can be adjusted to the colours in a scene.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Maria Vanrell;Graham D. Finlayson  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number Admin @ si @ Vaz2011a Serial 1785  
Permanent link to this record
 

 
Author Marc Masana edit  isbn
openurl 
  Title Lifelong Learning of Neural Networks: Detecting Novelty and Adapting to New Domains without Forgetting Type Book Whole
  Year 2020 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (up) Computer vision has gone through considerable changes in the last decade as neural networks have come into common use. As available computational capabilities have grown, neural networks have achieved breakthroughs in many computer vision tasks, and have even surpassed human performance in others. With accuracy being so high, focus has shifted to other issues and challenges. One research direction that saw a notable increase in interest is on lifelong learning systems. Such systems should be capable of efficiently performing tasks, identifying and learning new ones, and should moreover be able to deploy smaller versions of themselves which are experts on specific tasks. In this thesis, we contribute to research on lifelong learning and address the compression and adaptation of networks to small target domains, the incremental learning of networks faced with a variety of tasks, and finally the detection of out-of-distribution samples at inference time.

We explore how knowledge can be transferred from large pretrained models to more task-specific networks capable of running on smaller devices by extracting the most relevant information. Using a pretrained model provides more robust representations and a more stable initialization when learning a smaller task, which leads to higher performance and is known as domain adaptation. However, those models are too large for certain applications that need to be deployed on devices with limited memory and computational capacity. In this thesis we show that, after performing domain adaptation, some learned activations barely contribute to the predictions of the model. Therefore, we propose to apply network compression based on low-rank matrix decomposition using the activation statistics. This results in a significant reduction of the model size and the computational cost.

Like human intelligence, machine intelligence aims to have the ability to learn and remember knowledge. However, when a trained neural network is presented with learning a new task, it ends up forgetting previous ones. This is known as catastrophic forgetting and its avoidance is studied in continual learning. The work presented in this thesis extensively surveys continual learning techniques and presents an approach to avoid catastrophic forgetting in sequential task learning scenarios. Our technique is based on using ternary masks in order to update a network to new tasks, reusing the knowledge of previous ones while not forgetting anything about them. In contrast to earlier work, our masks are applied to the activations of each layer instead of the weights. This considerably reduces the number of parameters to be added for each new task. Furthermore, the analysis on a wide range of work on incremental learning without access to the task-ID, provides insight on current state-of-the-art approaches that focus on avoiding catastrophic forgetting by using regularization, rehearsal of previous tasks from a small memory, or compensating the task-recency bias.

Neural networks trained with a cross-entropy loss force the outputs of the model to tend toward a one-hot encoded vector. This leads to models being too overly confident when presented with images or classes that were not present in the training distribution. The capacity of a system to be aware of the boundaries of the learned tasks and identify anomalies or classes which have not been learned yet is key to lifelong learning and autonomous systems. In this thesis, we present a metric learning approach to out-of-distribution detection that learns the task at hand on an embedding space.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Joost Van de Weijer;Andrew Bagdanov  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-121011-9-5 Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ Mas20 Serial 3481  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: