|   | 
Details
   web
Records
Author Angel Sappa; George A. Triantafyllid
Title Computer Graphics and Imaging Type Book Whole
Year 2012 Publication Computer Graphics and Imaging Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address (up) Crete, Greece
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-0-88986-921-9 Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ Sap2012 Serial 2067
Permanent link to this record
 

 
Author Josep Llados; W. Liu; Jean-Marc Ogier
Title Seventh IAPR International Workshop on Graphics Recognition GREC 2007 Type Book Whole
Year 2007 Publication Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address (up) Curitiba (Brazil)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number DAG @ dag @ LLO2007 Serial 835
Permanent link to this record
 

 
Author Liu Wenyin; Josep Llados; Jean-Marc Ogier
Title Graphics Recognition. Recent Advances and New Opportunities. Type Book Whole
Year 2008 Publication 7th International Workshop, Selected Papers, Abbreviated Journal
Volume 5046 Issue Pages
Keywords
Abstract
Address (up) Curitiba (Brazil)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-540-88184-1 Medium
Area Expedition Conference GREC
Notes DAG Approved no
Call Number DAG @ dag @ WLO2008 Serial 1012
Permanent link to this record
 

 
Author David Masip
Title Face Classification Using Discriminative Features and Classifier Combination Type Book Whole
Year 2005 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address (up) CVC (UAB)
Corporate Author Thesis Ph.D. thesis
Publisher Place of Publication Editor Jordi Vitria
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 84-933652-3-8 Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number Admin @ si @ Mas2005b Serial 602
Permanent link to this record
 

 
Author Misael Rosales
Title A Physics-Based Image Modelling of IVUS as a Geometric and Kinematic System Type Book Whole
Year 2005 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address (up) CVC (UAB)
Corporate Author Thesis Ph.D. thesis
Publisher Place of Publication Editor Petia Radeva
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition 978-84-922529-8-7 Conference
Notes Approved no
Call Number Admin @ si @ Ros2005 Serial 603
Permanent link to this record
 

 
Author Fernando Vilariño
Title A Machine Learning Approach for Intestinal Motility Assessment with Capsule Endoscopy Type Book Whole
Year 2006 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Intestinal motility assessment with video capsule endoscopy arises as a novel and challenging clinical fieldwork. This technique is based on the analysis of the patterns of intestinal contractions obtained by labelling all the motility events present in a video provided by a capsule with a wireless micro-camera, which is ingested by the patient. However, the visual analysis of these video sequences presents several im- portant drawbacks, mainly related to both the large amount of time needed for the visualization process, and the low prevalence of intestinal contractions in video.
In this work we propose a machine learning system to automatically detect the intestinal contractions in video capsule endoscopy, driving a very useful but not fea- sible clinical routine into a feasible clinical procedure. Our proposal is divided into two different parts: The first part tackles the problem of the automatic detection of phasic contractions in capsule endoscopy videos. Phasic contractions are dynamic events spanning about 4-5 seconds, which show visual patterns with a high variability. Our proposal is based on a sequential design which involves the analysis of textural, color and blob features with powerful classifiers such as SVM. This approach appears to cope with two basic aims: the reduction of the imbalance rate of the data set, and the modular construction of the system, which adds the capability of including domain knowledge as new stages in the cascade. The second part of the current work tackles the problem of the automatic detection of tonic contractions. Tonic contrac- tions manifest in capsule endoscopy as a sustained pattern of the folds and wrinkles of the intestine, which may be prolonged for an undetermined span of time. Our proposal is based on the analysis of the wrinkle patterns, presenting a comparative study of diverse features and classification methods, and providing a set of appro- priate descriptors for their characterization. We provide a detailed analysis of the performance achieved by our system both in a qualitative and a quantitative way.
Address (up) CVC (UAB)
Corporate Author Thesis Ph.D. thesis
Publisher Place of Publication Editor Petia Radeva
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue 84-933652-7-0 Edition
ISSN ISBN Medium
Area 800 Expedition Conference
Notes MV;SIAI Approved no
Call Number Admin @ si @ Vil2006; IAM @ iam @ Vil2006 Serial 738
Permanent link to this record
 

 
Author Oriol Pujol
Title A semi-Supervised Statistical Framework and Generative Snakes for IVUS Analysis Type Book Whole
Year 2004 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address (up) CVC (UAB), Bellaterra
Corporate Author Thesis Ph.D. thesis
Publisher Place of Publication Editor Petia Radeva
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MILAB Approved no
Call Number BCNPCL @ bcnpcl @ Puj2004 Serial 512
Permanent link to this record
 

 
Author Xialei Liu
Title Visual recognition in the wild: learning from rankings in small domains and continual learning in new domains Type Book Whole
Year 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Deep convolutional neural networks (CNNs) have achieved superior performance in many visual recognition application, such as image classification, detection and segmentation. In this thesis we address two limitations of CNNs. Training deep CNNs requires huge amounts of labeled data, which is expensive and labor intensive to collect. Another limitation is that training CNNs in a continual learning setting is still an open research question. Catastrophic forgetting is very likely when adapting trained models to new environments or new tasks. Therefore, in this thesis, we aim to improve CNNs for applications with limited data and to adapt CNNs continually to new tasks.
Self-supervised learning leverages unlabelled data by introducing an auxiliary task for which data is abundantly available. In the first part of the thesis, we show how rankings can be used as a proxy self-supervised task for regression problems. Then we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning. We then apply our framework on two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both, we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results. We further show that active learning using rankings can reduce labeling effort by up to 50\% for both IQA and crowd counting.
In the second part of the thesis, we propose two approaches to avoiding catastrophic forgetting in sequential task learning scenarios. The first approach is derived from Elastic Weight Consolidation, which uses a diagonal Fisher Information Matrix (FIM) to measure the importance of the parameters of the network. However the diagonal assumption is unrealistic. Therefore, we approximately diagonalize the FIM using a set of factorized rotation parameters. This leads to significantly better performance on continual learning of sequential tasks. For the second approach, we show that forgetting manifests differently at different layers in the network and propose a hybrid approach where distillation is used in the feature extractor and replay in the classifier via feature generation. Our method addresses the limitations of generative image replay and probability distillation (i.e. learning without forgetting) and can naturally aggregate new tasks in a single, well-calibrated classifier. Experiments confirm that our proposed approach outperforms the baselines and some start-of-the-art methods.
Address (up) December 2019
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Joost Van de Weijer;Andrew Bagdanov
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-121011-4-0 Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ Liu2019 Serial 3396
Permanent link to this record
 

 
Author Fei Yang
Title Towards Practical Neural Image Compression Type Book Whole
Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Images and videos are pervasive in our life and communication. With advances in smart and portable devices, high capacity communication networks and high definition cinema, image and video compression are more relevant than ever. Traditional block-based linear transform codecs such as JPEG, H.264/AVC or the recent H.266/VVC are carefully designed to meet not only the rate-distortion criteria, but also the practical requirements of applications.
Recently, a new paradigm based on deep neural networks (i.e., neural image/video compression) has become increasingly popular due to its ability to learn powerful nonlinear transforms and other coding tools directly from data instead of being crafted by humans, as was usual in previous coding formats. While achieving excellent rate-distortion performance, these approaches are still limited mostly to research environments due to heavy models and other practical limitations, such as being limited to function on a particular rate and due to high memory and computational cost. In this thesis, we study these practical limitations, and designing more practical neural image compression approaches.
After analyzing the differences between traditional and neural image compression, our first contribution is the modulated autoencoder (MAE), a framework that includes a mechanism to provide multiple rate-distortion options within a single model with comparable performance to independent models. In a second contribution, we propose the slimmable compressive autoencoder (SlimCAE), which in addition to variable rate, can optimize the complexity of the model and thus reduce significantly the memory and computational burden.
Modern generative models can learn custom image transformation directly from suitable datasets following encoder-decoder architectures, task known as image-to-image (I2I) translation. Building on our previous work, we study the problem of distributed I2I translation, where the latent representation is transmitted through a binary channel and decoded in a remote receiving side. We also propose a variant that can perform both translation and the usual autoencoding functionality.
Finally, we also consider neural video compression, where the autoencoder is typically augmented with temporal prediction via motion compensation. One of the main bottlenecks of that framework is the optical flow module that estimates the displacement to predict the next frame. Focusing on this module, we propose a method that improves the accuracy of the optical flow estimation and a simplified variant that reduces the computational cost.
Key words: neural image compression, neural video compression, optical flow, practical neural image compression, compressive autoencoders, image-to-image translation, deep learning.
Address (up) December 2021
Corporate Author Thesis Ph.D. thesis
Publisher IMPRIMA Place of Publication Editor Luis Herranz;Mikhail Mozerov;Yongmei Cheng
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-122714-7-8 Medium
Area Expedition Conference
Notes LAMP Approved no
Call Number Admin @ si @ Yan2021 Serial 3608
Permanent link to this record
 

 
Author Javad Zolfaghari Bengar
Title Reducing Label Effort with Deep Active Learning Type Book Whole
Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Deep convolutional neural networks (CNNs) have achieved superior performance in many visual recognition applications, such as image classification, detection and segmentation. Training deep CNNs requires huge amounts of labeled data, which is expensive and labor intensive to collect. Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected
informative and/or representative samples. In this thesis we study several aspects of active learning including video object detection for autonomous driving systems, image classification on balanced and imbalanced datasets and the incorporation of self-supervised learning in active learning. We briefly describe our approach in each of these areas to reduce the labeling effort.
In chapter two we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our criterion is based on the estimated number of errors in terms of false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active
learning for video object detection in road scenes. Finally, we show that our
approach outperforms active learning baselines tested on two outdoor datasets.
In the next chapter we address the well-known problem of over confidence in the neural networks. As an alternative to network confidence, we propose a new informativeness-based active learning method that captures the learning dynamics of neural network with a metric called label-dispersion. This metric is low when the network consistently assigns the same label to the sample during the course of training and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results.
In chapter four, we tackle the problem of sampling bias in active learning methods on imbalanced datasets. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called longtail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we propose a general optimization framework that explicitly takes class-balancing into account. Results on three datasets show that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we show that also on balanced datasets our method generally results in a performance gain.
Another paradigm to reduce the annotation effort is self-training that learns from a large amount of unlabeled data in an unsupervised way and fine-tunes on few labeled samples. Recent advancements in self-training have achieved very impressive results rivaling supervised learning on some datasets. In the last chapter we focus on whether active learning and self supervised learning can benefit from each other.
We study object recognition datasets with several labeling budgets for the evaluations. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort, that for a low labeling budget, active learning offers no benefit to self-training, and finally that the combination of active learning and self-training is fruitful when the labeling budget is high.
Address (up) December 2021
Corporate Author Thesis Ph.D. thesis
Publisher IMPRIMA Place of Publication Editor Joost Van de Weijer;Bogdan Raducanu
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-122714-9-2 Medium
Area Expedition Conference
Notes LAMP; Approved no
Call Number Admin @ si @ Zol2021 Serial 3609
Permanent link to this record
 

 
Author Gabriel Villalonga
Title Leveraging Synthetic Data to Create Autonomous Driving Perception Systems Type Book Whole
Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Manually annotating images to develop vision models has been a major bottleneck
since computer vision and machine learning started to walk together. This has
been more evident since computer vision falls on the shoulders of data-hungry
deep learning techniques. When addressing on-board perception for autonomous
driving, the curse of data annotation is exacerbated due to the use of additional
sensors such as LiDAR. Therefore, any approach aiming at reducing such a timeconsuming and costly work is of high interest for addressing autonomous driving
and, in fact, for any application requiring some sort of artificial perception. In the
last decade, it has been shown that leveraging from synthetic data is a paradigm
worth to pursue in order to minimizing manual data annotation. The reason is
that the automatic process of generating synthetic data can also produce different
types of associated annotations (e.g. object bounding boxes for synthetic images
and LiDAR pointclouds, pixel/point-wise semantic information, etc.). Directly
using synthetic data for training deep perception models may not be the definitive
solution in all circumstances since it can appear a synth-to-real domain shift. In
this context, this work focuses on leveraging synthetic data to alleviate manual
annotation for three perception tasks related to driving assistance and autonomous
driving. In all cases, we assume the use of deep convolutional neural networks
(CNNs) to develop our perception models.
The first task addresses traffic sign recognition (TSR), a kind of multi-class
classification problem. We assume that the number of sign classes to be recognized
must be suddenly increased without having annotated samples to perform the
corresponding TSR CNN re-training. We show that leveraging synthetic samples of
such new classes and transforming them by a generative adversarial network (GAN)
trained on the known classes (i.e. without using samples from the new classes), it is
possible to re-train the TSR CNN to properly classify all the signs for a ∼ 1/4 ratio of
new/known sign classes. The second task addresses on-board 2D object detection,
focusing on vehicles and pedestrians. In this case, we assume that we receive a set
of images without the annotations required to train an object detector, i.e. without
object bounding boxes. Therefore, our goal is to self-annotate these images so
that they can later be used to train the desired object detector. In order to reach
this goal, we leverage from synthetic data and propose a semi-supervised learning
approach based on the co-training idea. In fact, we use a GAN to reduce the synthto-real domain shift before applying co-training. Our quantitative results show
that co-training and GAN-based image-to-image translation complement each
other up to allow the training of object detectors without manual annotation, and still almost reaching the upper-bound performances of the detectors trained from
human annotations. While in previous tasks we focus on vision-based perception,
the third task we address focuses on LiDAR pointclouds. Our initial goal was to
develop a 3D object detector trained on synthetic LiDAR-style pointclouds. While
for images we may expect synth/real-to-real domain shift due to differences in
their appearance (e.g. when source and target images come from different camera
sensors), we did not expect so for LiDAR pointclouds since these active sensors
factor out appearance and provide sampled shapes. However, in practice, we have
seen that it can be domain shift even among real-world LiDAR pointclouds. Factors
such as the sampling parameters of the LiDARs, the sensor suite configuration onboard the ego-vehicle, and the human annotation of 3D bounding boxes, do induce
a domain shift. We show it through comprehensive experiments with different
publicly available datasets and 3D detectors. This redirected our goal towards the
design of a GAN for pointcloud-to-pointcloud translation, a relatively unexplored
topic.
Finally, it is worth to mention that all the synthetic datasets used for these three
tasks, have been designed and generated in the context of this PhD work and will
be publicly released. Overall, we think this PhD presents several steps forward to
encourage leveraging synthetic data for developing deep perception models in the
field of driving assistance and autonomous driving.
Address (up) February 2021
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;German Ros
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-122714-2-3 Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ Vil2021 Serial 3599
Permanent link to this record
 

 
Author Edgar Riba
Title Geometric Computer Vision Techniques for Scene Reconstruction Type Book Whole
Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract From the early stages of Computer Vision, scene reconstruction has been one of the most studied topics leading to a wide variety of new discoveries and applications. Object grasping and manipulation, localization and mapping, or even visual effect generation are different examples of applications in which scene reconstruction has taken an important role for industries such as robotics, factory automation, or audio visual production. However, scene reconstruction is an extensive topic that can be approached in many different ways with already existing solutions that effectively work in controlled environments. Formally, the problem of scene reconstruction can be formulated as a sequence of independent processes which compose a pipeline. In this thesis, we analyse some parts of the reconstruction pipeline from which we contribute with novel methods using Convolutional Neural Networks (CNN) proposing innovative solutions that consider the optimisation of the methods in an end-to-end fashion. First, we review the state of the art of classical local features detectors and descriptors and contribute with two novel methods that inherently improve pre-existing solutions in the scene reconstruction pipeline.

It is a fact that computer science and software engineering are two fields that usually go hand in hand and evolve according to mutual needs making easier the design of complex and efficient algorithms. For this reason, we contribute with Kornia, a library specifically designed to work with classical computer vision techniques along with deep neural networks. In essence, we created a framework that eases the design of complex pipelines for computer vision algorithms so that can be included within neural networks and be used to backpropagate gradients throw a common optimisation framework. Finally, in the last chapter of this thesis we develop the aforementioned concept of designing end-to-end systems with classical projective geometry. Thus, we contribute with a solution to the problem of synthetic view generation by hallucinating novel views from high deformable cloths objects using a geometry aware end-to-end system. To summarize, in this thesis we demonstrate that with a proper design that combine classical geometric computer vision methods with deep learning techniques can lead to improve pre-existing solutions for the problem of scene reconstruction.
Address (up) February 2021
Corporate Author Thesis Ph.D. thesis
Publisher Place of Publication Editor Daniel Ponsa
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU Approved no
Call Number Admin @ si @ Rib2021 Serial 3610
Permanent link to this record
 

 
Author Joan Marti; Jose Miguel Benedi; Ana Maria Mendonça; Joan Serrat
Title Pattern Recognition and Image Analysis Type Book Whole
Year 2007 Publication 3rd Iberian Conference Abbreviated Journal
Volume 6669 Issue Pages 4477-4478
Keywords
Abstract
Address (up) Girona (Spain)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IbPRIA
Notes ADAS Approved no
Call Number ADAS @ adas @ MBM2007 Serial 994
Permanent link to this record
 

 
Author W. Liu; Josep Llados
Title Graphics Recognition. Ten Years Review and Future Perspectives Type Book Whole
Year 2006 Publication 6th International Workshop Abbreviated Journal
Volume 3926 Issue Pages
Keywords
Abstract
Address (up) Hong Kong (China)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference GREC
Notes DAG Approved no
Call Number DAG @ dag @ LiL2006 Serial 800
Permanent link to this record
 

 
Author Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds)
Title Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022 Type Book Whole
Year 2022 Publication Frontiers in Handwriting Recognition. Abbreviated Journal
Volume 13639 Issue Pages
Keywords
Abstract
Address (up) ICFHR 2022, Hyderabad, India, December 4–7, 2022
Corporate Author Thesis
Publisher Springer Place of Publication Editor Utkarsh Porwal; Alicia Fornes; Faisal Shafait
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-031-21648-0 Medium
Area Expedition Conference ICFHR
Notes DAG Approved no
Call Number Admin @ si @ PFS2022 Serial 3809
Permanent link to this record