toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Idoia Ruiz edit  isbn
openurl 
  Title Deep Metric Learning for re-identification, tracking and hierarchical novelty detection Type Book Whole
  Year 2022 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Metric learning refers to the problem in machine learning of learning a distance or similarity measurement to compare data. In particular, deep metric learning involves learning a representation, also referred to as embedding, such that in the embedding space data samples can be compared based on the distance, directly providing a similarity measure. This step is necessary to perform several tasks in computer vision. It allows to perform the classification of images, regions or pixels, re-identification, out-of-distribution detection, object tracking in image sequences and any other task that requires computing a similarity score for their solution. This thesis addresses three specific problems that share this common requirement. The first one is person re-identification. Essentially, it is an image retrieval task that aims at finding instances of the same person according to a similarity measure. We first compare in terms of accuracy and efficiency, classical metric learning to basic deep learning based methods for this problem. In this context, we also study network distillation as a strategy to optimize the trade-off between accuracy and speed at inference time. The second problem we contribute to is novelty detection in image classification. It consists in detecting samples of novel classes, i.e. never seen during training. However, standard novelty detection does not provide any information about the novel samples besides they are unknown. Aiming at more informative outputs, we take advantage from the hierarchical taxonomies that are intrinsic to the classes. We propose a metric learning based approach that leverages the hierarchical relationships among classes during training, being able to predict the parent class for a novel sample in such hierarchical taxonomy. Our third contribution is in multi-object tracking and segmentation. This joint task comprises classification, detection, instance segmentation and tracking. Tracking can be formulated as a retrieval problem to be addressed with metric learning approaches. We tackle the existing difficulty in academic research that is the lack of annotated benchmarks for this task. To this matter, we introduce the problem of weakly supervised multi-object tracking and segmentation, facing the challenge of not having available ground truth for instance segmentation. We propose a synergistic training strategy that benefits from the knowledge of the supervised tasks that are being learnt simultaneously.  
  Address (down) July, 2022  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Joan Serrat  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-124793-4-8 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Rui2022 Serial 3717  
Permanent link to this record
 

 
Author Fadi Dornaika; Angel Sappa edit  openurl
  Title 3D Face Tracking using Appearance Registration and Robust Iterative Closest Point Algorithm Type Book Chapter
  Year 2006 Publication 21st International Symposium on Computer and Information Sciences (ISCIS´06), LNCS 4263: 532–541 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address (down) Istanbul (Turkey)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ DoS2006d Serial 688  
Permanent link to this record
 

 
Author Felipe Lumbreras; Ramon Baldrich; Maria Vanrell; Joan Serrat; Juan J. Villanueva edit  openurl
  Title Multiresolution texture classification of ceramic tiles. Type Book Chapter
  Year 1999 Publication Recent Research developments in optical engineering, Research Signpost, 2: 213–228 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address (down) India  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS;CIC Approved no  
  Call Number ADAS @ adas @ LBV1999b Serial 45  
Permanent link to this record
 

 
Author Joan Marti; Jose Miguel Benedi; Ana Maria Mendonça; Joan Serrat edit  openurl
  Title Pattern Recognition and Image Analysis Type Book Whole
  Year 2007 Publication 3rd Iberian Conference Abbreviated Journal  
  Volume 6669 Issue Pages 4477-4478  
  Keywords  
  Abstract  
  Address (down) Girona (Spain)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IbPRIA  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ MBM2007 Serial 994  
Permanent link to this record
 

 
Author Gabriel Villalonga edit  isbn
openurl 
  Title Leveraging Synthetic Data to Create Autonomous Driving Perception Systems Type Book Whole
  Year 2021 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Manually annotating images to develop vision models has been a major bottleneck
since computer vision and machine learning started to walk together. This has
been more evident since computer vision falls on the shoulders of data-hungry
deep learning techniques. When addressing on-board perception for autonomous
driving, the curse of data annotation is exacerbated due to the use of additional
sensors such as LiDAR. Therefore, any approach aiming at reducing such a timeconsuming and costly work is of high interest for addressing autonomous driving
and, in fact, for any application requiring some sort of artificial perception. In the
last decade, it has been shown that leveraging from synthetic data is a paradigm
worth to pursue in order to minimizing manual data annotation. The reason is
that the automatic process of generating synthetic data can also produce different
types of associated annotations (e.g. object bounding boxes for synthetic images
and LiDAR pointclouds, pixel/point-wise semantic information, etc.). Directly
using synthetic data for training deep perception models may not be the definitive
solution in all circumstances since it can appear a synth-to-real domain shift. In
this context, this work focuses on leveraging synthetic data to alleviate manual
annotation for three perception tasks related to driving assistance and autonomous
driving. In all cases, we assume the use of deep convolutional neural networks
(CNNs) to develop our perception models.
The first task addresses traffic sign recognition (TSR), a kind of multi-class
classification problem. We assume that the number of sign classes to be recognized
must be suddenly increased without having annotated samples to perform the
corresponding TSR CNN re-training. We show that leveraging synthetic samples of
such new classes and transforming them by a generative adversarial network (GAN)
trained on the known classes (i.e. without using samples from the new classes), it is
possible to re-train the TSR CNN to properly classify all the signs for a ∼ 1/4 ratio of
new/known sign classes. The second task addresses on-board 2D object detection,
focusing on vehicles and pedestrians. In this case, we assume that we receive a set
of images without the annotations required to train an object detector, i.e. without
object bounding boxes. Therefore, our goal is to self-annotate these images so
that they can later be used to train the desired object detector. In order to reach
this goal, we leverage from synthetic data and propose a semi-supervised learning
approach based on the co-training idea. In fact, we use a GAN to reduce the synthto-real domain shift before applying co-training. Our quantitative results show
that co-training and GAN-based image-to-image translation complement each
other up to allow the training of object detectors without manual annotation, and still almost reaching the upper-bound performances of the detectors trained from
human annotations. While in previous tasks we focus on vision-based perception,
the third task we address focuses on LiDAR pointclouds. Our initial goal was to
develop a 3D object detector trained on synthetic LiDAR-style pointclouds. While
for images we may expect synth/real-to-real domain shift due to differences in
their appearance (e.g. when source and target images come from different camera
sensors), we did not expect so for LiDAR pointclouds since these active sensors
factor out appearance and provide sampled shapes. However, in practice, we have
seen that it can be domain shift even among real-world LiDAR pointclouds. Factors
such as the sampling parameters of the LiDARs, the sensor suite configuration onboard the ego-vehicle, and the human annotation of 3D bounding boxes, do induce
a domain shift. We show it through comprehensive experiments with different
publicly available datasets and 3D detectors. This redirected our goal towards the
design of a GAN for pointcloud-to-pointcloud translation, a relatively unexplored
topic.
Finally, it is worth to mention that all the synthetic datasets used for these three
tasks, have been designed and generated in the context of this PhD work and will
be publicly released. Overall, we think this PhD presents several steps forward to
encourage leveraging synthetic data for developing deep perception models in the
field of driving assistance and autonomous driving.
 
  Address (down) February 2021  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;German Ros  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-122714-2-3 Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ Vil2021 Serial 3599  
Permanent link to this record
 

 
Author Angel Sappa; George A. Triantafyllid edit  isbn
openurl 
  Title Computer Graphics and Imaging Type Book Whole
  Year 2012 Publication Computer Graphics and Imaging Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address (down) Crete, Greece  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-0-88986-921-9 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Sap2012 Serial 2067  
Permanent link to this record
 

 
Author Naveen Onkarappa edit  isbn
openurl 
  Title Optical Flow in Driver Assistance Systems Type Book Whole
  Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Motion perception is one of the most important attributes of the human brain. Visual motion perception consists in inferring speed and direction of elements in a scene based on visual inputs. Analogously, computer vision is assisted by motion cues in the scene. Motion detection in computer vision is useful in solving problems such as segmentation, depth from motion, structure from motion, compression, navigation and many others. These problems are common in several applications, for instance, video surveillance, robot navigation and advanced driver assistance systems (ADAS). One of the most widely used techniques for motion detection is the optical flow estimation. The work in this thesis attempts to make optical flow suitable for the requirements and conditions of driving scenarios. In this context, a novel space-variant representation called reverse log-polar representation is proposed that is shown to be better than the traditional log-polar space-variant representation for ADAS. The space-variant representations reduce the amount of data to be processed. Another major contribution in this research is related to the analysis of the influence of specific characteristics from driving scenarios on the optical flow accuracy. Characteristics such as vehicle speed and
road texture are considered in the aforementioned analysis. From this study, it is inferred that the regularization weight has to be adapted according to the required error measure and for different speeds and road textures. It is also shown that polar represented optical flow suits driving scenarios where predominant motion is translation. Due to the requirements of such a study and by the lack of needed datasets a new synthetic dataset is presented; it contains: i) sequences of different speeds and road textures in an urban scenario; ii) sequences with complex motion of an on-board camera; and iii) sequences with additional moving vehicles in the scene. The ground-truth optical flow is generated by the ray-tracing technique. Further, few applications of optical flow in ADAS are shown. Firstly, a robust RANSAC based technique to estimate horizon line is proposed. Then, an egomotion estimation is presented to compare the proposed space-variant representation with the classical one. As a final contribution, a modification in the regularization term is proposed that notably improves the results
in the ADAS applications. This adaptation is evaluated using a state of the art optical flow technique. The experiments on a public dataset (KITTI) validate the advantages of using the proposed modification.
 
  Address (down) Bellaterra  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-940902-1-9 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Nav2013 Serial 2447  
Permanent link to this record
 

 
Author Muhammad Anwer Rao edit  openurl
  Title Color for Object Detection and Action Recognition Type Book Whole
  Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Recognizing object categories in real world images is a challenging problem in computer vision. The deformable part based framework is currently the most successful approach for object detection. Generally, HOG are used for image representation within the part-based framework. For action recognition, the bag-of-word framework has shown to provide promising results. Within the bag-of-words framework, local image patches are described by SIFT descriptor. Contrary to object detection and action recognition, combining color and shape has shown to provide the best performance for object and scene recognition.

In the first part of this thesis, we analyze the problem of person detection in still images. Standard person detection approaches rely on intensity based features for image representation while ignoring the color. Channel based descriptors is one of the most commonly used approaches in object recognition. This inspires us to evaluate incorporating color information using the channel based fusion approach for the task of person detection.

In the second part of the thesis, we investigate the problem of object detection in still images. Due to high dimensionality, channel based fusion increases the computational cost. Moreover, channel based fusion has been found to obtain inferior results for object category where one of the visual varies significantly. On the other hand, late fusion is known to provide improved results for a wide range of object categories. A consequence of late fusion strategy is the need of a pure color descriptor. Therefore, we propose to use Color attributes as an explicit color representation for object detection. Color attributes are compact and computationally efficient. Consequently color attributes are combined with traditional shape features providing excellent results for object detection task.

Finally, we focus on the problem of action detection and classification in still images. We investigate the potential of color for action classification and detection in still images. We also evaluate different fusion approaches for combining color and shape information for action recognition. Additionally, an analysis is performed to validate the contribution of color for action recognition. Our results clearly demonstrate that combining color and shape information significantly improve the performance of both action classification and detection in still images.
 
  Address (down) Barcelona  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Joost Van de Weijer  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Rao2013 Serial 2281  
Permanent link to this record
 

 
Author Javier Marin edit  openurl
  Title Pedestrian Detection Based on Local Experts Type Book Whole
  Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract During the last decade vision-based human detection systems have started to play a key rolein multiple applications linked to driver assistance, surveillance, robot sensing and home automation.
Detecting humans is by far one of the most challenging tasks in Computer Vision.
This is mainly due to the high degree of variability in the human appearanceassociated to
the clothing, pose, shape and size. Besides, other factors such as cluttered scenarios, partial occlusions, or environmental conditions can make the detection task even harder.
Most promising methods of the state-of-the-art rely on discriminative learning paradigms which are fed with positive and negative examples. The training data is one of the most
relevant elements in order to build a robust detector as it has to cope the large variability of the target. In order to create this dataset human supervision is required. The drawback at this point is the arduous effort of annotating as well as looking for such claimed variability.
In this PhD thesis we address two recurrent problems in the literature. In the first stage,we aim to reduce the consuming task of annotating, namely, by using computer graphics.
More concretely, we develop a virtual urban scenario for later generating a pedestrian dataset.
Then, we train a detector using this dataset, and finally we assess if this detector can be successfully applied in a real scenario.
In the second stage, we focus on increasing the robustness of our pedestrian detectors
under partial occlusions. In particular, we present a novel occlusion handling approach to increase the performance of block-based holistic methods under partial occlusions. For this purpose, we make use of local experts via a RandomSubspaceMethod (RSM) to handle these cases. If the method infers a possible partial occlusion, then the RSM, based on performance statistics obtained from partially occluded data, is applied. The last objective of this thesis
is to propose a robust pedestrian detector based on an ensemble of local experts. To achieve this goal, we use the random forest paradigm, where the trees act as ensembles an their nodesare the local experts. In particular, each expert focus on performing a robust classification ofa pedestrian body patch. This approach offers computational efficiency and far less design complexity when compared to other state-of-the-artmethods, while reaching better accuracy
 
  Address (down) Barcelona  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Jaume Amores  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Mar2013 Serial 2280  
Permanent link to this record
 

 
Author David Vazquez edit   pdf
isbn  openurl
  Title Domain Adaptation of Virtual and Real Worlds for Pedestrian Detection Type Book Whole
  Year 2013 Publication PhD Thesis, Universitat de Barcelona-CVC Abbreviated Journal  
  Volume 1 Issue 1 Pages 1-105  
  Keywords Pedestrian Detection; Domain Adaptation  
  Abstract Pedestrian detection is of paramount interest for many applications, e.g. Advanced Driver Assistance Systems, Intelligent Video Surveillance and Multimedia systems. Most promising pedestrian detectors rely on appearance-based classifiers trained with annotated data. However, the required annotation step represents an intensive and subjective task for humans, what makes worth to minimize their intervention in this process by using computational tools like realistic virtual worlds. The reason to use these kind of tools relies in the fact that they allow the automatic generation of precise and rich annotations of visual information. Nevertheless, the use of this kind of data comes with the following question: can a pedestrian appearance model learnt with virtual-world data work successfully for pedestrian detection in real-world scenarios?. To answer this question, we conduct different experiments that suggest a positive answer. However, the pedestrian classifiers trained with virtual-world data can suffer the so called dataset shift problem as real-world based classifiers does. Accordingly, we have designed different domain adaptation techniques to face this problem, all of them integrated in a same framework (V-AYLA). We have explored different methods to train a domain adapted pedestrian classifiers by collecting a few pedestrian samples from the target domain (real world) and combining them with many samples of the source domain (virtual world). The extensive experiments we present show that pedestrian detectors developed within the V-AYLA framework do achieve domain adaptation. Ideally, we would like to adapt our system without any human intervention. Therefore, as a first proof of concept we also propose an unsupervised domain adaptation technique that avoids human intervention during the adaptation process. To the best of our knowledge, this Thesis work is the first demonstrating adaptation of virtual and real worlds for developing an object detector. Last but not least, we also assessed a different strategy to avoid the dataset shift that consists in collecting real-world samples and retrain with them in such a way that no bounding boxes of real-world pedestrians have to be provided. We show that the generated classifier is competitive with respect to the counterpart trained with samples collected by manually annotating pedestrian bounding boxes. The results presented on this Thesis not only end with a proposal for adapting a virtual-world pedestrian detector to the real world, but also it goes further by pointing out a new methodology that would allow the system to adapt to different situations, which we hope will provide the foundations for future research in this unexplored area.  
  Address (down) Barcelona  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Barcelona Editor Antonio Lopez;Daniel Ponsa  
  Language English Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-940530-1-6 Medium  
  Area Expedition Conference  
  Notes adas Approved yes  
  Call Number ADAS @ adas @ Vaz2013 Serial 2276  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: