toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Ferran Diego edit  openurl
  Title Probabilistic Alignment of Video Sequences Recorded by Moving Cameras Type (up) Book Whole
  Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Video alignment consists of integrating multiple video sequences recorded independently into a single video sequence. This means to register both in time (synchronize
frames) and space (image registration) so that the two videos sequences can be fused
or compared pixel–wise. In spite of being relatively unknown, many applications today may benefit from the availability of robust and efficient video alignment methods.
For instance, video surveillance requires to integrate video sequences that are recorded
of the same scene at different times in order to detect changes. The problem of aligning videos has been addressed before, but in the relatively simple cases of fixed or rigidly attached cameras and simultaneous acquisition. In addition, most works rely
on restrictive assumptions which reduce its difficulty such as linear time correspondence or the knowledge of the complete trajectories of corresponding scene points on the images; to some extent, these assumptions limit the practical applicability of the solutions developed until now. In this thesis, we focus on the challenging problem of aligning sequences recorded at different times from independent moving cameras following similar but not coincident trajectories. More precisely, this thesis covers four studies that advance the state-of-the-art in video alignment. First, we focus on analyzing and developing a probabilistic framework for video alignment, that is, a principled way to integrate multiple observations and prior information. In this way, two different approaches are presented to exploit the combination of several purely visual features (image–intensities, visual words and dense motion field descriptor), and
global positioning system (GPS) information. Second, we focus on reformulating the
problem into a single alignment framework since previous works on video alignment
adopt a divide–and–conquer strategy, i.e., first solve the synchronization, and then
register corresponding frames. This also generalizes the ’classic’ case of fixed geometric transform and linear time mapping. Third, we focus on exploiting directly the
time domain of the video sequences in order to avoid exhaustive cross–frame search.
This provides relevant information used for learning the temporal mapping between
pairs of video sequences. Finally, we focus on adapting these methods to the on–line
setting for road detection and vehicle geolocation. The qualitative and quantitative
results presented in this thesis on a variety of real–world pairs of video sequences show that the proposed method is: robust to varying imaging conditions, different image
content (e.g., incoming and outgoing vehicles), variations on camera velocity, and
different scenarios (indoor and outdoor) going beyond the state–of–the–art. Moreover, the on–line video alignment has been successfully applied for road detection and
vehicle geolocation achieving promising results.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Joan Serrat  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Die2011 Serial 1787  
Permanent link to this record
 

 
Author David Geronimo; Antonio Lopez edit  doi
isbn  openurl
  Title Vision-based Pedestrian Protection Systems for Intelligent Vehicles Type (up) Book Whole
  Year 2014 Publication SpringerBriefs in Computer Science Abbreviated Journal  
  Volume Issue Pages 1-114  
  Keywords Computer Vision; Driver Assistance Systems; Intelligent Vehicles; Pedestrian Detection; Vulnerable Road Users  
  Abstract Pedestrian Protection Systems (PPSs) are on-board systems aimed at detecting and tracking people in the surroundings of a vehicle in order to avoid potentially dangerous situations. These systems, together with other Advanced Driver Assistance Systems (ADAS) such as lane departure warning or adaptive cruise control, are one of the most promising ways to improve traffic safety. By the use of computer vision, cameras working either in the visible or infra-red spectra have been demonstrated as a reliable sensor to perform this task. Nevertheless, the variability of human’s appearance, not only in terms of clothing and sizes but also as a result of their dynamic shape, makes pedestrians one of the most complex classes even for computer vision. Moreover, the unstructured changing and unpredictable environment in which such on-board systems must work makes detection a difficult task to be carried out with the demanded robustness. In this brief, the state of the art in PPSs is introduced through the review of the most relevant papers of the last decade. A common computational architecture is presented as a framework to organize each method according to its main contribution. More than 300 papers are referenced, most of them addressing pedestrian detection and others corresponding to the descriptors (features), pedestrian models, and learning machines used. In addition, an overview of topics such as real-time aspects, systems benchmarking and future challenges of this research area are presented.  
  Address  
  Corporate Author Thesis  
  Publisher Springer Briefs in Computer Vision Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4614-7986-4 Medium  
  Area Expedition Conference  
  Notes ADAS; 600.076 Approved no  
  Call Number GeL2014 Serial 2325  
Permanent link to this record
 

 
Author Muhammad Anwer Rao edit  openurl
  Title Color for Object Detection and Action Recognition Type (up) Book Whole
  Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Recognizing object categories in real world images is a challenging problem in computer vision. The deformable part based framework is currently the most successful approach for object detection. Generally, HOG are used for image representation within the part-based framework. For action recognition, the bag-of-word framework has shown to provide promising results. Within the bag-of-words framework, local image patches are described by SIFT descriptor. Contrary to object detection and action recognition, combining color and shape has shown to provide the best performance for object and scene recognition.

In the first part of this thesis, we analyze the problem of person detection in still images. Standard person detection approaches rely on intensity based features for image representation while ignoring the color. Channel based descriptors is one of the most commonly used approaches in object recognition. This inspires us to evaluate incorporating color information using the channel based fusion approach for the task of person detection.

In the second part of the thesis, we investigate the problem of object detection in still images. Due to high dimensionality, channel based fusion increases the computational cost. Moreover, channel based fusion has been found to obtain inferior results for object category where one of the visual varies significantly. On the other hand, late fusion is known to provide improved results for a wide range of object categories. A consequence of late fusion strategy is the need of a pure color descriptor. Therefore, we propose to use Color attributes as an explicit color representation for object detection. Color attributes are compact and computationally efficient. Consequently color attributes are combined with traditional shape features providing excellent results for object detection task.

Finally, we focus on the problem of action detection and classification in still images. We investigate the potential of color for action classification and detection in still images. We also evaluate different fusion approaches for combining color and shape information for action recognition. Additionally, an analysis is performed to validate the contribution of color for action recognition. Our results clearly demonstrate that combining color and shape information significantly improve the performance of both action classification and detection in still images.
 
  Address Barcelona  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Joost Van de Weijer  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Rao2013 Serial 2281  
Permanent link to this record
 

 
Author Javier Marin edit  openurl
  Title Pedestrian Detection Based on Local Experts Type (up) Book Whole
  Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract During the last decade vision-based human detection systems have started to play a key rolein multiple applications linked to driver assistance, surveillance, robot sensing and home automation.
Detecting humans is by far one of the most challenging tasks in Computer Vision.
This is mainly due to the high degree of variability in the human appearanceassociated to
the clothing, pose, shape and size. Besides, other factors such as cluttered scenarios, partial occlusions, or environmental conditions can make the detection task even harder.
Most promising methods of the state-of-the-art rely on discriminative learning paradigms which are fed with positive and negative examples. The training data is one of the most
relevant elements in order to build a robust detector as it has to cope the large variability of the target. In order to create this dataset human supervision is required. The drawback at this point is the arduous effort of annotating as well as looking for such claimed variability.
In this PhD thesis we address two recurrent problems in the literature. In the first stage,we aim to reduce the consuming task of annotating, namely, by using computer graphics.
More concretely, we develop a virtual urban scenario for later generating a pedestrian dataset.
Then, we train a detector using this dataset, and finally we assess if this detector can be successfully applied in a real scenario.
In the second stage, we focus on increasing the robustness of our pedestrian detectors
under partial occlusions. In particular, we present a novel occlusion handling approach to increase the performance of block-based holistic methods under partial occlusions. For this purpose, we make use of local experts via a RandomSubspaceMethod (RSM) to handle these cases. If the method infers a possible partial occlusion, then the RSM, based on performance statistics obtained from partially occluded data, is applied. The last objective of this thesis
is to propose a robust pedestrian detector based on an ensemble of local experts. To achieve this goal, we use the random forest paradigm, where the trees act as ensembles an their nodesare the local experts. In particular, each expert focus on performing a robust classification ofa pedestrian body patch. This approach offers computational efficiency and far less design complexity when compared to other state-of-the-artmethods, while reaching better accuracy
 
  Address Barcelona  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Jaume Amores  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Mar2013 Serial 2280  
Permanent link to this record
 

 
Author David Vazquez edit   pdf
isbn  openurl
  Title Domain Adaptation of Virtual and Real Worlds for Pedestrian Detection Type (up) Book Whole
  Year 2013 Publication PhD Thesis, Universitat de Barcelona-CVC Abbreviated Journal  
  Volume 1 Issue 1 Pages 1-105  
  Keywords Pedestrian Detection; Domain Adaptation  
  Abstract Pedestrian detection is of paramount interest for many applications, e.g. Advanced Driver Assistance Systems, Intelligent Video Surveillance and Multimedia systems. Most promising pedestrian detectors rely on appearance-based classifiers trained with annotated data. However, the required annotation step represents an intensive and subjective task for humans, what makes worth to minimize their intervention in this process by using computational tools like realistic virtual worlds. The reason to use these kind of tools relies in the fact that they allow the automatic generation of precise and rich annotations of visual information. Nevertheless, the use of this kind of data comes with the following question: can a pedestrian appearance model learnt with virtual-world data work successfully for pedestrian detection in real-world scenarios?. To answer this question, we conduct different experiments that suggest a positive answer. However, the pedestrian classifiers trained with virtual-world data can suffer the so called dataset shift problem as real-world based classifiers does. Accordingly, we have designed different domain adaptation techniques to face this problem, all of them integrated in a same framework (V-AYLA). We have explored different methods to train a domain adapted pedestrian classifiers by collecting a few pedestrian samples from the target domain (real world) and combining them with many samples of the source domain (virtual world). The extensive experiments we present show that pedestrian detectors developed within the V-AYLA framework do achieve domain adaptation. Ideally, we would like to adapt our system without any human intervention. Therefore, as a first proof of concept we also propose an unsupervised domain adaptation technique that avoids human intervention during the adaptation process. To the best of our knowledge, this Thesis work is the first demonstrating adaptation of virtual and real worlds for developing an object detector. Last but not least, we also assessed a different strategy to avoid the dataset shift that consists in collecting real-world samples and retrain with them in such a way that no bounding boxes of real-world pedestrians have to be provided. We show that the generated classifier is competitive with respect to the counterpart trained with samples collected by manually annotating pedestrian bounding boxes. The results presented on this Thesis not only end with a proposal for adapting a virtual-world pedestrian detector to the real world, but also it goes further by pointing out a new methodology that would allow the system to adapt to different situations, which we hope will provide the foundations for future research in this unexplored area.  
  Address Barcelona  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Barcelona Editor Antonio Lopez;Daniel Ponsa  
  Language English Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-940530-1-6 Medium  
  Area Expedition Conference  
  Notes adas Approved yes  
  Call Number ADAS @ adas @ Vaz2013 Serial 2276  
Permanent link to this record
 

 
Author Angel Sappa; George A. Triantafyllid edit  isbn
openurl 
  Title Computer Graphics and Imaging Type (up) Book Whole
  Year 2012 Publication Computer Graphics and Imaging Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Crete, Greece  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-0-88986-921-9 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Sap2012 Serial 2067  
Permanent link to this record
 

 
Author Angel Sappa; Jordi Vitria edit  doi
isbn  openurl
  Title Multimodal Interaction in Image and Video Applications Type (up) Book Whole
  Year 2013 Publication Multimodal Interaction in Image and Video Applications Abbreviated Journal  
  Volume 48 Issue Pages  
  Keywords  
  Abstract Book Series Intelligent Systems Reference Library  
  Address  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1868-4394 ISBN 978-3-642-35931-6 Medium  
  Area Expedition Conference  
  Notes ADAS; OR;MV Approved no  
  Call Number Admin @ si @ SaV2013 Serial 2199  
Permanent link to this record
 

 
Author Mohammad Rouhani edit  openurl
  Title Shape Representation and Registration using Implicit Functions Type (up) Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Shape representation and registration are two important problems in computer vision and graphics. Representing the given cloud of points through an implicit function provides a higher level information describing the data. This representation can be more compact more robust to noise and outliers, hence it can be exploited in different computer vision application. In the first part of this thesis implicit shape representations, including both implicit B-spline and polynomial, are tackled. First, an approximation of a geometric distance is proposed to measure the closeness of the given cloud of points and the implicit surface. The analysis of the proposed distance shows an accurate estimation with smooth behavior. The distance by itself is used in a RANSAC based quadratic fitting method. Moreover, since the gradient information of the distance with respect to the surface parameters can be analytically computed, it is used in Levenberg-Marquadt algorithm to refine the surface parameters. In a different approach, an algebraic fitting method is used to represent an object through implicit B-splines. The outcome is a smooth flexible surface and can be represented in different levels from coarse to fine. This property has been exploited to solve the registration problem in the second part of the thesis. In the proposed registration technique the model set is replaced with an implicit representation provided in the first part; then, the point-to-point registration is converted to a point-to-model one in a higher level. This registration error can benefit from different distance estimations to speed up the registration process even without need of correspondence search. Finally, the non-rigid registration problem is tackled through a quadratic distance approximation that is based on the curvature information of the model set. This approximation is used in a free form deformation model to update its control lattice. Then it is shown how an accurate distance approximation can benefit non-rigid registration problems.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Rou2012 Serial 2205  
Permanent link to this record
 

 
Author Jose Carlos Rubio edit  openurl
  Title Many-to-Many High Order Matching. Applications to Tracking and Object Segmentation Type (up) Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Feature matching is a fundamental problem in Computer Vision, having multiple applications such as tracking, image classification and retrieval, shape recognition and stereo fusion. In numerous domains, it is useful to represent the local structure of the matching features to increase the matching accuracy or to make the correspondence invariant to certain transformations (affine, homography, etc. . . ). However, encoding this knowledge requires complicating the model by establishing high-order relationships between the model elements, and therefore increasing the complexity of the optimization problem.

The importance of many-to-many matching is sometimes dismissed in the literature. Most methods are restricted to perform one-to-one matching, and are usually validated on synthetic, or non-realistic datasets. In a real challenging environment, with scale, pose and illumination variations of the object of interest, as well as the presence of occlusions, clutter, and noisy observations, many-to-many matching is necessary to achieve satisfactory results. As a consequence, finding the most likely many-to-many correspondence often involves a challenging combinatorial optimization process.

In this work, we design and demonstrate matching algorithms that compute many-to-many correspondences, applied to several challenging problems. Our goal is to make use of high-order representations to improve the expressive power of the matching, at the same time that we make feasible the process of inference or optimization of such models. We effectively use graphical models as our preferred representation because they provide an elegant probabilistic framework to tackle structured prediction problems.

We introduce a matching-based tracking algorithm which performs matching between frames of a video sequence in order to solve the difficult problem of headlight tracking at night-time. We also generalise this algorithm to solve the problem of data association applied to various tracking scenarios. We demonstrate the effectiveness of such approach in real video sequences and we show that our tracking algorithm can be used to improve the accuracy of a headlight classification system.

In the second part of this work, we move from single (point) matching to dense (region) matching and we introduce a new hierarchical image representation. We make use of such model to develop a high-order many-to-many matching between pairs of images. We show that the use of high-order models in comparison to simpler models improves not only the accuracy of the results, but also the convergence speed of the inference algorithm.

Finally, we keep exploiting the idea of region matching to design a fully unsupervised image co-segmentation algorithm that is able to perform competitively with state-of-the-art supervised methods. Our method also overcomes the typical drawbacks of some of the past works, such as avoiding the necessity of variate appearances on the image backgrounds. The region matching in this case is applied to effectively exploit inter-image information. We also extend this work to perform co-segmentation of videos, being the first time that such problem is addressed, as a way to perform video object segmentation
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Joan Serrat  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Rub2012 Serial 2206  
Permanent link to this record
 

 
Author Fernando Barrera edit  openurl
  Title Multimodal Stereo from Thermal Infrared and Visible Spectrum Type (up) Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Recent advances in thermal infrared imaging (LWIR) has allowed its use in applications beyond of the military domain. Nowadays, this new family of sensors is included in different technical and scientific applications. They offer features that facilitate tasks, such as detection of pedestrians, hot spots, differences in temperature, among others, which can significantly improve the performance of a system where the persons are expected to play the principal role. For instance, video surveillance applications, monitoring, and pedestrian detection.
During this dissertation the next question is stated: Could a couple of sensors measuring different bands of the electromagnetic spectrum, as the visible and thermal infrared, be used to extract depth information? Although it is a complex question, we shows that a system of these characteristics is possible as well as their advantages, drawbacks, and potential opportunities.
The matching and fusion of data coming from different sensors, as the emissions registered at visible and infrared bands, represents a special challenge, because it has been showed that theses signals are weak correlated. Therefore, many traditional techniques of image processing and computer vision are not helpful, requiring adjustments for their correct performance in every modality.
In this research an experimental study that compares different cost functions and matching approaches is performed, in order to build a multimodal stereovision system. Furthermore, the common problems in infrared/visible stereo, specially in the outdoor scenes are identified. Our framework summarizes the architecture of a generic stereo algorithm, at different levels: computational, functional, and structural, which can be extended toward high-level fusion (semantic) and high-order (prior).The proposed framework is intended to explore novel multimodal stereo matching approaches, going from sparse to dense representations (both disparity and depth maps). Moreover, context information is added in form of priors and assumptions. Finally, this dissertation shows a promissory way toward the integration of multiple sensors for recovering three-dimensional information.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Felipe Lumbreras;Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Bar2012 Serial 2209  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: