Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 >>

Details

	Records
	Author	Angel Sappa; Jordi Vitria
	Title	Multimodal Interaction in Image and Video Applications			Type	Book Whole
	Year	2013	Publication	Multimodal Interaction in Image and Video Applications	Abbreviated Journal
	Volume	48	Issue		Pages
	Keywords
	Abstract	Book Series Intelligent Systems Reference Library
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1868-4394	ISBN	978-3-642-35931-6	Medium
	Area		Expedition		Conference
	Notes	ADAS; OR;MV			Approved	no
	Call Number	Admin @ si @ SaV2013			Serial	2199
Permanent link to this record



	Author	Antonio Lopez
	Title	Multilocal Methods for Ridge and Valley Delineation in Image Analysis.			Type	Book Whole
	Year	2000	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher		Place of Publication		Editor	Joan Serrat
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	ADAS @ adas @ Lop2000			Serial	174
Permanent link to this record



	Author	Alejandro Gonzalez Alzate
	Title	Multi-modal Pedestrian Detection			Type	Book Whole
	Year	2015	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Pedestrian detection continues to be an extremely challenging problem in real scenarios, in which situations like illumination changes, noisy images, unexpected objects, uncontrolled scenarios and variant appearance of objects occur constantly. All these problems force the development of more robust detectors for relevant applications like vision-based autonomous vehicles, intelligent surveillance, and pedestrian tracking for behavior analysis. Most reliable vision-based pedestrian detectors base their decision on features extracted using a single sensor capturing complementary features, e.g., appearance, and texture. These features usually are extracted from the current frame, ignoring temporal information, or including it in a post process step e.g., tracking or temporal coherence. Taking into account these issues we formulate the following question: can we generate more robust pedestrian detectors by introducing new information sources in the feature extraction step? In order to answer this question we develop different approaches for introducing new information sources to well-known pedestrian detectors. We start by the inclusion of temporal information following the Stacked Sequential Learning (SSL) paradigm which suggests that information extracted from the neighboring samples in a sequence can improve the accuracy of a base classifier. We then focus on the inclusion of complementary information from different sensors like 3D point clouds (LIDAR – depth), far infrared images (FIR), or disparity maps (stereo pair cameras). For this end we develop a multi-modal framework in which information from different sensors is used for increasing detection accuracy (by increasing information redundancy). Finally we propose a multi-view pedestrian detector, this multi-view approach splits the detection problem in n sub-problems. Each sub-problem will detect objects in a given specific view reducing in that way the variability problem faced when a single detectors is used for the whole problem. We show that these approaches obtain competitive results with other state-of-the-art methods but instead of design new features, we reuse existing ones boosting their performance.
	Address	November 2015
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	David Vazquez;Antonio Lopez;
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-943427-7-6	Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.076			Approved	no
	Call Number	Admin @ si @ Gon2015			Serial	2706
Permanent link to this record



	Author	Angel Sappa; David Geronimo; Fadi Dornaika; Mohammad Rouhani; Antonio Lopez
	Title	Moving object detection from mobile platforms using stereo data registration			Type	Book Chapter
	Year	2012	Publication	Computational Intelligence paradigms in advanced pattern classification	Abbreviated Journal
	Volume	386	Issue		Pages	25-37
	Keywords	pedestrian detection
	Abstract	This chapter describes a robust approach for detecting moving objects from on-board stereo vision systems. It relies on a feature point quaternion-based registration, which avoids common problems that appear when computationally expensive iterative-based algorithms are used on dynamic environments. The proposed approach consists of three main stages. Initially, feature points are extracted and tracked through consecutive 2D frames. Then, a RANSAC based approach is used for registering two point sets, with known correspondences in the 3D space. The computed 3D rigid displacement is used to map two consecutive 3D point clouds into the same coordinate system by means of the quaternion method. Finally, moving objects correspond to those areas with large 3D registration errors. Experimental results show the viability of the proposed approach to detect moving objects like vehicles or pedestrians in different urban scenarios.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	Marek R. Ogiela; Lakhmi C. Jain
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1860-949X	ISBN	978-3-642-24048-5	Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ SGD2012			Serial	2061
Permanent link to this record



	Author	Akhil Gurram
	Title	Monocular Depth Estimation for Autonomous Driving			Type	Book Whole
	Year	2022	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	3D geometric information is essential for on-board perception in autonomous driving and driver assistance. Autonomous vehicles (AVs) are equipped with calibrated sensor suites. As part of these suites, we can find LiDARs, which are expensive active sensors in charge of providing the 3D geometric information. Depending on the operational conditions for the AV, calibrated stereo rigs may be also sufficient for obtaining 3D geometric information, being these rigs less expensive and easier to install than LiDARs. However, ensuring a proper maintenance and calibration of these types of sensors is not trivial. Accordingly, there is an increasing interest on performing monocular depth estimation (MDE) to obtain 3D geometric information on-board. MDE is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Moreover, a set of single cameras with MDE capabilities would still be a cheap solution for on-board perception, relatively easy to integrate and maintain in an AV. Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i.e., assuming pixelwise ground truth (GT). Accordingly, the overall goal of this PhD is to study methods for improving CNN-based MDE accuracy under different training settings. More specifically, this PhD addresses different research questions that are described below. When we started to work in this PhD, state-of-theart methods for MDE were already based on CNNs. In fact, a promising line of work consisted in using image-based semantic supervision (i.e., pixel-level class labels) while training CNNs for MDE using LiDAR-based supervision (i.e., depth). It was common practice to assume that the same raw training data are complemented by both types of supervision, i.e., with depth and semantic labels. However, in practice, it was more common to find heterogeneous datasets with either only depth supervision or only semantic supervision. Therefore, our first work was to research if we could train CNNs for MDE by leveraging depth and semantic information from heterogeneous datasets. We show that this is indeed possible, and we surpassed the state-of-the-art results on MDE at the time we did this research. To achieve our results, we proposed a particular CNN architecture and a new training protocol. After this research, it was clear that the upper-bound setting to train CNN-based MDE models consists in using LiDAR data as supervision. However, it would be cheaper and more scalable if we would be able to train such models from monocular sequences. Obviously, this is far more challenging, but worth to research. Training MDE models using monocular sequences is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. To alleviate these problems, we perform MDE by virtual-world supervision and real-world SfM self-supervision. We call our proposalMonoDEVSNet. We compensate the SfM self-supervision limitations by leveraging virtual-world images with accurate semantic and depth supervision, as well as addressing the virtual-to-real domain gap. MonoDEVSNet outperformed previous MDE CNNs trained on monocular and even stereo sequences. We have publicly released MonoDEVSNet at <https://github.com/HMRC-AEL/MonoDEVSNet>. Finally, since MDE is performed to produce 3D information for being used in downstream tasks related to on-board perception. We also address the question of whether the standard metrics for MDE assessment are a good indicator for future MDE-based driving-related perception tasks. By using 3D object detection on point clouds as proxy of on-board perception, we conclude that, indeed, MDE evaluation metrics give rise to a ranking of methods which reflects relatively well the 3D object detection results we may expect.
	Address	March, 2022
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	IMPRIMA	Place of Publication		Editor	Antonio Lopez;Onay Urfalioglu
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-124793-0-0	Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Gur2022			Serial	3712
Permanent link to this record



	Author	Diego Cheda
	Title	Monocular Depth Cues in Computer Vision Applications			Type	Book Whole
	Year	2012	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Depth perception is a key aspect of human vision. It is a routine and essential visual task that the human do effortlessly in many daily activities. This has often been associated with stereo vision, but humans have an amazing ability to perceive depth relations even from a single image by using several monocular cues. In the computer vision field, if image depth information were available, many tasks could be posed from a different perspective for the sake of higher performance and robustness. Nevertheless, given a single image, this possibility is usually discarded, since obtaining depth information has frequently been performed by three-dimensional reconstruction techniques, requiring two or more images of the same scene taken from different viewpoints. Recently, some proposals have shown the feasibility of computing depth information from single images. In essence, the idea is to take advantage of a priori knowledge of the acquisition conditions and the observed scene to estimate depth from monocular pictorial cues. These approaches try to precisely estimate the scene depth maps by employing computationally demanding techniques. However, to assist many computer vision algorithms, it is not really necessary computing a costly and detailed depth map of the image. Indeed, just a rough depth description can be very valuable in many problems. In this thesis, we have demonstrated how coarse depth information can be integrated in different tasks following alternative strategies to obtain more precise and robust results. In that sense, we have proposed a simple, but reliable enough technique, whereby image scene regions are categorized into discrete depth ranges to build a coarse depth map. Based on this representation, we have explored the potential usefulness of our method in three application domains from novel viewpoints: camera rotation parameters estimation, background estimation and pedestrian candidate generation. In the first case, we have computed camera rotation mounted in a moving vehicle applying two novels methods based on distant elements in the image, where the translation component of the image flow vectors is negligible. In background estimation, we have proposed a novel method to reconstruct the background by penalizing close regions in a cost function, which integrates color, motion, and depth terms. Finally, we have benefited of geometric and depth information available on single images for pedestrian candidate generation to significantly reduce the number of generated windows to be further processed by a pedestrian classifier. In all cases, results have shown that our approaches contribute to better performances.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Daniel Ponsa;Antonio Lopez
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Che2012			Serial	2210
Permanent link to this record



	Author	Daniel Ponsa
	Title	Model-Based Visual Localisation of Contours and Vehicles			Type	Book Whole
	Year	2007	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Phd Thesis
	Abstract
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Antonio Lopez;Xavier Roca
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-935251-3-2	Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	ADAS @ adas @ Pon2007			Serial	1107
Permanent link to this record



	Author	Jose Carlos Rubio
	Title	Many-to-Many High Order Matching. Applications to Tracking and Object Segmentation			Type	Book Whole
	Year	2012	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Feature matching is a fundamental problem in Computer Vision, having multiple applications such as tracking, image classification and retrieval, shape recognition and stereo fusion. In numerous domains, it is useful to represent the local structure of the matching features to increase the matching accuracy or to make the correspondence invariant to certain transformations (affine, homography, etc. . . ). However, encoding this knowledge requires complicating the model by establishing high-order relationships between the model elements, and therefore increasing the complexity of the optimization problem. The importance of many-to-many matching is sometimes dismissed in the literature. Most methods are restricted to perform one-to-one matching, and are usually validated on synthetic, or non-realistic datasets. In a real challenging environment, with scale, pose and illumination variations of the object of interest, as well as the presence of occlusions, clutter, and noisy observations, many-to-many matching is necessary to achieve satisfactory results. As a consequence, finding the most likely many-to-many correspondence often involves a challenging combinatorial optimization process. In this work, we design and demonstrate matching algorithms that compute many-to-many correspondences, applied to several challenging problems. Our goal is to make use of high-order representations to improve the expressive power of the matching, at the same time that we make feasible the process of inference or optimization of such models. We effectively use graphical models as our preferred representation because they provide an elegant probabilistic framework to tackle structured prediction problems. We introduce a matching-based tracking algorithm which performs matching between frames of a video sequence in order to solve the difficult problem of headlight tracking at night-time. We also generalise this algorithm to solve the problem of data association applied to various tracking scenarios. We demonstrate the effectiveness of such approach in real video sequences and we show that our tracking algorithm can be used to improve the accuracy of a headlight classification system. In the second part of this work, we move from single (point) matching to dense (region) matching and we introduce a new hierarchical image representation. We make use of such model to develop a high-order many-to-many matching between pairs of images. We show that the use of high-order models in comparison to simpler models improves not only the accuracy of the results, but also the convergence speed of the inference algorithm. Finally, we keep exploiting the idea of region matching to design a fully unsupervised image co-segmentation algorithm that is able to perform competitively with state-of-the-art supervised methods. Our method also overcomes the typical drawbacks of some of the past works, such as avoiding the necessity of variate appearances on the image backgrounds. The region matching in this case is applied to effectively exploit inter-image information. We also extend this work to perform co-segmentation of videos, being the first time that such problem is addressed, as a way to perform video object segmentation
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Joan Serrat
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Rub2012			Serial	2206
Permanent link to this record



	Author	Cristhian Aguilera
	Title	Local feature description in cross-spectral imagery			Type	Book Whole
	Year	2017	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Over the last few years, the number of consumer computer vision applications has increased dramatically. Today, computer vision solutions can be found in video game consoles, smartphone applications, driving assistance – just to name a few. Ideally, we require the performance of those applications, particularly those that are safety critical to remain constant under any external environment factors, such as changes in illumination or weather conditions. However, this is not always possible or very difficult to obtain by only using visible imagery, due to the inherent limitations of the images from that spectral band. For that reason, the use of images from different or multiple spectral bands is becoming more appealing. The aforementioned possible advantages of using images from multiples spectral bands on various vision applications make multi-spectral image processing a relevant topic for research and development. Like in visible image processing, multi-spectral image processing needs tools and algorithms to handle information from various spectral bands. Furthermore, traditional tools such as local feature detection, which is the basis of many vision tasks such as visual odometry, image registration, or structure from motion, must be adjusted or reformulated to operate under new conditions. Traditional feature detection, description, and matching methods tend to underperform in multi-spectral settings, in comparison to mono-spectral settings, due to the natural differences between each spectral band. The work in this thesis is focused on the local feature description problem when cross-spectral images are considered. In this context, this dissertation has three main contributions. Firstly, the work starts by proposing the usage of a combination of frequency and spatial information, in a multi-scale scheme, as feature description. Evaluations of this proposal, based on classical hand-made feature descriptors, and comparisons with state of the art cross-spectral approaches help to find and understand limitations of such strategy. Secondly, different convolutional neural network (CNN) based architectures are evaluated when used to describe cross-spectral image patches. Results showed that CNN-based methods, designed to work with visible monocular images, could be successfully applied to the description of images from two different spectral bands, with just minor modifications. In this framework, a novel CNN-based network model, specifically intended to describe image patches from two different spectral bands, is proposed. This network, referred to as Q-Net, outperforms state of the art in the cross-spectral domain, including both previous hand-made solutions as well as L2 CNN-based architectures. The third contribution of this dissertation is in the cross-spectral feature description application domain. The multispectral odometry problem is tackled showing a real application of cross-spectral descriptors In addition to the three main contributions mentioned above, in this dissertation, two different multi-spectral datasets are generated and shared with the community to be used as benchmarks for further studies.
	Address	October 2017
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Angel Sappa
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-945373-6-3	Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.118			Approved	no
	Call Number	Admin @ si @ Agu2017			Serial	3020
Permanent link to this record



	Author	Gemma Rotger
	Title	Lifelike Humans: Detailed Reconstruction of Expressive Human Faces			Type	Book Whole
	Year	2021	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Developing human-like digital characters is a challenging task since humans are used to recognizing our fellows, and find the computed generated characters inadequately humanized. To fulfill the standards of the videogame and digital film productions it is necessary to model and animate these characters the most closely to human beings. However, it is an arduous and expensive task, since many artists and specialists are required to work on a single character. Therefore, to fulfill these requirements we found an interesting option to study the automatic creation of detailed characters through inexpensive setups. In this work, we develop novel techniques to bring detailed characters by combining different aspects that stand out when developing realistic characters, skin detail, facial hairs, expressions, and microexpressions. We examine each of the mentioned areas with the aim of automatically recover each of the parts without user interaction nor training data. We study the problems for their robustness but also for the simplicity of the setup, preferring single-image with uncontrolled illumination and methods that can be easily computed with the commodity of a standard laptop. A detailed face with wrinkles and skin details is vital to develop a realistic character. In this work, we introduce our method to automatically describe facial wrinkles on the image and transfer to the recovered base face. Then we advance to facial hair recovery by resolving a fitting problem with a novel parametrization model. As of last, we develop a mapping function that allows transfer expressions and microexpressions between different meshes, which provides realistic animations to our detailed mesh. We cover all the mentioned points with the focus on key aspects as (i) how to describe skin wrinkles in a simple and straightforward manner, (ii) how to recover 3D from 2D detections, (iii) how to recover and model facial hair from 2D to 3D, (iv) how to transfer expressions between models holding both skin detail and facial hair, (v) how to perform all the described actions without training data nor user interaction. In this work, we present our proposals to solve these aspects with an efficient and simple setup. We validate our work with several datasets both synthetic and real data, prooving remarkable results even in challenging cases as occlusions as glasses, thick beards, and indeed working with different face topologies like single-eyed cyclops.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Felipe Lumbreras;Antonio Agudo
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-122714-3-0	Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	Admin @ si @ Rot2021			Serial	3513
Permanent link to this record

Select All Deselect All

<< 1 2 3 4 5 6 7 >>

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: