Publicacions CVC -- Query Results

<< 1 2 >>

Details

Records
Author	Zhijie Fang; David Vazquez; Antonio Lopez
Title	On-Board Detection of Pedestrian Intentions			Type	Journal Article
Year	2017	Publication	Sensors	Abbreviated Journal	SENS
Volume	17	Issue	10	Pages	2193
Keywords	pedestrian intention; ADAS; self-driving
Abstract	Avoiding vehicle-to-pedestrian crashes is a critical requirement for nowadays advanced driver assistant systems (ADAS) and future self-driving vehicles. Accordingly, detecting pedestrians from raw sensor data has a history of more than 15 years of research, with vision playing a central role. During the last years, deep learning has boosted the accuracy of image-based pedestrian detectors. However, detection is just the first step towards answering the core question, namely is the vehicle going to crash with a pedestrian provided preventive actions are not taken? Therefore, knowing as soon as possible if a detected pedestrian has the intention of crossing the road ahead of the vehicle is essential for performing safe and comfortable maneuvers that prevent a crash. However, compared to pedestrian detection, there is relatively little literature on detecting pedestrian intentions. This paper aims to contribute along this line by presenting a new vision-based approach which analyzes the pose of a pedestrian along several frames to determine if he or she is going to enter the road or not. We present experiments showing 750 ms of anticipation for pedestrians crossing the road, which at a typical urban driving speed of 50 km/h can provide 15 additional meters (compared to a pure pedestrian detector) for vehicle automatic reactions or to warn the driver. Moreover, in contrast with state-of-the-art methods, our approach is monocular, neither requiring stereo nor optical flow information.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.085; 600.076; 601.223; 600.116; 600.118			Approved	no
Call Number	Admin @ si @ FVL2017			Serial	2983
Permanent link to this record



Author	Xavier Soria; Angel Sappa; Riad I. Hammoud
Title	Wide-Band Color Imagery Restoration for RGB-NIR Single Sensor Images			Type	Journal Article
Year	2018	Publication	Sensors	Abbreviated Journal	SENS
Volume	18	Issue	7	Pages	2059
Keywords	RGB-NIR sensor; multispectral imaging; deep learning; CNNs
Abstract	Multi-spectral RGB-NIR sensors have become ubiquitous in recent years. These sensors allow the visible and near-infrared spectral bands of a given scene to be captured at the same time. With such cameras, the acquired imagery has a compromised RGB color representation due to near-infrared bands (700–1100 nm) cross-talking with the visible bands (400–700 nm). This paper proposes two deep learning-based architectures to recover the full RGB color images, thus removing the NIR information from the visible bands. The proposed approaches directly restore the high-resolution RGB image by means of convolutional neural networks. They are evaluated with several outdoor images; both architectures reach a similar performance when evaluated in different scenarios and using different similarity metrics. Both of them improve the state of the art approaches.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; MSIAU; 600.086; 600.130; 600.122; 600.118			Approved	no
Call Number	Admin @ si @ SSH2018			Serial	3145
Permanent link to this record



Author	Xavier Perez Sala; Sergio Escalera; Cecilio Angulo; Jordi Gonzalez
Title	A survey on model based approaches for 2D and 3D visual human pose recovery			Type	Journal Article
Year	2014	Publication	Sensors	Abbreviated Journal	SENS
Volume	14	Issue	3	Pages	4189-4210
Keywords	human pose recovery; human body modelling; behavior analysis; computer vision
Abstract	Human Pose Recovery has been studied in the field of Computer Vision for the last 40 years. Several approaches have been reported, and significant improvements have been obtained in both data representation and model design. However, the problem of Human Pose Recovery in uncontrolled environments is far from being solved. In this paper, we define a general taxonomy to group model based approaches for Human Pose Recovery, which is composed of five main modules: appearance, viewpoint, spatial relations, temporal consistence, and behavior. Subsequently, a methodological comparison is performed following the proposed taxonomy, evaluating current SoA approaches in the aforementioned five group categories. As a result of this comparison, we discuss the main advantages and drawbacks of the reviewed literature.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; ISE; 600.046; 600.063; 600.078;MILAB			Approved	no
Call Number	Admin @ si @ PEA2014			Serial	2443
Permanent link to this record



Author	Wenjuan Gong; Xuena Zhang; Jordi Gonzalez; Andrews Sobral; Thierry Bouwmans; Changhe Tu; El-hadi Zahzah
Title	Human Pose Estimation from Monocular Images: A Comprehensive Survey			Type	Journal Article
Year	2016	Publication	Sensors	Abbreviated Journal	SENS
Volume	16	Issue	12	Pages	1966
Keywords	human pose estimation; human bodymodels; generativemethods; discriminativemethods; top-down methods; bottom-up methods
Abstract	Human pose estimation refers to the estimation of the location of body parts and how they are connected in an image. Human pose estimation from monocular images has wide applications (e.g., image indexing). Several surveys on human pose estimation can be found in the literature, but they focus on a certain category; for example, model-based approaches or human motion analysis, etc. As far as we know, an overall review of this problem domain has yet to be provided. Furthermore, recent advancements based on deep learning have brought novel algorithms for this problem. In this paper, a comprehensive survey of human pose estimation from monocular images is carried out including milestone works and recent advancements. Based on one standard pipeline for the solution of computer vision problems, this survey splits the problem into several modules: feature extraction and description, human body models, and modeling methods. Problem modeling methods are approached based on two means of categorization in this survey. One way to categorize includes top-down and bottom-up methods, and another way includes generative and discriminative methods. Considering the fact that one direct application of human pose estimation is to provide initialization for automatic video surveillance, there are additional sections for motion-related methods in all modules: motion features, motion models, and motion-based methods. Finally, the paper also collects 26 publicly available data sets for validation and provides error measurement methods that are frequently used.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE; 600.098; 600.119			Approved	no
Call Number	Admin @ si @ GZG2016			Serial	2933
Permanent link to this record



Author	Sergio Escalera; Xavier Baro; Jordi Vitria; Petia Radeva; Bogdan Raducanu
Title	Social Network Extraction and Analysis Based on Multimodal Dyadic Interaction			Type	Journal Article
Year	2012	Publication	Sensors	Abbreviated Journal	SENS
Volume	12	Issue	2	Pages	1702-1719
Keywords
Abstract	IF=1.77 (2010) Social interactions are a very important component in peopleís lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Timesí Blogging Heads opinion blog. The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The linksí weights are a measure of the ìinfluenceî a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network.
Address
Corporate Author				Thesis
Publisher	Molecular Diversity Preservation International	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; OR;HuPBA;MV			Approved	no
Call Number	Admin @ si @ EBV2012			Serial	1885
Permanent link to this record



Author	Saad Minhas; Zeba Khanam; Shoaib Ehsan; Klaus McDonald Maier; Aura Hernandez-Sabate
Title	Weather Classification by Utilizing Synthetic Data			Type	Journal Article
Year	2022	Publication	Sensors	Abbreviated Journal	SENS
Volume	22	Issue	9	Pages	3193
Keywords	Weather classification; synthetic data; dataset; autonomous car; computer vision; advanced driver assistance systems; deep learning; intelligent transportation systems
Abstract	Weather prediction from real-world images can be termed a complex task when targeting classification using neural networks. Moreover, the number of images throughout the available datasets can contain a huge amount of variance when comparing locations with the weather those images are representing. In this article, the capabilities of a custom built driver simulator are explored specifically to simulate a wide range of weather conditions. Moreover, the performance of a new synthetic dataset generated by the above simulator is also assessed. The results indicate that the use of synthetic datasets in conjunction with real-world datasets can increase the training efficiency of the CNNs by as much as 74%. The article paves a way forward to tackle the persistent problem of bias in vision-based datasets.
Address	21 April 2022
Corporate Author				Thesis
Publisher	MDPI	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM; 600.139; 600.159; 600.166; 600.145;			Approved	no
Call Number	Admin @ si @ MKE2022			Serial	3761
Permanent link to this record



Author	Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud
Title	A Novel Domain Transfer-Based Approach for Unsupervised Thermal Image Super-Resolution			Type	Journal Article
Year	2022	Publication	Sensors	Abbreviated Journal	SENS
Volume	22	Issue	6	Pages	2254
Keywords	Thermal image super-resolution; unsupervised super-resolution; thermal images; attention module; semiregistered thermal images
Abstract	This paper presents a transfer domain strategy to tackle the limitations of low-resolution thermal sensors and generate higher-resolution images of reasonable quality. The proposed technique employs a CycleGAN architecture and uses a ResNet as an encoder in the generator along with an attention module and a novel loss function. The network is trained on a multi-resolution thermal image dataset acquired with three different thermal sensors. Results report better performance benchmarking results on the 2nd CVPR-PBVS-2021 thermal image super-resolution challenge than state-of-the-art methods. The code of this work is available online.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MSIAU;			Approved	no
Call Number	Admin @ si @ RSV2022b			Serial	3688
Permanent link to this record



Author	P. Ricaurte ; C. Chilan; Cristhian A. Aguilera-Carrasco; Boris X. Vintimilla; Angel Sappa
Title	Feature Point Descriptors: Infrared and Visible Spectra			Type	Journal Article
Year	2014	Publication	Sensors	Abbreviated Journal	SENS
Volume	14	Issue	2	Pages	3690-3701
Keywords
Abstract	This manuscript evaluates the behavior of classical feature point descriptors when they are used in images from long-wave infrared spectral band and compare them with the results obtained in the visible spectrum. Robustness to changes in rotation, scaling, blur, and additive noise are analyzed using a state of the art framework. Experimental results using a cross-spectral outdoor image data set are presented and conclusions from these experiments are given.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS;600.055; 600.076			Approved	no
Call Number	Admin @ si @ RCA2014a			Serial	2474
Permanent link to this record



Author	O. Fors; J. Nuñez; Xavier Otazu; A. Prades; Robert D. Cardinal
Title	Improving the Ability of Image Sensors to Detect Faint Stars and Moving Objects Using Image Deconvolution Techniques			Type	Journal Article
Year	2010	Publication	Sensors	Abbreviated Journal	SENS
Volume	10	Issue	3	Pages	1743–1752
Keywords	image processing; image deconvolution; faint stars; space debris; wavelet transform
Abstract	Abstract: In this paper we show how the techniques of image deconvolution can increase the ability of image sensors as, for example, CCD imagers, to detect faint stars or faint orbital objects (small satellites and space debris). In the case of faint stars, we show that this benefit is equivalent to double the quantum efficiency of the used image sensor or to increase the effective telescope aperture by more than 30% without decreasing the astrometric precision or introducing artificial bias. In the case of orbital objects, the deconvolution technique can double the signal-to-noise ratio of the image, which helps to discover and control dangerous objects as space debris or lost satellites. The benefits obtained using CCD detectors can be extrapolated to any kind of image sensors.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	CAT @ cat @ FNO2010			Serial	1285
Permanent link to this record



Author	Mark Philip Philipsen; Jacob Velling Dueholm; Anders Jorgensen; Sergio Escalera; Thomas B. Moeslund
Title	Organ Segmentation in Poultry Viscera Using RGB-D			Type	Journal Article
Year	2018	Publication	Sensors	Abbreviated Journal	SENS
Volume	18	Issue	1	Pages	117
Keywords	semantic segmentation; RGB-D; random forest; conditional random field; 2D; 3D; CNN
Abstract	We present a pattern recognition framework for semantic segmentation of visual structures, that is, multi-class labelling at pixel level, and apply it to the task of segmenting organs in the eviscerated viscera from slaughtered poultry in RGB-D images. This is a step towards replacing the current strenuous manual inspection at poultry processing plants. Features are extracted from feature maps such as activation maps from a convolutional neural network (CNN). A random forest classifier assigns class probabilities, which are further refined by utilizing context in a conditional random field. The presented method is compatible with both 2D and 3D features, which allows us to explore the value of adding 3D and CNN-derived features. The dataset consists of 604 RGB-D images showing 151 unique sets of eviscerated viscera from four different perspectives. A mean Jaccard index of 78.11% is achieved across the four classes of organs by using features derived from 2D, 3D and a CNN, compared to 74.28% using only basic 2D image features.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ PVJ2018			Serial	3072
Permanent link to this record



Author	Jose Luis Gomez; Gabriel Villalonga; Antonio Lopez
Title	Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches			Type	Journal Article
Year	2021	Publication	Sensors	Abbreviated Journal	SENS
Volume	21	Issue	9	Pages	3185
Keywords	co-training; multi-modality; vision-based object detection; ADAS; self-driving
Abstract	Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ GVL2021			Serial	3562
Permanent link to this record



Author	Idoia Ruiz; Joan Serrat
Title	Hierarchical Novelty Detection for Traffic Sign Recognition			Type	Journal Article
Year	2022	Publication	Sensors	Abbreviated Journal	SENS
Volume	22	Issue	12	Pages	4389
Keywords	Novelty detection; hierarchical classification; deep learning; traffic sign recognition; autonomous driving; computer vision
Abstract	Recent works have made significant progress in novelty detection, i.e., the problem of detecting samples of novel classes, never seen during training, while classifying those that belong to known classes. However, the only information this task provides about novel samples is that they are unknown. In this work, we leverage hierarchical taxonomies of classes to provide informative outputs for samples of novel classes. We predict their closest class in the taxonomy, i.e., its parent class. We address this problem, known as hierarchical novelty detection, by proposing a novel loss, namely Hierarchical Cosine Loss that is designed to learn class prototypes along with an embedding of discriminative features consistent with the taxonomy. We apply it to traffic sign recognition, where we predict the parent class semantics for new types of traffic signs. Our model beats state-of-the art approaches on two large scale traffic sign benchmarks, Mapillary Traffic Sign Dataset (MTSD) and Tsinghua-Tencent 100K (TT100K), and performs similarly on natural images benchmarks (AWA2, CUB). For TT100K and MTSD, our approach is able to detect novel samples at the correct nodes of the hierarchy with 81% and 36% of accuracy, respectively, at 80% known class accuracy.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.154			Approved	no
Call Number	Admin @ si @ RuS2022			Serial	3684
Permanent link to this record



Author	Gabriel Villalonga; Joost Van de Weijer; Antonio Lopez
Title	Recognizing new classes with synthetic data in the loop: application to traffic sign recognition			Type	Journal Article
Year	2020	Publication	Sensors	Abbreviated Journal	SENS
Volume	20	Issue	3	Pages	583
Keywords
Abstract	On-board vision systems may need to increase the number of classes that can be recognized in a relatively short period. For instance, a traffic sign recognition system may suddenly be required to recognize new signs. Since collecting and annotating samples of such new classes may need more time than we wish, especially for uncommon signs, we propose a method to generate these samples by combining synthetic images and Generative Adversarial Network (GAN) technology. In particular, the GAN is trained on synthetic and real-world samples from known classes to perform synthetic-to-real domain adaptation, but applied to synthetic samples of the new classes. Using the Tsinghua dataset with a synthetic counterpart, SYNTHIA-TS, we have run an extensive set of experiments. The results show that the proposed method is indeed effective, provided that we use a proper Convolutional Neural Network (CNN) to perform the traffic sign recognition (classification) task as well as a proper GAN to transform the synthetic images. Here, a ResNet101-based classifier and domain adaptation based on CycleGAN performed extremely well for a ratio∼ 1/4 for new/known classes; even for more challenging ratios such as∼ 4/1, the results are also very positive.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; ADAS; 600.118; 600.120			Approved	no
Call Number	Admin @ si @ VWL2020			Serial	3405
Permanent link to this record



Author	Cristhian Aguilera; Fernando Barrera; Felipe Lumbreras; Angel Sappa; Ricardo Toledo
Title	Multispectral Image Feature Points			Type	Journal Article
Year	2012	Publication	Sensors	Abbreviated Journal	SENS
Volume	12	Issue	9	Pages	12661-12672
Keywords	multispectral image descriptor; color and infrared images; feature point descriptor
Abstract	Far-Infrared and Visible Spectrum images. It allows matching interest points on images of the same scene but acquired in different spectral bands. Initially, points of interest are detected on both images through a SIFT-like based scale space representation. Then, these points are characterized using an Edge Oriented Histogram (EOH) descriptor. Finally, points of interest from multispectral images are matched by finding nearest couples using the information from the descriptor. The provided experimental results and comparisons with similar methods show both the validity of the proposed approach as well as the improvements it offers with respect to the current state-of-the-art.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ ABL2012			Serial	2154
Permanent link to this record



Author	Cristhian A. Aguilera-Carrasco; Cristhian Aguilera; Cristobal A. Navarro; Angel Sappa
Title	Fast CNN Stereo Depth Estimation through Embedded GPU Devices			Type	Journal Article
Year	2020	Publication	Sensors	Abbreviated Journal	SENS
Volume	20	Issue	11	Pages	3249
Keywords	stereo matching; deep learning; embedded GPU
Abstract	Current CNN-based stereo depth estimation models can barely run under real-time constraints on embedded graphic processing unit (GPU) devices. Moreover, state-of-the-art evaluations usually do not consider model optimization techniques, being that it is unknown what is the current potential on embedded GPU devices. In this work, we evaluate two state-of-the-art models on three different embedded GPU devices, with and without optimization methods, presenting performance results that illustrate the actual capabilities of embedded GPU devices for stereo depth estimation. More importantly, based on our evaluation, we propose the use of a U-Net like architecture for postprocessing the cost-volume, instead of a typical sequence of 3D convolutions, drastically augmenting the runtime speed of current models. In our experiments, we achieve real-time inference speed, in the range of 5–32 ms, for 1216 × 368 input stereo images on the Jetson TX2, Jetson Xavier, and Jetson Nano embedded devices.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MSIAU; 600.122			Approved	no
Call Number	Admin @ si @ AAN2020			Serial	3428
Permanent link to this record