Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–15]

Details

	Records
	Author	Marçal Rusiñol; J. Chazalon; Katerine Diaz
	Title	Augmented Songbook: an Augmented Reality Educational Application for Raising Music Awareness			Type	Journal Article
	Year	2018	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
	Volume	77	Issue	11	Pages	13773-13798
	Keywords	Augmented reality; Document image matching; Educational applications
	Abstract	This paper presents the development of an Augmented Reality mobile application which aims at sensibilizing young children to abstract concepts of music. Such concepts are, for instance, the musical notation or the idea of rhythm. Recent studies in Augmented Reality for education suggest that such technologies have multiple benefits for students, including younger ones. As mobile document image acquisition and processing gains maturity on mobile platforms, we explore how it is possible to build a markerless and real-time application to augment the physical documents with didactic animations and interactive virtual content. Given a standard image processing pipeline, we compare the performance of different local descriptors at two key stages of the process. Results suggest alternatives to the SIFT local descriptors, regarding result quality and computational efficiency, both for document model identification and perspective transform estimation. All experiments are performed on an original and public dataset we introduce here.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.084; 600.121; 600.118; 600.129			Approved	no
	Call Number	Admin @ si @ RCD2018			Serial	2996
Permanent link to this record



	Author	David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Llados
	Title	A Study of Bag-of-Visual-Words Representations for Handwritten Keyword Spotting			Type	Journal Article
	Year	2015	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	18	Issue	3	Pages	223-234
	Keywords	Bag-of-Visual-Words; Keyword spotting; Handwritten documents; Performance evaluation
	Abstract	The Bag-of-Visual-Words (BoVW) framework has gained popularity among the document image analysis community, specifically as a representation of handwritten words for recognition or spotting purposes. Although in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis domain still rely on the basic implementation of the BoVW method disregarding such latest refinements. In this paper, we present a review of those improvements and its application to the keyword spotting task. We thoroughly evaluate their impact against a baseline system in the well-known George Washington dataset and compare the obtained results against nine state-of-the-art keyword spotting methods. In addition, we also compare both the baseline and improved systems with the methods presented at the Handwritten Keyword Spotting Competition 2014.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-2833	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.055; 600.061; 601.223; 600.077; 600.097			Approved	no
	Call Number	Admin @ si @ ART2015			Serial	2679
Permanent link to this record



	Author	Juan Jose Rubio; Takahiro Kashiwa; Teera Laiteerapong; Wenlong Deng; Kohei Nagai; Sergio Escalera; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger
	Title	Multi-class structural damage segmentation using fully convolutional networks			Type	Journal Article
	Year	2019	Publication	Computers in Industry	Abbreviated Journal	COMPUTIND
	Volume	112	Issue		Pages	103121
	Keywords	Bridge damage detection; Deep learning; Semantic segmentation
	Abstract	Structural Health Monitoring (SHM) has benefited from computer vision and more recently, Deep Learning approaches, to accurately estimate the state of deterioration of infrastructure. In our work, we test Fully Convolutional Networks (FCNs) with a dataset of deck areas of bridges for damage segmentation. We create a dataset for delamination and rebar exposure that has been collected from inspection records of bridges in Niigata Prefecture, Japan. The dataset consists of 734 images with three labels per image, which makes it the largest dataset of images of bridge deck damage. This data allows us to estimate the performance of our method based on regions of agreement, which emulates the uncertainty of in-field inspections. We demonstrate the practicality of FCNs to perform automated semantic segmentation of surface damages. Our model achieves a mean accuracy of 89.7% for delamination and 78.4% for rebar exposure, and a weighted F1 score of 81.9%.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; no proj;MILAB;ADAS			Approved	no
	Call Number	Admin @ si @ RKL2019			Serial	3315
Permanent link to this record



	Author	Angel Sappa; Fadi Dornaika; Daniel Ponsa; David Geronimo; Antonio Lopez
	Title	An Efficient Approach to Onboard Stereo Vision System Pose Estimation			Type	Journal Article
	Year	2008	Publication	IEEE Transactions on Intelligent Transportation Systems	Abbreviated Journal	TITS
	Volume	9	Issue	3	Pages	476–490
	Keywords	Camera extrinsic parameter estimation, ground plane estimation, onboard stereo vision system
	Abstract	This paper presents an efficient technique for estimating the pose of an onboard stereo vision system relative to the environment’s dominant surface area, which is supposed to be the road surface. Unlike previous approaches, it can be used either for urban or highway scenarios since it is not based on a specific visual traffic feature extraction but on 3-D raw data points. The whole process is performed in the Euclidean space and consists of two stages. Initially, a compact 2-D representation of the original 3-D data points is computed. Then, a RANdom SAmple Consensus (RANSAC) based least-squares approach is used to fit a plane to the road. Fast RANSAC fitting is obtained by selecting points according to a probability function that takes into account the density of points at a given depth. Finally, stereo camera height and pitch angle are computed related to the fitted road plane. The proposed technique is intended to be used in driverassistance systems for applications such as vehicle or pedestrian detection. Experimental results on urban environments, which are the most challenging scenarios (i.e., flat/uphill/downhill driving, speed bumps, and car’s accelerations), are presented. These results are validated with manually annotated ground truth. Additionally, comparisons with previous works are presented to show the improvements in the central processing unit processing time, as well as in the accuracy of the obtained results.
	Address
	Corporate Author				Thesis
	Publisher	IEEE	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	ADAS @ adas @ SDP2008			Serial	1000
Permanent link to this record



	Author	Jose Luis Gomez; Gabriel Villalonga; Antonio Lopez
	Title	Co-Training for Deep Object Detection: Comparing Single-Modal and Multi-Modal Approaches			Type	Journal Article
	Year	2021	Publication	Sensors	Abbreviated Journal	SENS
	Volume	21	Issue	9	Pages	3185
	Keywords	co-training; multi-modality; vision-based object detection; ADAS; self-driving
	Abstract	Top-performing computer vision models are powered by convolutional neural networks (CNNs). Training an accurate CNN highly depends on both the raw sensor data and their associated ground truth (GT). Collecting such GT is usually done through human labeling, which is time-consuming and does not scale as we wish. This data-labeling bottleneck may be intensified due to domain shifts among image sensors, which could force per-sensor data labeling. In this paper, we focus on the use of co-training, a semi-supervised learning (SSL) method, for obtaining self-labeled object bounding boxes (BBs), i.e., the GT to train deep object detectors. In particular, we assess the goodness of multi-modal co-training by relying on two different views of an image, namely, appearance (RGB) and estimated depth (D). Moreover, we compare appearance-based single-modal co-training with multi-modal. Our results suggest that in a standard SSL setting (no domain shift, a few human-labeled data) and under virtual-to-real domain shift (many virtual-world labeled data, no human-labeled data) multi-modal co-training outperforms single-modal. In the latter case, by performing GAN-based domain translation both co-training modalities are on par, at least when using an off-the-shelf depth estimation model not specifically trained on the translated images.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.118			Approved	no
	Call Number	Admin @ si @ GVL2021			Serial	3562
Permanent link to this record



	Author	Aura Hernandez-Sabate; Jose Elias Yauri; Pau Folch; Miquel Angel Piera; Debora Gil
	Title	Recognition of the Mental Workloads of Pilots in the Cockpit Using EEG Signals			Type	Journal Article
	Year	2022	Publication	Applied Sciences	Abbreviated Journal	APPLSCI
	Volume	12	Issue	5	Pages	2298
	Keywords	Cognitive states; Mental workload; EEG analysis; Neural networks; Multimodal data fusion
	Abstract	The commercial flightdeck is a naturally multi-tasking work environment, one in which interruptions are frequent come in various forms, contributing in many cases to aviation incident reports. Automatic characterization of pilots’ workloads is essential to preventing these kind of incidents. In addition, minimizing the physiological sensor network as much as possible remains both a challenge and a requirement. Electroencephalogram (EEG) signals have shown high correlations with specific cognitive and mental states, such as workload. However, there is not enough evidence in the literature to validate how well models generalize in cases of new subjects performing tasks with workloads similar to the ones included during the model’s training. In this paper, we propose a convolutional neural network to classify EEG features across different mental workloads in a continuous performance task test that partly measures working memory and working memory capacity. Our model is valid at the general population level and it is able to transfer task learning to pilot mental workload recognition in a simulated operational environment.
	Address	February 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM; ADAS; 600.139; 600.145; 600.118			Approved	no
	Call Number	Admin @ si @ HYF2022			Serial	3720
Permanent link to this record



	Author	David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville
	Title	A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images			Type	Journal Article
	Year	2017	Publication	Journal of Healthcare Engineering	Abbreviated Journal	JHCE
	Volume		Issue		Pages	2040-2295
	Keywords	Colonoscopy images; Deep Learning; Semantic Segmentation
	Abstract	Colorectal cancer (CRC) is the third cause of cancer death world-wide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss- rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aim- ing to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image segmentation, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. The proposed dataset consists of 4 relevant classes to inspect the endolumninal scene, tar- geting different clinical needs. Together with the dataset and taking advantage of advances in semantic segmentation literature, we provide new baselines by training standard fully convolutional networks (FCN). We perform a compar- ative study to show that FCN significantly outperform, without any further post-processing, prior results in endoluminal scene segmentation, especially with respect to polyp segmentation and localization.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118;MILAB			Approved	no
	Call Number	VBS2017b			Serial	2940
Permanent link to this record



	Author	Miguel Oliveira; Angel Sappa; Victor Santos
	Title	A probabilistic approach for color correction in image mosaicking applications			Type	Journal Article
	Year	2015	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	14	Issue	2	Pages	508 - 523
	Keywords	Color correction; image mosaicking; color transfer; color palette mapping functions
	Abstract	Image mosaicking applications require both geometrical and photometrical registrations between the images that compose the mosaic. This paper proposes a probabilistic color correction algorithm for correcting the photometrical disparities. First, the image to be color corrected is segmented into several regions using mean shift. Then, connected regions are extracted using a region fusion algorithm. Local joint image histograms of each region are modeled as collections of truncated Gaussians using a maximum likelihood estimation procedure. Then, local color palette mapping functions are computed using these sets of Gaussians. The color correction is performed by applying those functions to all the regions of the image. An extensive comparison with ten other state of the art color correction algorithms is presented, using two different image pair data sets. Results show that the proposed approach obtains the best average scores in both data sets and evaluation metrics and is also the most robust to failures.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1057-7149	ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.076			Approved	no
	Call Number	Admin @ si @ OSS2015b			Serial	2554
Permanent link to this record



	Author	Daniel Ponsa; Antonio Lopez
	Title	Variance reduction techniques in particle-based visual contour Tracking			Type	Journal Article
	Year	2009	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	42	Issue	11	Pages	2372–2391
	Keywords	Contour tracking; Active shape models; Kalman filter; Particle filter; Importance sampling; Unscented particle filter; Rao-Blackwellization; Partitioned sampling
	Abstract	This paper presents a comparative study of three different strategies to improve the performance of particle filters, in the context of visual contour tracking: the unscented particle filter, the Rao-Blackwellized particle filter, and the partitioned sampling technique. The tracking problem analyzed is the joint estimation of the global and local transformation of the outline of a given target, represented following the active shape model approach. The main contributions of the paper are the novel adaptations of the considered techniques on this generic problem, and the quantitative assessment of their performance in extensive experimental work done.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS			Approved	no
	Call Number	ADAS @ adas @ PoL2009a			Serial	1168
Permanent link to this record



	Author	Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan C. Moure
	Title	3D Perception With Slanted Stixels on GPU			Type	Journal Article
	Year	2021	Publication	IEEE Transactions on Parallel and Distributed Systems	Abbreviated Journal	TPDS
	Volume	32	Issue	10	Pages	2434-2447
	Keywords	Daniel Hernandez-Juarez; Antonio Espinosa; David Vazquez; Antonio M. Lopez; Juan C. Moure
	Abstract	This article presents a GPU-accelerated software design of the recently proposed model of Slanted Stixels, which represents the geometric and semantic information of a scene in a compact and accurate way. We reformulate the measurement depth model to reduce the computational complexity of the algorithm, relying on the confidence of the depth estimation and the identification of invalid values to handle outliers. The proposed massively parallel scheme and data layout for the irregular computation pattern that corresponds to a Dynamic Programming paradigm is described and carefully analyzed in performance terms. Performance is shown to scale gracefully on current generation embedded GPUs. We assess the proposed methods in terms of semantic and geometric accuracy as well as run-time performance on three publicly available benchmark datasets. Our approach achieves real-time performance with high accuracy for 2048 × 1024 image sizes and 4 × 4 Stixel resolution on the low-power embedded GPU of an NVIDIA Tegra Xavier.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.124; 600.118			Approved	no
	Call Number	Admin @ si @ HEV2021			Serial	3561
Permanent link to this record

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–15]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: