Publicacions CVC -- Query Results

[11–20] << 21 22 23 24 25 26 27 28 29 30 >> [31–40]

Details

Records
Author	Mohammad Rouhani; Angel Sappa
Title	A Fast accurate Implicit Polynomial Fitting Approach			Type	Conference Article
Year	2010	Publication	17th IEEE International Conference on Image Processing	Abbreviated Journal
Volume		Issue		Pages	1429–1432
Keywords
Abstract	This paper presents a novel hybrid approach that combines state of the art fitting algorithms: algebraic-based and geometric-based. It consists of two steps; first, the 3L algorithm is used as an initialization and then, the obtained result, is improved through a geometric approach. The adopted geometric approach is based on a distance estimation that avoids costly search for the real orthogonal distance. Experimental results are presented as well as quantitative comparisons.
Address	Hong-Kong
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1522-4880	ISBN	978-1-4244-7992-4	Medium
Area		Expedition		Conference	ICIP
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ RoS2010b			Serial	1359
Permanent link to this record



Author	Sergio Escalera; Alicia Fornes; O. Pujol; Petia Radeva; Gemma Sanchez; Josep Llados
Title	Blurred Shape Model for Binary and Grey-level Symbol Recognition			Type	Journal Article
Year	2009	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	30	Issue	15	Pages	1424–1433
Keywords
Abstract	Many symbol recognition problems require the use of robust descriptors in order to obtain rich information of the data. However, the research of a good descriptor is still an open issue due to the high variability of symbols appearance. Rotation, partial occlusions, elastic deformations, intra-class and inter-class variations, or high variability among symbols due to different writing styles, are just a few problems. In this paper, we introduce a symbol shape description to deal with the changes in appearance that these types of symbols suffer. The shape of the symbol is aligned based on principal components to make the recognition invariant to rotation and reflection. Then, we present the Blurred Shape Model descriptor (BSM), where new features encode the probability of appearance of each pixel that outlines the symbols shape. Moreover, we include the new descriptor in a system to deal with multi-class symbol categorization problems. Adaboost is used to train the binary classifiers, learning the BSM features that better split symbol classes. Then, the binary problems are embedded in an Error-Correcting Output Codes framework (ECOC) to deal with the multi-class case. The methodology is evaluated on different synthetic and real data sets. State-of-the-art descriptors and classifiers are compared, showing the robustness and better performance of the present scheme to classify symbols with high variability of appearance.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; DAG; MILAB			Approved	no
Call Number	BCNPCL @ bcnpcl @ EFP2009a			Serial	1180
Permanent link to this record



Author	Andres Traumann; Gholamreza Anbarjafari; Sergio Escalera
Title	Accurate 3D Measurement Using Optical Depth Information			Type	Journal Article
Year	2015	Publication	Electronic Letters	Abbreviated Journal	EL
Volume	51	Issue	18	Pages	1420-1422
Keywords
Abstract	A novel three-dimensional measurement technique is proposed. The methodology consists in mapping from the screen coordinates reported by the optical camera to the real world, and integrating distance gradients from the beginning to the end point, while also minimising the error through fitting pixel locations to a smooth curve. The results demonstrate accuracy of less than half a centimetre using Microsoft Kinect II.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA;MILAB			Approved	no
Call Number	Admin @ si @ TAE2015			Serial	2647
Permanent link to this record



Author	Diego Alejandro Cheda; Daniel Ponsa; Antonio Lopez
Title	Camera Egomotion Estimation in the ADAS Context			Type	Conference Article
Year	2010	Publication	13th International IEEE Annual Conference on Intelligent Transportation Systems	Abbreviated Journal
Volume		Issue		Pages	1415–1420
Keywords
Abstract	Camera-based Advanced Driver Assistance Systems (ADAS) have concentrated many research efforts in the last decades. Proposals based on monocular cameras require the knowledge of the camera pose with respect to the environment, in order to reach an efficient and robust performance. A common assumption in such systems is considering the road as planar, and the camera pose with respect to it as approximately known. However, in real situations, the camera pose varies along time due to the vehicle movement, the road slope, and irregularities on the road surface. Thus, the changes in the camera position and orientation (i.e., the egomotion) are critical information that must be estimated at every frame to avoid poor performances. This work focuses on egomotion estimation from a monocular camera under the ADAS context. We review and compare egomotion methods with simulated and real ADAS-like sequences. Basing on the results of our experiments, we show which of the considered nonlinear and linear algorithms have the best performance in this domain.
Address	Madeira Island (Portugal)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	2153-0009	ISBN	978-1-4244-7657-2	Medium
Area		Expedition		Conference	ITSC
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ CPL2010			Serial	1425
Permanent link to this record



Author	Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title	A deep co-attentive hand-based video question answering framework using multi-view skeleton			Type	Journal Article
Year	2023	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
Volume	82	Issue		Pages	1401–1429
Keywords
Abstract	In this paper, we present a novel hand –based Video Question Answering framework, entitled Multi-View Video Question Answering (MV-VQA), employing the Single Shot Detector (SSD), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bidirectional Encoder Representations from Transformers (BERT), and Co-Attention mechanism with RGB videos as the inputs. Our model includes three main blocks: vision, language, and attention. In the vision block, we employ a novel representation to obtain some efficient multiview features from the hand object using the combination of five 3DCNNs and one LSTM network. To obtain the question embedding, we use the BERT model in language block. Finally, we employ a co-attention mechanism on vision and language features to recognize the final answer. For the first time, we propose such a hand-based Video-QA framework including the multi-view hand skeleton features combined with the question embedding and co-attention mechanism. Our framework is capable of processing the arbitrary numbers of questions in the dataset annotations. There are different application domains for this framework. Here, as an application domain, we applied our framework to dynamic hand gesture recognition for the first time. Since the main object in dynamic hand gesture recognition is the human hand, we performed a step-by-step analysis of the hand detection and multi-view hand skeleton impact on the model performance. Evaluation results on five datasets, including two datasets in VideoQA, two datasets in dynamic hand gesture, and one dataset in hand action recognition show that MV-VQA outperforms state-of-the-art alternatives.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ RKE2023b			Serial	3881
Permanent link to this record



Author	Saad Minhas; Aura Hernandez-Sabate; Shoaib Ehsan; Klaus McDonald Maier
Title	Effects of Non-Driving Related Tasks during Self-Driving mode			Type	Journal Article
Year	2022	Publication	IEEE Transactions on Intelligent Transportation Systems	Abbreviated Journal	TITS
Volume	23	Issue	2	Pages	1391-1399
Keywords
Abstract	Perception reaction time and mental workload have proven to be crucial in manual driving. Moreover, in highly automated cars, where most of the research is focusing on Level 4 Autonomous driving, take-over performance is also a key factor when taking road safety into account. This study aims to investigate how the immersion in non-driving related tasks affects the take-over performance of drivers in given scenarios. The paper also highlights the use of virtual simulators to gather efficient data that can be crucial in easing the transition between manual and autonomous driving scenarios. The use of Computer Aided Simulations is of absolute importance in this day and age since the automotive industry is rapidly moving towards Autonomous technology. An experiment comprising of 40 subjects was performed to examine the reaction times of driver and the influence of other variables in the success of take-over performance in highly automated driving under different circumstances within a highway virtual environment. The results reflect the relationship between reaction times under different scenarios that the drivers might face under the circumstances stated above as well as the importance of variables such as velocity in the success on regaining car control after automated driving. The implications of the results acquired are important for understanding the criteria needed for designing Human Machine Interfaces specifically aimed towards automated driving conditions. Understanding the need to keep drivers in the loop during automation, whilst allowing drivers to safely engage in other non-driving related tasks is an important research area which can be aided by the proposed study.
Address	Feb. 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM; 600.139; 600.145			Approved	no
Call Number	Admin @ si @ MHE2022			Serial	3468
Permanent link to this record



Author	Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas
Title	Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching			Type	Conference Article
Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	1391-1400
Keywords	Measurement; Training; Integrated circuits; Annotations; Semantics; Training data; Semisupervised learning
Abstract	The task of image-text matching aims to map representations from different modalities into a common joint visual-textual embedding. However, the most widely used datasets for this task, MSCOCO and Flickr30K, are actually image captioning datasets that offer a very limited set of relationships between images and sentences in their ground-truth annotations. This limited ground truth information forces us to use evaluation metrics based on binary relevance: given a sentence query we consider only one image as relevant. However, many other relevant images or captions may be present in the dataset. In this work, we propose two metrics that evaluate the degree of semantic relevance of retrieved items, independently of their annotated binary relevance. Additionally, we incorporate a novel strategy that uses an image captioning metric, CIDEr, to define a Semantic Adaptive Margin (SAM) to be optimized in a standard triplet loss. By incorporating our formulation to existing models, a large improvement is obtained in scenarios where available training data is limited. We also demonstrate that the performance on the annotated image-caption pairs is maintained while improving on other non-annotated relevant items when employing the full training set. The code for our new metric can be found at github. com/furkanbiten/ncsmetric and the model implementation at github. com/andrespmd/semanticadaptive_margin.
Address	Virtual; Waikoloa; Hawai; USA; January 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.155; 302.105;			Approved	no
Call Number	Admin @ si @ BMG2022			Serial	3663
Permanent link to this record



Author	Alicia Fornes; Veronica Romero; Arnau Baro; Juan Ignacio Toledo; Joan Andreu Sanchez; Enrique Vidal; Josep Llados
Title	ICDAR2017 Competition on Information Extraction in Historical Handwritten Records			Type	Conference Article
Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume		Issue		Pages	1389-1394
Keywords
Abstract	The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this competition, the goal is to detect the named entities and assign each of them a semantic category, and therefore, to simulate the filling in of a knowledge database. This paper describes the dataset, the tasks, the evaluation metrics, the participants methods and the results.
Address	Kyoto; Japan; November 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.097; 601.225; 600.121			Approved	no
Call Number	Admin @ si @ FRB2017			Serial	3052
Permanent link to this record



Author	Fahad Shahbaz Khan; Shida Beigpour; Joost Van de Weijer; Michael Felsberg
Title	Painting-91: A Large Scale Database for Computational Painting Categorization			Type	Journal Article
Year	2014	Publication	Machine Vision and Applications	Abbreviated Journal	MVAP
Volume	25	Issue	6	Pages	1385-1397
Keywords
Abstract	Computer analysis of visual art, especially paintings, is an interesting cross-disciplinary research domain. Most of the research in the analysis of paintings involve medium to small range datasets with own specific settings. Interestingly, significant progress has been made in the field of object and scene recognition lately. A key factor in this success is the introduction and availability of benchmark datasets for evaluation. Surprisingly, such a benchmark setup is still missing in the area of computational painting categorization. In this work, we propose a novel large scale dataset of digital paintings. The dataset consists of paintings from 91 different painters. We further show three applications of our dataset namely: artist categorization, style classification and saliency detection. We investigate how local and global features popular in image classification perform for the tasks of artist and style categorization. For both categorization tasks, our experimental results suggest that combining multiple features significantly improves the final performance. We show that state-of-the-art computer vision methods can correctly classify 50 % of unseen paintings to its painter in a large dataset and correctly attribute its artistic style in over 60 % of the cases. Additionally, we explore the task of saliency detection on paintings and show experimental findings using state-of-the-art saliency estimation algorithms.
Address
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0932-8092	ISBN		Medium
Area		Expedition		Conference
Notes	CIC; LAMP; 600.074; 600.079			Approved	no
Call Number	Admin @ si @ KBW2014			Serial	2510
Permanent link to this record



Author	Fadi Dornaika; Angel Sappa
Title	A Featureless and Stochastic Approach to On-board Stereo Vision System Pose			Type	Journal Article
Year	2009	Publication	Image and Vision Computing	Abbreviated Journal	IMAVIS
Volume	27	Issue	9	Pages	1382–1393
Keywords	On-board stereo vision system; Pose estimation; Featureless approach; Particle filtering; Image warping
Abstract	This paper presents a direct and stochastic technique for real-time estimation of on-board stereo head’s position and orientation. Unlike existing works which rely on feature extraction either in the image domain or in 3D space, our proposed approach directly estimates the unknown parameters from the stream of stereo pairs’ brightness. The pose parameters are tracked using the particle filtering framework which implicitly enforces the smoothness constraints on the estimated parameters. The proposed technique can be used with a driver assistance applications as well as with augmented reality applications. Extended experiments on urban environments with different road geometries are presented. Comparisons with a 3D data-based approach are presented. Moreover, we provide a performance study aiming at evaluating the accuracy of the proposed approach.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ DoS2009b			Serial	1152
Permanent link to this record



Author	Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas
Title	Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning			Type	Conference Article
Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	1381-1390
Keywords	Measurement; Training; Visualization; Analytical models; Computer vision; Computational modeling; Training data
Abstract	Explaining an image with missing or non-existent objects is known as object bias (hallucination) in image captioning. This behaviour is quite common in the state-of-the-art captioning models which is not desirable by humans. To decrease the object hallucination in captioning, we propose three simple yet efficient training augmentation method for sentences which requires no new training data or increase in the model size. By extensive analysis, we show that the proposed methods can significantly diminish our models’ object bias on hallucination metrics. Moreover, we experimentally demonstrate that our methods decrease the dependency on the visual features. All of our code, configuration files and model weights are available online.
Address	Virtual; Waikoloa; Hawai; USA; January 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.155; 302.105			Approved	no
Call Number	Admin @ si @ BGK2022			Serial	3662
Permanent link to this record



Author	Ferran Diego; Joan Serrat; Antonio Lopez
Title	Joint spatio-temporal alignment of sequences			Type	Journal Article
Year	2013	Publication	IEEE Transactions on Multimedia	Abbreviated Journal	TMM
Volume	15	Issue	6	Pages	1377-1387
Keywords	video alignment
Abstract	Video alignment is important in different areas of computer vision such as wide baseline matching, action recognition, change detection, video copy detection and frame dropping prevention. Current video alignment methods usually deal with a relatively simple case of fixed or rigidly attached cameras or simultaneous acquisition. Therefore, in this paper we propose a joint video alignment for bringing two video sequences into a spatio-temporal alignment. Specifically, the novelty of the paper is to formulate the video alignment to fold the spatial and temporal alignment into a single alignment framework. This simultaneously satisfies a frame-correspondence and frame-alignment similarity; exploiting the knowledge among neighbor frames by a standard pairwise Markov random field (MRF). This new formulation is able to handle the alignment of sequences recorded at different times by independent moving cameras that follows a similar trajectory, and also generalizes the particular cases that of fixed geometric transformation and/or linear temporal mapping. We conduct experiments on different scenarios such as sequences recorded simultaneously or by moving cameras to validate the robustness of the proposed approach. The proposed method provides the highest video alignment accuracy compared to the state-of-the-art methods on sequences recorded from vehicles driving along the same track at different times.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1520-9210	ISBN		Medium
Area		Expedition		Conference
Notes	ADAS			Approved	no
Call Number	Admin @ si @ DSL2013; ADAS @ adas @			Serial	2228
Permanent link to this record



Author	Arash Akbarinia; C. Alejandro Parraga
Title	Feedback and Surround Modulated Boundary Detection			Type	Journal Article
Year	2018	Publication	International Journal of Computer Vision	Abbreviated Journal	IJCV
Volume	126	Issue	12	Pages	1367–1380
Keywords	Boundary detection; Surround modulation; Biologically-inspired vision
Abstract	Edges are key components of any visual scene to the extent that we can recognise objects merely by their silhouettes. The human visual system captures edge information through neurons in the visual cortex that are sensitive to both intensity discontinuities and particular orientations. The “classical approach” assumes that these cells are only responsive to the stimulus present within their receptive fields, however, recent studies demonstrate that surrounding regions and inter-areal feedback connections influence their responses significantly. In this work we propose a biologically-inspired edge detection model in which orientation selective neurons are represented through the first derivative of a Gaussian function resembling double-opponent cells in the primary visual cortex (V1). In our model we account for four kinds of receptive field surround, i.e. full, far, iso- and orthogonal-orientation, whose contributions are contrast-dependant. The output signal from V1 is pooled in its perpendicular direction by larger V2 neurons employing a contrast-variant centre-surround kernel. We further introduce a feedback connection from higher-level visual areas to the lower ones. The results of our model on three benchmark datasets show a big improvement compared to the current non-learning and biologically-inspired state-of-the-art algorithms while being competitive to the learning-based methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	NEUROBIT; 600.068; 600.072			Approved	no
Call Number	Admin @ si @ AkP2018b			Serial	2991
Permanent link to this record



Author	F. Javier Sanchez; Jorge Bernal
Title	Use of Software Tools for Real-time Monitoring of Learning Processes: Application to Compilers subject			Type	Conference Article
Year	2018	Publication	4th International Conference of Higher Education Advances	Abbreviated Journal
Volume		Issue		Pages	1359-1366
Keywords	Monitoring; Evaluation tool; Gamification; Student motivation
Abstract	The effective implementation of the Higher European Education Area has meant a change regarding the focus of the learning process, being now the student at its very center. This shift of focus requires a strong involvement and fluent communication between teachers and students to succeed. Considering the difficulties associated to motivate students to take a more active role in the learning process, we explore how the use of a software tool can help both actors to improve the learning experience. We present a tool that can help students to obtain instantaneous feedback with respect to their progress in the subject as well as providing teachers with useful information about the evolution of knowledge acquisition with respect to each of the subject areas. We compare the performance achieved by students in two academic years: results show an improvement in overall performance which, after observing graphs provided by our tool, can be associated to an increase in students interest in the subject.
Address	Valencia; June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	HEAD
Notes	MV; no proj			Approved	no
Call Number	Admin @ si @ SaB2018			Serial	3165
Permanent link to this record



Author	Bhaskar Chakraborty; Jordi Gonzalez; Xavier Roca
Title	Large scale continuous visual event recognition using max-margin Hough transformation framework			Type	Journal Article
Year	2013	Publication	Computer Vision and Image Understanding	Abbreviated Journal	CVIU
Volume	117	Issue	10	Pages	1356–1368
Keywords
Abstract	In this paper we propose a novel method for continuous visual event recognition (CVER) on a large scale video dataset using max-margin Hough transformation framework. Due to high scalability, diverse real environmental state and wide scene variability direct application of action recognition/detection methods such as spatio-temporal interest point (STIP)-local feature based technique, on the whole dataset is practically infeasible. To address this problem, we apply a motion region extraction technique which is based on motion segmentation and region clustering to identify possible candidate “event of interest” as a preprocessing step. On these candidate regions a STIP detector is applied and local motion features are computed. For activity representation we use generalized Hough transform framework where each feature point casts a weighted vote for possible activity class centre. A max-margin frame work is applied to learn the feature codebook weight. For activity detection, peaks in the Hough voting space are taken into account and initial event hypothesis is generated using the spatio-temporal information of the participating STIPs. For event recognition a verification Support Vector Machine is used. An extensive evaluation on benchmark large scale video surveillance dataset (VIRAT) and as well on a small scale benchmark dataset (MSR) shows that the proposed method is applicable on a wide range of continuous visual event recognition applications having extremely challenging conditions.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1077-3142	ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ CGR2013			Serial	2413
Permanent link to this record