Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

Details

Records
Author	Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate; Debora Gil
Title	Continuous head pose estimation using manifold subspace embedding and multivariate regression			Type	Journal Article
Year	2018	Publication	IEEE Access	Abbreviated Journal	ACCESS
Volume	6	Issue		Pages	18325 - 18334
Keywords	Head Pose estimation; HOG features; Generalized Discriminative Common Vectors; B-splines; Multiple linear regression
Abstract	In this paper, a continuous head pose estimation system is proposed to estimate yaw and pitch head angles from raw facial images. Our approach is based on manifold learningbased methods, due to their promising generalization properties shown for face modelling from images. The method combines histograms of oriented gradients, generalized discriminative common vectors and continuous local regression to achieve successful performance. Our proposal was tested on multiple standard face datasets, as well as in a realistic scenario. Results show a considerable performance improvement and a higher consistence of our model in comparison with other state-of-art methods, with angular errors varying between 9 and 17 degrees.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	2169-3536	ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ DMH2018b			Serial	3091
Permanent link to this record



Author	Lei Kang; Juan Ignacio Toledo; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol
Title	Convolve, Attend and Spell: An Attention-based Sequence-to-Sequence Model for Handwritten Word Recognition			Type	Conference Article
Year	2018	Publication	40th German Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	459-472
Keywords
Abstract	This paper proposes Convolve, Attend and Spell, an attention based sequence-to-sequence model for handwritten word recognition. The proposed architecture has three main parts: an encoder, consisting of a CNN and a bi-directional GRU, an attention mechanism devoted to focus on the pertinent features and a decoder formed by a one-directional GRU, able to spell the corresponding word, character by character. Compared with the recent state-of-the-art, our model achieves competitive results on the IAM dataset without needing any pre-processing step, predefined lexicon nor language model. Code and additional results are available in https://github.com/omni-us/research-seq2seq-HTR.
Address	Stuttgart; Germany; October 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	GCPR
Notes	DAG; 600.097; 603.057; 302.065; 601.302; 600.084; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ KTR2018			Serial	3167
Permanent link to this record



Author	Mohamed Ilyes Lakhal; Hakan Cevikalp; Sergio Escalera
Title	CRN: End-to-end Convolutional Recurrent Network Structure Applied to Vehicle Classification			Type	Conference Article
Year	2018	Publication	13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications	Abbreviated Journal
Volume	5	Issue		Pages	137-144
Keywords	Vehicle Classification; Deep Learning; End-to-end Learning
Abstract	Vehicle type classification is considered to be a central part of Intelligent Traffic Systems. In the recent years, deep learning methods have emerged in as being the state-of-the-art in many computer vision tasks. In this paper, we present a novel yet simple deep learning framework for the vehicle type classification problem. We propose an end-to-end trainable system, that combines convolution neural network for feature extraction and recurrent neural network as a classifier. The recurrent network structure is used to handle various types of feature inputs, and at the same time allows to produce a single or a set of class predictions. In order to assess the effectiveness of our solution, we have conducted a set of experiments in two public datasets, obtaining state of the art results. In addition, we also report results on the newly released MIO-TCD dataset.
Address	Funchal; Madeira; Portugal; January 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VISAPP
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ LCE2018a			Serial	3094
Permanent link to this record



Author	Bojana Gajic; Ramon Baldrich
Title	Cross-domain fashion image retrieval			Type	Conference Article
Year	2018	Publication	CVPR 2018 Workshop on Women in Computer Vision (WiCV 2018, 4th Edition)	Abbreviated Journal
Volume		Issue		Pages	19500-19502
Keywords
Abstract	Cross domain image retrieval is a challenging task that implies matching images from one domain to their pairs from another domain. In this paper we focus on fashion image retrieval, which involves matching an image of a fashion item taken by users, to the images of the same item taken in controlled condition, usually by professional photographer. When facing this problem, we have different products in train and test time, and we use triplet loss to train the network. We stress the importance of proper training of simple architecture, as well as adapting general models to the specific task.
Address	Salt Lake City, USA; 22 June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	CIC; 600.087			Approved	no
Call Number	Admin @ si @			Serial	3709
Permanent link to this record



Author	Hugo Prol; Vincent Dumoulin; Luis Herranz
Title	Cross-Modulation Networks for Few-Shot Learning			Type	Miscellaneous
Year	2018	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	A family of recent successful approaches to few-shot learning relies on learning an embedding space in which predictions are made by computing similarities between examples. This corresponds to combining information between support and query examples at a very late stage of the prediction pipeline. Inspired by this observation, we hypothesize that there may be benefits to combining the information at various levels of abstraction along the pipeline. We present an architecture called Cross-Modulation Networks which allows support and query examples to interact throughout the feature extraction process via a feature-wise modulation mechanism. We adapt the Matching Networks architecture to take advantage of these interactions and show encouraging initial results on miniImageNet in the 5-way, 1-shot setting, where we close the gap with state-of-the-art.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ PDH2018			Serial	3248
Permanent link to this record



Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla
Title	Cross-spectral image dehaze through a dense stacked conditional GAN based approach			Type	Conference Article
Year	2018	Publication	14th IEEE International Conference on Signal Image Technology & Internet Based System	Abbreviated Journal
Volume		Issue		Pages
Keywords	Infrared imaging; Dense; Stacked CGAN; Crossspectral; Convolutional networks
Abstract	This paper proposes a novel approach to remove haze from RGB images using a near infrared images based on a dense stacked conditional Generative Adversarial Network (CGAN). The architecture of the deep network implemented receives, besides the images with haze, its corresponding image in the near infrared spectrum, which serve to accelerate the learning process of the details of the characteristics of the images. The model uses a triplet layer that allows the independence learning of each channel of the visible spectrum image to remove the haze on each color channel separately. A multiple loss function scheme is proposed, which ensures balanced learning between the colors and the structure of the images. Experimental results have shown that the proposed method effectively removes the haze from the images. Additionally, the proposed approach is compared with a state of the art approach showing better results.
Address	Las Palmas de Gran Canaria; November 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-5386-9385-8	Medium
Area		Expedition		Conference	SITIS
Notes	MSIAU; 600.086; 600.130; 600.122			Approved	no
Call Number	Admin @ si @ SSV2018a			Serial	3193
Permanent link to this record



Author	Md. Mostafa Kamal Sarker; Mohammed Jabreel; Hatem A. Rashwan; Syeda Furruka Banu; Petia Radeva; Domenec Puig
Title	CuisineNet: Food Attributes Classification using Multi-scale Convolution Network			Type	Conference Article
Year	2018	Publication	21st International Conference of the Catalan Association for Artificial Intelligence	Abbreviated Journal
Volume		Issue		Pages	365-372
Keywords
Abstract	Diversity of food and its attributes represents the culinary habits of peoples from different countries. Thus, this paper addresses the problem of identifying food culture of people around the world and its flavor by classifying two main food attributes, cuisine and flavor. A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input images. The aggregation of multi-scale convolution layers with different kernel size is also used for weighting the features results from different scales. In addition, a joint loss function based on Negative Log Likelihood (NLL) is used to fit the model probability to multi labeled classes for multi-modal classification task. Furthermore, this work provides a new dataset for food attributes, so-called Yummly48K, extracted from the popular food website, Yummly. Our model is assessed on the constructed Yummly48K dataset. The experimental results show that our proposed method yields 65% and 62% average F1 score on validation and test set which outperforming the state-of-the-art models.
Address	Roses; catalonia; October 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CCIA
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ SJR2018			Serial	3113
Permanent link to this record



Author	Md. Mostafa Kamal Sarker; Mohammed Jabreel; Hatem A. Rashwan; Syeda Furruka Banu; Antonio Moreno; Petia Radeva; Domenec Puig
Title	CuisineNet: Food Attributes Classification using Multi-scale Convolution Network.			Type	Miscellaneous
Year	2018	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Diversity of food and its attributes represents the culinary habits of peoples from different countries. Thus, this paper addresses the problem of identifying food culture of people around the world and its flavor by classifying two main food attributes, cuisine and flavor. A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input images. The aggregation of multi-scale convolution layers with different kernel size is also used for weighting the features results from different scales. In addition, a joint loss function based on Negative Log Likelihood (NLL) is used to fit the model probability to multi labeled classes for multi-modal classification task. Furthermore, this work provides a new dataset for food attributes, so-called Yummly48K, extracted from the popular food website, Yummly. Our model is assessed on the constructed Yummly48K dataset. The experimental results show that our proposed method yields 65% and 62% average F1 score on validation and test set which outperforming the state-of-the-art models.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ KJR2018			Serial	3235
Permanent link to this record



Author	Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas
Title	Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters			Type	Conference Article
Year	2018	Publication	13th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	97-102
Keywords	Robust Reading; End-to-end Systems; CNN; Utility Meters
Abstract	In this paper we present a segmentation-free system for reading text in natural scenes. A CNN architecture is trained in an end-to-end manner, and is able to directly output readings without any explicit text localization step. In order to validate our proposal, we focus on the specific case of reading utility meters. We present our results in a large dataset of images acquired by different users and devices, so text appears in any location, with different sizes, fonts and lengths, and the images present several distortions such as dirt, illumination highlights or blur.
Address	Viena; Austria; April 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 600.084; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ GRK2018			Serial	3102
Permanent link to this record



Author	Antonio Lopez; David Vazquez; Gabriel Villalonga
Title	Data for Training Models, Domain Adaptation			Type	Book Chapter
Year	2018	Publication	Intelligent Vehicles. Enabling Technologies and Future Developments	Abbreviated Journal
Volume		Issue		Pages	395–436
Keywords	Driving simulator; hardware; software; interface; traffic simulation; macroscopic simulation; microscopic simulation; virtual data; training data
Abstract	Simulation can enable several developments in the field of intelligent vehicles. This chapter is divided into three main subsections. The first one deals with driving simulators. The continuous improvement of hardware performance is a well-known fact that is allowing the development of more complex driving simulators. The immersion in the simulation scene is increased by high fidelity feedback to the driver. In the second subsection, traffic simulation is explained as well as how it can be used for intelligent transport systems. Finally, it is rather clear that sensor-based perception and action must be based on data-driven algorithms. Simulation could provide data to train and test algorithms that are afterwards implemented in vehicles. These tools are explained in the third subsection.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ LVV2018			Serial	3047
Permanent link to this record



Author	Guillem Cucurull; Pau Rodriguez; Vacit Oguz Yazici; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez
Title	Deep Inference of Personality Traits by Integrating Image and Word Use in Social Networks			Type	Miscellaneous
Year	2018	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	arXiv:1802.06757 Social media, as a major platform for communication and information exchange, is a rich repository of the opinions and sentiments of 2.3 billion users about a vast spectrum of topics. To sense the whys of certain social user’s demands and cultural-driven interests, however, the knowledge embedded in the 1.8 billion pictures which are uploaded daily in public profiles has just started to be exploited since this process has been typically been text-based. Following this trend on visual-based social analysis, we present a novel methodology based on Deep Learning to build a combined image-and-text based personality trait model, trained with images posted together with words found highly correlated to specific personality traits. So the key contribution here is to explore whether OCEAN personality trait modeling can be addressed based on images, here called MindPics, appearing with certain tags with psychological insights. We found that there is a correlation between those posted images and their accompanying texts, which can be successfully modeled using deep neural networks for personality estimation. The experimental results are consistent with previous cyber-psychology results based on texts or images. In addition, classification results on some traits show that some patterns emerge in the set of images corresponding to a specific text, in essence to those representing an abstract concept. These results open new avenues of research for further refining the proposed personality model under the supervision of psychology experts.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE; 600.098; 600.119			Approved	no
Call Number	Admin @ si @ CRY2018			Serial	3550
Permanent link to this record



Author	Jorge Charco; Boris X. Vintimilla; Angel Sappa
Title	Deep learning based camera pose estimation in multi-view environment			Type	Conference Article
Year	2018	Publication	14th IEEE International Conference on Signal Image Technology & Internet Based System	Abbreviated Journal
Volume		Issue		Pages
Keywords	Deep learning; Camera pose estimation; Multiview environment; Siamese architecture
Abstract	This paper proposes to use a deep learning network architecture for relative camera pose estimation on a multi-view environment. The proposed network is a variant architecture of AlexNet to use as regressor for prediction the relative translation and rotation as output. The proposed approach is trained from scratch on a large data set that takes as input a pair of imagesfrom the same scene. This new architecture is compared with a previous approach using standard metrics, obtaining better results on the relative camera pose.
Address	Las Palmas de Gran Canaria; November 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	SITIS
Notes	MSIAU; 600.086; 600.130; 600.122			Approved	no
Call Number	Admin @ si @ CVS2018			Serial	3194
Permanent link to this record



Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud
Title	Deep Learning based Single Image Dehazing			Type	Conference Article
Year	2018	Publication	31st IEEE Conference on Computer Vision and Pattern Recognition Workhsop	Abbreviated Journal
Volume		Issue		Pages	1250 - 12507
Keywords	Gallium nitride; Atmospheric modeling; Generators; Generative adversarial networks; Convergence; Image color analysis
Abstract	This paper proposes a novel approach to remove haze degradations in RGB images using a stacked conditional Generative Adversarial Network (GAN). It employs a triplet of GAN to remove the haze on each color channel independently. A multiple loss functions scheme, applied over a conditional probabilistic model, is proposed. The proposed GAN architecture learns to remove the haze, using as conditioned entrance, the images with haze from which the clear images will be obtained. Such formulation ensures a fast model training convergence and a homogeneous model generalization. Experiments showed that the proposed method generates high-quality clear images.
Address	Salt Lake City; USA; June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	MSIAU; 600.086; 600.130; 600.122			Approved	no
Call Number	Admin @ si @ SSV2018d			Serial	3197
Permanent link to this record



Author	Laura Lopez-Fuentes; Alessandro Farasin; Harald Skinnemoen; Paolo Garza
Title	Deep Learning models for passability detection of flooded roads			Type	Conference Article
Year	2018	Publication	MediaEval 2018 Multimedia Benchmark Workshop	Abbreviated Journal
Volume	2283	Issue		Pages
Keywords
Abstract	In this paper we study and compare several approaches to detect floods and evidence for passability of roads by conventional means in Twitter. We focus on tweets containing both visual information (a picture shared by the user) and metadata, a combination of text and related extra information intrinsic to the Twitter API. This work has been done in the context of the MediaEval 2018 Multimedia Satellite Task.
Address	Sophia Antipolis; France; October 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MediaEval
Notes	LAMP; 600.084; 600.109; 600.120			Approved	no
Call Number	Admin @ si @ LFS2018			Serial	3224
Permanent link to this record



Author	Mohammad A. Haque; Ruben B. Bautista; Kamal Nasrollahi; Sergio Escalera; Christian B. Laursen; Ramin Irani; Ole K. Andersen; Erika G. Spaich; Kaustubh Kulkarni; Thomas B. Moeslund; Marco Bellantonio; Golamreza Anbarjafari; Fatemeh Noroozi
Title	Deep Multimodal Pain Recognition: A Database and Comparision of Spatio-Temporal Visual Modalities, Faces and Gestures			Type	Conference Article
Year	2018	Publication	13th IEEE Conference on Automatic Face and Gesture Recognition	Abbreviated Journal
Volume		Issue		Pages	250 - 257
Keywords
Abstract	Pain is a symptom of many disorders associated with actual or potential tissue damage in human body. Managing pain is not only a duty but also highly cost prone. The most primitive state of pain management is the assessment of pain. Traditionally it was accomplished by self-report or visual inspection by experts. However, automatic pain assessment systems from facial videos are also rapidly evolving due to the need of managing pain in a robust and cost effective way. Among different challenges of automatic pain assessment from facial video data two issues are increasingly prevalent: first, exploiting both spatial and temporal information of the face to assess pain level, and second, incorporating multiple visual modalities to capture complementary face information related to pain. Most works in the literature focus on merely exploiting spatial information on chromatic (RGB) video data on shallow learning scenarios. However, employing deep learning techniques for spatio-temporal analysis considering Depth (D) and Thermal (T) along with RGB has high potential in this area. In this paper, we present the first state-of-the-art publicly available database, 'Multimodal Intensity Pain (MIntPAIN)' database, for RGBDT pain level recognition in sequences. We provide a first baseline results including 5 pain levels recognition by analyzing independent visual modalities and their fusion with CNN and LSTM models. From the experimental evaluation we observe that fusion of modalities helps to enhance recognition performance of pain levels in comparison to isolated ones. In particular, the combination of RGB, D, and T in an early fusion fashion achieved the best recognition rate.
Address	Xian; China; May 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	FG
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ HBN2018			Serial	3117
Permanent link to this record