Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	106–120 of 155 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

List View

Citations

Details

	Records
	Author	Jorge Charco; Boris X. Vintimilla; Angel Sappa
	Title	Deep learning based camera pose estimation in multi-view environment			Type	Conference Article
	Year	2018	Publication	14th IEEE International Conference on Signal Image Technology & Internet Based System	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Deep learning; Camera pose estimation; Multiview environment; Siamese architecture
	Abstract	This paper proposes to use a deep learning network architecture for relative camera pose estimation on a multi-view environment. The proposed network is a variant architecture of AlexNet to use as regressor for prediction the relative translation and rotation as output. The proposed approach is trained from scratch on a large data set that takes as input a pair of imagesfrom the same scene. This new architecture is compared with a previous approach using standard metrics, obtaining better results on the relative camera pose.
	Address	Las Palmas de Gran Canaria; November 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SITIS
	Notes	MSIAU; 600.086; 600.130; 600.122			Approved	no
	Call Number	Admin @ si @ CVS2018			Serial	3194
Permanent link to this record



	Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud
	Title	Near InfraRed Imagery Colorization			Type	Conference Article
	Year	2018	Publication	25th International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages	2237 - 2241
	Keywords	Convolutional Neural Networks (CNN), Generative Adversarial Network (GAN), Infrared Imagery colorization
	Abstract	This paper proposes a stacked conditional Generative Adversarial Network-based method for Near InfraRed (NIR) imagery colorization. We propose a variant architecture of Generative Adversarial Network (GAN) that uses multiple loss functions over a conditional probabilistic generative model. We show that this new architecture/loss-function yields better generalization and representation of the generated colored IR images. The proposed approach is evaluated on a large test dataset and compared to recent state of the art methods using standard metrics.
	Address	Athens; Greece; October 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	MSIAU; 600.086; 600.130; 600.122			Approved	no
	Call Number	Admin @ si @ SSV2018b			Serial	3195
Permanent link to this record



	Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla
	Title	Vegetation Index Estimation from Monospectral Images			Type	Conference Article
	Year	2018	Publication	15th International Conference on Images Analysis and Recognition	Abbreviated Journal
	Volume	10882	Issue		Pages	353-362
	Keywords
	Abstract	This paper proposes a novel approach to estimate Normalized Difference Vegetation Index (NDVI) from just the red channel of a RGB image. The NDVI index is defined as the ratio of the difference of the red and infrared radiances over their sum. In other words, information from the red channel of a RGB image and the corresponding infrared spectral band are required for its computation. In the current work the NDVI index is estimated just from the red channel by training a Conditional Generative Adversarial Network (CGAN). The architecture proposed for the generative network consists of a single level structure, which combines at the final layer results from convolutional operations together with the given red channel with Gaussian noise to enhance details, resulting in a sharp NDVI image. Then, the discriminative model estimates the probability that the NDVI generated index came from the training dataset, rather than the index automatically generated. Experimental results with a large set of real images are provided showing that a Conditional GAN single level model represents an acceptable approach to estimate NDVI index.
	Address	Povoa de Varzim; Portugal; June 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIAR
	Notes	MSIAU; 600.086; 600.130; 600.122			Approved	no
	Call Number	Admin @ si @ SSV2018c			Serial	3196
Permanent link to this record



	Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud
	Title	Deep Learning based Single Image Dehazing			Type	Conference Article
	Year	2018	Publication	31st IEEE Conference on Computer Vision and Pattern Recognition Workhsop	Abbreviated Journal
	Volume		Issue		Pages	1250 - 12507
	Keywords	Gallium nitride; Atmospheric modeling; Generators; Generative adversarial networks; Convergence; Image color analysis
	Abstract	This paper proposes a novel approach to remove haze degradations in RGB images using a stacked conditional Generative Adversarial Network (GAN). It employs a triplet of GAN to remove the haze on each color channel independently. A multiple loss functions scheme, applied over a conditional probabilistic model, is proposed. The proposed GAN architecture learns to remove the haze, using as conditioned entrance, the images with haze from which the clear images will be obtained. Such formulation ensures a fast model training convergence and a homogeneous model generalization. Experiments showed that the proposed method generates high-quality clear images.
	Address	Salt Lake City; USA; June 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	MSIAU; 600.086; 600.130; 600.122			Approved	no
	Call Number	Admin @ si @ SSV2018d			Serial	3197
Permanent link to this record



	Author	Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
	Title	Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine			Type	Journal Article
	Year	2018	Publication	Entropy	Abbreviated Journal	ENTROPY
	Volume	20	Issue	11	Pages	809
	Keywords	hand sign language; deep learning; restricted Boltzmann machine (RBM); multi-modal; profoundly deaf; noisy image
	Abstract	In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey’s Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ RKE2018			Serial	3198
Permanent link to this record



	Author	Meysam Madadi; Sergio Escalera; Alex Carruesco Llorens; Carlos Andujar; Xavier Baro; Jordi Gonzalez
	Title	Top-down model fitting for hand pose recovery in sequences of depth images			Type	Journal Article
	Year	2018	Publication	Image and Vision Computing	Abbreviated Journal	IMAVIS
	Volume	79	Issue		Pages	63-75
	Keywords
	Abstract	State-of-the-art approaches on hand pose estimation from depth images have reported promising results under quite controlled considerations. In this paper we propose a two-step pipeline for recovering the hand pose from a sequence of depth images. The pipeline has been designed to deal with images taken from any viewpoint and exhibiting a high degree of finger occlusion. In a first step we initialize the hand pose using a part-based model, fitting a set of hand components in the depth images. In a second step we consider temporal data and estimate the parameters of a trained bilinear model consisting of shape and trajectory bases. We evaluate our approach on a new created synthetic hand dataset along with NYU and MSRA real datasets. Results demonstrate that the proposed method outperforms the most recent pose recovering approaches, including those based on CNNs.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; 600.098			Approved	no
	Call Number	Admin @ si @ MEC2018			Serial	3203
Permanent link to this record



	Author	Marc Oliu; Javier Selva; Sergio Escalera
	Title	Folded Recurrent Neural Networks for Future Video Prediction			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
	Volume	11218	Issue		Pages	745-761
	Keywords
	Abstract	Future video prediction is an ill-posed Computer Vision problem that recently received much attention. Its main challenges are the high variability in video content, the propagation of errors through time, and the non-specificity of the future frames: given a sequence of past frames there is a continuous distribution of possible futures. This work introduces bijective Gated Recurrent Units, a double mapping between the input and output of a GRU layer. This allows for recurrent auto-encoders with state sharing between encoder and decoder, stratifying the sequence representation and helping to prevent capacity problems. We show how with this topology only the encoder or decoder needs to be applied for input encoding and prediction, respectively. This reduces the computational cost and avoids re-encoding the predictions when generating a sequence of frames, mitigating the propagation of errors. Furthermore, it is possible to remove layers from an already trained model, giving an insight to the role performed by each layer and making the model more explainable. We evaluate our approach on three video datasets, outperforming state of the art prediction results on MMNIST and UCF101, and obtaining competitive results on KTH with 2 and 3 times less memory usage and computational cost than the best scored approach.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ OSE2018			Serial	3204
Permanent link to this record



	Author	Ciprian Corneanu; Meysam Madadi; Sergio Escalera
	Title	Deep Structure Inference Network for Facial Action Unit Recognition			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
	Volume	11216	Issue		Pages	309-324
	Keywords	Computer Vision; Machine Learning; Deep Learning; Facial Expression Analysis; Facial Action Units; Structure Inference
	Abstract	Facial expressions are combinations of basic components called Action Units (AU). Recognizing AUs is key for general facial expression analysis. Recently, efforts in automatic AU recognition have been dedicated to learning combinations of local features and to exploiting correlations between AUs. We propose a deep neural architecture that tackles both problems by combining learned local and global features in its initial stages and replicating a message passing algorithm between classes similar to a graphical model inference approach in later stages. We show that by training the model end-to-end with increased supervision we improve state-of-the-art by 5.3% and 8.2% performance on BP4D and DISFA datasets, respectively.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ CME2018			Serial	3205
Permanent link to this record



	Author	Mohamed Ilyes Lakhal; Albert Clapes; Sergio Escalera; Oswald Lanz; Andrea Cavallaro
	Title	Residual Stacked RNNs for Action Recognition			Type	Conference Article
	Year	2018	Publication	9th International Workshop on Human Behavior Understanding	Abbreviated Journal
	Volume		Issue		Pages	534-548
	Keywords	Action recognition; Deep residual learning; Two-stream RNN
	Abstract	Action recognition pipelines that use Recurrent Neural Networks (RNN) are currently 5–10% less accurate than Convolutional Neural Networks (CNN). While most works that use RNNs employ a 2D CNN on each frame to extract descriptors for action recognition, we extract spatiotemporal features from a 3D CNN and then learn the temporal relationship of these descriptors through a stacked residual recurrent neural network (Res-RNN). We introduce for the first time residual learning to counter the degradation problem in multi-layer RNNs, which have been successful for temporal aggregation in two-stream action recognition pipelines. Finally, we use a late fusion strategy to combine RGB and optical flow data of the two-stream Res-RNN. Experimental results show that the proposed pipeline achieves competitive results on UCF-101 and state of-the-art results for RNN-like architectures on the challenging HMDB-51 dataset.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ LCE2018b			Serial	3206
Permanent link to this record



	Author	Cristina Palmero; Javier Selva; Mohammad Ali Bagheri; Sergio Escalera
	Title	Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues			Type	Conference Article
	Year	2018	Publication	29th British Machine Vision Conference	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Gaze behavior is an important non-verbal cue in social signal processing and humancomputer interaction. In this paper, we tackle the problem of person- and head poseindependent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN). We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images. Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame. Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on EYEDIAP dataset, further improved by 4% when the temporal modality is included.
	Address	Newcastle; UK; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	BMVC
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ PSB2018			Serial	3208
Permanent link to this record



	Author	Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier
	Title	Multimodal First Impression Analysis with Deep Residual Networks			Type	Journal Article
	Year	2018	Publication	IEEE Transactions on Affective Computing	Abbreviated Journal	TAC
	Volume	8	Issue	3	Pages	316-329
	Keywords
	Abstract	People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ GGB2018			Serial	3210
Permanent link to this record



	Author	Gabriela Ramirez; Esau Villatoro; Bogdan Ionescu; Hugo Jair Escalante; Sergio Escalera; Martha Larson; Henning Muller; Isabelle Guyon
	Title	Overview of the Multimedia Information Processing for Personality & Social Networks Analysis Contes			Type	Conference Article
	Year	2018	Publication	Multimedia Information Processing for Personality and Social Networks Analysis (MIPPSNA 2018)	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Beijing; China; August 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPRW
	Notes	HUPBA			Approved	no
	Call Number	Admin @ si @ RVI2018			Serial	3211
Permanent link to this record



	Author	Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya
	Title	Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable”			Type	Journal
	Year	2018	Publication	Informaciones Psiquiatricas	Abbreviated Journal
	Volume	232	Issue		Pages	47-59
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0210-7279	ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ FAA2018			Serial	3214
Permanent link to this record



	Author	Ilke Demir; Dena Bazazian; Adriana Romero; Viktoriia Sharmanska; Lyne P. Tchapmi
	Title	WiCV 2018: The Fourth Women In Computer Vision Workshop			Type	Conference Article
	Year	2018	Publication	4th Women in Computer Vision Workshop	Abbreviated Journal
	Volume		Issue		Pages	1941-19412
	Keywords	Conferences; Computer vision; Industries; Object recognition; Engineering profession; Collaboration; Machine learning
	Abstract	We present WiCV 2018 – Women in Computer Vision Workshop to increase the visibility and inclusion of women researchers in computer vision field, organized in conjunction with CVPR 2018. Computer vision and machine learning have made incredible progress over the past years, yet the number of female researchers is still low both in academia and industry. WiCV is organized to raise visibility of female researchers, to increase the collaboration, and to provide mentorship and give opportunities to femaleidentifying junior researchers in the field. In its fourth year, we are proud to present the changes and improvements over the past years, summary of statistics for presenters and attendees, followed by expectations from future generations.
	Address	Salt Lake City; USA; June 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WiCV
	Notes	DAG; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ DBR2018			Serial	3222
Permanent link to this record



	Author	Arnau Baro; Pau Riba; Alicia Fornes
	Title	A Starting Point for Handwritten Music Recognition			Type	Conference Article
	Year	2018	Publication	1st International Workshop on Reading Music Systems	Abbreviated Journal
	Volume		Issue		Pages	5-6
	Keywords	Optical Music Recognition; Long Short-Term Memory; Convolutional Neural Networks; MUSCIMA++; CVCMUSCIMA
	Abstract	In the last years, the interest in Optical Music Recognition (OMR) has reawakened, especially since the appearance of deep learning. However, there are very few works addressing handwritten scores. In this work we describe a full OMR pipeline for handwritten music scores by using Convolutional and Recurrent Neural Networks that could serve as a baseline for the research community.
	Address	Paris; France; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WORMS
	Notes	DAG; 600.097; 601.302; 601.330; 600.121			Approved	no
	Call Number	Admin @ si @ BRF2018			Serial	3223
Permanent link to this record