Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	76–90 of 155 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

List View

Citations

Details

	Records
	Author	Jun Wan; Sergio Escalera; Francisco Perales; Josef Kittler
	Title	Articulated Motion and Deformable Objects			Type	Journal Article
	Year	2018	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	79	Issue		Pages	55-64
	Keywords
	Abstract	This guest editorial introduces the twenty two papers accepted for this Special Issue on Articulated Motion and Deformable Objects (AMDO). They are grouped into four main categories within the field of AMDO: human motion analysis (action/gesture), human pose estimation, deformable shape segmentation, and face analysis. For each of the four topics, a survey of the recent developments in the field is presented. The accepted papers are briefly introduced in the context of this survey. They contribute novel methods, algorithms with improved performance as measured on benchmarking datasets, as well as two new datasets for hand action detection and human posture analysis. The special issue should be of high relevance to the reader interested in AMDO recognition and promote future research directions in the field.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ WEP2018			Serial	3126
Permanent link to this record



	Author	Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
	Title	Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine			Type	Journal Article
	Year	2018	Publication	Entropy	Abbreviated Journal	ENTROPY
	Volume	20	Issue	11	Pages	809
	Keywords	hand sign language; deep learning; restricted Boltzmann machine (RBM); multi-modal; profoundly deaf; noisy image
	Abstract	In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey’s Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ RKE2018			Serial	3198
Permanent link to this record



	Author	Sergio Escalera; Markus Weimer; Mikhail Burtsev; Valentin Malykh; Varvara Logacheva; Ryan Lowe; Iulian Vlad Serban; Yoshua Bengio; Alexander Rudnicky; Alan W. Black; Shrimai Prabhumoye; Łukasz Kidzinski; Mohanty Sharada; Carmichael Ong; Jennifer Hicks; Sergey Levine; Marcel Salathe; Scott Delp; Iker Huerga; Alexander Grigorenko; Leifur Thorbergsson; Anasuya Das; Kyla Nemitz; Jenna Sandker; Stephen King; Alexander S. Ecker; Leon A. Gatys; Matthias Bethge; Jordan Boyd Graber; Shi Feng; Pedro Rodriguez; Mohit Iyyer; He He; Hal Daume III; Sean McGregor; Amir Banifatemi; Alexey Kurakin; Ian Goodfellow; Samy Bengio
	Title	Introduction to NIPS 2017 Competition Track			Type	Book Chapter
	Year	2018	Publication	The NIPS ’17 Competition: Building Intelligent Systems	Abbreviated Journal
	Volume		Issue		Pages	1-23
	Keywords
	Abstract	Competitions have become a popular tool in the data science community to solve hard problems, assess the state of the art and spur new research directions. Companies like Kaggle and open source platforms like Codalab connect people with data and a data science problem to those with the skills and means to solve it. Hence, the question arises: What, if anything, could NIPS add to this rich ecosystem? In 2017, we embarked to find out. We attracted 23 potential competitions, of which we selected five to be NIPS 2017 competitions. Our final selection features competitions advancing the state of the art in other sciences such as “Classifying Clinically Actionable Genetic Mutations” and “Learning to Run”. Others, like “The Conversational Intelligence Challenge” and “Adversarial Attacks and Defences” generated new data sets that we expect to impact the progress in their respective communities for years to come. And “Human-Computer Question Answering Competition” showed us just how far we as a field have come in ability and efficiency since the break-through performance of Watson in Jeopardy. Two additional competitions, DeepArt and AI XPRIZE Milestions, were also associated to the NIPS 2017 competition track, whose results are also presented within this chapter.
	Address
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor	Sergio Escalera; Markus Weimer
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-319-94042-7	Medium
	Area		Expedition		Conference
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ EWB2018			Serial	3200
Permanent link to this record



	Author	Ciprian Corneanu; Meysam Madadi; Sergio Escalera
	Title	Deep Structure Inference Network for Facial Action Unit Recognition			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
	Volume	11216	Issue		Pages	309-324
	Keywords	Computer Vision; Machine Learning; Deep Learning; Facial Expression Analysis; Facial Action Units; Structure Inference
	Abstract	Facial expressions are combinations of basic components called Action Units (AU). Recognizing AUs is key for general facial expression analysis. Recently, efforts in automatic AU recognition have been dedicated to learning combinations of local features and to exploiting correlations between AUs. We propose a deep neural architecture that tackles both problems by combining learned local and global features in its initial stages and replicating a message passing algorithm between classes similar to a graphical model inference approach in later stages. We show that by training the model end-to-end with increased supervision we improve state-of-the-art by 5.3% and 8.2% performance on BP4D and DISFA datasets, respectively.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ CME2018			Serial	3205
Permanent link to this record



	Author	Mohamed Ilyes Lakhal; Albert Clapes; Sergio Escalera; Oswald Lanz; Andrea Cavallaro
	Title	Residual Stacked RNNs for Action Recognition			Type	Conference Article
	Year	2018	Publication	9th International Workshop on Human Behavior Understanding	Abbreviated Journal
	Volume		Issue		Pages	534-548
	Keywords	Action recognition; Deep residual learning; Two-stream RNN
	Abstract	Action recognition pipelines that use Recurrent Neural Networks (RNN) are currently 5–10% less accurate than Convolutional Neural Networks (CNN). While most works that use RNNs employ a 2D CNN on each frame to extract descriptors for action recognition, we extract spatiotemporal features from a 3D CNN and then learn the temporal relationship of these descriptors through a stacked residual recurrent neural network (Res-RNN). We introduce for the first time residual learning to counter the degradation problem in multi-layer RNNs, which have been successful for temporal aggregation in two-stream action recognition pipelines. Finally, we use a late fusion strategy to combine RGB and optical flow data of the two-stream Res-RNN. Experimental results show that the proposed pipeline achieves competitive results on UCF-101 and state of-the-art results for RNN-like architectures on the challenging HMDB-51 dataset.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ LCE2018b			Serial	3206
Permanent link to this record



	Author	Cristina Palmero; Javier Selva; Mohammad Ali Bagheri; Sergio Escalera
	Title	Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues			Type	Conference Article
	Year	2018	Publication	29th British Machine Vision Conference	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Gaze behavior is an important non-verbal cue in social signal processing and humancomputer interaction. In this paper, we tackle the problem of person- and head poseindependent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN). We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images. Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame. Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on EYEDIAP dataset, further improved by 4% when the temporal modality is included.
	Address	Newcastle; UK; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	BMVC
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ PSB2018			Serial	3208
Permanent link to this record



	Author	Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier
	Title	Multimodal First Impression Analysis with Deep Residual Networks			Type	Journal Article
	Year	2018	Publication	IEEE Transactions on Affective Computing	Abbreviated Journal	TAC
	Volume	8	Issue	3	Pages	316-329
	Keywords
	Abstract	People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ GGB2018			Serial	3210
Permanent link to this record



	Author	Miguel Angel Bautista; Oriol Pujol; Fernando De la Torre; Sergio Escalera
	Title	Error-Correcting Factorization			Type	Journal Article
	Year	2018	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	40	Issue		Pages	2388-2401
	Keywords
	Abstract	Error Correcting Output Codes (ECOC) is a successful technique in multi-class classification, which is a core problem in Pattern Recognition and Machine Learning. A major advantage of ECOC over other methods is that the multi- class problem is decoupled into a set of binary problems that are solved independently. However, literature defines a general error-correcting capability for ECOCs without analyzing how it distributes among classes, hindering a deeper analysis of pair-wise error-correction. To address these limitations this paper proposes an Error-Correcting Factorization (ECF) method, our contribution is three fold: (I) We propose a novel representation of the error-correction capability, called the design matrix, that enables us to build an ECOC on the basis of allocating correction to pairs of classes. (II) We derive the optimal code length of an ECOC using rank properties of the design matrix. (III) ECF is formulated as a discrete optimization problem, and a relaxed solution is found using an efficient constrained block coordinate descent approach. (IV) Enabled by the flexibility introduced with the design matrix we propose to allocate the error-correction on classes that are prone to confusion. Experimental results in several databases show that when allocating the error-correction to confusable classes ECF outperforms state-of-the-art approaches.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; no menciona			Approved	no
	Call Number	Admin @ si @ BPT2018			Serial	3015
Permanent link to this record



	Author	Marc Oliu; Javier Selva; Sergio Escalera
	Title	Folded Recurrent Neural Networks for Future Video Prediction			Type	Conference Article
	Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
	Volume	11218	Issue		Pages	745-761
	Keywords
	Abstract	Future video prediction is an ill-posed Computer Vision problem that recently received much attention. Its main challenges are the high variability in video content, the propagation of errors through time, and the non-specificity of the future frames: given a sequence of past frames there is a continuous distribution of possible futures. This work introduces bijective Gated Recurrent Units, a double mapping between the input and output of a GRU layer. This allows for recurrent auto-encoders with state sharing between encoder and decoder, stratifying the sequence representation and helping to prevent capacity problems. We show how with this topology only the encoder or decoder needs to be applied for input encoding and prediction, respectively. This reduces the computational cost and avoids re-encoding the predictions when generating a sequence of frames, mitigating the propagation of errors. Furthermore, it is possible to remove layers from an already trained model, giving an insight to the role performed by each layer and making the model more explainable. We evaluate our approach on three video datasets, outperforming state of the art prediction results on MMNIST and UCF101, and obtaining competitive results on KTH with 2 and 3 times less memory usage and computational cost than the best scored approach.
	Address	Munich; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCV
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ OSE2018			Serial	3204
Permanent link to this record



	Author	Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya
	Title	Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable”			Type	Journal
	Year	2018	Publication	Informaciones Psiquiatricas	Abbreviated Journal
	Volume	232	Issue		Pages	47-59
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0210-7279	ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ FAA2018			Serial	3214
Permanent link to this record



	Author	Hugo Jair Escalante; Sergio Escalera; Isabelle Guyon; Xavier Baro; Yagmur Gucluturk; Umut Guçlu; Marcel van Gerven
	Title	Explainable and Interpretable Models in Computer Vision and Machine Learning			Type	Book Whole
	Year	2018	Publication	The Springer Series on Challenges in Machine Learning	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This book compiles leading research on the development of explainable and interpretable machine learning methods in the context of computer vision and machine learning. Research progress in computer vision and pattern recognition has led to a variety of modeling techniques with almost human-like performance. Although these models have obtained astounding results, they are limited in their explainability and interpretability: what is the rationale behind the decision made? what in the model structure explains its functioning? Hence, while good performance is a critical required characteristic for learning machines, explainability and interpretability capabilities are needed to take learning machines to the next step to include them in decision support systems involving human supervision. This book, written by leading international researchers, addresses key topics of explainability and interpretability, including the following: ·Evaluation and Generalization in Interpretable Machine Learning ·Explanation Methods in Deep Learning ·Learning Functional Causal Models with Generative Neural Networks ·Learning Interpreatable Rules for Multi-Label Classification ·Structuring Neural Networks for More Explainable Predictions ·Generating Post Hoc Rationales of Deep Visual Classification Decisions ·Ensembling Visual Explanations ·Explainable Deep Driving by Visualizing Causal Attention ·Interdisciplinary Perspective on Algorithmic Job Candidate Search ·Multimodal Personality Trait Analysis for Explainable Modeling of Job Interview Decisions ·Inherent Explainability Pattern Theory-based Video Event Interpretations
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; no menciona			Approved	no
	Call Number	Admin @ si @ EEG2018			Serial	3399
Permanent link to this record



	Author	Sergio Escalera; Jordi Gonzalez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon
	Title	Looking at People Special Issue			Type	Journal Article
	Year	2018	Publication	International Journal of Computer Vision	Abbreviated Journal	IJCV
	Volume	126	Issue	2-4	Pages	141-143
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; ISE; 600.119			Approved	no
	Call Number	Admin @ si @ EGJ2018			Serial	3093
Permanent link to this record



	Author	Julio C. S. Jacques Junior; Xavier Baro; Sergio Escalera
	Title	Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification			Type	Journal Article
	Year	2018	Publication	Image and Vision Computing	Abbreviated Journal	IMAVIS
	Volume	79	Issue		Pages	76-85
	Keywords
	Abstract	Person re-identification has received special attention by the human analysis community in the last few years. To address the challenges in this field, many researchers have proposed different strategies, which basically exploit either cross-view invariant features or cross-view robust metrics. In this work, we propose to exploit a post-ranking approach and combine different feature representations through ranking aggregation. Spatial information, which potentially benefits the person matching, is represented using a 2D body model, from which color and texture information are extracted and combined. We also consider background/foreground information, automatically extracted via Deep Decompositional Network, and the usage of Convolutional Neural Network (CNN) features. To describe the matching between images we use the polynomial feature map, also taking into account local and global information. The Discriminant Context Information Analysis based post-ranking approach is used to improve initial ranking lists. Finally, the Stuart ranking aggregation method is employed to combine complementary ranking lists obtained from different feature representations. Experimental results demonstrated that we improve the state-of-the-art on VIPeR and PRID450s datasets, achieving 67.21% and 75.64% on top-1 rank recognition rate, respectively, as well as obtaining competitive results on CUHK01 dataset.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; 602.143			Approved	no
	Call Number	Admin @ si @ JBE2018			Serial	3138
Permanent link to this record



	Author	Mohammad N. S. Jahromi; Morten Bojesen Bonderup; Maryam Asadi-Aghbolaghi; Egils Avots; Kamal Nasrollahi; Sergio Escalera; Shohreh Kasaei; Thomas B. Moeslund; Gholamreza Anbarjafari
	Title	Automatic Access Control Based on Face and Hand Biometrics in a Non-cooperative Context			Type	Conference Article
	Year	2018	Publication	IEEE Winter Applications of Computer Vision Workshops	Abbreviated Journal
	Volume		Issue		Pages	28-36
	Keywords	IEEE Winter Applications of Computer Vision Workshops
	Abstract	Automatic access control systems (ACS) based on the human biometrics or physical tokens are widely employed in public and private areas. Yet these systems, in their conventional forms, are restricted to active interaction from the users. In scenarios where users are not cooperating with the system, these systems are challenged. Failure in cooperation with the biometric systems might be intentional or because the users are incapable of handling the interaction procedure with the biometric system or simply forget to cooperate with it, due to for example, illness like dementia. This work introduces a challenging bimodal database, including face and hand information of the users when they approach a door to open it by its handle in a noncooperative context. We have defined two (an easy and a challenging) protocols on how to use the database. We have reported results on many baseline methods, including deep learning techniques as well as conventional methods on the database. The obtained results show the merit of the proposed database and the challenging nature of access control with non-cooperative users.
	Address	Lake Tahoe; USA; March 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACVW
	Notes	HUPBA; 602.133			Approved	no
	Call Number	Admin @ si @ JBA2018			Serial	3121
Permanent link to this record



	Author	Jelena Gorbova; Egils Avots; Iiris Lusi; Mark Fishel; Sergio Escalera; Gholamreza Anbarjafari
	Title	Integrating Vision and Language for First Impression Personality Analysis			Type	Journal Article
	Year	2018	Publication	IEEE Multimedia	Abbreviated Journal	MULTIMEDIA
	Volume	25	Issue	2	Pages	24 - 33
	Keywords
	Abstract	The authors present a novel methodology for analyzing integrated audiovisual signals and language to assess a persons personality. An evaluation of their proposed multimodal method using a job candidate screening system that predicted five personality traits from a short video demonstrates the methods effectiveness.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; 602.133			Approved	no
	Call Number	Admin @ si @ GAL2018			Serial	3124
Permanent link to this record