|   | 
Details
   web
Records
Author Sergio Escalera; Markus Weimer; Mikhail Burtsev; Valentin Malykh; Varvara Logacheva; Ryan Lowe; Iulian Vlad Serban; Yoshua Bengio; Alexander Rudnicky; Alan W. Black; Shrimai Prabhumoye; Łukasz Kidzinski; Mohanty Sharada; Carmichael Ong; Jennifer Hicks; Sergey Levine; Marcel Salathe; Scott Delp; Iker Huerga; Alexander Grigorenko; Leifur Thorbergsson; Anasuya Das; Kyla Nemitz; Jenna Sandker; Stephen King; Alexander S. Ecker; Leon A. Gatys; Matthias Bethge; Jordan Boyd Graber; Shi Feng; Pedro Rodriguez; Mohit Iyyer; He He; Hal Daume III; Sean McGregor; Amir Banifatemi; Alexey Kurakin; Ian Goodfellow; Samy Bengio
Title Introduction to NIPS 2017 Competition Track Type Book Chapter
Year 2018 Publication The NIPS ’17 Competition: Building Intelligent Systems Abbreviated Journal
Volume Issue Pages 1-23
Keywords
Abstract Competitions have become a popular tool in the data science community to solve hard problems, assess the state of the art and spur new research directions. Companies like Kaggle and open source platforms like Codalab connect people with data and a data science problem to those with the skills and means to solve it. Hence, the question arises: What, if anything, could NIPS add to this rich ecosystem?

In 2017, we embarked to find out. We attracted 23 potential competitions, of which we selected five to be NIPS 2017 competitions. Our final selection features competitions advancing the state of the art in other sciences such as “Classifying Clinically Actionable Genetic Mutations” and “Learning to Run”. Others, like “The Conversational Intelligence Challenge” and “Adversarial Attacks and Defences” generated new data sets that we expect to impact the progress in their respective communities for years to come. And “Human-Computer Question Answering Competition” showed us just how far we as a field have come in ability and efficiency since the break-through performance of Watson in Jeopardy. Two additional competitions, DeepArt and AI XPRIZE Milestions, were also associated to the NIPS 2017 competition track, whose results are also presented within this chapter.
Address
Corporate Author Thesis
Publisher Springer Place of Publication Editor Sergio Escalera; Markus Weimer
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN 978-3-319-94042-7 Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ EWB2018 Serial 3200
Permanent link to this record
 

 
Author Giacomo Magnifico; Beata Megyesi; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes
Title Lost in Transcription of Graphic Signs in Ciphers Type Conference Article
Year 2022 Publication International Conference on Historical Cryptology (HistoCrypt 2022) Abbreviated Journal
Volume Issue Pages 153-158
Keywords transcription of ciphers; hand-written text recognition of symbols; graphic signs
Abstract Hand-written Text Recognition techniques with the aim to automatically identify and transcribe hand-written text have been applied to historical sources including ciphers. In this paper, we compare the performance of two machine learning architectures, an unsupervised method based on clustering and a deep learning method with few-shot learning. Both models are tested on seen and unseen data from historical ciphers with different symbol sets consisting of various types of graphic signs. We compare the models and highlight their differences in performance, with their advantages and shortcomings.
Address Amsterdam, Netherlands, June 20-22, 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference HystoCrypt
Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no
Call Number Admin @ si @ MBS2022 Serial 3731
Permanent link to this record
 

 
Author Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados
Title Table detection in business document images by message passing networks Type Journal Article
Year 2022 Publication Pattern Recognition Abbreviated Journal PR
Volume 127 Issue Pages 108641
Keywords
Abstract Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches.
Address July 2022
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.162; 600.121 Approved no
Call Number Admin @ si @ RGR2022 Serial 3729
Permanent link to this record
 

 
Author Meysam Madadi; Sergio Escalera; Alex Carruesco Llorens; Carlos Andujar; Xavier Baro; Jordi Gonzalez
Title Top-down model fitting for hand pose recovery in sequences of depth images Type Journal Article
Year 2018 Publication Image and Vision Computing Abbreviated Journal IMAVIS
Volume 79 Issue Pages 63-75
Keywords
Abstract State-of-the-art approaches on hand pose estimation from depth images have reported promising results under quite controlled considerations. In this paper we propose a two-step pipeline for recovering the hand pose from a sequence of depth images. The pipeline has been designed to deal with images taken from any viewpoint and exhibiting a high degree of finger occlusion. In a first step we initialize the hand pose using a part-based model, fitting a set of hand components in the depth images. In a second step we consider temporal data and estimate the parameters of a trained bilinear model consisting of shape and trajectory bases. We evaluate our approach on a new created synthetic hand dataset along with NYU and MSRA real datasets. Results demonstrate that the proposed method outperforms the most recent pose recovering approaches, including those based on CNNs.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; 600.098 Approved no
Call Number Admin @ si @ MEC2018 Serial 3203
Permanent link to this record
 

 
Author Mohamed Ilyes Lakhal; Albert Clapes; Sergio Escalera; Oswald Lanz; Andrea Cavallaro
Title Residual Stacked RNNs for Action Recognition Type Conference Article
Year 2018 Publication 9th International Workshop on Human Behavior Understanding Abbreviated Journal
Volume Issue Pages 534-548
Keywords Action recognition; Deep residual learning; Two-stream RNN
Abstract Action recognition pipelines that use Recurrent Neural Networks (RNN) are currently 5–10% less accurate than Convolutional Neural Networks (CNN). While most works that use RNNs employ a 2D CNN on each frame to extract descriptors for action recognition, we extract spatiotemporal features from a 3D CNN and then learn the temporal relationship of these descriptors through a stacked residual recurrent neural network (Res-RNN). We introduce for the first time residual learning to counter the degradation problem in multi-layer RNNs, which have been successful for temporal aggregation in two-stream action recognition pipelines. Finally, we use a late fusion strategy to combine RGB and optical flow data of the two-stream Res-RNN. Experimental results show that the proposed pipeline achieves competitive results on UCF-101 and state of-the-art results for RNN-like architectures on the challenging HMDB-51 dataset.
Address Munich; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ LCE2018b Serial 3206
Permanent link to this record
 

 
Author Giuseppe Pezzano; Oliver Diaz; Vicent Ribas Ripoll; Petia Radeva
Title CoLe-CNN+: Context learning – Convolutional neural network for COVID-19-Ground-Glass-Opacities detection and segmentation Type Journal Article
Year 2021 Publication Computers in Biology and Medicine Abbreviated Journal CBM
Volume 136 Issue Pages 104689
Keywords
Abstract The most common tool for population-wide COVID-19 identification is the Reverse Transcription-Polymerase Chain Reaction test that detects the presence of the virus in the throat (or sputum) in swab samples. This test has a sensitivity between 59% and 71%. However, this test does not provide precise information regarding the extension of the pulmonary infection. Moreover, it has been proven that through the reading of a computed tomography (CT) scan, a clinician can provide a more complete perspective of the severity of the disease. Therefore, we propose a comprehensive system for fully-automated COVID-19 detection and lesion segmentation from CT scans, powered by deep learning strategies to support decision-making process for the diagnosis of COVID-19.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no menciona Approved no
Call Number Admin @ si @ PDR2021 Serial 3635
Permanent link to this record
 

 
Author Cristina Palmero; Javier Selva; Mohammad Ali Bagueri; Sergio Escalera
Title Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues Type Conference Article
Year 2018 Publication 29th British Machine Vision Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Gaze behavior is an important non-verbal cue in social signal processing and humancomputer interaction. In this paper, we tackle the problem of person- and head poseindependent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN). We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images. Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame. Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on
EYEDIAP dataset, further improved by 4% when the temporal modality is included.
Address Newcastle; UK; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference BMVC
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ PSB2018 Serial 3208
Permanent link to this record
 

 
Author Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier
Title Multimodal First Impression Analysis with Deep Residual Networks Type Journal Article
Year 2018 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC
Volume 8 Issue 3 Pages 316-329
Keywords
Abstract People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ GGB2018 Serial 3210
Permanent link to this record
 

 
Author Gabriela Ramirez; Esau Villatoro; Bogdan Ionescu; Hugo Jair Escalante; Sergio Escalera; Martha Larson; Henning Muller; Isabelle Guyon
Title Overview of the Multimedia Information Processing for Personality & Social Networks Analysis Contes Type Conference Article
Year 2018 Publication Multimedia Information Processing for Personality and Social Networks Analysis (MIPPSNA 2018) Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Beijing; China; August 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPRW
Notes HUPBA Approved no
Call Number Admin @ si @ RVI2018 Serial 3211
Permanent link to this record
 

 
Author Rain Eric Haamer; Eka Rusadze; Iiris Lusi; Tauseef Ahmed; Sergio Escalera; Gholamreza Anbarjafari
Title Review on Emotion Recognition Databases Type Book Chapter
Year 2018 Publication Human-Robot Interaction: Theory and Application Abbreviated Journal
Volume Issue Pages
Keywords emotion; computer vision; databases
Abstract Over the past few decades human-computer interaction has become more important in our daily lives and research has developed in many directions: memory research, depression detection, and behavioural deficiency detection, lie detection, (hidden) emotion recognition etc. Because of that, the number of generic emotion and face databases or those tailored to specific needs have grown immensely large. Thus, a comprehensive yet compact guide is needed to help researchers find the most suitable database and understand what types of databases already exist. In this paper, different elicitation methods are discussed and the databases are primarily organized into neat and informative tables based on the format.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN 978-1-78923-316-2 Medium
Area Expedition Conference
Notes HUPBA; 602.133 Approved no
Call Number Admin @ si @ HRL2018 Serial 3212
Permanent link to this record
 

 
Author Reza Azad; Maryam Asadi-Aghbolaghi; Shohreh Kasaei; Sergio Escalera
Title Dynamic 3D Hand Gesture Recognition by Learning Weighted Depth Motion Maps Type Journal Article
Year 2019 Publication IEEE Transactions on Circuits and Systems for Video Technology Abbreviated Journal TCSVT
Volume 29 Issue 6 Pages 1729-1740
Keywords Hand gesture recognition; Multilevel temporal sampling; Weighted depth motion map; Spatio-temporal description; VLAD encoding
Abstract Hand gesture recognition from sequences of depth maps is a challenging computer vision task because of the low inter-class and high intra-class variability, different execution rates of each gesture, and the high articulated nature of human hand. In this paper, a multilevel temporal sampling (MTS) method is first proposed that is based on the motion energy of key-frames of depth sequences. As a result, long, middle, and short sequences are generated that contain the relevant gesture information. The MTS results in increasing the intra-class similarity while raising the inter-class dissimilarities. The weighted depth motion map (WDMM) is then proposed to extract the spatio-temporal information from generated summarized sequences by an accumulated weighted absolute difference of consecutive frames. The histogram of gradient (HOG) and local binary pattern (LBP) are exploited to extract features from WDMM. The obtained results define the current state-of-the-art on three public benchmark datasets of: MSR Gesture 3D, SKIG, and MSR Action 3D, for 3D hand gesture recognition. We also achieve competitive results on NTU action dataset.
Address June 2019,
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ AAK2018 Serial 3213
Permanent link to this record
 

 
Author Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya
Title Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable” Type Journal
Year 2018 Publication Informaciones Psiquiatricas Abbreviated Journal
Volume 232 Issue Pages 47-59
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN 0210-7279 ISBN Medium
Area Expedition Conference
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ FAA2018 Serial 3214
Permanent link to this record
 

 
Author Suman Ghosh
Title Word Spotting and Recognition in Images from Heterogeneous Sources A Type Book Whole
Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Text is the most common way of information sharing from ages. With recent development of personal images databases and handwritten historic manuscripts the demand for algorithms to make these databases accessible for browsing and indexing are in rise. Enabling search or understanding large collection of manuscripts or image databases needs fast and robust methods. Researchers have found different ways to represent cropped words for understanding and matching, which works well when words are already segmented. However there is no trivial way to extend these for non-segmented documents. In this thesis we explore different methods for text retrieval and recognition from unsegmented document and scene images. Two different ways of representation exist in literature, one uses a fixed length representation learned from cropped words and another a sequence of features of variable length. Throughout this thesis, we have studied both these representation for their suitability in segmentation free understanding of text. In the first part we are focused on segmentation free word spotting using a fixed length representation. We extended the use of the successful PHOC (Pyramidal Histogram of Character) representation to segmentation free retrieval. In the second part of the thesis, we explore sequence based features and finally, we propose a unified solution where the same framework can generate both kind of representations.
Address November 2018
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Ernest Valveny
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN 978-84-948531-0-4 Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ Gho2018 Serial 3217
Permanent link to this record
 

 
Author Gholamreza Anbarjafari; Sergio Escalera
Title Human-Robot Interaction: Theory and Application Type Book Whole
Year 2018 Publication Human-Robot Interaction: Theory and Application Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN 978-1-78923-316-2 Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ AnE2018 Serial 3216
Permanent link to this record
 

 
Author Mikhail Mozerov; Joost Van de Weijer
Title One-view occlusion detection for stereo matching with a fully connected CRF model Type Journal Article
Year 2019 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP
Volume 28 Issue 6 Pages 2936-2947
Keywords Stereo matching; energy minimization; fully connected MRF model; geodesic distance filter
Abstract In this paper, we extend the standard belief propagation (BP) sequential technique proposed in the tree-reweighted sequential method [15] to the fully connected CRF models with the geodesic distance affinity. The proposed method has been applied to the stereo matching problem. Also a new approach to the BP marginal solution is proposed that we call one-view occlusion detection (OVOD). In contrast to the standard winner takes all (WTA) estimation, the proposed OVOD solution allows to find occluded regions in the disparity map and simultaneously improve the matching result. As a result we can perform only
one energy minimization process and avoid the cost calculation for the second view and the left-right check procedure. We show that the OVOD approach considerably improves results for cost augmentation and energy minimization techniques in comparison with the standard one-view affinity space implementation. We apply our method to the Middlebury data set and reach state-ofthe-art especially for median, average and mean squared error metrics.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title (up)
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.098; 600.109; 602.133; 600.120 Approved no
Call Number Admin @ si @ MoW2019 Serial 3221
Permanent link to this record