|   | 
Details
   web
Records
Author Rain Eric Haamer; Eka Rusadze; Iiris Lusi; Tauseef Ahmed; Sergio Escalera; Gholamreza Anbarjafari
Title Review on Emotion Recognition Databases Type Book Chapter
Year 2018 Publication Human-Robot Interaction: Theory and Application Abbreviated Journal
Volume Issue Pages
Keywords emotion; computer vision; databases
Abstract Over the past few decades human-computer interaction has become more important in our daily lives and research has developed in many directions: memory research, depression detection, and behavioural deficiency detection, lie detection, (hidden) emotion recognition etc. Because of that, the number of generic emotion and face databases or those tailored to specific needs have grown immensely large. Thus, a comprehensive yet compact guide is needed to help researchers find the most suitable database and understand what types of databases already exist. In this paper, different elicitation methods are discussed and the databases are primarily organized into neat and informative tables based on the format.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-78923-316-2 Medium
Area Expedition Conference
Notes HUPBA; 602.133 Approved no
Call Number Admin @ si @ HRL2018 Serial (up) 3212
Permanent link to this record
 

 
Author Reza Azad; Maryam Asadi-Aghbolaghi; Shohreh Kasaei; Sergio Escalera
Title Dynamic 3D Hand Gesture Recognition by Learning Weighted Depth Motion Maps Type Journal Article
Year 2019 Publication IEEE Transactions on Circuits and Systems for Video Technology Abbreviated Journal TCSVT
Volume 29 Issue 6 Pages 1729-1740
Keywords Hand gesture recognition; Multilevel temporal sampling; Weighted depth motion map; Spatio-temporal description; VLAD encoding
Abstract Hand gesture recognition from sequences of depth maps is a challenging computer vision task because of the low inter-class and high intra-class variability, different execution rates of each gesture, and the high articulated nature of human hand. In this paper, a multilevel temporal sampling (MTS) method is first proposed that is based on the motion energy of key-frames of depth sequences. As a result, long, middle, and short sequences are generated that contain the relevant gesture information. The MTS results in increasing the intra-class similarity while raising the inter-class dissimilarities. The weighted depth motion map (WDMM) is then proposed to extract the spatio-temporal information from generated summarized sequences by an accumulated weighted absolute difference of consecutive frames. The histogram of gradient (HOG) and local binary pattern (LBP) are exploited to extract features from WDMM. The obtained results define the current state-of-the-art on three public benchmark datasets of: MSR Gesture 3D, SKIG, and MSR Action 3D, for 3D hand gesture recognition. We also achieve competitive results on NTU action dataset.
Address June 2019,
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ AAK2018 Serial (up) 3213
Permanent link to this record
 

 
Author Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya
Title Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable” Type Journal
Year 2018 Publication Informaciones Psiquiatricas Abbreviated Journal
Volume 232 Issue Pages 47-59
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0210-7279 ISBN Medium
Area Expedition Conference
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ FAA2018 Serial (up) 3214
Permanent link to this record
 

 
Author Gholamreza Anbarjafari; Sergio Escalera
Title Human-Robot Interaction: Theory and Application Type Book Whole
Year 2018 Publication Human-Robot Interaction: Theory and Application Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-78923-316-2 Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ AnE2018 Serial (up) 3216
Permanent link to this record
 

 
Author Suman Ghosh
Title Word Spotting and Recognition in Images from Heterogeneous Sources A Type Book Whole
Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Text is the most common way of information sharing from ages. With recent development of personal images databases and handwritten historic manuscripts the demand for algorithms to make these databases accessible for browsing and indexing are in rise. Enabling search or understanding large collection of manuscripts or image databases needs fast and robust methods. Researchers have found different ways to represent cropped words for understanding and matching, which works well when words are already segmented. However there is no trivial way to extend these for non-segmented documents. In this thesis we explore different methods for text retrieval and recognition from unsegmented document and scene images. Two different ways of representation exist in literature, one uses a fixed length representation learned from cropped words and another a sequence of features of variable length. Throughout this thesis, we have studied both these representation for their suitability in segmentation free understanding of text. In the first part we are focused on segmentation free word spotting using a fixed length representation. We extended the use of the successful PHOC (Pyramidal Histogram of Character) representation to segmentation free retrieval. In the second part of the thesis, we explore sequence based features and finally, we propose a unified solution where the same framework can generate both kind of representations.
Address November 2018
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Ernest Valveny
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-948531-0-4 Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ Gho2018 Serial (up) 3217
Permanent link to this record
 

 
Author Aymen Azaza
Title Context, Motion and Semantic Information for Computational Saliency Type Book Whole
Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The main objective of this thesis is to highlight the salient object in an image or in a video sequence. We address three important—but in our opinion
insufficiently investigated—aspects of saliency detection. Firstly, we start
by extending previous research on saliency which explicitly models the information provided from the context. Then, we show the importance of
explicit context modelling for saliency estimation. Several important works
in saliency are based on the usage of object proposals. However, these methods
focus on the saliency of the object proposal itself and ignore the context.
To introduce context in such saliency approaches, we couple every object
proposal with its direct context. This allows us to evaluate the importance
of the immediate surround (context) for its saliency. We propose several
saliency features which are computed from the context proposals including
features based on omni-directional and horizontal context continuity. Secondly,
we investigate the usage of top-downmethods (high-level semantic
information) for the task of saliency prediction since most computational
methods are bottom-up or only include few semantic classes. We propose
to consider a wider group of object classes. These objects represent important
semantic information which we will exploit in our saliency prediction
approach. Thirdly, we develop a method to detect video saliency by computing
saliency from supervoxels and optical flow. In addition, we apply the
context features developed in this thesis for video saliency detection. The
method combines shape and motion features with our proposed context
features. To summarize, we prove that extending object proposals with their
direct context improves the task of saliency detection in both image and
video data. Also the importance of the semantic information in saliency
estimation is evaluated. Finally, we propose a newmotion feature to detect
saliency in video data. The three proposed novelties are evaluated on standard
saliency benchmark datasets and are shown to improve with respect to
state-of-the-art.
Address October 2018
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Joost Van de Weijer;Ali Douik
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-9-4 Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ Aza2018 Serial (up) 3218
Permanent link to this record
 

 
Author Albert Clapes
Title Learning to recognize human actions: from hand-crafted to deep-learning based visual representations Type Book Whole
Year 2019 Publication PhD Thesis, Universitat de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Action recognition is a very challenging and important problem in computer vi­sion. Researchers working on this field aspire to provide computers with the abil­ ity to visually perceive human actions – that is, to observe, interpret, and under­ stand human-related events that occur in the physical environment merely from visual data. The applications of this technology are numerous: human-machine interaction, e-health, monitoring/surveillance, and content-based video retrieval, among others. Hand-crafted methods dominated the field until the apparition of the first successful deep learning-based action recognition works. Although ear­ lier deep-based methods underperformed with respect to hand-crafted approaches, these slowly but steadily improved to become state-of-the-art, eventually achieving better results than hand-crafted ones. Still, hand-crafted approaches can be advan­ tageous in certain scenarios, specially when not enough data is available to train very large deep models or simply to be combined with deep-based methods to fur­ ther boost the performance. Hence, showing how hand-crafted features can provide extra knowledge the deep networks are notable to easily learn about human actions.
This Thesis concurs in time with this change of paradigm and, hence, reflects it into two distinguished parts. In the first part, we focus on improving current suc­ cessful hand-crafted approaches for action recognition and we do so from three dif­ ferent perspectives. Using the dense trajectories framework as a backbone: first, we explore the use of multi-modal and multi-view input
data to enrich the trajectory de­ scriptors. Second, we focus on the classification part of action recognition pipelines and propose an ensemble learning approach, where each classifier leams from a dif­ferent set of local spatiotemporal features to then combine their outputs following an strategy based on the Dempster-Shaffer Theory. And third, we propose a novel hand-crafted feature extraction method that constructs a rnid-level feature descrip­ tion to better modellong-term spatiotemporal dynarnics within action videos. Moving to the second part of the Thesis, we start with a comprehensive study of the current deep-learning based action recognition methods. We review both fun­ damental and cutting edge methodologies reported during the last few years and introduce a taxonomy of deep-leaming methods dedicated to action recognition. In particular, we analyze and discuss how these handle
the temporal dimension of data. Last but not least, we propose a residual recurrent network for action recogni­ tion that naturally integrates all our previous findings in a powerful and prornising framework.
Address January 2019
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Sergio Escalera
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-948531-2-8 Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ Cla2019 Serial (up) 3219
Permanent link to this record
 

 
Author Dena Bazazian
Title Fully Convolutional Networks for Text Understanding in Scene Images Type Book Whole
Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Text understanding in scene images has gained plenty of attention in the computer vision community and it is an important task in many applications as text carries semantically rich information about scene content and context. For instance, reading text in a scene can be applied to autonomous driving, scene understanding or assisting visually impaired people. The general aim of scene text understanding is to localize and recognize text in scene images. Text regions are first localized in the original image by a trained detector model and afterwards fed into a recognition module. The tasks of localization and recognition are highly correlated since an inaccurate localization can affect the recognition task.
The main purpose of this thesis is to devise efficient methods for scene text understanding. We investigate how the latest results on deep learning can advance text understanding pipelines. Recently, Fully Convolutional Networks (FCNs) and derived methods have achieved a significant performance on semantic segmentation and pixel level classification tasks. Therefore, we took benefit of the strengths of FCN approaches in order to detect text in natural scenes. In this thesis we have focused on two challenging tasks of scene text understanding which are Text Detection and Word Spotting. For the task of text detection, we have proposed an efficient text proposal technique in scene images. We have considered the Text Proposals method as the baseline which is an approach to reduce the search space of possible text regions in an image. In order to improve the Text Proposals method we combined it with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same level of accuracy and thus gaining a significant speed up. Our experiments demonstrate that this text proposal approach yields significantly higher recall rates than the line based text localization techniques, while also producing better-quality localization. We have also applied this technique on compressed images such as videos from wearable egocentric cameras. For the task of word spotting, we have introduced a novel mid-level word representation method. We have proposed a technique to create and exploit an intermediate representation of images based on text attributes which roughly correspond to character probability maps. Our representation extends the concept of Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We call this representation the Soft-PHOC. Furthermore, we show how to use Soft-PHOC descriptors for word spotting tasks through an efficient text line proposal algorithm. To evaluate the detected text, we propose a novel line based evaluation along with the classic bounding box based approach. We test our method on incidental scene text images which comprises real-life scenarios such as urban scenes. The importance of incidental scene text images is due to the complexity of backgrounds, perspective, variety of script and language, short text and little linguistic context. All of these factors together makes the incidental scene text images challenging.
Address November 2018
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Dimosthenis Karatzas;Andrew Bagdanov
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-948531-1-1 Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ Baz2018 Serial (up) 3220
Permanent link to this record
 

 
Author Mikhail Mozerov; Joost Van de Weijer
Title One-view occlusion detection for stereo matching with a fully connected CRF model Type Journal Article
Year 2019 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP
Volume 28 Issue 6 Pages 2936-2947
Keywords Stereo matching; energy minimization; fully connected MRF model; geodesic distance filter
Abstract In this paper, we extend the standard belief propagation (BP) sequential technique proposed in the tree-reweighted sequential method [15] to the fully connected CRF models with the geodesic distance affinity. The proposed method has been applied to the stereo matching problem. Also a new approach to the BP marginal solution is proposed that we call one-view occlusion detection (OVOD). In contrast to the standard winner takes all (WTA) estimation, the proposed OVOD solution allows to find occluded regions in the disparity map and simultaneously improve the matching result. As a result we can perform only
one energy minimization process and avoid the cost calculation for the second view and the left-right check procedure. We show that the OVOD approach considerably improves results for cost augmentation and energy minimization techniques in comparison with the standard one-view affinity space implementation. We apply our method to the Middlebury data set and reach state-ofthe-art especially for median, average and mean squared error metrics.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.098; 600.109; 602.133; 600.120 Approved no
Call Number Admin @ si @ MoW2019 Serial (up) 3221
Permanent link to this record
 

 
Author Ilke Demir; Dena Bazazian; Adriana Romero; Viktoriia Sharmanska; Lyne P. Tchapmi
Title WiCV 2018: The Fourth Women In Computer Vision Workshop Type Conference Article
Year 2018 Publication 4th Women in Computer Vision Workshop Abbreviated Journal
Volume Issue Pages 1941-19412
Keywords Conferences; Computer vision; Industries; Object recognition; Engineering profession; Collaboration; Machine learning
Abstract We present WiCV 2018 – Women in Computer Vision Workshop to increase the visibility and inclusion of women researchers in computer vision field, organized in conjunction with CVPR 2018. Computer vision and machine learning have made incredible progress over the past years, yet the number of female researchers is still low both in academia and industry. WiCV is organized to raise visibility of female researchers, to increase the collaboration,
and to provide mentorship and give opportunities to femaleidentifying junior researchers in the field. In its fourth year, we are proud to present the changes and improvements over the past years, summary of statistics for presenters and attendees, followed by expectations from future generations.
Address Salt Lake City; USA; June 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WiCV
Notes DAG; 600.121; 600.129 Approved no
Call Number Admin @ si @ DBR2018 Serial (up) 3222
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Alicia Fornes
Title A Starting Point for Handwritten Music Recognition Type Conference Article
Year 2018 Publication 1st International Workshop on Reading Music Systems Abbreviated Journal
Volume Issue Pages 5-6
Keywords Optical Music Recognition; Long Short-Term Memory; Convolutional Neural Networks; MUSCIMA++; CVCMUSCIMA
Abstract In the last years, the interest in Optical Music Recognition (OMR) has reawakened, especially since the appearance of deep learning. However, there are very few works addressing handwritten scores. In this work we describe a full OMR pipeline for handwritten music scores by using Convolutional and Recurrent Neural Networks that could serve as a baseline for the research community.
Address Paris; France; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WORMS
Notes DAG; 600.097; 601.302; 601.330; 600.121 Approved no
Call Number Admin @ si @ BRF2018 Serial (up) 3223
Permanent link to this record
 

 
Author Laura Lopez-Fuentes; Alessandro Farasin; Harald Skinnemoen; Paolo Garza
Title Deep Learning models for passability detection of flooded roads Type Conference Article
Year 2018 Publication MediaEval 2018 Multimedia Benchmark Workshop Abbreviated Journal
Volume 2283 Issue Pages
Keywords
Abstract In this paper we study and compare several approaches to detect floods and evidence for passability of roads by conventional means in Twitter. We focus on tweets containing both visual information (a picture shared by the user) and metadata, a combination of text and related extra information intrinsic to the Twitter API. This work has been done in the context of the MediaEval 2018 Multimedia Satellite Task.
Address Sophia Antipolis; France; October 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MediaEval
Notes LAMP; 600.084; 600.109; 600.120 Approved no
Call Number Admin @ si @ LFS2018 Serial (up) 3224
Permanent link to this record
 

 
Author Anjan Dutta; Hichem Sahbi
Title Stochastic Graphlet Embedding Type Journal Article
Year 2018 Publication IEEE Transactions on Neural Networks and Learning Systems Abbreviated Journal TNNLS
Volume Issue Pages 1-14
Keywords Stochastic graphlets; Graph embedding; Graph classification; Graph hashing; Betweenness centrality
Abstract Graph-based methods are known to be successful in many machine learning and pattern classification tasks. These methods consider semi-structured data as graphs where nodes correspond to primitives (parts, interest points, segments,
etc.) and edges characterize the relationships between these primitives. However, these non-vectorial graph data cannot be straightforwardly plugged into off-the-shelf machine learning algorithms without a preliminary step of – explicit/implicit –graph vectorization and embedding. This embedding process
should be resilient to intra-class graph variations while being highly discriminant. In this paper, we propose a novel high-order stochastic graphlet embedding (SGE) that maps graphs into vector spaces. Our main contribution includes a new stochastic search procedure that efficiently parses a given graph and extracts/samples unlimitedly high-order graphlets. We consider
these graphlets, with increasing orders, to model local primitives as well as their increasingly complex interactions. In order to build our graph representation, we measure the distribution of these graphlets into a given graph, using particular hash functions that efficiently assign sampled graphlets into isomorphic sets with a very low probability of collision. When
combined with maximum margin classifiers, these graphlet-based representations have positive impact on the performance of pattern comparison and recognition as corroborated through extensive experiments using standard benchmark databases.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 602.167; 602.168; 600.097; 600.121 Approved no
Call Number Admin @ si @ DuS2018 Serial (up) 3225
Permanent link to this record
 

 
Author Xim Cerda-Company; Xavier Otazu
Title Color induction in equiluminant flashed stimuli Type Journal Article
Year 2019 Publication Journal of the Optical Society of America A Abbreviated Journal JOSA A
Volume 36 Issue 1 Pages 22-31
Keywords
Abstract Color induction is the influence of the surrounding color (inducer) on the perceived color of a central region. There are two different types of color induction: color contrast (the color of the central region shifts away from that of the inducer) and color assimilation (the color shifts towards the color of the inducer). Several studies on these effects have used uniform and striped surrounds, reporting color contrast and color assimilation, respectively. Other authors [J. Vis. 12(1), 22 (2012) [CrossRef] ] have studied color induction using flashed uniform surrounds, reporting that the contrast is higher for shorter flash duration. Extending their study, we present new psychophysical results using both flashed and static (i.e., non-flashed) equiluminant stimuli for both striped and uniform surrounds. Similarly to them, for uniform surround stimuli we observed color contrast, but we did not obtain the maximum contrast for the shortest (10 ms) flashed stimuli, but for 40 ms. We only observed this maximum contrast for red, green, and lime inducers, while for a purple inducer we obtained an asymptotic profile along the flash duration. For striped stimuli, we observed color assimilation only for the static (infinite flash duration) red–green surround inducers (red first inducer, green second inducer). For the other inducers’ configurations, we observed color contrast or no induction. Since other studies showed that non-equiluminant striped static stimuli induce color assimilation, our results also suggest that luminance differences could be a key factor to induce it.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes NEUROBIT; 600.120; 600.128 Approved no
Call Number Admin @ si @ CeO2019 Serial (up) 3226
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes
Title Optical Music Recognition by Long Short-Term Memory Networks Type Book Chapter
Year 2018 Publication Graphics Recognition. Current Trends and Evolutions Abbreviated Journal
Volume 11009 Issue Pages 81-95
Keywords Optical Music Recognition; Recurrent Neural Network; Long ShortTerm Memory
Abstract Optical Music Recognition refers to the task of transcribing the image of a music score into a machine-readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level. The experimental results are promising, showing the benefits of our approach.
Address
Corporate Author Thesis
Publisher Springer Place of Publication Editor A. Fornes, B. Lamiroy
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-030-02283-9 Medium
Area Expedition Conference GREC
Notes DAG; 600.097; 601.302; 601.330; 600.121 Approved no
Call Number Admin @ si @ BRC2018 Serial (up) 3227
Permanent link to this record