Publicacions CVC -- Query Results

[161–170] << 171 172 173 174 175 176 177 178 179 180 >> [181–190]

Details

Records
Author	Giacomo Magnifico; Beata Megyesi; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes
Title	Lost in Transcription of Graphic Signs in Ciphers			Type	Conference Article
Year	2022	Publication	International Conference on Historical Cryptology (HistoCrypt 2022)	Abbreviated Journal
Volume		Issue		Pages	153-158
Keywords	transcription of ciphers; hand-written text recognition of symbols; graphic signs
Abstract	Hand-written Text Recognition techniques with the aim to automatically identify and transcribe hand-written text have been applied to historical sources including ciphers. In this paper, we compare the performance of two machine learning architectures, an unsupervised method based on clustering and a deep learning method with few-shot learning. Both models are tested on seen and unseen data from historical ciphers with different symbol sets consisting of various types of graphic signs. We compare the models and highlight their differences in performance, with their advantages and shortcomings.
Address	Amsterdam, Netherlands, June 20-22, 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	HystoCrypt
Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
Call Number	Admin @ si @ MBS2022			Serial	3731
Permanent link to this record



Author	Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados
Title	Table detection in business document images by message passing networks			Type	Journal Article
Year	2022	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	127	Issue		Pages	108641
Keywords
Abstract	Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches.
Address	July 2022
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.162; 600.121			Approved	no
Call Number	Admin @ si @ RGR2022			Serial	3729
Permanent link to this record



Author	Meysam Madadi; Sergio Escalera; Alex Carruesco Llorens; Carlos Andujar; Xavier Baro; Jordi Gonzalez
Title	Top-down model fitting for hand pose recovery in sequences of depth images			Type	Journal Article
Year	2018	Publication	Image and Vision Computing	Abbreviated Journal	IMAVIS
Volume	79	Issue		Pages	63-75
Keywords
Abstract	State-of-the-art approaches on hand pose estimation from depth images have reported promising results under quite controlled considerations. In this paper we propose a two-step pipeline for recovering the hand pose from a sequence of depth images. The pipeline has been designed to deal with images taken from any viewpoint and exhibiting a high degree of finger occlusion. In a first step we initialize the hand pose using a part-based model, fitting a set of hand components in the depth images. In a second step we consider temporal data and estimate the parameters of a trained bilinear model consisting of shape and trajectory bases. We evaluate our approach on a new created synthetic hand dataset along with NYU and MSRA real datasets. Results demonstrate that the proposed method outperforms the most recent pose recovering approaches, including those based on CNNs.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; 600.098			Approved	no
Call Number	Admin @ si @ MEC2018			Serial	3203
Permanent link to this record



Author	Marc Oliu; Javier Selva; Sergio Escalera
Title	Folded Recurrent Neural Networks for Future Video Prediction			Type	Conference Article
Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
Volume	11218	Issue		Pages	745-761
Keywords
Abstract	Future video prediction is an ill-posed Computer Vision problem that recently received much attention. Its main challenges are the high variability in video content, the propagation of errors through time, and the non-specificity of the future frames: given a sequence of past frames there is a continuous distribution of possible futures. This work introduces bijective Gated Recurrent Units, a double mapping between the input and output of a GRU layer. This allows for recurrent auto-encoders with state sharing between encoder and decoder, stratifying the sequence representation and helping to prevent capacity problems. We show how with this topology only the encoder or decoder needs to be applied for input encoding and prediction, respectively. This reduces the computational cost and avoids re-encoding the predictions when generating a sequence of frames, mitigating the propagation of errors. Furthermore, it is possible to remove layers from an already trained model, giving an insight to the role performed by each layer and making the model more explainable. We evaluate our approach on three video datasets, outperforming state of the art prediction results on MMNIST and UCF101, and obtaining competitive results on KTH with 2 and 3 times less memory usage and computational cost than the best scored approach.
Address	Munich; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCV
Notes	HUPBA; no menciona			Approved	no
Call Number	Admin @ si @ OSE2018			Serial	3204
Permanent link to this record



Author	Ciprian Corneanu; Meysam Madadi; Sergio Escalera
Title	Deep Structure Inference Network for Facial Action Unit Recognition			Type	Conference Article
Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
Volume	11216	Issue		Pages	309-324
Keywords	Computer Vision; Machine Learning; Deep Learning; Facial Expression Analysis; Facial Action Units; Structure Inference
Abstract	Facial expressions are combinations of basic components called Action Units (AU). Recognizing AUs is key for general facial expression analysis. Recently, efforts in automatic AU recognition have been dedicated to learning combinations of local features and to exploiting correlations between AUs. We propose a deep neural architecture that tackles both problems by combining learned local and global features in its initial stages and replicating a message passing algorithm between classes similar to a graphical model inference approach in later stages. We show that by training the model end-to-end with increased supervision we improve state-of-the-art by 5.3% and 8.2% performance on BP4D and DISFA datasets, respectively.
Address	Munich; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCV
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ CME2018			Serial	3205
Permanent link to this record



Author	Mohamed Ilyes Lakhal; Albert Clapes; Sergio Escalera; Oswald Lanz; Andrea Cavallaro
Title	Residual Stacked RNNs for Action Recognition			Type	Conference Article
Year	2018	Publication	9th International Workshop on Human Behavior Understanding	Abbreviated Journal
Volume		Issue		Pages	534-548
Keywords	Action recognition; Deep residual learning; Two-stream RNN
Abstract	Action recognition pipelines that use Recurrent Neural Networks (RNN) are currently 5–10% less accurate than Convolutional Neural Networks (CNN). While most works that use RNNs employ a 2D CNN on each frame to extract descriptors for action recognition, we extract spatiotemporal features from a 3D CNN and then learn the temporal relationship of these descriptors through a stacked residual recurrent neural network (Res-RNN). We introduce for the first time residual learning to counter the degradation problem in multi-layer RNNs, which have been successful for temporal aggregation in two-stream action recognition pipelines. Finally, we use a late fusion strategy to combine RGB and optical flow data of the two-stream Res-RNN. Experimental results show that the proposed pipeline achieves competitive results on UCF-101 and state of-the-art results for RNN-like architectures on the challenging HMDB-51 dataset.
Address	Munich; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ LCE2018b			Serial	3206
Permanent link to this record



Author	Giuseppe Pezzano; Oliver Diaz; Vicent Ribas Ripoll; Petia Radeva
Title	CoLe-CNN+: Context learning – Convolutional neural network for COVID-19-Ground-Glass-Opacities detection and segmentation			Type	Journal Article
Year	2021	Publication	Computers in Biology and Medicine	Abbreviated Journal	CBM
Volume	136	Issue		Pages	104689
Keywords
Abstract	The most common tool for population-wide COVID-19 identification is the Reverse Transcription-Polymerase Chain Reaction test that detects the presence of the virus in the throat (or sputum) in swab samples. This test has a sensitivity between 59% and 71%. However, this test does not provide precise information regarding the extension of the pulmonary infection. Moreover, it has been proven that through the reading of a computed tomography (CT) scan, a clinician can provide a more complete perspective of the severity of the disease. Therefore, we propose a comprehensive system for fully-automated COVID-19 detection and lesion segmentation from CT scans, powered by deep learning strategies to support decision-making process for the diagnosis of COVID-19.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ PDR2021			Serial	3635
Permanent link to this record



Author	Cristina Palmero; Javier Selva; Mohammad Ali Bagueri; Sergio Escalera
Title	Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues			Type	Conference Article
Year	2018	Publication	29th British Machine Vision Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Gaze behavior is an important non-verbal cue in social signal processing and humancomputer interaction. In this paper, we tackle the problem of person- and head poseindependent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN). We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images. Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame. Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on EYEDIAP dataset, further improved by 4% when the temporal modality is included.
Address	Newcastle; UK; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	BMVC
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ PSB2018			Serial	3208
Permanent link to this record



Author	Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier
Title	Multimodal First Impression Analysis with Deep Residual Networks			Type	Journal Article
Year	2018	Publication	IEEE Transactions on Affective Computing	Abbreviated Journal	TAC
Volume	8	Issue	3	Pages	316-329
Keywords
Abstract	People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ GGB2018			Serial	3210
Permanent link to this record



Author	Gabriela Ramirez; Esau Villatoro; Bogdan Ionescu; Hugo Jair Escalante; Sergio Escalera; Martha Larson; Henning Muller; Isabelle Guyon
Title	Overview of the Multimedia Information Processing for Personality & Social Networks Analysis Contes			Type	Conference Article
Year	2018	Publication	Multimedia Information Processing for Personality and Social Networks Analysis (MIPPSNA 2018)	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Beijing; China; August 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRW
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ RVI2018			Serial	3211
Permanent link to this record



Author	Rain Eric Haamer; Eka Rusadze; Iiris Lusi; Tauseef Ahmed; Sergio Escalera; Gholamreza Anbarjafari
Title	Review on Emotion Recognition Databases			Type	Book Chapter
Year	2018	Publication	Human-Robot Interaction: Theory and Application	Abbreviated Journal
Volume		Issue		Pages
Keywords	emotion; computer vision; databases
Abstract	Over the past few decades human-computer interaction has become more important in our daily lives and research has developed in many directions: memory research, depression detection, and behavioural deficiency detection, lie detection, (hidden) emotion recognition etc. Because of that, the number of generic emotion and face databases or those tailored to specific needs have grown immensely large. Thus, a comprehensive yet compact guide is needed to help researchers find the most suitable database and understand what types of databases already exist. In this paper, different elicitation methods are discussed and the databases are primarily organized into neat and informative tables based on the format.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-78923-316-2	Medium
Area		Expedition		Conference
Notes	HUPBA; 602.133			Approved	no
Call Number	Admin @ si @ HRL2018			Serial	3212
Permanent link to this record



Author	Reza Azad; Maryam Asadi-Aghbolaghi; Shohreh Kasaei; Sergio Escalera
Title	Dynamic 3D Hand Gesture Recognition by Learning Weighted Depth Motion Maps			Type	Journal Article
Year	2019	Publication	IEEE Transactions on Circuits and Systems for Video Technology	Abbreviated Journal	TCSVT
Volume	29	Issue	6	Pages	1729-1740
Keywords	Hand gesture recognition; Multilevel temporal sampling; Weighted depth motion map; Spatio-temporal description; VLAD encoding
Abstract	Hand gesture recognition from sequences of depth maps is a challenging computer vision task because of the low inter-class and high intra-class variability, different execution rates of each gesture, and the high articulated nature of human hand. In this paper, a multilevel temporal sampling (MTS) method is first proposed that is based on the motion energy of key-frames of depth sequences. As a result, long, middle, and short sequences are generated that contain the relevant gesture information. The MTS results in increasing the intra-class similarity while raising the inter-class dissimilarities. The weighted depth motion map (WDMM) is then proposed to extract the spatio-temporal information from generated summarized sequences by an accumulated weighted absolute difference of consecutive frames. The histogram of gradient (HOG) and local binary pattern (LBP) are exploited to extract features from WDMM. The obtained results define the current state-of-the-art on three public benchmark datasets of: MSR Gesture 3D, SKIG, and MSR Action 3D, for 3D hand gesture recognition. We also achieve competitive results on NTU action dataset.
Address	June 2019,
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ AAK2018			Serial	3213
Permanent link to this record



Author	Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya
Title	Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable”			Type	Journal
Year	2018	Publication	Informaciones Psiquiatricas	Abbreviated Journal
Volume	232	Issue		Pages	47-59
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0210-7279	ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; no menciona			Approved	no
Call Number	Admin @ si @ FAA2018			Serial	3214
Permanent link to this record



Author	Suman Ghosh
Title	Word Spotting and Recognition in Images from Heterogeneous Sources A			Type	Book Whole
Year	2018	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Text is the most common way of information sharing from ages. With recent development of personal images databases and handwritten historic manuscripts the demand for algorithms to make these databases accessible for browsing and indexing are in rise. Enabling search or understanding large collection of manuscripts or image databases needs fast and robust methods. Researchers have found different ways to represent cropped words for understanding and matching, which works well when words are already segmented. However there is no trivial way to extend these for non-segmented documents. In this thesis we explore different methods for text retrieval and recognition from unsegmented document and scene images. Two different ways of representation exist in literature, one uses a fixed length representation learned from cropped words and another a sequence of features of variable length. Throughout this thesis, we have studied both these representation for their suitability in segmentation free understanding of text. In the first part we are focused on segmentation free word spotting using a fixed length representation. We extended the use of the successful PHOC (Pyramidal Histogram of Character) representation to segmentation free retrieval. In the second part of the thesis, we explore sequence based features and finally, we propose a unified solution where the same framework can generate both kind of representations.
Address	November 2018
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Ernest Valveny
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-84-948531-0-4	Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ Gho2018			Serial	3217
Permanent link to this record



Author	Gholamreza Anbarjafari; Sergio Escalera
Title	Human-Robot Interaction: Theory and Application			Type	Book Whole
Year	2018	Publication	Human-Robot Interaction: Theory and Application	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-78923-316-2	Medium
Area		Expedition		Conference
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ AnE2018			Serial	3216
Permanent link to this record