|   | 
Details
   web
Record
Author Marc Bolaños; Alvaro Peris; Francisco Casacuberta; Sergi Solera; Petia Radeva
Title Egocentric video description based on temporally-linked sequences Type Journal Article
Year 2018 Publication Journal of Visual Communication and Image Representation Abbreviated Journal JVCIR
Volume 50 Issue Pages 205-216
Keywords egocentric vision; video description; deep learning; multi-modal learning
Abstract Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures.
In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1,339 events with 3,991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ BPC2018 Serial (down) 3109
Permanent link to this record