Publicacions CVC -- Query Results

<< 1 >>

Details

Record
Author	Alejandro Cartas; Jordi Luque; Petia Radeva; Carlos Segura; Mariella Dimiccoli
Title	Seeing and Hearing Egocentric Actions: How Much Can We Learn?			Type	Conference Article
Year	2019	Publication	IEEE International Conference on Computer Vision Workshops	Abbreviated Journal
Volume		Issue		Pages	4470-4480
Keywords
Abstract	Our interaction with the world is an inherently multimodal experience. However, the understanding of human-to-object interactions has historically been addressed focusing on a single modality. In particular, a limited number of works have considered to integrate the visual and audio modalities for this purpose. In this work, we propose a multimodal approach for egocentric action recognition in a kitchen environment that relies on audio and visual information. Our model combines a sparse temporal sampling strategy with a late fusion of audio, spatial, and temporal streams. Experimental results on the EPIC-Kitchens dataset show that multimodal integration leads to better performance than unimodal approaches. In particular, we achieved a 5.18% improvement over the state of the art on verb classification.
Address	Seul; Korea; October 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ CLR2019b			Serial	3385
Permanent link to this record