Home | << 1 >> |
Records | |||||
---|---|---|---|---|---|
Author | Cesar de Souza; Adrien Gaidon; Eleonora Vig; Antonio Lopez | ||||
Title | System and method for video classification using a hybrid unsupervised and supervised multi-layer architecture | Type | Patent | ||
Year | 2018 | Publication | US9946933B2 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | US9946933B2 | ||||
Abstract | A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ SGV2018 | Serial | 3255 | ||
Permanent link to this record | |||||
Author | Cesar de Souza; Adrien Gaidon; Eleonora Vig; Antonio Lopez | ||||
Title | Sympathy for the Details: Dense Trajectories and Hybrid Classification Architectures for Action Recognition | Type | Conference Article | ||
Year | 2016 | Publication | 14th European Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 697-716 | ||
Keywords | |||||
Abstract | Action recognition in videos is a challenging task due to the complexity of the spatio-temporal patterns to model and the difficulty to acquire and learn on large quantities of video data. Deep learning, although a breakthrough for image classification and showing promise for videos, has still not clearly superseded action recognition methods using hand-crafted features, even when training on massive datasets. In this paper, we introduce hybrid video classification architectures based on carefully designed unsupervised representations of hand-crafted spatio-temporal features classified by supervised deep networks. As we show in our experiments on five popular benchmarks for action recognition, our hybrid model combines the best of both worlds: it is data efficient (trained on 150 to 10000 short clips) and yet improves significantly on the state of the art, including recent deep models trained on millions of manually labelled images and videos. | ||||
Address | Amsterdam; The Netherlands; October 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECCV | ||
Notes | ADAS; 600.076; 600.085 | Approved | no | ||
Call Number | Admin @ si @ SGV2016 | Serial | 2824 | ||
Permanent link to this record |