|
Records |
Links |
|
Author |
Swathikiran Sudhakaran; Sergio Escalera;Oswald Lanz |
|
|
Title |
Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries |
Type |
Journal Article |
|
Year |
2021 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
We present EgoACO, a deep neural architecture for video action recognition that learns to pool action-context-object descriptors from frame level features by leveraging the verb-noun structure of action labels in egocentric video datasets. The core component of EgoACO is class activation pooling (CAP), a differentiable pooling operation that combines ideas from bilinear pooling for fine-grained recognition and from feature learning for discriminative localization. CAP uses self-attention with a dictionary of learnable weights to pool from the most relevant feature regions. Through CAP, EgoACO learns to decode object and scene context descriptors from video frame features. For temporal modeling in EgoACO, we design a recurrent version of class activation pooling termed Long Short-Term Attention (LSTA). LSTA extends convolutional gated LSTM with built-in spatial attention and a re-designed output gate. Action, object and context descriptors are fused by a multi-head prediction that accounts for the inter-dependencies between noun-verb-action structured labels in egocentric video datasets. EgoACO features built-in visual explanations, helping learning and interpretation. Results on the two largest egocentric action recognition datasets currently available, EPIC-KITCHENS and EGTEA, show that by explicitly decoding action-context-object descriptors, EgoACO achieves state-of-the-art recognition performance. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA; no proj;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ SEL2021 |
Serial |
3656 |
|
Permanent link to this record |
|
|
|
|
Author |
Francesco Ciompi; Oriol Pujol; Petia Radeva |
|
|
Title |
ECOC-DRF: Discriminative random fields based on error correcting output codes |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
47 |
Issue |
6 |
Pages |
2193-2204 |
|
|
Keywords |
Discriminative random fields; Error-correcting output codes; Multi-class classification; Graphical models |
|
|
Abstract |
We present ECOC-DRF, a framework where potential functions for Discriminative Random Fields are formulated as an ensemble of classifiers. We introduce the label trick, a technique to express transitions in the pairwise potential as meta-classes. This allows to independently learn any possible transition between labels without assuming any pre-defined model. The Error Correcting Output Codes matrix is used as ensemble framework for the combination of margin classifiers. We apply ECOC-DRF to a large set of classification problems, covering synthetic, natural and medical images for binary and multi-class cases, outperforming state-of-the art in almost all the experiments. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; HuPBA; MILAB; 605.203; 600.046; 601.043; 600.079 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CPR2014b |
Serial |
2470 |
|
Permanent link to this record |
|
|
|
|
Author |
Miguel Angel Bautista; Antonio Hernandez; Sergio Escalera; Laura Igual; Oriol Pujol; Josep Moya; Veronica Violant; Maria Teresa Anguera |
|
|
Title |
A Gesture Recognition System for Detecting Behavioral Patterns of ADHD |
Type |
Journal Article |
|
Year |
2016 |
Publication |
IEEE Transactions on System, Man and Cybernetics, Part B |
Abbreviated Journal |
TSMCB |
|
|
Volume |
46 |
Issue |
1 |
Pages |
136-147 |
|
|
Keywords |
Gesture Recognition; ADHD; Gaussian Mixture Models; Convex Hulls; Dynamic Time Warping; Multi-modal RGB-Depth data |
|
|
Abstract |
We present an application of gesture recognition using an extension of Dynamic Time Warping (DTW) to recognize behavioural patterns of Attention Deficit Hyperactivity Disorder (ADHD). We propose an extension of DTW using one-class classifiers in order to be able to encode the variability of a gesture category, and thus, perform an alignment between a gesture sample and a gesture class. We model the set of gesture samples of a certain gesture category using either GMMs or an approximation of Convex Hulls. Thus, we add a theoretical contribution to classical warping path in DTW by including local modeling of intra-class gesture variability. This methodology is applied in a clinical context, detecting a group of ADHD behavioural patterns defined by experts in psychology/psychiatry, to provide support to clinicians in the diagnose procedure. The proposed methodology is tested on a novel multi-modal dataset (RGB plus Depth) of ADHD children recordings with behavioural patterns. We obtain satisfying results when compared to standard state-of-the-art approaches in the DTW context. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; MILAB; |
Approved |
no |
|
|
Call Number |
Admin @ si @ BHE2016 |
Serial |
2566 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Clapes; Alex Pardo; Oriol Pujol; Sergio Escalera |
|
|
Title |
Action detection fusing multiple Kinects and a WIMU: an application to in-home assistive technology for the elderly |
Type |
Journal Article |
|
Year |
2018 |
Publication |
Machine Vision and Applications |
Abbreviated Journal |
MVAP |
|
|
Volume |
29 |
Issue |
5 |
Pages |
765–788 |
|
|
Keywords |
Multimodal activity detection; Computer vision; Inertial sensors; Dense trajectories; Dynamic time warping; Assistive technology |
|
|
Abstract |
We present a vision-inertial system which combines two RGB-Depth devices together with a wearable inertial movement unit in order to detect activities of the daily living. From multi-view videos, we extract dense trajectories enriched with a histogram of normals description computed from the depth cue and bag them into multi-view codebooks. During the later classification step a multi-class support vector machine with a RBF- 2 kernel combines the descriptions at kernel level. In order to perform action detection from the videos, a sliding window approach is utilized. On the other hand, we extract accelerations, rotation angles, and jerk features from the inertial data collected by the wearable placed on the user’s dominant wrist. During gesture spotting, a dynamic time warping is applied and the aligning costs to a set of pre-selected gesture sub-classes are thresholded to determine possible detections. The outputs of the two modules are combined in a late-fusion fashion. The system is validated in a real-case scenario with elderly from an elder home. Learning-based fusion results improve the ones from the single modalities, demonstrating the success of such multimodal approach. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA; no proj;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ CPP2018 |
Serial |
3125 |
|
Permanent link to this record |
|
|
|
|
Author |
Mark Philip Philipsen; Jacob Velling Dueholm; Anders Jorgensen; Sergio Escalera; Thomas B. Moeslund |
|
|
Title |
Organ Segmentation in Poultry Viscera Using RGB-D |
Type |
Journal Article |
|
Year |
2018 |
Publication |
Sensors |
Abbreviated Journal |
SENS |
|
|
Volume |
18 |
Issue |
1 |
Pages |
117 |
|
|
Keywords |
semantic segmentation; RGB-D; random forest; conditional random field; 2D; 3D; CNN |
|
|
Abstract |
We present a pattern recognition framework for semantic segmentation of visual structures, that is, multi-class labelling at pixel level, and apply it to the task of segmenting organs in the eviscerated viscera from slaughtered poultry in RGB-D images. This is a step towards replacing the current strenuous manual inspection at poultry processing plants. Features are extracted from feature maps such as activation maps from a convolutional neural network (CNN). A random forest classifier assigns class probabilities, which are further refined by utilizing context in a conditional random field. The presented method is compatible with both 2D and 3D features, which allows us to explore the value of adding 3D and CNN-derived features. The dataset consists of 604 RGB-D images showing 151 unique sets of eviscerated viscera from four different perspectives. A mean Jaccard index of 78.11% is achieved across the four classes of organs by using features derived from 2D, 3D and a CNN, compared to 74.28% using only basic 2D image features. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA; no proj;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ PVJ2018 |
Serial |
3072 |
|
Permanent link to this record |