toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Javier Selva; Anders S. Johansen; Sergio Escalera; Kamal Nasrollahi; Thomas B. Moeslund; Albert Clapes edit  doi
openurl 
  Title Video transformers: A survey Type Journal Article
  Year 2023 Publication (up) IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 45 Issue 11 Pages 12922-12943  
  Keywords Artificial Intelligence; Computer Vision; Self-Attention; Transformers; Video Representations  
  Abstract Transformer models have shown great success handling long-range interactions, making them a promising tool for modeling video. However, they lack inductive biases and scale quadratically with input length. These limitations are further exacerbated when dealing with the high dimensionality introduced by the temporal dimension. While there are surveys analyzing the advances of Transformers for vision, none focus on an in-depth analysis of video-specific designs. In this survey, we analyze the main contributions and trends of works leveraging Transformers to model video. Specifically, we delve into how videos are handled at the input level first. Then, we study the architectural changes made to deal with video more efficiently, reduce redundancy, re-introduce useful inductive biases, and capture long-term temporal dynamics. In addition, we provide an overview of different training regimes and explore effective self-supervised learning strategies for video. Finally, we conduct a performance comparison on the most common benchmark for Video Transformers (i.e., action classification), finding them to outperform 3D ConvNets even with less computational complexity.  
  Address 1 Nov. 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no menciona Approved no  
  Call Number Admin @ si @ SJE2023 Serial 3823  
Permanent link to this record
 

 
Author Oriol Pujol; Petia Radeva; Jordi Vitria edit  openurl
  Title Discriminant ECOC: A Heuristic Method for Application Dependent Design of Error Correcting Output Codes Type Journal
  Year 2006 Publication (up) IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6): 1007–1012 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes OR;MILAB;HuPBA;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ PRV2006a Serial 646  
Permanent link to this record
 

 
Author Miguel Angel Bautista; Antonio Hernandez; Sergio Escalera; Laura Igual; Oriol Pujol; Josep Moya; Veronica Violant; Maria Teresa Anguera edit   pdf
doi  openurl
  Title A Gesture Recognition System for Detecting Behavioral Patterns of ADHD Type Journal Article
  Year 2016 Publication (up) IEEE Transactions on System, Man and Cybernetics, Part B Abbreviated Journal TSMCB  
  Volume 46 Issue 1 Pages 136-147  
  Keywords Gesture Recognition; ADHD; Gaussian Mixture Models; Convex Hulls; Dynamic Time Warping; Multi-modal RGB-Depth data  
  Abstract We present an application of gesture recognition using an extension of Dynamic Time Warping (DTW) to recognize behavioural patterns of Attention Deficit Hyperactivity Disorder (ADHD). We propose an extension of DTW using one-class classifiers in order to be able to encode the variability of a gesture category, and thus, perform an alignment between a gesture sample and a gesture class. We model the set of gesture samples of a certain gesture category using either GMMs or an approximation of Convex Hulls. Thus, we add a theoretical contribution to classical warping path in DTW by including local modeling of intra-class gesture variability. This methodology is applied in a clinical context, detecting a group of ADHD behavioural patterns defined by experts in psychology/psychiatry, to provide support to clinicians in the diagnose procedure. The proposed methodology is tested on a novel multi-modal dataset (RGB plus Depth) of ADHD children recordings with behavioural patterns. We obtain satisfying results when compared to standard state-of-the-art approaches in the DTW context.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; MILAB; Approved no  
  Call Number Admin @ si @ BHE2016 Serial 2566  
Permanent link to this record
 

 
Author Oscar Lopes; Miguel Reyes; Sergio Escalera; Jordi Gonzalez edit  doi
openurl 
  Title Spherical Blurred Shape Model for 3-D Object and Pose Recognition: Quantitative Analysis and HCI Applications in Smart Environments Type Journal Article
  Year 2014 Publication (up) IEEE Transactions on Systems, Man and Cybernetics (Part B) Abbreviated Journal TSMCB  
  Volume 44 Issue 12 Pages 2379-2390  
  Keywords  
  Abstract The use of depth maps is of increasing interest after the advent of cheap multisensor devices based on structured light, such as Kinect. In this context, there is a strong need of powerful 3-D shape descriptors able to generate rich object representations. Although several 3-D descriptors have been already proposed in the literature, the research of discriminative and computationally efficient descriptors is still an open issue. In this paper, we propose a novel point cloud descriptor called spherical blurred shape model (SBSM) that successfully encodes the structure density and local variabilities of an object based on shape voxel distances and a neighborhood propagation strategy. The proposed SBSM is proven to be rotation and scale invariant, robust to noise and occlusions, highly discriminative for multiple categories of complex objects like the human hand, and computationally efficient since the SBSM complexity is linear to the number of object voxels. Experimental evaluation in public depth multiclass object data, 3-D facial expressions data, and a novel hand poses data sets show significant performance improvements in relation to state-of-the-art approaches. Moreover, the effectiveness of the proposal is also proved for object spotting in 3-D scenes and for real-time automatic hand pose recognition in human computer interaction scenarios.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 2168-2267 ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; ISE; 600.078;MILAB Approved no  
  Call Number Admin @ si @ LRE2014 Serial 2442  
Permanent link to this record
 

 
Author Sergio Escalera; Alicia Fornes; Oriol Pujol; Josep Llados; Petia Radeva edit  doi
openurl 
  Title Circular Blurred Shape Model for Multiclass Symbol Recognition Type Journal Article
  Year 2011 Publication (up) IEEE Transactions on Systems, Man and Cybernetics (Part B) (IEEE) Abbreviated Journal TSMCB  
  Volume 41 Issue 2 Pages 497-506  
  Keywords  
  Abstract In this paper, we propose a circular blurred shape model descriptor to deal with the problem of symbol detection and classification as a particular case of object recognition. The feature extraction is performed by capturing the spatial arrangement of significant object characteristics in a correlogram structure. The shape information from objects is shared among correlogram regions, where a prior blurring degree defines the level of distortion allowed in the symbol, making the descriptor tolerant to irregular deformations. Moreover, the descriptor is rotation invariant by definition. We validate the effectiveness of the proposed descriptor in both the multiclass symbol recognition and symbol detection domains. In order to perform the symbol detection, the descriptors are learned using a cascade of classifiers. In the case of multiclass categorization, the new feature space is learned using a set of binary classifiers which are embedded in an error-correcting output code design. The results over four symbol data sets show the significant improvements of the proposed descriptor compared to the state-of-the-art descriptors. In particular, the results are even more significant in those cases where the symbols suffer from elastic deformations.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1083-4419 ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; DAG;HuPBA Approved no  
  Call Number Admin @ si @ EFP2011 Serial 1784  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: