toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Cristina Palmero; Albert Clapes; Chris Bahnsen; Andreas Møgelmose; Thomas B. Moeslund; Sergio Escalera edit   pdf
doi  openurl
  Title (up) Multi-modal RGB-Depth-Thermal Human Body Segmentation Type Journal Article
  Year 2016 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 118 Issue 2 Pages 217-239  
  Keywords Human body segmentation; RGB ; Depth Thermal  
  Abstract This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.  
  Address  
  Corporate Author Thesis  
  Publisher Springer US Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ PCB2016 Serial 2767  
Permanent link to this record
 

 
Author Albert Clapes; Miguel Reyes; Sergio Escalera edit   pdf
url  doi
openurl 
  Title (up) Multi-modal User Identification and Object Recognition Surveillance System Type Journal Article
  Year 2013 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 34 Issue 7 Pages 799-808  
  Keywords Multi-modal RGB-Depth data analysis; User identification; Object recognition; Intelligent surveillance; Visual features; Statistical learning  
  Abstract We propose an automatic surveillance system for user identification and object recognition based on multi-modal RGB-Depth data analysis. We model a RGBD environment learning a pixel-based background Gaussian distribution. Then, user and object candidate regions are detected and recognized using robust statistical approaches. The system robustly recognizes users and updates the system in an online way, identifying and detecting new actors in the scene. Moreover, segmented objects are described, matched, recognized, and updated online using view-point 3D descriptions, being robust to partial occlusions and local 3D viewpoint rotations. Finally, the system saves the historic of user–object assignments, being specially useful for surveillance scenarios. The system has been evaluated on a novel data set containing different indoor/outdoor scenarios, objects, and users, showing accurate recognition and better performance than standard state-of-the-art approaches.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; 600.046; 605.203;MILAB Approved no  
  Call Number Admin @ si @ CRE2013 Serial 2248  
Permanent link to this record
 

 
Author Meysam Madadi; Sergio Escalera; Jordi Gonzalez; Xavier Roca; Felipe Lumbreras edit  doi
openurl 
  Title (up) Multi-part body segmentation based on depth maps for soft biometry analysis Type Journal Article
  Year 2015 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 56 Issue Pages 14-21  
  Keywords 3D shape context; 3D point cloud alignment; Depth maps; Human body segmentation; Soft biometry analysis  
  Abstract This paper presents a novel method extracting biometric measures using depth sensors. Given a multi-part labeled training data, a new subject is aligned to the best model of the dataset, and soft biometrics such as lengths or circumference sizes of limbs and body are computed. The process is performed by training relevant pose clusters, defining a representative model, and fitting a 3D shape context descriptor within an iterative matching procedure. We show robust measures by applying orthogonal plates to body hull. We test our approach in a novel full-body RGB-Depth data set, showing accurate estimation of soft biometrics and better segmentation accuracy in comparison with random forest approach without requiring large training data.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; ISE; ADAS; 600.076;600.049; 600.063; 600.054; 302.018;MILAB Approved no  
  Call Number Admin @ si @ MEG2015 Serial 2588  
Permanent link to this record
 

 
Author Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund edit  doi
openurl 
  Title (up) Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification Type Journal Article
  Year 2022 Publication Automation in Construction Abbreviated Journal AC  
  Volume 144 Issue Pages 104614  
  Keywords Sewer Defect Classification; Vision Transformers; Sinkhorn-Knopp; Convolutional Neural Networks; Closed-Circuit Television; Sewer Inspection  
  Abstract A crucial part of image classification consists of capturing non-local spatial semantics of image content. This paper describes the multi-scale hybrid vision transformer (MSHViT), an extension of the classical convolutional neural network (CNN) backbone, for multi-label sewer defect classification. To better model spatial semantics in the images, features are aggregated at different scales non-locally through the use of a lightweight vision transformer, and a smaller set of tokens was produced through a novel Sinkhorn clustering-based tokenizer using distinct cluster centers. The proposed MSHViT and Sinkhorn tokenizer were evaluated on the Sewer-ML multi-label sewer defect classification dataset, showing consistent performance improvements of up to 2.53 percentage points.  
  Address Dec 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA Approved no  
  Call Number Admin @ si @ BME2022c Serial 3780  
Permanent link to this record
 

 
Author Carlo Gatta; Eloi Puertas; Oriol Pujol edit  doi
openurl 
  Title (up) Multi-Scale Stacked Sequential Learning Type Journal Article
  Year 2011 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 44 Issue 10-11 Pages 2414-2416  
  Keywords Stacked sequential learning; Multiscale; Multiresolution; Contextual classification  
  Abstract One of the most widely used assumptions in supervised learning is that data is independent and identically distributed. This assumption does not hold true in many real cases. Sequential learning is the discipline of machine learning that deals with dependent data such that neighboring examples exhibit some kind of relationship. In the literature, there are different approaches that try to capture and exploit this correlation, by means of different methodologies. In this paper we focus on meta-learning strategies and, in particular, the stacked sequential learning approach. The main contribution of this work is two-fold: first, we generalize the stacked sequential learning. This generalization reflects the key role of neighboring interactions modeling. Second, we propose an effective and efficient way of capturing and exploiting sequential correlations that takes into account long-range interactions by means of a multi-scale pyramidal decomposition of the predicted labels. Additionally, this new method subsumes the standard stacked sequential learning approach. We tested the proposed method on two different classification tasks: text lines classification in a FAQ data set and image classification. Results on these tasks clearly show that our approach outperforms the standard stacked sequential learning. Moreover, we show that the proposed method allows to control the trade-off between the detail and the desired range of the interactions.  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB;HuPBA Approved no  
  Call Number Admin @ si @ GPP2011 Serial 1802  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: