|   | 
Details
   web
Records
Author Hugo Bertiche; Niloy J Mitra; Kuldeep Kulkarni; Chun Hao Paul Huang; Tuanfeng Y Wang; Meysam Madadi; Sergio Escalera; Duygu Ceylan
Title Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images Type Conference Article
Year 2023 Publication 36th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 459-468
Keywords
Abstract Cinemagraphs are short looping videos created by adding subtle motions to a static image. This kind of media is popular and engaging. However, automatic generation of cinemagraphs is an underexplored area and current solutions require tedious low-level manual authoring by artists. In this paper, we present an automatic method that allows generating human cinemagraphs from single RGB images. We investigate the problem in the context of dressed humans under the wind. At the core of our method is a novel cyclic neural network that produces looping cinemagraphs for the target loop duration. To circumvent the problem of collecting real data, we demonstrate that it is possible, by working in the image normal space, to learn garment motion dynamics on synthetic data and generalize to real data. We evaluate our method on both synthetic and real data and demonstrate that it is possible to create compelling and plausible cinemagraphs from single RGB images.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes HUPBA Approved no
Call Number (down) Admin @ si @ BMK2023 Serial 3921
Permanent link to this record
 

 
Author Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas
Title Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Type Conference Article
Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 1391-1400
Keywords Measurement; Training; Integrated circuits; Annotations; Semantics; Training data; Semisupervised learning
Abstract The task of image-text matching aims to map representations from different modalities into a common joint visual-textual embedding. However, the most widely used datasets for this task, MSCOCO and Flickr30K, are actually image captioning datasets that offer a very limited set of relationships between images and sentences in their ground-truth annotations. This limited ground truth information forces us to use evaluation metrics based on binary relevance: given a sentence query we consider only one image as relevant. However, many other relevant images or captions may be present in the dataset. In this work, we propose two metrics that evaluate the degree of semantic relevance of retrieved items, independently of their annotated binary relevance. Additionally, we incorporate a novel strategy that uses an image captioning metric, CIDEr, to define a Semantic Adaptive Margin (SAM) to be optimized in a standard triplet loss. By incorporating our formulation to existing models, a large improvement is obtained in scenarios where available training data is limited. We also demonstrate that the performance on the annotated image-caption pairs is maintained while improving on other non-annotated relevant items when employing the full training set. The code for our new metric can be found at github. com/furkanbiten/ncsmetric and the model implementation at github. com/andrespmd/semanticadaptive_margin.
Address Virtual; Waikoloa; Hawai; USA; January 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes DAG; 600.155; 302.105; Approved no
Call Number (down) Admin @ si @ BMG2022 Serial 3663
Permanent link to this record
 

 
Author Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund
Title Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification Type Journal Article
Year 2022 Publication Automation in Construction Abbreviated Journal AC
Volume 144 Issue Pages 104614
Keywords Sewer Defect Classification; Vision Transformers; Sinkhorn-Knopp; Convolutional Neural Networks; Closed-Circuit Television; Sewer Inspection
Abstract A crucial part of image classification consists of capturing non-local spatial semantics of image content. This paper describes the multi-scale hybrid vision transformer (MSHViT), an extension of the classical convolutional neural network (CNN) backbone, for multi-label sewer defect classification. To better model spatial semantics in the images, features are aggregated at different scales non-locally through the use of a lightweight vision transformer, and a smaller set of tokens was produced through a novel Sinkhorn clustering-based tokenizer using distinct cluster centers. The proposed MSHViT and Sinkhorn tokenizer were evaluated on the Sewer-ML multi-label sewer defect classification dataset, showing consistent performance improvements of up to 2.53 percentage points.
Address Dec 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA Approved no
Call Number (down) Admin @ si @ BME2022c Serial 3780
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera
Title Neural Cloth Simulation Type Journal Article
Year 2022 Publication ACM Transactions on Graphics Abbreviated Journal ACMTGraph
Volume 41 Issue 6 Pages 1-14
Keywords
Abstract We present a general framework for the garment animation problem through unsupervised deep learning inspired in physically based simulation. Existing trends in the literature already explore this possibility. Nonetheless, these approaches do not handle cloth dynamics. Here, we propose the first methodology able to learn realistic cloth dynamics unsupervisedly, and henceforth, a general formulation for neural cloth simulation. The key to achieve this is to adapt an existing optimization scheme for motion from simulation based methodologies to deep learning. Then, analyzing the nature of the problem, we devise an architecture able to automatically disentangle static and dynamic cloth subspaces by design. We will show how this improves model performance. Additionally, this opens the possibility of a novel motion augmentation technique that greatly improves generalization. Finally, we show it also allows to control the level of motion in the predictions. This is a useful, never seen before, tool for artists. We provide of detailed analysis of the problem to establish the bases of neural cloth simulation and guide future research into the specifics of this domain.



ACM Transactions on GraphicsVolume 41Issue 6December 2022 Article No.: 220pp 1–
Address Dec 2022
Corporate Author Thesis
Publisher ACM Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number (down) Admin @ si @ BME2022b Serial 3779
Permanent link to this record
 

 
Author Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund
Title Multi-Task Classification of Sewer Pipe Defects and Properties Using a Cross-Task Graph Neural Network Decoder Type Conference Article
Year 2022 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 2806-2817
Keywords Vision Systems; Applications Multi-Task Classification
Abstract The sewerage infrastructure is one of the most important and expensive infrastructures in modern society. In order to efficiently manage the sewerage infrastructure, automated sewer inspection has to be utilized. However, while sewer
defect classification has been investigated for decades, little attention has been given to classifying sewer pipe properties such as water level, pipe material, and pipe shape, which are needed to evaluate the level of sewer pipe deterioration.
In this work we classify sewer pipe defects and properties concurrently and present a novel decoder-focused multi-task classification architecture Cross-Task Graph Neural Network (CT-GNN), which refines the disjointed per-task predictions using cross-task information. The CT-GNN architecture extends the traditional disjointed task-heads decoder, by utilizing a cross-task graph and unique class node embeddings. The cross-task graph can either be determined a priori based on the conditional probability between the task classes or determined dynamically using self-attention.
CT-GNN can be added to any backbone and trained end-toend at a small increase in the parameter count. We achieve state-of-the-art performance on all four classification tasks in the Sewer-ML dataset, improving defect classification and
water level classification by 5.3 and 8.0 percentage points, respectively. We also outperform the single task methods as well as other multi-task classification approaches while introducing 50 times fewer parameters than previous modelfocused approaches.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes HUPBA; no proj Approved no
Call Number (down) Admin @ si @ BME2022 Serial 3638
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera
Title PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation Type Journal Article
Year 2021 Publication ACM Transactions on Graphics Abbreviated Journal
Volume 40 Issue 6 Pages 1-14
Keywords
Abstract We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to unsupervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth.
While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used. Also, dependency on PBS data limits the scalability of these solutions, while their formulation hinders its applicability and compatibility. By proposing an unsupervised methodology to learn PSD for LBS models (3D animation standard), we overcome both of these drawbacks. Results obtained show cloth-consistency in the animated garments and meaningful pose-dependant folds and wrinkles. Our solution is extremely efficient, handles multiple layers of cloth, allows unsupervised outfit resizing and can be easily applied to any custom 3D avatar.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number (down) Admin @ si @ BME2021c Serial 3643
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera
Title PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation Type Conference Article
Year 2021 Publication 14th ACM Siggraph Conference and exhibition on Computer Graphics and Interactive Techniques in Asia Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to unsupervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth.
While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used. Also, dependency on PBS data limits the scalability of these solutions, while their formulation hinders its applicability and compatibility. By proposing an unsupervised methodology to learn PSD for LBS models (3D animation standard), we overcome both of these drawbacks. Results obtained show cloth-consistency in the animated garments and meaningful pose-dependant folds and wrinkles. Our solution is extremely efficient, handles multiple layers of cloth, allows unsupervised outfit resizing and can be easily applied to any custom 3D avatar.
Address Virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference SIGGRAPH
Notes HUPBA; no proj Approved no
Call Number (down) Admin @ si @ BME2021b Serial 3641
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera
Title Deep Parametric Surfaces for 3D Outfit Reconstruction from Single View Image Type Conference Article
Year 2021 Publication 16th IEEE International Conference on Automatic Face and Gesture Recognition Abbreviated Journal
Volume Issue Pages 1-8
Keywords
Abstract We present a methodology to retrieve analytical surfaces parametrized as a neural network. Previous works on 3D reconstruction yield point clouds, voxelized objects or meshes. Instead, our approach yields 2-manifolds in the euclidean space through deep learning. To this end, we implement a novel formulation for fully connected layers as parametrized manifolds that allows continuous predictions with differential geometry. Based on this property we propose a novel smoothness loss. Results on CLOTH3D++ dataset show the possibility to infer different topologies and the benefits of the smoothness term based on differential geometry.
Address Virtual; December 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference FG
Notes HUPBA; no proj Approved no
Call Number (down) Admin @ si @ BME2021 Serial 3640
Permanent link to this record
 

 
Author Hugo Bertiche; Meysam Madadi; Sergio Escalera
Title CLOTH3D: Clothed 3D Humans Type Conference Article
Year 2020 Publication 16th European Conference on Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This work presents CLOTH3D, the first big scale synthetic dataset of 3D clothed human sequences. CLOTH3D contains a large variability on garment type, topology, shape, size, tightness and fabric. Clothes are simulated on top of thousands of different pose sequences and body shapes, generating realistic cloth dynamics. We provide the dataset with a generative model for cloth generation. We propose a Conditional Variational Auto-Encoder (CVAE) based on graph convolutions (GCVAE) to learn garment latent spaces. This allows for realistic generation of 3D garments on top of SMPL model for any pose and shape.
Address Virtual; August 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCV
Notes HUPBA Approved no
Call Number (down) Admin @ si @ BME2020 Serial 3519
Permanent link to this record
 

 
Author Souhail Bakkali; Zuheng Ming; Mickael Coustaty; Marçal Rusiñol; Oriol Ramos Terrades
Title VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification Type Journal Article
Year 2023 Publication Pattern Recognition Abbreviated Journal PR
Volume 139 Issue Pages 109419
Keywords
Abstract Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream approach. In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues, considering intra- and inter-modality relationships. Instead of merging features from different modalities into a common representation space, the proposed method exploits high-level interactions and learns relevant semantic information from effective attention flows within and across modalities. The proposed learning objective is devised between intra- and inter-modality alignment tasks, where the similarity distribution per task is computed by contracting positive sample pairs while simultaneously contrasting negative ones in the common feature representation space}. Extensive experiments on public document classification datasets demonstrate the effectiveness and the generalization capacity of our model on both low-scale and large-scale datasets.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISSN 0031-3203 ISBN Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121 Approved no
Call Number (down) Admin @ si @ BMC2023 Serial 3826
Permanent link to this record
 

 
Author Francisco Blanco; Felipe Lumbreras; Joan Serrat; Roswitha Siener; Silvia Serranti; Giuseppe Bonifazi; Montserrat Lopez Mesas; Manuel Valiente
Title Taking advantage of Hyperspectral Imaging classification of urinary stones against conventional IR Spectroscopy Type Journal Article
Year 2014 Publication Journal of Biomedical Optics Abbreviated Journal JBiO
Volume 19 Issue 12 Pages 126004-1 - 126004-9
Keywords
Abstract The analysis of urinary stones is mandatory for the best management of the disease after the stone passage in order to prevent further stone episodes. Thus the use of an appropriate methodology for an individualized stone analysis becomes a key factor for giving the patient the most suitable treatment. A recently developed hyperspectral imaging methodology, based on pixel-to-pixel analysis of near-infrared spectral images, is compared to the reference technique in stone analysis, infrared (IR) spectroscopy. The developed classification model yields >90% correct classification rate when compared to IR and is able to precisely locate stone components within the structure of the stone with a 15 µm resolution. Due to the little sample pretreatment, low analysis time, good performance of the model, and the automation of the measurements, they become analyst independent; this methodology can be considered to become a routine analysis for clinical laboratories.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.076 Approved no
Call Number (down) Admin @ si @ BLS2014 Serial 2563
Permanent link to this record
 

 
Author Fernando Barrera; Felipe Lumbreras; Angel Sappa
Title Multispectral Piecewise Planar Stereo using Manhattan-World Assumption Type Journal Article
Year 2013 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 34 Issue 1 Pages 52-61
Keywords Multispectral stereo rig; Dense disparity maps from multispectral stereo; Color and infrared images
Abstract This paper proposes a new framework for extracting dense disparity maps from a multispectral stereo rig. The system is constructed with an infrared and a color camera. It is intended to explore novel multispectral stereo matching approaches that will allow further extraction of semantic information. The proposed framework consists of three stages. Firstly, an initial sparse disparity map is generated by using a cost function based on feature matching in a multiresolution scheme. Then, by looking at the color image, a set of planar hypotheses is defined to describe the surfaces on the scene. Finally, the previous stages are combined by reformulating the disparity computation as a global minimization problem. The paper has two main contributions. The first contribution combines mutual information with a shape descriptor based on gradient in a multiresolution scheme. The second contribution, which is based on the Manhattan-world assumption, extracts a dense disparity representation using the graph cut algorithm. Experimental results in outdoor scenarios are provided showing the validity of the proposed framework.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.054; 600.055; 605.203 Approved no
Call Number (down) Admin @ si @ BLS2013 Serial 2245
Permanent link to this record
 

 
Author Fernando Barrera; Felipe Lumbreras; Angel Sappa
Title Multimodal Stereo Vision System: 3D Data Extraction and Algorithm Evaluation Type Journal Article
Year 2012 Publication IEEE Journal of Selected Topics in Signal Processing Abbreviated Journal J-STSP
Volume 6 Issue 5 Pages 437-446
Keywords
Abstract This paper proposes an imaging system for computing sparse depth maps from multispectral images. A special stereo head consisting of an infrared and a color camera defines the proposed multimodal acquisition system. The cameras are rigidly attached so that their image planes are parallel. Details about the calibration and image rectification procedure are provided. Sparse disparity maps are obtained by the combined use of mutual information enriched with gradient information. The proposed approach is evaluated using a Receiver Operating Characteristics curve. Furthermore, a multispectral dataset, color and infrared images, together with their corresponding ground truth disparity maps, is generated and used as a test bed. Experimental results in real outdoor scenarios are provided showing its viability and that the proposed approach is not restricted to a specific domain.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1932-4553 ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number (down) Admin @ si @ BLS2012b Serial 2155
Permanent link to this record
 

 
Author Federico Bartoli; Giuseppe Lisanti; Svebor Karaman; Andrew Bagdanov; Alberto del Bimbo
Title Unsupervised scene adaptation for faster multi- scale pedestrian detection Type Conference Article
Year 2014 Publication 22nd International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 3534 - 3539
Keywords
Abstract
Address Stockholm; Sweden; August 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes LAMP; 600.079 Approved no
Call Number (down) Admin @ si @ BLK2014 Serial 2519
Permanent link to this record
 

 
Author Ricard Borras; Agata Lapedriza; Laura Igual
Title Depth Information in Human Gait Analysis: An Experimental Study on Gender Recognition Type Conference Article
Year 2012 Publication 9th International Conference on Image Analysis and Recognition Abbreviated Journal
Volume 7325 Issue II Pages 98-105
Keywords
Abstract This work presents DGait, a new gait database acquired with a depth camera. This database contains videos from 53 subjects walking in different directions. The intent of this database is to provide a public set to explore whether the depth can be used as an additional information source for gait classification purposes. Each video is labelled according to subject, gender and age. Furthermore, for each subject and view point, we provide initial and final frames of an entire walk cycle. On the other hand, we perform gait-based gender classification experiments with DGait database, in order to illustrate the usefulness of depth information for this purpose. In our experiments, we extract 2D and 3D gait features based on shape descriptors, and compare the performance of these features for gender identification, using a Kernel SVM. The obtained results show that depth can be an information source of great relevance for gait classification problems.
Address Aveiro, Portugal
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-31297-7 Medium
Area Expedition Conference ICIAR
Notes OR; MILAB;MV Approved no
Call Number (down) Admin @ si @ BLI2012 Serial 2009
Permanent link to this record