|
Records |
Links |
|
Author |
Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) |
|
|
Title |
16th International Conference, 2021, Proceedings, Part III |
Type |
Book Whole |
|
Year |
2021 |
Publication |
Document Analysis and Recognition – ICDAR 2021 |
Abbreviated Journal |
|
|
|
Volume |
12823 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding. |
|
|
Address |
Lausanne, Switzerland, September 5-10, 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Cham |
Place of Publication |
|
Editor |
Josep Llados; Daniel Lopresti; Seiichi Uchida |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-030-86333-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3727 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) |
|
|
Title |
16th International Conference, 2021, Proceedings, Part IV |
Type |
Book Whole |
|
Year |
2021 |
Publication |
Document Analysis and Recognition – ICDAR 2021 |
Abbreviated Journal |
|
|
|
Volume |
12824 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding. |
|
|
Address |
Lausanne, Switzerland, September 5-10, 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Cham |
Place of Publication |
|
Editor |
Josep Llados; Daniel Lopresti; Seiichi Uchida |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-030-86336-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3728 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) |
|
|
Title |
16th International Conference, 2021, Proceedings, Part I |
Type |
Book Whole |
|
Year |
2021 |
Publication |
Document Analysis and Recognition – ICDAR 2021 |
Abbreviated Journal |
|
|
|
Volume |
12821 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition. |
|
|
Address |
Lausanne, Switzerland, September 5-10, 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Cham |
Place of Publication |
|
Editor |
Josep Llados; Daniel Lopresti; Seiichi Uchida |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-030-86548-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3725 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) |
|
|
Title |
16th International Conference, 2021, Proceedings, Part II |
Type |
Book Whole |
|
Year |
2021 |
Publication |
Document Analysis and Recognition – ICDAR 2021 |
Abbreviated Journal |
|
|
|
Volume |
12822 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.
The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding. |
|
|
Address |
Lausanne, Switzerland, September 5-10, 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Cham |
Place of Publication |
|
Editor |
Josep Llados; Daniel Lopresti; Seiichi Uchida |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-030-86330-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3726 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados |
|
|
Title |
The 5G of Document Intelligence |
Type |
Conference Article |
|
Year |
2021 |
Publication |
3rd Workshop on Future of Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Lausanne; Suissa; September 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
FDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3677 |
|
Permanent link to this record |
|
|
|
|
Author |
Shiqi Yang; Yaxing Wang; Joost Van de Weijer; Luis Herranz; Shangling Jui |
|
|
Title |
Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation |
Type |
Conference Article |
|
Year |
2021 |
Publication |
Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g. due to data privacy or intellectual property). In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absence of source data. Our method is based on the observation that target data, which might no longer align with the source domain classifier, still forms clear clusters. We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity. We observe that higher affinity should be assigned to reciprocal neighbors, and propose a self regularization loss to decrease the negative impact of noisy neighbors. Furthermore, to aggregate information with more context, we consider expanded neighborhoods with small affinity values. In the experimental results we verify that the inherent structure of the target features is an important source of information for domain adaptation. We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood. Finally, we achieve state-of-the-art performance on several 2D image and 3D point cloud recognition datasets. Code is available in https://github.com/Albert0147/SFDA_neighbors. |
|
|
Address |
Online; December 7-10, 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
NIPS |
|
|
Notes |
LAMP; 600.147; 600.141 |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3691 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Elias Yauri; Aura Hernandez-Sabate; Pau Folch; Debora Gil |
|
|
Title |
Mental Workload Detection Based on EEG Analysis |
Type |
Conference Article |
|
Year |
2021 |
Publication |
Artificial Intelligent Research and Development. Proceedings 23rd International Conference of the Catalan Association for Artificial Intelligence. |
Abbreviated Journal |
|
|
|
Volume |
339 |
Issue |
|
Pages |
268-277 |
|
|
Keywords |
Cognitive states; Mental workload; EEG analysis; Neural Networks. |
|
|
Abstract |
The study of mental workload becomes essential for human work efficiency, health conditions and to avoid accidents, since workload compromises both performance and awareness. Although workload has been widely studied using several physiological measures, minimising the sensor network as much as possible remains both a challenge and a requirement.
Electroencephalogram (EEG) signals have shown a high correlation to specific cognitive and mental states like workload. However, there is not enough evidence in the literature to validate how well models generalize in case of new subjects performing tasks of a workload similar to the ones included during model’s training.
In this paper we propose a binary neural network to classify EEG features across different mental workloads. Two workloads, low and medium, are induced using two variants of the N-Back Test. The proposed model was validated in a dataset collected from 16 subjects and shown a high level of generalization capability: model reported an average recall of 81.81% in a leave-one-out subject evaluation. |
|
|
Address |
Virtual; October 20-22 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CCIA |
|
|
Notes |
IAM; 600.139; 600.118; 600.145 |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3723 |
|
Permanent link to this record |
|
|
|
|
Author |
Reza Azad; Afshin Bozorgpour; Maryam Asadi-Aghbolaghi; Dorit Merhof; Sergio Escalera |
|
|
Title |
Deep Frequency Re-Calibration U-Net for Medical Image Segmentation |
Type |
Conference Article |
|
Year |
2021 |
Publication |
IEEE/CVF International Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
3274-3283 |
|
|
Keywords |
|
|
|
Abstract |
We present a novel solution to the garment animation problem through deep learning. Our contribution allows animating any template outfit with arbitrary topology and geometric complexity. Recent works develop models for garment edition, resizing and animation at the same time by leveraging the support body model (encoding garments as body homotopies). This leads to complex engineering solutions that suffer from scalability, applicability and compatibility. By limiting our scope to garment animation only, we are able to propose a simple model that can animate any outfit, independently of its topology, vertex order or connectivity. Our proposed architecture maps outfits to animated 3D models into the standard format for 3D animation (blend weights and blend shapes matrices), automatically providing of compatibility with any graphics engine. We also propose a methodology to complement supervised learning with an unsupervised physically based learning that implicitly solves collisions and enhances cloth quality. |
|
|
Address |
VIRTUAL; October 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCVW |
|
|
Notes |
HUPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ ABA2021 |
Serial |
3645 |
|
Permanent link to this record |
|
|
|
|
Author |
David Aldavert |
|
|
Title |
Efficient and Scalable Handwritten Word Spotting on Historical Documents using Bag of Visual Words |
Type |
Book Whole |
|
Year |
2021 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Word spotting can be defined as the pattern recognition tasked aimed at locating and retrieving a specific keyword within a document image collection without explicitly transcribing the whole corpus. Its use is particularly interesting when applied in scenarios where Optical Character Recognition performs poorly or can not be used at all. This thesis focuses on such a scenario, word spotting on historical handwritten documents that have been written by a single author or by multiple authors with a similar calligraphy.
This problem requires a visual signature that is robust to image artifacts, flexible to accommodate script variations and efficient to retrieve information in a rapid manner. For this, we have developed a set of word spotting methods that on their foundation use the well known Bag-of-Visual-Words (BoVW) representation. This representation has gained popularity among the document image analysis community to characterize handwritten words
in an unsupervised manner. However, most approaches on this field rely on a basic BoVW configuration and disregard complex encoding and spatial representations. We determine which BoVW configurations provide the best performance boost to a spotting system.
Then, we extend the segmentation-based word spotting, where word candidates are given a priori, to segmentation-free spotting. The proposed approach seeds the document images with overlapping word location candidates and characterizes them with a BoVW signature. Retrieval is achieved comparing the query and candidate signatures and returning the locations that provide a higher consensus. This is a simple but powerful approach that requires a more compact signature than in a segmentation-based scenario. We first
project the BoVW signature into a reduced semantic topics space and then compress it further using Product Quantizers. The resulting signature only requires a few dozen bytes, allowing us to index thousands of pages on a common desktop computer. The final system still yields a performance comparable to the state-of-the-art despite all the information loss during the compression phases.
Afterwards, we also study how to combine different modalities of information in order to create a query-by-X spotting system where, words are indexed using an information modality and queries are retrieved using another. We consider three different information modalities: visual, textual and audio. Our proposal is to create a latent feature space where features which are semantically related are projected onto the same topics. Creating thus a new feature space where information from different modalities can be compared. Later, we consider the codebook generation and descriptor encoding problem. The codebooks used to encode the BoVW signatures are usually created using an unsupervised clustering algorithm and, they require to test multiple parameters to determine which configuration is best for a certain document collection. We propose a semantic clustering algorithm which allows to estimate the best parameter from data. Since gather annotated data is costly, we use synthetically generated word images. The resulting codebook is database agnostic, i. e. a codebook that yields a good performance on document collections that use the same script. We also propose the use of an additional codebook to approximate descriptors and reduce the descriptor encoding
complexity to sub-linear.
Finally, we focus on the problem of signatures dimensionality. We propose a new symbol probability signature where each bin represents the probability that a certain symbol is present a certain location of the word image. This signature is extremely compact and combined with compression techniques can represent word images with just a few bytes per signature. |
|
|
Address |
April 2021 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Marçal Rusiñol;Josep Llados |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-122714-5-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Ald2021 |
Serial |
3601 |
|
Permanent link to this record |
|
|
|
|
Author |
O.F.Ahmad; Y.Mori; M.Misawa; S.Kudo; J.T.Anderson; Jorge Bernal |
|
|
Title |
Establishing key research questions for the implementation of artificial intelligence in colonoscopy: a modified Delphi method |
Type |
Journal Article |
|
Year |
2021 |
Publication |
Endoscopy |
Abbreviated Journal |
END |
|
|
Volume |
53 |
Issue |
9 |
Pages |
893-901 |
|
|
Keywords |
|
|
|
Abstract |
BACKGROUND : Artificial intelligence (AI) research in colonoscopy is progressing rapidly but widespread clinical implementation is not yet a reality. We aimed to identify the top implementation research priorities. METHODS : An established modified Delphi approach for research priority setting was used. Fifteen international experts, including endoscopists and translational computer scientists/engineers, from nine countries participated in an online survey over 9 months. Questions related to AI implementation in colonoscopy were generated as a long-list in the first round, and then scored in two subsequent rounds to identify the top 10 research questions. RESULTS : The top 10 ranked questions were categorized into five themes. Theme 1: clinical trial design/end points (4 questions), related to optimum trial designs for polyp detection and characterization, determining the optimal end points for evaluation of AI, and demonstrating impact on interval cancer rates. Theme 2: technological developments (3 questions), including improving detection of more challenging and advanced lesions, reduction of false-positive rates, and minimizing latency. Theme 3: clinical adoption/integration (1 question), concerning the effective combination of detection and characterization into one workflow. Theme 4: data access/annotation (1 question), concerning more efficient or automated data annotation methods to reduce the burden on human experts. Theme 5: regulatory approval (1 question), related to making regulatory approval processes more efficient. CONCLUSIONS : This is the first reported international research priority setting exercise for AI in colonoscopy. The study findings should be used as a framework to guide future research with key stakeholders to accelerate the clinical implementation of AI in endoscopy. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ AMM2021 |
Serial |
3670 |
|
Permanent link to this record |
|
|
|
|
Author |
Sonia Baeza; R.Domingo; M.Salcedo; G.Moragas; J.Deportos; I.Garcia Olive; Carles Sanchez; Debora Gil; Antoni Rosell |
|
|
Title |
Artificial Intelligence to Optimize Pulmonary Embolism Diagnosis During Covid-19 Pandemic by Perfusion SPECT/CT, a Pilot Study |
Type |
Journal Article |
|
Year |
2021 |
Publication |
American Journal of Respiratory and Critical Care Medicine |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM; 600.145 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BDS2021 |
Serial |
3591 |
|
Permanent link to this record |
|
|
|
|
Author |
Hugo Bertiche; Meysam Madadi; Sergio Escalera |
|
|
Title |
Deep Parametric Surfaces for 3D Outfit Reconstruction from Single View Image |
Type |
Conference Article |
|
Year |
2021 |
Publication |
16th IEEE International Conference on Automatic Face and Gesture Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-8 |
|
|
Keywords |
|
|
|
Abstract |
We present a methodology to retrieve analytical surfaces parametrized as a neural network. Previous works on 3D reconstruction yield point clouds, voxelized objects or meshes. Instead, our approach yields 2-manifolds in the euclidean space through deep learning. To this end, we implement a novel formulation for fully connected layers as parametrized manifolds that allows continuous predictions with differential geometry. Based on this property we propose a novel smoothness loss. Results on CLOTH3D++ dataset show the possibility to infer different topologies and the benefits of the smoothness term based on differential geometry. |
|
|
Address |
Virtual; December 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
FG |
|
|
Notes |
HUPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ BME2021 |
Serial |
3640 |
|
Permanent link to this record |
|
|
|
|
Author |
Hugo Bertiche; Meysam Madadi; Sergio Escalera |
|
|
Title |
PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation |
Type |
Conference Article |
|
Year |
2021 |
Publication |
14th ACM Siggraph Conference and exhibition on Computer Graphics and Interactive Techniques in Asia |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to unsupervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth.
While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used. Also, dependency on PBS data limits the scalability of these solutions, while their formulation hinders its applicability and compatibility. By proposing an unsupervised methodology to learn PSD for LBS models (3D animation standard), we overcome both of these drawbacks. Results obtained show cloth-consistency in the animated garments and meaningful pose-dependant folds and wrinkles. Our solution is extremely efficient, handles multiple layers of cloth, allows unsupervised outfit resizing and can be easily applied to any custom 3D avatar. |
|
|
Address |
Virtual; December 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SIGGRAPH |
|
|
Notes |
HUPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ BME2021b |
Serial |
3641 |
|
Permanent link to this record |
|
|
|
|
Author |
Hugo Bertiche; Meysam Madadi; Sergio Escalera |
|
|
Title |
PBNS: Physically Based Neural Simulation for Unsupervised Garment Pose Space Deformation |
Type |
Journal Article |
|
Year |
2021 |
Publication |
ACM Transactions on Graphics |
Abbreviated Journal |
|
|
|
Volume |
40 |
Issue |
6 |
Pages |
1-14 |
|
|
Keywords |
|
|
|
Abstract |
We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to unsupervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth.
While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used. Also, dependency on PBS data limits the scalability of these solutions, while their formulation hinders its applicability and compatibility. By proposing an unsupervised methodology to learn PSD for LBS models (3D animation standard), we overcome both of these drawbacks. Results obtained show cloth-consistency in the animated garments and meaningful pose-dependant folds and wrinkles. Our solution is extremely efficient, handles multiple layers of cloth, allows unsupervised outfit resizing and can be easily applied to any custom 3D avatar. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ BME2021c |
Serial |
3643 |
|
Permanent link to this record |
|
|
|
|
Author |
Hugo Bertiche; Meysam Madadi; Emilio Tylson; Sergio Escalera |
|
|
Title |
DeePSD: Automatic Deep Skinning And Pose Space Deformation For 3D Garment Animation |
Type |
Conference Article |
|
Year |
2021 |
Publication |
19th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
5471-5480 |
|
|
Keywords |
|
|
|
Abstract |
We present a novel solution to the garment animation problem through deep learning. Our contribution allows animating any template outfit with arbitrary topology and geometric complexity. Recent works develop models for garment edition, resizing and animation at the same time by leveraging the support body model (encoding garments as body homotopies). This leads to complex engineering solutions that suffer from scalability, applicability and compatibility. By limiting our scope to garment animation only, we are able to propose a simple model that can animate any outfit, independently of its topology, vertex order or connectivity. Our proposed architecture maps outfits to animated 3D models into the standard format for 3D animation (blend weights and blend shapes matrices), automatically providing of compatibility with any graphics engine. We also propose a methodology to complement supervised learning with an unsupervised physically based learning that implicitly solves collisions and enhances cloth quality. |
|
|
Address |
Virtual; October 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCV |
|
|
Notes |
HUPBA; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ BMT2021 |
Serial |
3606 |
|
Permanent link to this record |