toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Oriol Ramos Terrades; N. Serrano; Albert Gordo; Ernest Valveny; Alfons Juan-Ciscar edit  doi
openurl 
  Title Interactive-predictive detection of handwritten text blocks Type Conference Article
  Year 2010 Publication 17th Document Recognition and Retrieval Conference, part of the IS&T-SPIE Electronic Imaging Symposium Abbreviated Journal  
  Volume 7534 Issue Pages (down) 75340Q–75340Q–10  
  Keywords  
  Abstract A method for text block detection is introduced for old handwritten documents. The proposed method takes advantage of sequential book structure, taking into account layout information from pages previously transcribed. This glance at the past is used to predict the position of text blocks in the current page with the help of conventional layout analysis methods. The method is integrated into the GIDOC prototype: a first attempt to provide integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. Results are given in a transcription task on a 764-page Spanish manuscript from 1891.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DRR  
  Notes DAG Approved no  
  Call Number DAG @ dag @ TSG2010 Serial 1479  
Permanent link to this record
 

 
Author Matthias Eisenmann; Annika Reinke; Vivienn Weru; Minu D. Tizabi; Fabian Isensee; Tim J. Adler; Sharib Ali; Vincent Andrearczyk; Marc Aubreville; Ujjwal Baid; Spyridon Bakas; Niranjan Balu; Sophia Bano; Jorge Bernal; Sebastian Bodenstedt; Alessandro Casella; Veronika Cheplygina; Marie Daum; Marleen de Bruijne edit   pdf
doi  openurl
  Title Why Is the Winner the Best? Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 19955-19966  
  Keywords  
  Abstract International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The “typical” lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.  
  Address Vancouver; Canada; June 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes ISE Approved no  
  Call Number Admin @ si @ ERW2023 Serial 3842  
Permanent link to this record
 

 
Author Jordy Van Landeghem; Ruben Tito; Lukasz Borchmann; Michal Pietruszka; Pawel Joziak; Rafal Powalski; Dawid Jurkiewicz; Mickael Coustaty; Bertrand Anckaert; Ernest Valveny; Matthew Blaschko; Sien Moens; Tomasz Stanislawek edit   pdf
url  openurl
  Title Document Understanding Dataset and Evaluation (DUDE) Type Conference Article
  Year 2023 Publication 20th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages (down) 19528-19540  
  Keywords  
  Abstract We call on the Document AI (DocAI) community to re-evaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks. Document Understanding Dataset and Evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs). We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins and dates. Moreover, we are pushing the boundaries of current methods by creating multi-task and multi-domain evaluation setups that more accurately simulate real-world situations where powerful generalization and adaptation under low-resource settings are desired. DUDE aims to set a new standard as a more practical, long-standing benchmark for the community, and we hope that it will lead to future extensions and contributions that address real-world challenges. Finally, our work illustrates the importance of finding more efficient ways to model language, images, and layout in DocAI.  
  Address Paris; France; October 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes DAG Approved no  
  Call Number Admin @ si @ LTB2023 Serial 3948  
Permanent link to this record
 

 
Author Bojana Gajic; Ramon Baldrich edit  doi
openurl 
  Title Cross-domain fashion image retrieval Type Conference Article
  Year 2018 Publication CVPR 2018 Workshop on Women in Computer Vision (WiCV 2018, 4th Edition) Abbreviated Journal  
  Volume Issue Pages (down) 19500-19502  
  Keywords  
  Abstract Cross domain image retrieval is a challenging task that implies matching images from one domain to their pairs from another domain. In this paper we focus on fashion image retrieval, which involves matching an image of a fashion item taken by users, to the images of the same item taken in controlled condition, usually by professional photographer. When facing this problem, we have different products
in train and test time, and we use triplet loss to train the network. We stress the importance of proper training of simple architecture, as well as adapting general models to the specific task.
 
  Address Salt Lake City, USA; 22 June 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes CIC; 600.087 Approved no  
  Call Number Admin @ si @ Serial 3709  
Permanent link to this record
 

 
Author Yaxing Wang; Hector Laria Mantecon; Joost Van de Weijer; Laura Lopez-Fuentes; Bogdan Raducanu edit   pdf
doi  openurl
  Title TransferI2I: Transfer Learning for Image-to-Image Translation from Small Datasets Type Conference Article
  Year 2021 Publication 19th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages (down) 13990-13999  
  Keywords  
  Abstract Image-to-image (I2I) translation has matured in recent years and is able to generate high-quality realistic images. However, despite current success, it still faces important challenges when applied to small domains. Existing methods use transfer learning for I2I translation, but they still require the learning of millions of parameters from scratch. This drawback severely limits its application on small domains. In this paper, we propose a new transfer learning for I2I translation (TransferI2I). We decouple our learning process into the image generation step and the I2I translation step. In the first step we propose two novel techniques: source-target initialization and self-initialization of the adaptor layer. The former finetunes the pretrained generative model (e.g., StyleGAN) on source and target data. The latter allows to initialize all non-pretrained network parameters without the need of any data. These techniques provide a better initialization for the I2I translation step. In addition, we introduce an auxiliary GAN that further facilitates the training of deep I2I systems even from small datasets. In extensive experiments on three datasets, (Animal faces, Birds, and Foods), we show that we outperform existing methods and that mFID improves on several datasets with over 25 points.  
  Address Virtual; October 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes LAMP; 600.147; 602.200; 600.120 Approved no  
  Call Number Admin @ si @ WLW2021 Serial 3604  
Permanent link to this record
 

 
Author Senmao Li; Joost Van de Weijer; Yaxing Wang; Fahad Shahbaz Khan; Meiqin Liu; Jian Yang edit  url
doi  openurl
  Title 3D-Aware Multi-Class Image-to-Image Translation with NeRFs Type Conference Article
  Year 2023 Publication 36th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 12652-12662  
  Keywords  
  Abstract Recent advances in 3D-aware generative models (3D-aware GANs) combined with Neural Radiance Fields (NeRF) have achieved impressive results. However no prior works investigate 3D-aware GANs for 3D consistent multiclass image-to-image (3D-aware 121) translation. Naively using 2D-121 translation methods suffers from unrealistic shape/identity change. To perform 3D-aware multiclass 121 translation, we decouple this learning process into a multiclass 3D-aware GAN step and a 3D-aware 121 translation step. In the first step, we propose two novel techniques: a new conditional architecture and an effective training strategy. In the second step, based on the well-trained multiclass 3D-aware GAN architecture, that preserves view-consistency, we construct a 3D-aware 121 translation system. To further reduce the view-consistency problems, we propose several new techniques, including a U-net-like adaptor network design, a hierarchical representation constrain and a relative regularization loss. In exten-sive experiments on two datasets, quantitative and qualitative results demonstrate that we successfully perform 3D-aware 121 translation with multi-view consistency. Code is available in 3DI2I.  
  Address Vancouver; Canada; June 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes LAMP Approved no  
  Call Number Admin @ si @ LWW2023b Serial 3920  
Permanent link to this record
 

 
Author Ali Furkan Biten; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Good News, Everyone! Context driven entity-aware captioning for news images Type Conference Article
  Year 2019 Publication 32nd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 12458-12467  
  Keywords  
  Abstract Current image captioning systems perform at a merely descriptive level, essentially enumerating the objects in the scene and their relations. Humans, on the contrary, interpret images by integrating several sources of prior knowledge of the world. In this work, we aim to take a step closer to producing captions that offer a plausible interpretation of the scene, by integrating such contextual information into the captioning pipeline. For this we focus on the captioning of images used to illustrate news articles. We propose a novel captioning method that is able to leverage contextual information provided by the text of news articles associated with an image. Our model is able to selectively draw information from the article guided by visual cues, and to dynamically extend the output dictionary to out-of-vocabulary named entities that appear in the context source. Furthermore we introduce“ GoodNews”, the largest news image captioning dataset in the literature and demonstrate state-of-the-art results.  
  Address Long beach; California; USA; june 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes DAG; 600.129; 600.135; 601.338; 600.121 Approved no  
  Call Number Admin @ si @ BGR2019 Serial 3289  
Permanent link to this record
 

 
Author Yuyang Liu; Yang Cong; Dipam Goswami; Xialei Liu; Joost Van de Weijer edit   pdf
url  openurl
  Title Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection Type Conference Article
  Year 2023 Publication 20th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages (down) 11367-11377  
  Keywords  
  Abstract In incremental learning, replaying stored samples from previous tasks together with current task samples is one of the most efficient approaches to address catastrophic forgetting. However, unlike incremental classification, image replay has not been successfully applied to incremental object detection (IOD). In this paper, we identify the overlooked problem of foreground shift as the main reason for this. Foreground shift only occurs when replaying images of previous tasks and refers to the fact that their background might contain foreground objects of the current task. To overcome this problem, a novel and efficient Augmented Box Replay (ABR) method is developed that only stores and replays foreground objects and thereby circumvents the foreground shift problem. In addition, we propose an innovative Attentive RoI Distillation loss that uses spatial attention from region-of-interest (RoI) features to constrain current model to focus on the most important information from old model. ABR significantly reduces forgetting of previous classes while maintaining high plasticity in current classes. Moreover, it considerably reduces the storage requirements when compared to standard image replay. Comprehensive experiments on Pascal-VOC and COCO datasets support the state-of-the-art performance of our model.  
  Address Paris; France; October 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes LAMP Approved no  
  Call Number Admin @ si @ LCG2023 Serial 3949  
Permanent link to this record
 

 
Author Alejandro Cartas; Petia Radeva; Mariella Dimiccoli edit  url
openurl 
  Title Modeling long-term interactions to enhance action recognition Type Conference Article
  Year 2021 Publication 25th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 10351-10358  
  Keywords  
  Abstract In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects and calculates the action score through a CNN formulation. This information is then fed to a Hierarchical LongShort-Term Memory Network (HLSTM) that captures temporal dependencies between actions within and across shots. Ablation studies thoroughly validate the proposed approach, showing in particular that both levels of the HLSTM architecture contribute to performance improvement. Furthermore, quantitative comparisons show that the proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks,without relying on motion information  
  Address January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes MILAB; Approved no  
  Call Number Admin @ si @ CRD2021 Serial 3626  
Permanent link to this record
 

 
Author Swathikiran Sudhakaran; Sergio Escalera; Oswald Lanz edit   pdf
url  doi
openurl 
  Title LSTA: Long Short-Term Attention for Egocentric Action Recognition Type Conference Article
  Year 2019 Publication 32nd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 9946-9955  
  Keywords  
  Abstract Egocentric activity recognition is one of the most challenging tasks in video analysis. It requires a fine-grained discrimination of small objects and their manipulation. While some methods base on strong supervision and attention mechanisms, they are either annotation consuming or do not take spatio-temporal patterns into account. In this paper we propose LSTA as a mechanism to focus on features from spatial relevant parts while attention is being tracked smoothly across the video sequence. We demonstrate the effectiveness of LSTA on egocentric activity recognition with an end-to-end trainable two-stream architecture, achieving state-of-the-art performance on four standard benchmarks.  
  Address California; June 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ SEL2019 Serial 3333  
Permanent link to this record
 

 
Author Siyang Song; Micol Spitale; Cheng Luo; German Barquero; Cristina Palmero; Sergio Escalera; Michel Valstar; Tobias Baur; Fabien Ringeval; Elisabeth Andre; Hatice Gunes edit  url
openurl 
  Title REACT2023: The First Multiple Appropriate Facial Reaction Generation Challenge Type Conference Article
  Year 2023 Publication Proceedings of the 31st ACM International Conference on Multimedia Abbreviated Journal  
  Volume Issue Pages (down) 9620–9624  
  Keywords  
  Abstract The Multiple Appropriate Facial Reaction Generation Challenge (REACT2023) is the first competition event focused on evaluating multimedia processing and machine learning techniques for generating human-appropriate facial reactions in various dyadic interaction scenarios, with all participants competing strictly under the same conditions. The goal of the challenge is to provide the first benchmark test set for multi-modal information processing and to foster collaboration among the audio, visual, and audio-visual behaviour analysis and behaviour generation (a.k.a generative AI) communities, to compare the relative merits of the approaches to automatic appropriate facial reaction generation under different spontaneous dyadic interaction conditions. This paper presents: (i) the novelties, contributions and guidelines of the REACT2023 challenge; (ii) the dataset utilized in the challenge; and (iii) the performance of the baseline systems on the two proposed sub-challenges: Offline Multiple Appropriate Facial Reaction Generation and Online Multiple Appropriate Facial Reaction Generation, respectively. The challenge baseline code is publicly available at https://github.com/reactmultimodalchallenge/baseline_react2023.  
  Address Otawa; Canada; October 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MM  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ SSL2023 Serial 3931  
Permanent link to this record
 

 
Author Felipe Codevilla; Eder Santana; Antonio Lopez; Adrien Gaidon edit   pdf
url  doi
openurl 
  Title Exploring the Limitations of Behavior Cloning for Autonomous Driving Type Conference Article
  Year 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages (down) 9328-9337  
  Keywords  
  Abstract Driving requires reacting to a wide variety of complex environment conditions and agent behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation learning can, in theory, leverage data from large fleets of human-driven cars. Behavior cloning in particular has been successfully used to learn simple visuomotor policies end-to-end, but scaling to the full spectrum of driving behaviors remains an unsolved problem. In this paper, we propose a new benchmark to experimentally investigate the scalability and limitations of behavior cloning. We show that behavior cloning leads to state-of-the-art results, executing complex lateral and longitudinal maneuvers, even in unseen environments, without being explicitly programmed to do so. However, we confirm some limitations of the behavior cloning approach: some well-known limitations (eg, dataset bias and overfitting), new generalization issues (eg, dynamic objects and the lack of a causal modeling), and training instabilities, all requiring further research before behavior cloning can graduate to real-world driving. The code, dataset, benchmark, and agent studied in this paper can be found at github.  
  Address Seul; Korea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes ADAS; 600.124; 600.118 Approved no  
  Call Number Admin @ si @ CSL2019 Serial 3322  
Permanent link to this record
 

 
Author Shiqi Yang; Yaxing Wang; Joost Van de Weijer; Luis Herranz; Shangling Jui edit   pdf
doi  openurl
  Title Generalized Source-free Domain Adaptation Type Conference Article
  Year 2021 Publication 19th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages (down) 8958-8967  
  Keywords  
  Abstract Domain adaptation (DA) aims to transfer the knowledge learned from a source domain to an unlabeled target domain. Some recent works tackle source-free domain adaptation (SFDA) where only a source pre-trained model is available for adaptation to the target domain. However, those methods do not consider keeping source performance which is of high practical value in real world applications. In this paper, we propose a new domain adaptation paradigm called Generalized Source-free Domain Adaptation (G-SFDA), where the learned model needs to perform well on both the target and source domains, with only access to current unlabeled target data during adaptation. First, we propose local structure clustering (LSC), aiming to cluster the target features with its semantically similar neighbors, which successfully adapts the model to the target domain in the absence of source data. Second, we propose sparse domain attention (SDA), it produces a binary domain specific attention to activate different feature channels for different domains, meanwhile the domain attention will be utilized to regularize the gradient during adaptation to keep source information. In the experiments, for target performance our method is on par with or better than existing DA and SFDA methods, specifically it achieves state-of-the-art performance (85.4%) on VisDA, and our method works well for all domains after adapting to single or multiple target domains.  
  Address Virtual; October 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120; 600.147 Approved no  
  Call Number Admin @ si @ YWW2021 Serial 3605  
Permanent link to this record
 

 
Author David Berga; Xose R. Fernandez-Vidal; Xavier Otazu; Xose M. Pardo edit   pdf
url  doi
openurl 
  Title SID4VAM: A Benchmark Dataset with Synthetic Images for Visual Attention Modeling Type Conference Article
  Year 2019 Publication 18th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages (down) 8788-8797  
  Keywords  
  Abstract A benchmark of saliency models performance with a synthetic image dataset is provided. Model performance is evaluated through saliency metrics as well as the influence of model inspiration and consistency with human psychophysics. SID4VAM is composed of 230 synthetic images, with known salient regions. Images were generated with 15 distinct types of low-level features (e.g. orientation, brightness, color, size...) with a target-distractor popout type of synthetic patterns. We have used Free-Viewing and Visual Search task instructions and 7 feature contrasts for each feature category. Our study reveals that state-ofthe-art Deep Learning saliency models do not perform well with synthetic pattern images, instead, models with Spectral/Fourier inspiration outperform others in saliency metrics and are more consistent with human psychophysical experimentation. This study proposes a new way to evaluate saliency models in the forthcoming literature, accounting for synthetic images with uniquely low-level feature contexts, distinct from previous eye tracking image datasets.  
  Address Seul; Corea; October 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes NEUROBIT; 600.128 Approved no  
  Call Number Admin @ si @ BFO2019b Serial 3372  
Permanent link to this record
 

 
Author Hunor Laczko; Meysam Madadi; Sergio Escalera; Jordi Gonzalez edit   pdf
url  openurl
  Title A Generative Multi-Resolution Pyramid and Normal-Conditioning 3D Cloth Draping Type Conference Article
  Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages (down) 8709-8718  
  Keywords  
  Abstract RGB cloth generation has been deeply studied in the related literature, however, 3D garment generation remains an open problem. In this paper, we build a conditional variational autoencoder for 3D garment generation and draping. We propose a pyramid network to add garment details progressively in a canonical space, i.e. unposing and unshaping the garments w.r.t. the body. We study conditioning the network on surface normal UV maps, as an intermediate representation, which is an easier problem to optimize than 3D coordinates. Our results on two public datasets, CLOTH3D and CAPE, show that our model is robust, controllable in terms of detail generation by the use of multi-resolution pyramids, and achieves state-of-the-art results that can highly generalize to unseen garments, poses, and shapes even when training with small amounts of data.  
  Address Waikoloa; Hawai; USA; January 2024  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes ISE; HUPBA Approved no  
  Call Number Admin @ si @ LME2024 Serial 3996  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: