FLDM-VTON: Faithful Latent Diffusion Model for Virtual Try-on
arxiv(2024)
摘要
Despite their impressive generative performance, latent diffusion model-based
virtual try-on (VTON) methods lack faithfulness to crucial details of the
clothes, such as style, pattern, and text. To alleviate these issues caused by
the diffusion stochastic nature and latent supervision, we propose a novel
Faithful Latent Diffusion Model for VTON, termed FLDM-VTON. FLDM-VTON improves
the conventional latent diffusion process in three major aspects. First, we
propose incorporating warped clothes as both the starting point and local
condition, supplying the model with faithful clothes priors. Second, we
introduce a novel clothes flattening network to constrain generated try-on
images, providing clothes-consistent faithful supervision. Third, we devise a
clothes-posterior sampling for faithful inference, further enhancing the model
performance over conventional clothes-agnostic Gaussian sampling. Extensive
experimental results on the benchmark VITON-HD and Dress Code datasets
demonstrate that our FLDM-VTON outperforms state-of-the-art baselines and is
able to generate photo-realistic try-on images with faithful clothing details.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要