You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs
CoRR(2024)
摘要
We introduce YOSO, a novel generative model designed for rapid, scalable, and
high-fidelity one-step image synthesis. This is achieved by integrating the
diffusion process with GANs. Specifically, we smooth the distribution by the
denoising generator itself, performing self-cooperative learning. We show that
our method can serve as a one-step generation model training from scratch with
competitive performance. Moreover, we show that our method can be extended to
finetune pre-trained text-to-image diffusion for high-quality one-step
text-to-image synthesis even with LoRA fine-tuning. In particular, we provide
the first diffusion transformer that can generate images in one step trained on
512 resolution, with the capability of adapting to 1024 resolution without
explicit training. Our code is provided at https://github.com/Luo-Yihong/YOSO.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要