Distilling Diffusion Models into Conditional GANs
arxiv(2024)
摘要
We propose a method to distill a complex multistep diffusion model into a
single-step conditional GAN student model, dramatically accelerating inference,
while preserving image quality. Our approach interprets diffusion distillation
as a paired image-to-image translation task, using noise-to-image pairs of the
diffusion model's ODE trajectory. For efficient regression loss computation, we
propose E-LatentLPIPS, a perceptual loss operating directly in diffusion
model's latent space, utilizing an ensemble of augmentations. Furthermore, we
adapt a diffusion model to construct a multi-scale discriminator with a text
alignment loss to build an effective conditional GAN-based formulation.
E-LatentLPIPS converges more efficiently than many existing distillation
methods, even accounting for dataset construction costs. We demonstrate that
our one-step generator outperforms cutting-edge one-step diffusion distillation
models - DMD, SDXL-Turbo, and SDXL-Lightning - on the zero-shot COCO benchmark.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要