AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning
CoRR(2024)
Abstract
Video diffusion models has been gaining increasing attention for its ability
to produce videos that are both coherent and of high fidelity. However, the
iterative denoising process makes it computationally intensive and
time-consuming, thus limiting its applications. Inspired by the Consistency
Model (CM) that distills pretrained image diffusion models to accelerate the
sampling with minimal steps and its successful extension Latent Consistency
Model (LCM) on conditional image generation, we propose AnimateLCM, allowing
for high-fidelity video generation within minimal steps. Instead of directly
conducting consistency learning on the raw video dataset, we propose a
decoupled consistency learning strategy that decouples the distillation of
image generation priors and motion generation priors, which improves the
training efficiency and enhance the generation visual quality. Additionally, to
enable the combination of plug-and-play adapters in stable diffusion community
to achieve various functions (e.g., ControlNet for controllable generation). we
propose an efficient strategy to adapt existing adapters to our distilled
text-conditioned video consistency model or train adapters from scratch without
harming the sampling speed. We validate the proposed strategy in
image-conditioned video generation and layout-conditioned video generation, all
achieving top-performing results. Experimental results validate the
effectiveness of our proposed method. Code and weights will be made public.
More details are available at https://github.com/G-U-N/AnimateLCM.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined