SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer
CoRR(2024)
Abstract
Recent advances in 2D/3D generative models enable the generation of dynamic
3D objects from a single-view video. Existing approaches utilize score
distillation sampling to form the dynamic scene as dynamic NeRF or dense 3D
Gaussians. However, these methods struggle to strike a balance among reference
view alignment, spatio-temporal consistency, and motion fidelity under
single-view conditions due to the implicit nature of NeRF or the intricate
dense Gaussian motion prediction. To address these issues, this paper proposes
an efficient, sparse-controlled video-to-4D framework named SC4D, that
decouples motion and appearance to achieve superior video-to-4D generation.
Moreover, we introduce Adaptive Gaussian (AG) initialization and Gaussian
Alignment (GA) loss to mitigate shape degeneration issue, ensuring the fidelity
of the learned motion and shape. Comprehensive experimental results demonstrate
that our method surpasses existing methods in both quality and efficiency. In
addition, facilitated by the disentangled modeling of motion and appearance of
SC4D, we devise a novel application that seamlessly transfers the learned
motion onto a diverse array of 4D entities according to textual descriptions.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined