Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers
CoRR(2023)
摘要
Recent advancements in 3D reconstruction from single images have been driven
by the evolution of generative models. Prominent among these are methods based
on Score Distillation Sampling (SDS) and the adaptation of diffusion models in
the 3D domain. Despite their progress, these techniques often face limitations
due to slow optimization or rendering processes, leading to extensive training
and optimization times. In this paper, we introduce a novel approach for
single-view reconstruction that efficiently generates a 3D model from a single
image via feed-forward inference. Our method utilizes two transformer-based
networks, namely a point decoder and a triplane decoder, to reconstruct 3D
objects using a hybrid Triplane-Gaussian intermediate representation. This
hybrid representation strikes a balance, achieving a faster rendering speed
compared to implicit representations while simultaneously delivering superior
rendering quality than explicit representations. The point decoder is designed
for generating point clouds from single images, offering an explicit
representation which is then utilized by the triplane decoder to query Gaussian
features for each point. This design choice addresses the challenges associated
with directly regressing explicit 3D Gaussian attributes characterized by their
non-structural nature. Subsequently, the 3D Gaussians are decoded by an MLP to
enable rapid rendering through splatting. Both decoders are built upon a
scalable, transformer-based architecture and have been efficiently trained on
large-scale 3D datasets. The evaluations conducted on both synthetic datasets
and real-world images demonstrate that our method not only achieves higher
quality but also ensures a faster runtime in comparison to previous
state-of-the-art techniques. Please see our project page at
https://zouzx.github.io/TriplaneGaussian/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要