LRM: Large Reconstruction Model for Single Image to 3D

ICLR 2024(2023)

引用 53|浏览275
暂无评分
摘要
We propose the first Large Reconstruction Model (LRM) that predicts the 3D\nmodel of an object from a single input image within just 5 seconds. In contrast\nto many previous methods that are trained on small-scale datasets such as\nShapeNet in a category-specific fashion, LRM adopts a highly scalable\ntransformer-based architecture with 500 million learnable parameters to\ndirectly predict a neural radiance field (NeRF) from the input image. We train\nour model in an end-to-end manner on massive multi-view data containing around\n1 million objects, including both synthetic renderings from Objaverse and real\ncaptures from MVImgNet. This combination of a high-capacity model and\nlarge-scale training data empowers our model to be highly generalizable and\nproduce high-quality 3D reconstructions from various testing inputs including\nreal-world in-the-wild captures and images from generative models. Video demos\nand interactable 3D meshes can be found on this website:\nhttps://yiconghong.me/LRM/.
更多
查看译文
关键词
3D Reconstruction,Large-Scale,Transformers
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要