Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates
CVPR 2024(2024)
Abstract
Neural radiance field is an emerging rendering method that generateshigh-quality multi-view consistent images from a neural scene representationand volume rendering. Although neural radiance field-based techniques arerobust for scene reconstruction, their ability to add or remove objects remainslimited. This paper proposes a new language-driven approach for objectmanipulation with neural radiance fields through dataset updates. Specifically,to insert a new foreground object represented by a set of multi-view imagesinto a background radiance field, we use a text-to-image diffusion model tolearn and generate combined images that fuse the object of interest into thegiven background across views. These combined images are then used for refiningthe background radiance field so that we can render view-consistent imagescontaining both the object and the background. To ensure view consistency, wepropose a dataset updates strategy that prioritizes radiance field trainingwith camera views close to the already-trained views prior to propagating thetraining to remaining views. We show that under the same dataset updatesstrategy, we can easily adapt our method for object insertion using data fromtext-to-3D models as well as object removal. Experimental results show that ourmethod generates photorealistic images of the edited scenes, and outperformsstate-of-the-art methods in 3D reconstruction and neural radiance fieldblending.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined