
Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)(2024)

3D scene representations have gained immense popularity in recent years.Methods that use Neural Radiance fields are versatile for traditional taskssuch as novel view synthesis. In recent times, some work has emerged that aimsto extend the functionality of NeRF beyond view synthesis, for semanticallyaware tasks such as editing and segmentation using 3D feature fielddistillation from 2D foundation models. However, these methods have two majorlimitations: (a) they are limited by the rendering speed of NeRF pipelines, and(b) implicitly represented feature fields suffer from continuity artifactsreducing feature quality. Recently, 3D Gaussian Splatting has shownstate-of-the-art performance on real-time radiance field rendering. In thiswork, we go one step further: in addition to radiance field rendering, weenable 3D Gaussian splatting on arbitrary-dimension semantic features via 2Dfoundation model distillation. This translation is not straightforward: naivelyincorporating feature fields in the 3DGS framework encounters significantchallenges, notably the disparities in spatial resolution and channelconsistency between RGB images and feature maps. We propose architectural andtraining changes to efficiently avert this problem. Our proposed method isgeneral, and our experiments showcase novel view semantic segmentation,language-guided editing and segment anything through learning feature fieldsfrom state-of-the-art 2D foundation models such as SAM and CLIP-LSeg. Acrossexperiments, our distillation method is able to provide comparable or betterresults, while being significantly faster to both train and render.Additionally, to the best of our knowledge, we are the first method to enablepoint and bounding-box prompting for radiance field manipulation, by leveragingthe SAM model. Project website at:
