Instruct-GS2GS

Editing 3D Gaussian Splatting Scenes with Instructions

UC Berkeley

TL;DR: Instruct-GS2GS enables instruction-based editing of Gaussian Splatting scenes via a 2D diffusion model

Overview

We propose a method for editing 3D Gaussian Splatting (3DGS) scenes with text-instructions. Given a 3DGS scene of a scene and the collection of images used to reconstruct it, our method uses an image-conditioned diffusion model (InstructPix2Pix) to iteratively edit the input images while optimizing the underlying scene, resulting in an optimized 3D scene that respects the edit instruction. We demonstrate that our proposed method is able to edit large-scale, real-world scenes, and is able to accomplish realistic and targeted edits.

How it works

Our method gradually updates a reconstructed 3DGS scene by iteratively updating the dataset images while training the 3DGS:

  1. Images are rendered from the scene at training viewpoints.
  2. They get edited by InstructPix2Pix given a global text instruction.
  3. The training dataset images are replaced with the edited images.
  4. The 3DGS continues training as usual for 2.5k iterations.

Results

Training Progression

Timelapse training with Instruct-GS2GS and Instruct-NeRF2NeRF. We note that Instruct-GS2GS performs edits much faster.

Citation

If you use this work or find it helpful, please consider citing: (bibtex)

@misc{igs2gs,
         author = {Vachha, Cyrus and Haque, Ayaan},
         title = {Instruct-GS2GS: Editing 3D Gaussian Splats with Instructions},
         year = {2024},
         url = {https://instruct-gs2gs.github.io/}
        } 

Acknowledgments

We thank our instructors Alexei A. Efros and Angjoo Kanazawa for their support. This work was completed as a course project for CS180/280A. We would also like to thank the Nerfstudio and gsplat team for providing the 3D Gaussian Splatting implementation. We thank the authors of Instruct-NeRF2NeRF for their paper and website format.