/EditableNeRF_cvpr23

Primary LanguagePythonApache License 2.0Apache-2.0

EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points

This is the code for "EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points", and the project page is here.

This code is built upon HyperNeRF.

Overview

  • Train a HyperNeRF and obtain rendering results.
  • Run data_process/k_points_detect.py to detect key points.
  • Run data_process/k_points_init_RAFT.py based on RAFT to initialize key points in 2d.
  • Run data_process/k_points_init_3d.py to initialize key points in 3d.
  • Run train.py for EditableNeRF training.
  • Run eval.py for EditableNeRF evaluation and rendering.

1. Train a HyperNeRF

Before running our method, you should run HyperNeRF first, as many parts of our code rely on the Python environments and the results of HyperNeRF.

Please follow HyperNeRF to set up Python environments, prepare your dataset (or use the dataset we provide), and train a HyperNeRF.

After training, you need to use this HyperNeRF to render a set of results, which will be used as inputs of EditableNeRF.

You can also train a HyperNeRF using our EditableNeRF code by setting the config file as hypernerf_vrig_ds_2d.gin. And the codes for saving the following images are already in our code (lines 147-154 in our eval.py).

  1. Depth images: direct outputs of HyperNeRF. You can save these images using: np.save(f'your_path/depth_median_{item_id}.npy', model_out['med_depth']) where model_out is the returned dictionary of HyperNeRF (e.g., line 126, line 264 in HyperNeRF's eval.py). You should render them in both the original input camera views and a fixed view (e.g., the view of the first frame).

  2. Rendered maps for canonical points (hyperspace): for each pixel, its value in this map is the canonical coordinate of the corresponding surface point. Note this coordinate is in hyperspace, and these maps only need to be rendered in a fixed view. You can save these maps using np.save(f'your_path/med_warped_points_{item_id}.npy', model_out['med_warped_points'].squeeze()]), and replacing the models.py of HyperNeRF with the one we provide at data_process/models.py (different in lines 544-549).

  3. Rendered maps for 3d points: save these maps similarly using np.save(f'your_path/med_points_{item_id}.npy', model_out['med_points'].squeeze()]), in the original input camera views and a fixed view.

We provide an example of these results in our dataset (in capture_dice_cup.zip); please refer to it and make sure your data is in the correct format and consistent with ours.

2. Key Point Detection and Initialization

Based on these rendering results, the next stage is to detect and initialize key points.

2.1 Key Point Detection

First, run data_process/k_points_detect.py using python k_points_detect.py (or using Jupyter) in the same Python environment as the dataset processing stage in HyperNeRF. You need to change the parameters in k_points_detect.py (lines 16-20) to be consistent with those you used in HyperNeRF.

After that, you can find the detected key points and the visualization results in a new folder data_process/kp_init. Please check these results before you step into the next stage.

2.2 Key Point Initialization in 2D and Optical Flow Computing

Then, use RAFT to initialize key points in 2D and compute the optical flow that will be used in EditableNeRF training.

The data_process/k_points_init_RAFT.py file is an expansion of the demo.py file in RAFT, and it can be run similarly as demo.py in RAFT based on the Python environment used in RAFT:

python k_points_init_RAFT.py \
    --model=models/raft-things.pth \
    --path=input/rgb/1x \
    --kp_file=kp_init \
    --skip_prop=50 \
    --colmap_scale=4

where skip_prop is the frame number $M$ for skipping propagation as in our paper (Sec. 3.3). Set it to zero if you do not want to use skipping propagation. Note that here you should use the original input images without downsampling.

When it finished, you can find a new file kp_2d_init.txt in data_process/kp_init, which is the initialized 2D key points; and a new folder data_process/flow, which contains the optical flow that will be used in EditableNeRF training. Also, the visualization results are provided in data_process/out and data_process/kp_init/kp_visual; check them if you find anything wrong.

2.3 Key Point Initialization in 3D

Before starting training, the last step is initializing the key points in 3D based on their 2D positions.

This can be done by running data_process/k_points_init_3d.py using python k_points_init_3d.py (or using Jupyter) again in the dataset processing environment in HyperNeRF. Also, please remember to change the parameters in it (lines 20-22).

3. Training and Rendering

Input file should be like:

capture_demo
    camera (same as HyperNeRF)
    camera-paths (same as HyperNeRF)
    data [output of k_points_init_3d.py]
      ├─k_points_auto.txt
      └─k_points_auto_coor_*.txt 
    depth [output of HyperNeRF]
      └─depth_median_*.npy
    flow [output of k_points_init_RAFT.py]
      └─flow_*.npy
    rgb (same as HyperNeRF)
    dataset.json (same as HyperNeRF)
    metadata.json (same as HyperNeRF)
    scene.json (same as HyperNeRF)

Our training and rendering methods are similar to HyperNeRF.

python train.py \
    --base_folder ../out/save_demo \
    --gin_bindings="data_dir='../in/capture_demo'" \
    --gin_configs configs/editablenerf_2p.gin

python eval.py \
    --base_folder ../out/save_demo \
    --gin_bindings="data_dir='../in/capture_demo'" \
    --gin_configs configs/editablenerf_2p.gin

Dataset

Download at Google Driver.

(Optional) GUI

Running GUI_qt.py further needs Qt5 installation:

pip install pyvista
pip install pyvistaqt
pip install pyqt5   

Then run GUI by:

python GUI_qt.py

The current version of GUI only supports scenes containing one keypoint. For editing a scene containing multiple keypoints, please follow the comments in encode_metadata() from evaluation.py.

Citing

@misc{zheng2023editablenerf,
      title={EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points}, 
      author={Chengwei Zheng and Wenbin Lin and Feng Xu},
      year={2023},
      eprint={2212.04247},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}