Transforming NeRF: Fusion, ScanNet++, Depth and Distortion Loss for Large-Scale Scene Reconstruction
This project builds on the NeRFusion paper and extends NeRF with a GRUFusion block, depth and distortion loss components, as well as real-time visualizations and training on Scannet++. The GRUFusion block enhances multi-view information fusion for local volume reconstruction. This allows the system to better synthesize and interpret data from different perspectives, leading to faster per-scene optimization and the ability to render large-scale scenes. Additionally, depth and distortion loss components have been added to optimize learning and improve model accuracy. The fusion process is also visualized by comparing input views with the updated radiance field and extracting the mesh, providing a clear understanding of the process. These enhancements aim to make the NeRFusion framework more robust and versatile, paving the way for advanced 3D reconstruction tasks.
This is a re-development of the original NeRFusion code based heavily on nerf_pl, NeuralRecon, MVSNeRF. We thank the authors for sharing their code.
In order for logging to work, please export your WANDB_API_KEY
All the codes are tested in the following environment:
- Linux (Ubuntu 20.04 or above)
- 32GB RAM (in order to load full size images)
- NVIDIA GPU with Compute Compatibility >= 75 and VRAM >= 6GB, CUDA >= 11.3
# Ubuntu 18.04 and above is recommended.
sudo apt install libsparsehash-dev # you can try to install sparsehash with conda if you don't have sudo privileges.
conda env create -f environment.yaml
conda activate nerfusion
-
Python libraries
- Install
pytorch>=1.11.0
bypip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113
- Install
torch-scatter
following their instruction - Install
tinycudann
following their instruction (compilation and pytorch extension) - Install
apex
following their instruction - Install
torchsparse
following their instruction - Install core requirements by
pip install -r requirements.txt
- Install
-
Cuda extension: Upgrade
pip
to >= 22.1 and runpip install models/csrc/
(please run this each time youpull
the code)
We follow the same data organization as the original NeRF, which expects camera parameters to be provided in a transforms.json
file. We also support data from NSVF, NeRF++, colmap and ScanNet.
Download the pretrained NeuralRecon weights and put it under
PROJECT_PATH/checkpoints/release
.
You can also use gdown to download it in command line:
mkdir checkpoints && cd checkpoints
gdown --id 1zKuWqm9weHSm98SZKld1PbEddgLOQkQV
To run training on a given dataset, the data should be organized in the original NeRF-style:
data
├── transforms.json
├── images
│ ├── 0000.jpg
├── 0001.jpg
├── ...
The following script trains models from scratch and automatically uploads metrics and artifacts to Weights & Biases.
python train.py --dataset_name DATASET_NAME --root_dir DIR_TO_SCANNET_SCENE --exp_name EXP_NAME
Please download and organize the datasets in the following manner:
├──data/
├──DTU/
├──google_scanned_objects/
├──ScanNet/
├──ScanNetPP/
For google scanned objects, we used renderings from IBRNet. Download with:
gdown https://drive.google.com/uc?id=1w1Cs0yztH6kE3JIz7mdggvPGCwIKkVi2
unzip google_scanned_objects_renderings.zip
For DTU and ScanNet, please use the official toolkits for downloading and processing of the data, and unpack the root directory to the data
folder mentioned above. Train with:
python train.py --train_root_dir DIR_TO_DATA --exp_name EXP_NAME
See opt.py
for more options.
The following command will generate and extract the global feature volumes created by the GRUFusion module, leveraging the pre-trained weights of NeuralRecon.
python train_fusion.py --cfg ./config/test.yaml
Once the global feature volume is available, you can run fusion based scene-reconstruction on any scannet scene by including the --use_gru_fusion flag.
Depth loss is added by default, but can be deactivated using --skip_depth_loading.
Use flag --distortion_loss_w and specify the weight (0 by defautl). Good values are 1e-3 for real scene and 1e-2 for synthetic scene.
Follow the procedures outlines above. Specify the dataset name as --dataset_name scannetpp. Note that training is done on DSLR images that first need to be undistorted using the scannetpp-toolkit.
All experiments are automatically tracked in Weights and Biases. To deactivate this use the --debug flag.
Use flag --use_sweep to leverage wandb sweep agents for hyperparameter tuning (default: False).
Make sure you have the following library installed:
conda install -c conda-forge libstdcxx-ng
Then execute the following command:
python show_gui.py --dataset_name scannet --root_dir data/scannet_official/scans/scene0000_00 --ckpt_path ckpts/scannet/test_scannet_8frames/epoch=29.ckpt
- w and s can be used to move forward and backward instead of using the mouse scroll.
- q and e can be used to move up and down, and a and d can be used to move left and right.
- Use right-click instead of left-click to control rotation.