/GP-NeRF

[CVPR 2024 Highlight] GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding

Primary LanguageJupyter NotebookMIT LicenseMIT

GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding [Official, CVPR 2024, Highlight]

[Project] [Paper]

Overview

Hao Li1Dingwen Zhang1,6,Yalun Dai4Nian Liu2,Lechao Cheng3Jingfeng Li1Jingdong Wang5Junwei Han1,6

(1Brain and Artificial Intelligence Lab, Northwestern Polytechnical University 2MBZUAI 3Hefei University of Technology 4Nanyang Technological Universityh 5Baidu, Inc. 6Institute of Artificial Intelligence, Hefei Comprehensive National Science Center * Corresponding Author)

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024, Highlight

Dataset Preparation

Scannet

Please follows the Semantic-Ray for Scannet data preperation.

├── data
│   ├── scannet
│   │   ├── scene0000_00
│   │   │   ├── color
│   │   │   │   ├── 0.jpg
│   │   │   │   ├── ...
│   │   │   ├── depth
│   │   │   │   ├── 0.png
│   │   │   │   ├── ...
│   │   │   ├── label-filt
│   │   │   │   ├── 0.png
│   │   │   │   ├── ...
│   │   │   ├── pose
│   │   │   │   ├── 0.txt
│   │   │   │   ├── ...
│   │   │   ├── intrinsic
│   │   │   │   ├── extrinsic_color.txt
│   │   │   │   ├── intrinsic_color.txt
│   │   │   │   ├── ...
│   │   │   ├── ...
│   │   ├── ...
│   │   ├── scannetv2-labels.combined.tsv

Replica

Please follows the Semantic-NeRF for Replica data preperation.

├── data
│   ├── Replica
│   │   ├── office_0
│   │   │   ├── Sequence_1
│   │   │   │   ├── depth
│   │   │   │   │   ├── depth_0.png
│   │   │   │   │   ├── ...
│   │   │   │   ├── rgb
│   │   │   │   │   ├── rgb_0.png
│   │   │   │   │   ├── ...
│   │   │   │   ├── semantic_class
│   │   │   │   │   ├── semantic_class_0.png
│   │   │   │   │   ├── ...
│   │   │   │   ├── traj_w_c.txt
│   │   │   ├── Sequence_2
│   │   │   │   ├── depth
│   │   │   │   │   ├── depth_0.png
│   │   │   │   │   ├── ...
│   │   │   │   ├── rgb
│   │   │   │   │   ├── rgb_0.png
│   │   │   │   │   ├── ...
│   │   │   │   ├── semantic_class
│   │   │   │   │   ├── semantic_class_0.png
│   │   │   │   │   ├── ...
│   │   │   │   ├── traj_w_c.txt

Checkpoints

We provide our checkpoints in this link for evaluation.

Loading Genearlized NeRF Pre-trained Model

For better and faster reconstruction results, you can leverage the pretrained model of GNT, which can be downloaded from here. Put the pretrained model in out and name it as model_pretrain.pth, then you can load it with the following command --ckpt_path out/model_pretrain.pth.

Command

Note that we use 8 x NVIDIA A100 GPU to train our model.

Evaluation

We provide VSCode launch.json to conduct evaluation procedure.

For Scannet Evaluation:

{
    "env":{
        "CUDA_VISIBLE_DEVICES": "2"
    },
    "name": "eval: semantic-scannet",
    "type": "python",
    "request": "launch",
    "program": "${workspaceFolder}/eval_gpnerf.py",
    "console": "integratedTerminal",
    "justMyCode": false,
    "args": ["--config", "configs/gnt_scannet.txt", "--expname", "debug", "--no_load_opt", "--ckpt_path", "path-to-ckpt", "--val_set_list","configs/scannetv2_test_split.txt"]
},

For Replica Evaluation:

{
    "name": "eval: replica",
    "type": "python",
    "request": "launch",
    "program": "${workspaceFolder}/eval_gpnerf.py",
    "console": "integratedTerminal",
    "justMyCode": false,
    "args": ["--config", "configs/gnt_replica.txt", "--expname", "debug", "--no_load_opt", "--ckpt_path", "path-to-ckpt"]
},

Training in Finetuning Setting

We recommand multi-gpu training to reproduce our results:

python -m torch.distributed.launch --nproc_per_node=8 \
       --master_port=$(( RANDOM % 1000 + 50000 )) \
       ft_gpnerf.py --config configs/gnt_scannet_ft.txt \
       --ckpt_path path-to-ckpt --expname finetune_training_scannet --val_set_list configs/scannetv2_test_split.txt --no_load_opt --no_load_scheduler

We also provide launch.json command as below:

{
    "name": "ft: scannet",
    "type": "python",
    "request": "launch",
    "program": "${workspaceFolder}/ft_gpnerf.py",
    "console": "integratedTerminal",
    "justMyCode": false,
    "args": ["--config", "configs/gnt_scannet_ft.txt", "--expname", "debug", "--no_load_opt", "--ckpt_path", "path-to-ckpt"]
},

Training in Finetuning Setting

We recommand multi-gpu training to reproduce our results:

python -m torch.distributed.launch --nproc_per_node=8 \
       --master_port=$(( RANDOM % 1000 + 50000 )) \
       train_gpnerf.py --config configs/gnt_scannet.txt \
       --ckpt_path path-to-ckpt --expname generalized_training_scannet --val_set_list configs/scannetv2_test_split.txt --no_load_opt --no_load_scheduler

We also provide launch.json command as below:

{
    "env":{
        "CUDA_VISIBLE_DEVICES": "1"
    },
    "name": "train: scannet",
    "type": "python",
    "request": "launch",
    "program": "${workspaceFolder}/train_gpnerf.py",
    "console": "integratedTerminal",
    "justMyCode": false,
    "args": ["--config", "configs/gnt_scannet.txt", "--expname", "debug","--ckpt_path", "path-to-ckpt", "--no_load_opt", "--no_load_scheduler"]
},

Acknowledgement

This repo is benefits from GNT. Thanks for their wonderful work.