/PointGPT

[NeurIPS 2023] PointGPT: Auto-regressively Generative Pre-training from Point Clouds

Primary LanguagePythonMIT LicenseMIT

PointGPT

PointGPT: Auto-regressively Generative Pre-training from Point Clouds ArXiv

In this work, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, utilizing a point cloud auto-regressive generation task for pre-training transformer models. In object classification tasks, our PointGPT achieves 94.9% accuracy on the ModelNet40 dataset and 93.4% accuracy on the ScanObjectNN dataset, outperforming all other transformer models. In few-shot learning tasks, our method also attains new SOTA performance on all four benchmarks.

News

[2023.09.22] PointGPT has been accepted by NeurIPS 2023!

[2023.09.08] Unlabeled hybrid dataset and labeled hybrid dataset have been released!

[2023.08.19] Code has been updated; PointGPT-B and PointGPT-L models have been released!

[2023.06.20] Code and the PointGPT-S models have been released!

1. Requirements

PyTorch >= 1.7.0; python >= 3.7; CUDA >= 9.0; GCC >= 4.9; torchvision;

pip install -r requirements.txt
# Chamfer Distance & emd
cd ./extensions/chamfer_dist
python setup.py install --user
cd ./extensions/emd
python setup.py install --user
# PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

2. Datasets

Our training data for the PointGPT-S model encompasses ShapeNet, ScanObjectNN, ModelNet40, and ShapeNetPart datasets. For detailed information, please refer to DATASET.md.

To pretrain the PointGPT-B and PointGPT-L models, we employ both unlabeled hybrid dataset and labeled hybrid dataset, available for download here.

3. PointGPT Models

PointGPT-S Models

Task Dataset Config Acc. Download
Pre-training ShapeNet pretrain.yaml N.A. here
Classification ScanObjectNN finetune_scan_hardest.yaml 86.9% here
Classification ScanObjectNN finetune_scan_objbg.yaml 91.6% here
Classification ScanObjectNN finetune_scan_objonly.yaml 90.0% here
Classification ModelNet40(1k) finetune_modelnet.yaml 94.0% here
Classification ModelNet40(8k) finetune_modelnet_8k.yaml 94.2% here
Part segmentation ShapeNetPart segmentation 86.2% mIoU here
Task Dataset Config 5w10s Acc. (%) 5w20s Acc. (%) 10w10s Acc. (%) 10w20s Acc. (%)
Few-shot learning ModelNet40 fewshot.yaml 96.8 ± 2.0 98.6 ± 1.1 92.6 ± 4.6 95.2 ± 3.4

PointGPT-B Models

Task Dataset Config Acc. Download
Pre-training UnlabeledHybrid pretrain.yaml N.A. here
Post-pre-training LabeledHybrid post_pretrain.yaml N.A. here
Classification ScanObjectNN finetune_scan_hardest.yaml 91.9% here
Classification ScanObjectNN finetune_scan_objbg.yaml 95.8% here
Classification ScanObjectNN finetune_scan_objonly.yaml 95.2% here
Classification ModelNet40(1k) finetune_modelnet.yaml 94.4% here
Classification ModelNet40(8k) finetune_modelnet_8k.yaml 94.6% here
Part segmentation ShapeNetPart segmentation 86.5% mIoU here
Task Dataset Config 5w10s Acc. (%) 5w20s Acc. (%) 10w10s Acc. (%) 10w20s Acc. (%)
Few-shot learning ModelNet40 fewshot.yaml 97.5 ± 2.0 98.8 ± 1.0 93.5 ± 4.0 95.8 ± 3.0

PointGPT-L Models

Task Dataset Config Acc. Download
Pre-training UnlabeledHybrid pretrain.yaml N.A. here
Post-pre-training LabeledHybrid post_pretrain.yaml N.A. here
Classification ScanObjectNN finetune_scan_hardest.yaml 93.4% here
Classification ScanObjectNN finetune_scan_objbg.yaml 97.2% here
Classification ScanObjectNN finetune_scan_objonly.yaml 96.6% here
Classification ModelNet40(1k) finetune_modelnet.yaml 94.7% here
Classification ModelNet40(8k) finetune_modelnet_8k.yaml 94.9% here
Part segmentation ShapeNetPart segmentation 86.6% mIoU here
Task Dataset Config 5w10s Acc. (%) 5w20s Acc. (%) 10w10s Acc. (%) 10w20s Acc. (%)
Few-shot learning ModelNet40 fewshot.yaml 98.0 ± 1.9 99.0 ± 1.0 94.1 ± 3.3 96.1 ± 2.8

4. PointGPT Pre-training

To pretrain PointGPT, run the following command.

CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/<MODEL_NAME>/pretrain.yaml --exp_name <output_file_name>

To post-pretrain PointGPT, run the following command.

CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/<MODEL_NAME>/post_pretrain.yaml --exp_name <output_file_name> --finetune_model

5. PointGPT Fine-tuning

Fine-tuning on ScanObjectNN, run the following command:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/<MODEL_NAME>/finetune_scan_hardest.yaml \
--finetune_model --exp_name <output_file_name> --ckpts <path/to/pre-trained/model>

Fine-tuning on ModelNet40, run the following command:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/<MODEL_NAME>/finetune_modelnet.yaml \
--finetune_model --exp_name <output_file_name> --ckpts <path/to/pre-trained/model>

Voting on ModelNet40, run the following command:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/<MODEL_NAME>/finetune_modelnet.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model>

Few-shot learning, run the following command:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/<MODEL_NAME>/fewshot.yaml --finetune_model \
--ckpts <path/to/pre-trained/model> --exp_name <output_file_name> --way <5 or 10> --shot <10 or 20> --fold <0-9>

Part segmentation on ShapeNetPart, run the following command:

cd segmentation
python main.py --ckpts <path/to/pre-trained/model> --root path/to/data --learning_rate 0.0002 --epoch 300 --model_name <MODEL_NAME>

6. Visualization

Visulization of pre-trained model on validation set, run:

python main_vis.py --test --ckpts <path/to/pre-trained/model> --config cfgs/<MODEL_NAME>/pretrain.yaml --exp_name <name>

7. Ablation studies on post-pre-training stage

Methods ScanObjectNN ModelNet40 ShapeNetPart
OBJ_BG OBJ_ONLY PB_T50_RS 1k P 8k P Cls.mIoU Inst.mIoU
without post-pre-training
PointGPT-B 93.6 92.5 89.6 94.2 94.4 84.5 86.4
PointGPT-L 95.7 94.1 91.1 94.5 94.7 84.7 86.5
with post-pre-training
PointGPT-B 95.8 (+2.2) 95.2 (+2.7) 91.9 (+2.3) 94.4 (+0.2) 94.6 (+0.2) 84.5 (+0.0) 86.5 (+0.1)
PointGPT-L 97.2 (+1.5) 96.6 (+2.5) 93.4 (+2.3) 94.7 (+0.2) 94.9 (+0.2) 84.8 (+0.1) 86.6 (+0.1)

Acknowledgements

Our codes are built upon Point-MAE, Point-BERT, Pointnet2_PyTorch and Pointnet_Pointnet2_pytorch

The unlabeled hybrid dataset and labeled hybrid dataset are built upon ModelNet40, PartNet, ShapeNet, S3DIS, ScanObjectNN, SUN RGB-D, and Semantic3D

Reference

@article{chen2024pointgpt,
  title={Pointgpt: Auto-regressively generative pre-training from point clouds},
  author={Chen, Guangyan and Wang, Meiling and Yang, Yi and Yu, Kai and Yuan, Li and Yue, Yufeng},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

For unlabeled hybrid dataset or labeled hybrid dataset, please also cite the following work.

@inproceedings{wu20153d,
  title={3d shapenets: A deep representation for volumetric shapes},
  author={Wu, Zhirong and Song, Shuran and Khosla, Aditya and Yu, Fisher and Zhang, Linguang and Tang, Xiaoou and Xiao, Jianxiong},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={1912--1920},
  year={2015}
}

@inproceedings{mo2019partnet,
  title={Partnet: A large-scale benchmark for fine-grained and hierarchical part-level 3d object understanding},
  author={Mo, Kaichun and Zhu, Shilin and Chang, Angel X and Yi, Li and Tripathi, Subarna and Guibas, Leonidas J and Su, Hao},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={909--918},
  year={2019}
}

@article{chang2015shapenet,
  title={Shapenet: An information-rich 3d model repository},
  author={Chang, Angel X and Funkhouser, Thomas and Guibas, Leonidas and Hanrahan, Pat and Huang, Qixing and Li, Zimo and Savarese, Silvio and Savva, Manolis and Song, Shuran and Su, Hao and others},
  journal={arXiv preprint arXiv:1512.03012},
  year={2015}
}

@inproceedings{armeni20163d,
  title={3d semantic parsing of large-scale indoor spaces},
  author={Armeni, Iro and Sener, Ozan and Zamir, Amir R and Jiang, Helen and Brilakis, Ioannis and Fischer, Martin and Savarese, Silvio},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={1534--1543},
  year={2016}
}

@inproceedings{uy-scanobjectnn-iccv19,
  title = {Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data},
  author = {Mikaela Angelina Uy and Quang-Hieu Pham and Binh-Son Hua and Duc Thanh Nguyen and Sai-Kit Yeung},
  booktitle = {International Conference on Computer Vision (ICCV)},
  year = {2019}
}

@inproceedings{song2015sun,
  title={Sun rgb-d: A rgb-d scene understanding benchmark suite},
  author={Song, Shuran and Lichtenberg, Samuel P and Xiao, Jianxiong},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={567--576},
  year={2015}
}

@article{hackel2017semantic3d,
  title={Semantic3d. net: A new large-scale point cloud classification benchmark},
  author={Hackel, Timo and Savinov, Nikolay and Ladicky, Lubor and Wegner, Jan D and Schindler, Konrad and Pollefeys, Marc},
  journal={arXiv preprint arXiv:1704.03847},
  year={2017}
}