Commonsense Prototype for Outdoor Unsupervised 3D Object Detection (CVPR 2024)

Primary LanguagePython

Commonsense Prototype for Outdoor Unsupervised 3D Object Detection (CVPR 2024)

This is the codebase of our CVPR 2024 paper.



CPD (Commonsense Prototype-based Detector) is a high-performance unsupervised 3D object detection framework. CPD first constructs Commonsense Prototype (CProto) characterized by high-quality bounding box and dense points, based on commonsense intuition. Subsequently, CPD refines the low-quality pseudo-labels by leveraging the size prior from CProto. Furthermore, CPD enhances the detection accuracy of sparsely scanned objects by the geometric knowledge from CProto. CPD outperforms state-of-the-art unsupervised 3D detectors on the Waymo Open Dataset (WOD), and KITTI datasets by a large margin. image


conda create -n spconv2 python=3.9
conda activate spconv2
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 waymo-open-dataset-tf-2-5-0 nuscenes-devkit==1.0.5 spconv-cu111 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open3d pandas future pybind11 tensorboardX tensorboard Cython prefetch-generator

Environment we tested:

Ubuntu 18.04
Python 3.9.13
PyTorch 1.8.1
Numba 0.53.1
Spconv 2.1.22 # pip install spconv-cu111
4x 3090 GPUs

Prepare Dataset

Waymo Dataset

  • Please download the official Waymo Open Dataset, including the training data training_0000.tar~training_0031.tar and the validation data validation_0000.tar~validation_0007.tar.
  • Unzip all the above xxxx.tar files to the directory of data/waymo/raw_data as follows (You could get 798 train tfrecord and 202 val tfrecord ):
├── data
│   ├── waymo
│   │   │── ImageSets
│   │   │── raw_data
│   │   │   │── segment-xxxxxxxx.tfrecord
|   |   |   |── ...
|   |   |── waymo_processed_data_train_val_test
│   │   │   │── segment-xxxxxxxx/
|   |   |   |── ...
│   │   │── pcdet_waymo_track_dbinfos_train_cp.pkl
│   │   │── waymo_infos_test.pkl
│   │   │── waymo_infos_train.pkl
│   │   │── waymo_infos_val.pkl
├── cpd
├── tools

Then, generate dataset information:

python3 -m cpd.datasets.waymo_unsupervised.waymo_unsupervised_dataset --cfg_file tools/cfgs/dataset_configs/waymo_unsupervised/waymo_unsupervised_cproto.yaml

KITTI Dataset

  • Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows (the road planes could be downloaded from [road plane], which are optional for data augmentation in the training):
├── data
│   ├── kitti
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │── testing
│   │   │   ├──calib & velodyne & image_2
├── cpd
├── tools

Run following command to create dataset infos:

python3 -m cpd.datasets.kitti.kitti2waymo_dataset create_kitti_infos tools/cfgs/dataset_configs/waymo_unsupervised/kitti2waymo_dataset.yaml


Train using scripts

cd tools
sh dist_train.sh {cfg_file}

The log infos are saved into log-test.txt You can run cat log.txt to view the test results.

or run directly

cd tools
python train.py 


cd tools
sh dist_test.sh {cfg_file}

The log infos are saved into log-test.txt You can run cat log-test.txt to view the test results.

Model Zoo

Model Vehicle 3D AP Pedestrian 3D AP Cyclist 3D AP Download
L1 L2 L1 L2 L1 L2
DBSCAN-single-train 2.65 2.29 0 0 0.25 0.20 ---
OYSTER-single-train 7.91 6.78 0.03 0.02 4.65 4.05 oyster_pretrained
CPD 38.74 33.37 16.53 13.72 4.28 4.13 cpd_pretrained

The thresholds for evaluating these three categories are respectively set to $IoU_{0.7}$, $IoU_{0.5}$, and $IoU_{0.5}$.


    title={Commonsense Prototype for Outdoor Unsupervised 3D Object Detection},
    author={Wu, Hai and Zhao, Shijia and Huang, Xun and Wen, Chenglu and Li, Xin and Wang, Cheng},