/PaSCo

[CVPR 2024 Oral - Best paper award candidate] Official repository of "PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness"

Primary LanguagePythonApache License 2.0Apache-2.0

PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness

CVPR 2024 Oral - Best paper award candidate

Anh-Quan Cao1    Angela Dai2    Raoul de Charette1   

1 Inria 2 Technical University of Munich

arXiv Project page

If you find this work or code useful, please cite our paper and give this repo a star:

@InProceedings{cao2024pasco,
      title={PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness}, 
      author={Anh-Quan Cao and Angela Dai and Raoul de Charette},
      year={2024},
      booktitle = {CVPR}
}

Teaser

Table of Content

News

  • 25/06/2024: Added visualization code.
  • 10/06/2024: Training and evaluation code for PaSCo w/o MIMO has been released.
  • 06/04/2024: Dataset download instructions and label generation code for SemanticKITTI are now available.
  • 04/04/2024: PaSCo has been accepted as Oral paper at CVPR 2024 (0.8% = 90/11,532).
  • 05/12/2023: Paper released on arXiv! Code will be released soon! Please watch this repo for updates.

1. Installation

  1. Download the source code with git

    git clone https://github.com/astra-vision/PaSCo.git
    
  2. Create conda environment:

    conda create -y -n pasco python=3.9
    conda activate pasco
    
  3. Install pytorch 1.13.0

    pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
    
  4. Install Minkowski Engine v0.5.4

  5. Install pytorch_lightning 1.9.0 with torchmetrics 1.4.0.post0

  6. Install the additional dependencies:

    cd PaSCo/
    pip install -r requirements.txt
    
  7. Install PaSCo

    pip install -e ./
    

2. Data

2.1. Semantic KITTI

Please download the following data into a folder e.g. /gpfsdswork/dataset/SemanticKITTI and unzip:

  • The Semantic Scene Completion dataset v1.1 (SemanticKITTI voxel data (700 MB)) from SemanticKITTI website

  • The KITTI Odometry Benchmark calibration data (Download odometry data set (calibration files, 1 MB)).

  • The KITTI Odometry Benchmark Velodyne data (Download odometry data set (velodyne laser data, 80 GB)).

  • The dataset folder at /gpfsdswork/dataset/SemanticKITTI should have the following structure:

    └── /gpfsdswork/dataset/SemanticKITTI
      └── dataset
        └── sequences
    

3. Panoptic labels generation

3.1. Semantic KITTI

  1. Create a folder to store preprocess data for Semantic KITTI dataset e.g. /gpfsscratch/rech/kvd/uyl37fq/pasco_preprocess/kitti .
  2. Execute the command below to generate panoptic labels, or move to the next step to directly download the pre-generated labels:
    cd PaSCo/
    python label_gen/gen_instance_labels.py \
        --kitti_config=pasco/data/semantic_kitti/semantic-kitti.yaml \
        --kitti_root=/gpfsdswork/dataset/SemanticKITTI \
        --kitti_preprocess_root=/gpfsscratch/rech/kvd/uyl37fq/pasco_preprocess/kitti \
        --n_process=10
    

Note

This command doesn't need GPU. Processing 4649 files took approximately 10 hours using 10 processes. The number of processes can be adjusted by modifying the n_process parameter.

  1. You can download the generated panoptic labels for Semantic KITTI:
    1. Go to the preprocess folder for KITTI:
      cd /gpfsscratch/rech/kvd/uyl37fq/pasco_preprocess/kitti
      
    2. Download the compressed file:
      wget https://github.com/astra-vision/PaSCo/releases/download/v0.0.1/kitti_instance_label_v2.tar.gz
      
    3. Extract the file:
      tar xvf kitti_instance_label_v2.tar.gz
      
  2. Your folder structure should look as follows:
    /gpfsscratch/rech/kvd/uyl37fq/pasco_preprocess/kitti
    └── instance_labels_v2
        ├── 00
        ├── 01
        ├── 02
        ├── 03
        ├── 04
        ├── 05
        ├── 06
        ├── 07
        ├── 08
        ├── 09
        └── 10
    

3.2. KITTI-360

WORK IN PROGRESS

4. Training and evaluation

4.1. PaSCo w/o MIMO

4.1.1 Extract point features

Note

This step is only necessary when training on SemanticKITTI because of the availability of the WaffleIron pretrained model.

Tip

A better approach could be to explore the features of pretrained models available at https://github.com/valeoai/ScaLR.

  1. Install WaffleIron in a separate conda environment:
    conda create -y -n waffleiron 
    pip install pyaml==6.0 tqdm=4.63.0 scipy==1.8.0 torch==1.11.0 tensorboard=2.8.0
    cd PaSCo/WaffleIron_mod
    pip install -e ./
    

Caution

I used the older version of WaffleIron which requires pytorch 1.11.0.

  1. Run the following command to extract point features from the pretrained WaffleIron model (require 10883Mb GPU memory) pretrained on SemanticKITTI. The extracted features will be stored in the result_folder:
    cd PaSCo/WaffleIron_mod
    python extract_point_features.py \
    --path_dataset /gpfsdswork/dataset/SemanticKITTI \
    --ckpt pretrained_models/WaffleIron-48-256__kitti/ckpt_last.pth \
    --config configs/WaffleIron-48-256__kitti.yaml \
    --result_folder /gpfsscratch/rech/kvd/uyl37fq/pasco_preprocess/kitti/waffleiron_v2 \
    --phase val \
    --num_workers 3 \
    --num_votes 10 \
    --batch_size 2
    

4.1.2 Training

Note

The generated instance label is supposed to be stored in os.path.join(dataset_preprocess_root, "instance_labels_v2")

  1. Change the dataset_preprocess_root and dataset_root of the training command below to the preprocess and raw data folder respectively.

  2. The log_dir is the folder to store the training logs and checkpoints.

  3. Run the following command to train PaSCo w/o MIMO with batchsize of 2 on 2 V100-32G GPUs (1 item per GPU):

    cd PaSCo/
    python scripts/train.py --bs=2 --n_gpus=2 \
          --dataset_preprocess_root=/gpfsscratch/rech/kvd/uyl37fq/pasco_preprocess/kitti \
          --dataset_root=/gpfsdswork/dataset/SemanticKITTI \
          --log_dir=logs \
          --exp_prefix=pasco_single --lr=1e-4 --seed=0 \
          --data_aug=True --max_angle=30.0 --translate_distance=0.2 \
          --enable_log=True \
          --sample_query_class=True --n_infers=1
    
    

Important

During training, the reported metric is lower than the final metrics because we limit the number of generated voxels to prevent running out of memory. The training metrics are used solely to assess the progress of the training. The final metrics are determined during evaluation.

4.1.2 Evaluation

  1. Download the pretrained checkpoint at here and put it into ckpt folder or use your trained checkpoint.

  2. Run the following command to evaluate PaSCo w/o MIMO on 1 V100-32G GPUs (1 item per GPU). ckpt/pasco_single.ckpt is the path to the downloaded checkpoint:

    python scripts/eval.py --n_infers=1 --model_path=ckpt/pasco_single.ckpt
    
  3. Output looks like following:

    =====================================
    method, P, R, IoU, mIoU, All PQ dagger, All PQ, All SQ, All RQ, Thing PQ, Thing SQ, Thing RQ, Stuff PQ, Stuff SQ, Stuff RQ
    subnet 0, 86.41, 57.98, 53.13, 29.15, 26.33, 15.71, 53.82, 24.27, 12.27, 47.18, 18.86, 18.21, 58.65, 28.20
    ensemble, 86.41, 57.98, 53.13, 29.15, 26.33, 15.71, 53.82, 24.27, 12.27, 47.18, 18.86, 18.21, 58.65, 28.20
    =====================================
    ==> pq
    method, car, bicycle, motorcycle, truck, other-vehicle, person, bicyclist, motorcyclist, road, parking, sidewalk, other-ground, building, fence, vegetation, trunk, terrain, pole, traffic-sign
    subnet 0, 27.53, 6.21, 16.86, 34.27, 9.77, 3.53, 0.00, 0.00, 74.51, 26.63, 39.70, 0.54, 4.10, 4.64, 6.87, 3.80, 29.58, 7.68, 2.28
    ensemble, 27.53, 6.21, 16.86, 34.27, 9.77, 3.53, 0.00, 0.00, 74.51, 26.63, 39.70, 0.54, 4.10, 4.64, 6.87, 3.80, 29.58, 7.68, 2.28
    ==> sq
    method, car, bicycle, motorcycle, truck, other-vehicle, person, bicyclist, motorcyclist, road, parking, sidewalk, other-ground, building, fence, vegetation, trunk, terrain, pole, traffic-sign
    subnet 0, 69.83, 57.87, 64.56, 65.42, 59.95, 59.80, 0.00, 0.00, 75.74, 63.11, 58.65, 52.57, 56.08, 55.89, 52.51, 58.07, 62.12, 55.01, 55.42
    ensemble, 69.83, 57.87, 64.56, 65.42, 59.95, 59.80, 0.00, 0.00, 75.74, 63.11, 58.65, 52.57, 56.08, 55.89, 52.51, 58.07, 62.12, 55.01, 55.42
    ==> rq
    method, car, bicycle, motorcycle, truck, other-vehicle, person, bicyclist, motorcyclist, road, parking, sidewalk, other-ground, building, fence, vegetation, trunk, terrain, pole, traffic-sign
    subnet 0, 39.43, 10.73, 26.11, 52.38, 16.29, 5.91, 0.00, 0.00, 98.38, 42.19, 67.69, 1.04, 7.31, 8.31, 13.07, 6.54, 47.61, 13.96, 4.11
    ensemble, 39.43, 10.73, 26.11, 52.38, 16.29, 5.91, 0.00, 0.00, 98.38, 42.19, 67.69, 1.04, 7.31, 8.31, 13.07, 6.54, 47.61, 13.96, 4.11
    [2.621915578842163, 0.8142204284667969, 0.8685343265533447, 0.7775185108184814, 0.9801337718963623, 0.6943247318267822]
    inference time:  0.7034459143364459
    [0.004038333892822266, 0.003854036331176758, 0.005398988723754883, 0.003660440444946289, 0.004451274871826172, 0.003663778305053711]
    ensemble time:  0.004062994399293342
    Uncertainty threshold:  0.5
    =====================================
    method, ins ece, ins nll, ssc nonempty ece, ssc empty ece, ssc nonempty nll, ssc empty nll,  count, inference time
    subnet 0,  0.6235, 4.6463, 0.0911, 0.0357, 0.7075, 0.9657, 11702, 0.00
    ensemble,  0.6235, 4.6463, 0.0911, 0.0357, 0.7075, 0.9657, 11702, 0.00
    allocated 8895.119325153375
    

Important

Note that voxel ece = (ssc empty ece + ssc nonempty ece)/2 and voxel nll = (ssc empty nll + ssc nonempty nll)/2.

The inference time reported in the paper was measured on an A100 GPU, making it faster than on a V100. For SemanticKITTI, the time also includes the WaffleIron feature extraction duration.

4.2. PaSCo w/ MIMO

WORK IN PROGRESS

5. Visualization

  1. Install Mayavi following the official instructions.
  2. Run the following command to generate the prediction using the downloaded checkpoint ckpt/pasco_single.ckpt:
    cd PaSCo/
    python scripts/save_outputs_panoptic.py --model_path=ckpt/pasco_single.ckpt \
          --dataset_preprocess_root=/gpfsscratch/rech/kvd/uyl37fq/pasco_preprocess/kitti \
          --dataset_root=/gpfsdswork/dataset/SemanticKITTI
    
  3. Draw the generated output:
    cd PaSCo/
    python scripts/visualize.py
    

Acknowledgment

We thank the authors of the following repositories for making their code and models publicly available:

The research was supported by the French project SIGHT (ANR-20-CE23-0016), the ERC Starting Grant SpatialSem (101076253), and the SAMBA collaborative project co-funded by BpiFrance in the Investissement d’Avenir Program. Computation was performed using HPC resources from GENCI–IDRIS (2023-AD011014102, AD011012808R2). We thank all Astra-Vision members for their valuable feedbacks, including Andrei Bursuc and Gilles Puy for excellent suggestions and Tetiana Martyniuk for her kind proofreading.