/OpenCOOD

[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

Primary LanguagePythonOtherNOASSERTION

 
 

paper Documentation Status License: MIT

OpenCOOD is an Open COOperative Detection framework for autonomous driving. It is also the official implementation of the ICRA 2022 paper OPV2V.

News:

  • 12/28/2022: OpenCOOD now support multi-gpu training.
  • 12/21/2022: V2XSet (ECCV2022) is supported by OpenCOOD now!
  • 12/16/2022: Both spconv 1.2.1 and spconv 2.x are supported!
  • 12/04/2022: The log replay tool for OPV2V is online now! With this toolbox, you can 100% replay all the events in the offline dataset and add/change any sensors/groundtruth you want to explore the tasks that the origin dataset do not support. Check here to see more details.
  • 09/15/2022: So far OpenCOOD has supported several top conference papers, including ECCV,ICRA,CoRL,NeurIPS,WACV! The bottom of this project page lists the detailed information.

Features

Data Downloading

All the data can be downloaded from google drive. If you have a good internet, you can directly download the complete large zip file such as train.zip. In case you suffer from downloading large files, we also split each data set into small chunks, which can be found in the directory ending with _chunks, such as train_chunks. After downloading, please run the following command to each set to merge those chunks together:

cat train.zip.part* > train.zip
unzip train.zip

Installation

Please refer to data introduction and installation guide to prepare data and install OpenCOOD. To see more details of OPV2V data, please check our website.

Quick Start

Data sequence visualization

To quickly visualize the LiDAR stream in the OPV2V dataset, first modify the validate_dir in your opencood/hypes_yaml/visualization.yaml to the opv2v data path on your local machine, e.g. opv2v/validate, and the run the following commond:

cd ~/OpenCOOD
python opencood/visualization/vis_data_sequence.py [--color_mode ${COLOR_RENDERING_MODE}]

Arguments Explanation:

  • color_mode : str type, indicating the lidar color rendering mode. You can choose from 'constant', 'intensity' or 'z-value'.

Train your model

OpenCOOD uses yaml file to configure all the parameters for training. To train your own model from scratch or a continued checkpoint, run the following commonds:

python opencood/tools/train.py --hypes_yaml ${CONFIG_FILE} [--model_dir  ${CHECKPOINT_FOLDER} --half]

Arguments Explanation:

  • hypes_yaml: the path of the training configuration file, e.g. opencood/hypes_yaml/second_early_fusion.yaml, meaning you want to train an early fusion model which utilizes SECOND as the backbone. See Tutorial 1: Config System to learn more about the rules of the yaml files.
  • model_dir (optional) : the path of the checkpoints. This is used to fine-tune the trained models. When the model_dir is given, the trainer will discard the hypes_yaml and load the config.yaml in the checkpoint folder.
  • half (optional): If set, the model will be trained with half precision. It cannot be set with multi-gpu training togetger.

To train on multiple gpus, run the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4  --use_env opencood/tools/train.py --hypes_yaml ${CONFIG_FILE} [--model_dir  ${CHECKPOINT_FOLDER}]

Test the model

Before you run the following command, first make sure the validation_dir in config.yaml under your checkpoint folder refers to the testing dataset path, e.g. opv2v_data_dumping/test.

python opencood/tools/inference.py --model_dir ${CHECKPOINT_FOLDER} --fusion_method ${FUSION_STRATEGY} [--show_vis] [--show_sequence]

Arguments Explanation:

  • model_dir: the path to your saved model.
  • fusion_method: indicate the fusion strategy, currently support 'early', 'late', and 'intermediate'.
  • show_vis: whether to visualize the detection overlay with point cloud.
  • show_sequence : the detection results will visualized in a video stream. It can NOT be set with show_vis at the same time.
  • global_sort_detections: whether to globally sort detections by confidence score. If set to True, it is the mainstream AP computing method, but would increase the tolerance for FP (False Positives). OPV2V paper does not perform the global sort. Please choose the consistent AP calculation method in your paper for fair comparison.

The evaluation results will be dumped in the model directory.

Benchmark and model zoo

Results on OPV2V LiDAR-track (AP@0.7 for no-compression/ compression)

Spconv Version Backbone Fusion Strategy Bandwidth (Megabit),
before/after compression
Default Towns Culver City Download
Naive Late 1.2.1 PointPillar Late 0.024/0.024 0.781/0.781 0.668/0.668 url
Cooper 1.2.1 PointPillar Early 7.68/7.68 0.800/x 0.696/x url
Attentive Fusion 1.2.1 PointPillar Intermediate 126.8/1.98 0.815/0.810 0.735/0.731 url
F-Cooper 1.2.1 PointPillar Intermediate 72.08/1.12 0.790/0.788 0.728/0.726 url
V2VNet 1.2.1 PointPillar Intermediate 72.08/1.12 0.822/0.814 0.734/0.729 url
CoAlign 1.2.1 PointPillar Intermediate 72.08/2.24 0.833/0.806 0.760/ 0.750 url
FPV-RCNN 1.2.1 PV-RCNN Intermediate(2 stage) 0.24/0.24 0.820/0.820 0.763/0.763 url
V2VAM 1.2.1 PointPillar Intermediate x/x 0.860/0.860 0.813/0.791 url
CoBEVT 2.0 PointPillar Intermediate 72.08/1.12 0.861/0.836 0.773/0.730 url
Naive Late 1.2.1 VoxelNet Late 0.024/0.024 0.738/0.738 0.588/0.588 url
Cooper 1.2.1 VoxelNet Early 7.68/7.68 0.758/x 0.677/x url
Attentive Fusion 1.2.1 VoxelNet Intermediate 576.71/1.12 0.864/0.852 0.775/0.746 url
Naive Late 1.2.1 SECOND Late 0.024/0.024 0.775/0.775 0.682/0.682 url
Cooper 1.2.1 SECOND Early 7.68/7.68 0.813/x 0.738/x url
Attentive 1.2.1 SECOND Intermediate 63.4/0.99 0.826/0.783 0.760/0.760 url
Naive Late 1.2.1 PIXOR Late 0.024/0.024 0.578/0.578 0.360/0.360 url
Cooper 1.2.1 PIXOR Early 7.68/7.68 0.678/x 0.558/x url
Attentive 1.2.1 PIXOR Intermediate 313.75/1.22 0.687/0.612 0.546/0.492 url

Note:

  • We suggest using PointPillar as the backbone when you are creating your method and try to compare with our benchmark, as we implement most of the SOTA methods with this backbone only.
  • We assume the transimssion rate is 27Mbp/s. Considering the frequency of LiDAR is 10Hz, the bandwidth requirement should be less than 2.7Mbp to avoid severe delay.
  • A 'x' in the benchmark table represents the bandwidth requirement is too large, which can not be considered to employ in practice.

Results of BEV semantic segmentation on OPV2V camera-track (IoU)

Backbone Fusion Strategy Vehicles Road Surface Lane Download
No Fusion CVT No Fusion 37.7 57.8 43.7 None
Map Fusion CVT Late 45.1 60.0 44.1 None
Attentive Fusion CVT Intermediate 51.9 60.5 46.2 None
F-Cooper CVT Intermediate 52.5 60.4 46.5 None
V2VNet CVT Intermediate 53.5 60.2 47.5 None
DiscoNet CVT Intermediate 52.9 60.7 45.8 None
FuseBEVT CVT Intermediate 59.0 62.1 49.2 url
CoBEVT SinBEVT Intermediate 60.4 63.0 53.0 url

Note: To play with OPV2V camera data, please check here: https://github.com/DerrickXuNu/CoBEVT

Results of 3D Detection on V2XSet LiDAR-Track

Method Spconv Version Backbone Perfect AP@0.5 Perfect AP@0.7 Noisy AP@0.5 Noisy AP@0.7 Download Link
No Fusion 2.0 PointPillar 60.6 40.2 60.6 40.2
Late Fusion 2.0 PointPillar 72.7 62.0 54.9 30.7
Early Fusion 2.0 PointPillar 81.9 71.0 72.0 38.4
F-Cooper 2.0 PointPillar 84.0 68.0 71.5 46.9
Attentive Fusion 2.0 PointPillar 80.7 66.4 70.9 48.7
V2VNet 2.0 PointPillar 84.5 67.7 79.1 49.3
DiscoNet 2.0 PointPillar 84.4 69.5 79.8 54.1
CoBEVT 2.0 PointPillar 84.9 66.0 81.1 54.3 url
Where2Comm 2.0 PointPillar 85.5 65.4 82.0 53.4 url
V2X-ViT 2.0 PointPillar 88.2 71.2 83.6 61.4 url

Important Notes for Training in V2XSet:

  1. When you train from scratch, please first set async and loc_err to false to train on perfect setting. Also, set compression to 0 at beginning.
  2. After the model on perfect setting converged, set compression to 32 (please change the config yaml in your trained model directory) and continue training on the perfect setting for another 1-2 epoches.
  3. Next, set async to true, async_mode to 'real', async_overhead to 200 or 300, loc_err to true, xyz_std to 0.2, rpy_std to 0.2, and then continue training your model on this noisy setting. Please note that you are free to change these noise setting during training to obtain better performance.
  4. Eventually, use the model fine-tuned on noisy setting as the test model for both perfect and noisy setting.

Tutorials

We have a series of tutorials to help you understand OpenCOOD more. Please check the series of our tutorials.

Citation

If you are using our OpenCOOD framework or OPV2V dataset for your research, please cite the following paper:

@inproceedings{xu2022opencood,
 author = {Runsheng Xu, Hao Xiang, Xin Xia, Xu Han, Jinlong Li, Jiaqi Ma},
 title = {OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication},
 booktitle = {2022 IEEE International Conference on Robotics and Automation (ICRA)},
 year = {2022}}

Supported Projects

OpenCOOD has supported several top conference papers in cooperative perception field.

V2V4Real: A large-scale real-world dataset for Vehicle-to-Vehicle Cooperative Perception
Runsheng Xu, Xin Xia, Jinlong Li, Hanzhao Li, Shuo Zhang, Zhengzhong Tu, Zonglin Meng, Hao Xiang, Xiaoyu Dong, Rui Song, Hongkai Yu, Bolei Zhou, Jiaqi Ma
CVPR 2023
[Paper][Code]

Robust Collaborative 3D Object Detection in Presence of Pose Errors
Yifan Lu, Quanhao Li, Baoan Liu, Mehrdad Dianati, Chen Feng, Siheng Chen, Yanfeng Wang
ICRA 2023
[Paper][Code]

Analyzing Infrastructure LiDAR Placement with Realistic LiDAR Simulation Library
Xinyu Cai, Wentao Jiang, Runsheng Xu, Wenquan Zhao, Jiaqi Ma, Si Liu, Yikang Li
ICRA 2023
[Paper][Code]

Bridging the Domain Gap for Multi-Agent Perception
Runsheng Xu, Jinlong Li, Xiaoyu Dong, Hongkai Yu, Jiaqi Ma∗
ICRA 2023
[Paper][Code]

Model Agnostic Multi-agent Perception
Runsheng Xu, Weizhe Chen, Hao Xiang, Xin Xia, Lantao Liu, Jiaqi Ma∗
ICRA 2023
[Paper][Code]

Learning for Vehicle-to-Vehicle Cooperative Perception under Lossy Communication
Jinlong Li, Runsheng Xu, Xinyu Liu, Jin Ma, Zicheng Chi, Jiaqi Ma, Hongkai Yu
TIV 2023
[Paper] [Code]

Where2comm: Communication-Efficient Collaborative Perception via Spatial Confidence Maps
Yue Hu, Shaoheng Fang, Zixing Lei, Yiqi Zhong, Siheng Chen
Neurips 2022
[Paper] [Code]

Adaptive Feature Fusion for Cooperative Perception using LiDAR Point Clouds
Donghao Qiao, Farhana Zulkernine
WACV 2023
[Paper]

CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers
Runsheng Xu*, Zhengzhong Tu*, Hao Xiang, Wei Shao, Bolei Zhou, Jiaqi Ma
CoRL2022
[Paper] [Code]

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer
Runsheng Xu*, Hao Xiang*, Zhengzhong Tu*, Xin Xia, Ming-Hsuan Yang, Jiaqi Ma
ECCV2022
[Paper] [Code] [Talk]

OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication
Runsheng Xu*, Hao Xiang*, Xin Xia, Jinlong Li, Jiaqi Ma
ICRA2022
[Paper] [Website] [Code]