Mix-Teaching: A General Semi-Supervised Learning Framework for Monocular 3D Object Detection
This is the official implementation of our manuscript Mix-Teaching: a general semi-supervised learning framework for monocular 3D object detection. The raw data of KITTI which consists of 48K temporal images is used as unlabeled data in all experiments. For more details, please see our paper.
The performance on KITTI validation set (3D) is as follows:
Models | 10% | 30% | 100% | ||||||
Easy | Mod | Hard | Easy | Mod | Hard | Easy | Mod | Hard | |
MonoFlex | 5.76 | 4.67 | 3.54 | 15.58 | 11.03 | 8.93 | 23.64 | 17.51 | 14.83 |
Ours | 14.43 | 10.65 | 8.41 | 23.81 | 16.94 | 13.80 | 30.82 | 22.18 | 18.61 |
Abs. Imp. | +8.57 | +5.98 | +4.87 | +8.23 | +5.91 | +4.87 | +7.18 | +4.67 | +3.78 |
Getting Started
1. Installation
Please refer to Installation
Then run
pip install mmcv-full==1.2.5 mmdet==2.11.0
git clone https://github.com/open-mmlab/mmdetection3d && cd mmdetection3d && git checkout v0.9.0
cd ../ && pip install mmdetection3d/
2. Dataset
Please first download the training set and organize it as following structure:
datasets
│──kitti
│ ├──ImageSets
│ ├──training <-- 7481 train data
│ │ ├──calib
│ │ ├──label_2
│ │ └──image_2
│ └──testing <-- empty directory to save raw data in official format
│ ├──calib
│ ├──image_2
│ └──ImageSets
└──raw_data <-- raw data in zip format
Download and transfer format for KITTI raw data.
cd datasets && mkdir raw_data
cd ../raw_data_tools && bash download_raw_data.sh ../datasets/raw_data
python convert_det_format.py --raw_data_root ../datasets/raw_data --kitti_root ../datasets/kitti
cd ../pseudo_labeling_tools && python generate_imageset.py --kitti_root ../datasets/kitti
Then run
python create_data.py --kitti_root ../datasets/kitti
3. Train teacher model
Please refer to Training in supervised mode.
4. Generate pseudo labels for unlabeled data
Please refer to Inference.
Inference on unlabeled data and organize results as following structure:
pred_folders
│──model_1_preds
│ ├──000000.txt
│ ├──000001.txt
│ └── ...
│──model_2_preds
│ ├──000000.txt
│ ├──000001.txt
│ └── ...
└── ...
5. Pseudo labeling
python uncertainty_estimator.py --kitti_root ../datasets/kitti --pred_folders <path-to-pred_folders>/
python create_data.py --kitti_root ../datasets/kitti --ssl True
python create_background_infos.py --kitti_root ../datasets/kitti
python parse_db_infos.py --old_db_infos ../datasets/kitti/kitti_dbinfos_test.pkl --new_db_infos ../datasets/kitti/kitti_dbinfos_test_filtered.pkl --score_threshold 0.7 --geo_conf_threshold 0.75
or
bash pseudo_labeling.sh
6. Train student model with labeled and unlabeled data
Please refer to Training in semi-supervised model.
7. Continue with step 4.
Citation
If you find our work useful in your research, please consider citing:
@article{Yang2022MixTeachingAS,
title={Mix-Teaching: A Simple, Unified and Effective Semi-Supervised Learning Framework for Monocular 3D Object Detection},
author={Lei Yang and Xinyu Zhang and Li Wang and Minghan Zhu and Chuan-Fang Zhang and Jun Li},
journal={ArXiv},
year={2022},
volume={abs/2207.04448}
}
Acknowledgements
Thank for the excellent cooperative perception codebases MonoFlex
Thank for the excellent perception datasets KITTI
Contact
If you have any problem with this code, please feel free to contact yanglei20@mails.tsinghua.edu.cn.