The implementation of the paper 'ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation' (CVPR2022). ArXiv
- Ubuntu 18.04
- CUDA 11.1
- Python 3.6
bop_toolkit
- Pytorch 1.9
- torchvision 0.10.0
- opencv-python
Progressive-X
Download with git clone --recurse-submodules
so that bop_toolkit
will also be cloned.
-
Download the dataset from
BOP benchmark
-
Download required ground truth folders of zebrapose from
owncloud
. The folders aremodels_GT_color
,XX_GT
(e.g.train_real_GT
andtest_GT
) andmodels
(models
is optional, only if you want to generate GT from scratch). -
The expected data structure:
. └── BOP ROOT PATH/ ├── lmo ├── ycbv/ │ ├── models │ ├── models_eval │ ├── models_fine │ ├── test │ ├── train_pbr │ ├── train_real │ ├── ... #(other files from BOP page) │ ├── models_GT_color #(from last step) │ ├── train_pbr_GT #(from last step) │ ├── train_real_GT #(from last step) │ └── test_GT #(from last step) └── tless
-
Download the 3
pretrained resnet
, save them underzebrapose/pretrained_backbone/resnet
. -
(Optional) Instead of download the ground truth, you can also generate them from scratch, details in
Generate_GT.md
.
pip install imgaug pip install cyglfw3 pip install pyassimp pip install mmcv termcolor chardet numba in order to solve problem "Error: BadWindow (invalid window parameter)" I find that new version matplotlib will cause problem,and do it sudo apt-get install libjpeg-dev zlib1g-dev sudo apt-get install libopenexr-dev sudo apt-get install openexr sudo apt-get install python3-dev sudo apt-get install libglfw3-dev libglfw3 sudo apt-get install libassimp-dev pip install matplotlib==3.1.3 pip uninstall pillow CC="cc -mavx2" pip install -U --force-reinstall pillow-simd
Adjust the paths in the config files, and train the network with train.py
, e.g.
python train.py --cfg config/config_BOP/lmo/exp_lmo_BOP.txt --obj_name ape
The script will save the last 3 checkpoints and the best checkpoint, as well as tensorboard log.
For most datasets, a specific object occurs only once in a test images.
python test.py --cfg config/config_BOP/lmo/exp_lmo_BOP.txt --obj_name ape --ckpt_file path/to/the/best/checkpoint --ignore_bit 0 --eval_output_path path/to/save/the/evaluation/report
python test.py --cfg config/config_paper/ycbv/exp_ycbv_paper.txt --obj_name bowl --ckpt_file /home/lyltc/git/ZebraPose/results/zebra_ckpts/paper/ycbv/bowl --eval_output_path /home/lyltc/git/ZebraPose/results --debug
For datasets like tless, the number of a a specific object is unknown in the test stage.
python test_vivo.py --cfg config/config_BOP/tless/exp_tless_BOP.txt --ckpt_file path/to/the/best/checkpoint --ignore_bit 0 --obj_name obj01 --eval_output_path path/to/save/the/evaluation/report
Download our trained model from this link
. The ProgressiveX can not set random seed in its python API. The ADD results can be +/- 0.5%.
Merge the .csv
files generated in the last step using tools_for_BOP/merge_csv.py
, e.g.
python merge_csv.py --input_dir /dir/to/pose_result_bop/lmo --output_fn zebrapose_lmo-test.csv
And then evaluate it according to bop_toolkit
The results were reported with the same checkpoints. We fixed a bug that only influence the inference results:
The PnP solver requires the Bbox size to calculate the 2D pixel location in the original image. We modified the Bbox size in the dataloader. The bug is that we didn't update this modification for the PnP solver. If you remove the get_final_Bbox
in the dataloader, you will get the results reported in v1.
The bug has more influence if we resize the Bbox using crop_square_resize
. After we fixed the bug, we used crop_square_resize
for BOP challange (instead of crop_resize
in the config files in config_paper). We think this resize method should work better since it will not introduce distortion. However, we didn't compare resize methods with experiments.
The original code has been developed together with Mahdi Saleh
. Some code are adapted from Pix2Pose
, SingleShotPose
, GDR-Net
, and Deeplabv3
.
@article{su2022zebrapose,
title={ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation},
author={Su, Yongzhi and Saleh, Mahdi and Fetzer, Torben and Rambach, Jason and Navab, Nassir and Busam, Benjamin and Stricker, Didier and Tombari, Federico},
journal={arXiv preprint arXiv:2203.09418},
year={2022}
}