OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD Models

Project Page | Paper

OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD Models
Xingyi He^*, Jiaming Sun^*, Yu'ang Wang, Di Huang, Hujun Bao, Xiaowei Zhou
NeurIPS 2022

TODO List

Training, inference and demo code.
Pipeline to reproduce the evaluation results on the OnePose dataset and proposed OnePose_LowTexture dataset.
Use multiple GPUs for parallelized reconstruction and evaluation of multiple objects.
OnePose Cap app: we are preparing for the release of the data capture app to the App Store (iOS only), please stay tuned.

Installation

conda env create -f environment.yaml
conda activate oneposeplus

LoFTR and DeepLM are used in this project. Thanks for their great work, and we appreciate their contribution to the community. Please follow their installation instructions and LICENSE:

git submodule update --init --recursive

# Install DeepLM
cd submodules/DeepLM
sh example.sh
cp ${REPO_ROOT}/backup/deeplm_init_backup.py ${REPO_ROOT}/submodules/DeepLM/__init__.py

Note that the efficient optimizer DeepLM is used in our SfM refinement phase. If you face difficulty in installation, do not worry. You can still run the code by using our first-order optimizer, which is a little slower.

COLMAP is also used in this project for Structure-from-Motion. Please refer to the official instructions for the installation.

Download the pretrained models, including our 2D-3D matching and LoFTR models. Then move them to ${REPO_ROOT}/weights.

[Optional] You may optionally try out our web-based 3D visualization tool Wis3D for convenient and interactive visualizations of feature matches and point clouds. We also provide many other cool visualization features in Wis3D, welcome to try it out.

# Working in progress, should be ready very soon, only available on test-pypi now.
pip install -i https://test.pypi.org/simple/ wis3d

Demo

After the installation, you can refer to this page to run the demo with your custom data.

Training and Evaluation

Dataset setup

Download OnePose dataset from here and OnePose_LowTexture dataset from here, and extract them into $/your/path/to/onepose_datasets. If you want to evaluate on LINEMOD dataset, download the real training data, test data and 3D object models from CDPN, and detection results by YOLOv5 from here. Then extract them into $/your/path/to/onepose_datasets/LINEMOD The directory should be organized in the following structure:
```
|--- /your/path/to/datasets
|       |--- train_data
|       |--- val_data
|       |--- test_data
|       |--- lowtexture_test_data
|       |--- LINEMOD
|       |      |--- real_train
|       |      |--- real_test
|       |      |--- models
|       |      |--- yolo_detection
```

You can refer to dataset document for more informations about OnePose_LowTexture dataset.

Build the dataset symlinks

REPO_ROOT=/path/to/OnePose_Plus_Plus
ln -s /your/path/to/datasets $REPO_ROOT/data/datasets

Reconstruction

Reconstructed the semi-dense object point cloud and 2D-3D correspondences are needed for both training and test objects:

python run.py +preprocess=sfm_train_data.yaml use_local_ray=True  # for train data
python run.py +preprocess=sfm_inference_onepose_val.yaml use_local_ray=True # for val data
python run.py +preprocess=sfm_inference_onepose.yaml use_local_ray=True # for test data
python run.py +preprocess=sfm_inference_lowtexture.yaml use_local_ray=True # for lowtexture test data

Inference

# Eval OnePose dataset:
python inference.py +experiment=inference_onepose.yaml use_local_ray=True verbose=True

# Eval OnePose_LowTexture dataset:
python inference.py +experiment=inference_onepose_lowtexture.yaml use_local_ray=True verbose=True

Note that we perform the parallel evaluation on a single GPU with two workers by default. If your GPU memory is smaller than 6GB, you are supposed to add use_local_ray=False to turn off the parallelization.

Evaluation on LINEMOD Dataset

# Parse LINDMOD Dataset to OnePose Dataset format:
sh scripts/parse_linemod_objs.sh

# Reconstruct SfM model on real training data:
python run.py +preprocess=sfm_inference_LINEMOD.yaml use_local_ray=True

# Eval LINEMOD dataset:
python inference.py +experiment=inference_LINEMOD.yaml use_local_ray=True verbose=True

Training

Prepare ground-truth annotations. Merge annotations of training/val data:

python merge.py +preprocess=merge_annotation_train.yaml
python merge.py +preprocess=merge_annotation_val.yaml

Begin training
```
python train_onepose_plus.py +experiment=train.yaml exp_name=onepose_plus_train
```
Note that the default config for training uses 8 GPUs with around 23GB VRAM for each GPU. You can set the GPU number or ID in trainer.gpus and reduce the batch size in datamodule.batch_size to reduce the GPU VRAM footprint.

All model weights will be saved under ${REPO_ROOT}/models/checkpoints/${exp_name} and logs will be saved under ${REPO_ROOT}/logs/${exp_name}. You can visualize the training process by Tensorboard:

tensorboard --logdir logs --bind_all --port your_port_number

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{
    he2022oneposeplusplus,
    title={OnePose++: Keypoint-Free One-Shot Object Pose Estimation without {CAD} Models},
    author={Xingyi He and Jiaming Sun and Yuang Wang and Di Huang and Hujun Bao and Xiaowei Zhou},
    booktitle={Advances in Neural Information Processing Systems},
    year={2022}
}

Acknowledgement

Part of our code is borrowed from hloc and LoFTR. Thanks to their authors for their great works.

zju3dv/OnePose_Plus_Plus