(BMVC 2019) PyTorch implementation of Paper "Pose from Shape: Deep Pose Estimation for Arbitrary 3D Objects" [PDF] [Project webpage]
If our project is helpful for your research, please consider citing:
@INPROCEEDINGS{Xiao2019PoseFromShape,
author = {Yang Xiao and Xuchong Qiu and Pierre{-}Alain Langlois and Mathieu Aubry and Renaud Marlet},
title = {Pose from Shape: Deep Pose Estimation for Arbitrary {3D} Objects},
booktitle = {British Machine Vision Conference (BMVC)},
year = {2019}}
The generated point clouds of Pascal3D and ObjectNet3D can be directly downloaded from our repo.
Please see ./data/Pascal3D/pointcloud
and ./data/ObjectNet3D/pointcloud
The code can be used in Linux system with the the following dependencies: Python 3.6, Pytorch 1.0.1, Python-Blender 2.77, meshlabserver
We recommend to utilize conda environment to install all dependencies and test the code.
## Download the repository
git clone 'https://github.com/YoungXIAO13/PoseFromShape'
cd PoseFromShape
## Create python env with relevant packages
conda create --name PoseFromShape --file auxiliary/spec-file.txt
source activate PoseFromShape
conda install -c conda-forge matplotlib
## Install blender as a python module
conda install auxiliary/python-blender-2.77-py36_0.tar.bz2
To download and prepare the datasets for training and testing (Pascal3D, ObjectNet3D, ShapeNetCore, SUN397, Pix3D, LineMod):
cd data
bash prepare_data.sh
To generate point cloud from the .obj file for Pascal3D and ObjectNet3D, check the data folder
To download the pretrained models (Pascal3D, ObjectNet3D, ShapeNetCore):
cd model
bash download_models.sh
To train on the ObjectNet3D dataset with real images and coarse alignment:
bash run/train_ObjectNet3D.sh
To train on the Pascal3D dataset with real images and coarse alignment:
bash run/train_Pascal3D.sh
To train on the ShapeNetCore dataset with synthetic images and precise alignment:
bash run/train_ShapeNetCore.sh
While the network was trained on real or synthetic images, all the testing was done on real images.
bash run/test_ObjectNet3D.sh
You should obtain the results in Table 1 in the paper (*indicates testing on the novel categories):
Method | Average | bed | bookcase | calculator | cellphone | computer | door | cabinet | guitar | iron | knife | microwave | pen | pot | rifle | shoe | slipper | stove | toilet | tub | wheelchair |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
StarMap | 56 | 73 | 78 | 91 | 57 | 82 | - | 84 | 73 | 3 | 18 | 94 | 13 | 56 | 4 | - | 12 | 87 | 71 | 51 | 60 |
StarMap* | 42 | 37 | 69 | 19 | 52 | 73 | - | 78 | 61 | 2 | 9 | 88 | 12 | 51 | 0 | - | 11 | 82 | 41 | 49 | 14 |
Ours(MV) | 73 | 82 | 90 | 95 | 65 | 93 | 97 | 89 | 75 | 52 | 32 | 95 | 54 | 82 | 45 | 67 | 46 | 95 | 82 | 67 | 66 |
Ours(MV)* | 62 | 65 | 90 | 88 | 65 | 84 | 93 | 84 | 67 | 2 | 29 | 94 | 47 | 79 | 15 | 54 | 32 | 89 | 61 | 68 | 39 |
To test on the Pascal3D dataset with real images:
bash run/test_Pascal3D.sh
You should obtain the results in Table 2 in the paper (*indicates category-agnostic):
Method | Accuracy | Median Error |
---|---|---|
Keypoints and Viewpoints | 80.75 | 13.6 |
Render for CNN | 82.00 | 11.7 |
Mousavian | 81.03 | 11.1 |
Grabner | 83.92 | 10.9 |
Grabner* | 81.33 | 11.5 |
StarMap* | 81.67 | 12.8 |
Ours(MV)* | 82.66 | 10.0 |
bash run/test_Pix3D.sh
You should obtain the results in Table 3 in the paper (Accuracy / MedErr):
Method | Bed | Chair | Desk |
---|---|---|---|
Georgakis | 50.8 / 28.6 | 31.2 / 57.3 | 34.9 / 51.6 |
Ours(MV) | 59.8 / 20.0 | 52.4 / 26.6 | 56.6 / 26.6 |
In order to test on other 3D model, first you need to generate multiviews from .obj file
by running python ./data/render_utils.py
with the correct path
and you should save the testing images picturing this model in a folder.
Then you can run bash ./demo/inference.sh
to get predictions and images rendered
under the predicted pose with the right model_path, image_path, render_path, obj_path.
Some example of applying our model trained on objects of ObjectNet3D with keypoint annotations to armadillo images can be seen below:
Input Image 1 | Input Image 2 | Input Image 3 | Input Image 4 | Input Image 5 |
Prediction 1 | Prediction 2 | Prediction 3 | Prediction 4 | Prediction 5 |