ObjectPoseEstimationDatasets

A repo to summarize datasets used for object pose estimation and rendering methods used to generate synthetic training data.

In the following tables, 3D CAD model is noted as model and 2D pictured object is noted as object.

Papers resuming some related datesets can be found here and here

Objects in the controlled environments

This table lists the datasets commonly known as BOP: Benchmark 6D Object Pose Estimation, which provide accurate 3D object models and precise 2D-3D alignment.

You can download all the BOP datasets here and use the toolkit provided by the authors.

You can use our code ply2obj.py to convert original .ply files to .obj files, and run create_annotation.py to create a single annotation file for all the scenes in a dataset.

Datasets format can be found here, we use instance id in our annotation to indicate different instances pictured in the same image.

Dataset	Annotation	Statistics	Reference
HomebrewedDB	6D pose + Depth + BoundingBox	33 models in 13 videos with 17,420 frames	Preprint 2019
YCB-Video	6D Pose + Depth + Mask	21 models in 92 videos with 133,827 frames	RSS 2018
T-LESS	6D Pose + Depth	30 models in 20 videos with ~49K frames	WACV 2017
Doumanoglou	6D Pose + Depth	2 models in 3 videos with 183 frames	CVPR 2016
Tejani	6D Pose + Depth	6 models in 6 videos with 2,067 frames	ECCV 2014
Occluded-LINEMOD	6D Pose + Depth	8 models in 1,214 frames with 8,992 objects	ECCV 2014
LINEMOD	6D pose + Depth for one object	15 models in 15 videos with 18,273 frames	ACCV 2012

Objects in the wild

In this table, Pix3D and ScanNet provide precise 2D-3D alignment while others only provide a coarse alignment.

PASCAL3D+ is the de facto benchmark used for viewpoint estimation.

ScanNet is usually used to evaluate scene reconstruction and segmentation.

Dataset	Annotation	Statistics	Reference
ApolloCar3D	6D Pose + Mask	34 car models with 60K+ objects in 5,277 images	CVPR 2019
Pix3D	6D Pose + Mask	9 categories containing 395 models in 10,069 images	CVPR 2018
ScanNet	6D Pose + Segmentation + Depth	2.5M RGB-D frames in 1,515 scenes	CVPR 2017
ObjectNet3D	Euler Angles + BoundingBox	100 categories with 201,888 objects in 90,127 images	ECCV 2016
PASCAL3D+	Euler Angles + BoundingBox	12 categories with 36,292 objects in 30,889 images	WACV 2014
KITTI	3D BoundingBox	80,256 objects in 14,999 images	CVPR 2012

3D model datasets

In order to testify the network generalization ability (tested on images containing unseen 3D models from the training set), the following dataset could be used to generate synthetic training data.

Notice that ABC contains generic and arbitrary industrial CAD models while ShapeNetCore and ModelNet contain common category objects such as cars and chairs.

Dataset	Number of categories	Number of models	Reference
ABC	-	1 million	CVPR 2019
ShapeNetCore	55	~51,300	ArXiv 2015
ModelNet-40	40	26,960	CVPR 2015

Rendering methods

Differentiable Renderer

Neural 3D Mesh Renderer: Kato el al. CVPR 2018

RenderNet: Thu et al. NIPS 2018

Blender Render

In this repo, we provide python code to generate rendering images from 3D models using blender as a python module that is easy to install and generate photo-realistic images : )

TODO: scripts about how to use it.

Other works using blender can be found here that generates one model at a time.

Physical Simulator

PyBullet: a very popular one in the Robotics community.

Others

Glumpy: does not support headless rendering (failed on ssh mode)

UnrealCV: extension of Unreal Engine 4, helps interact with virtual world and communicate with external program.

SyntheticComputerVision: resuming a lot of techniques used to generate synthetic image

Attention: 3D models should be aligned in the same way through meshlab to ensure the consistent orientation while wandering across the different datasets.

paojianghu/ObjectPoseEstimationDatasets