MPI3D datasets have been introduced to benchmark representations learning algorithms across simulated and real-world environments. The first transfer learning results of unsupervised disentangled representations are presented in our NeurIPS 2019 paper.
The dataset is also used in the NeurIPS Disentanglement Challenge. If you use this dataset in your work then kindly cite us.
@inproceedings{NEURIPS2019_d97d404b,
author = {Gondal, Muhammad Waleed and Wuthrich, Manuel and Miladinovic, Djordje and Locatello, Francesco and Breidt, Martin and Volchkov, Valentin and Akpo, Joel and Bachem, Olivier and Sch\"{o}lkopf, Bernhard and Bauer, Stefan},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Wallach and H. Larochelle and A. Beygelzimer and F. d\textquotesingle Alch\'{e}-Buc and E. Fox and R. Garnett},
pages = {},
publisher = {Curran Associates, Inc.},
title = {On the Transfer of Inductive Bias from Simulation to the Real World: a New Disentanglement Dataset},
url = {https://proceedings.neurips.cc/paper/2019/file/d97d404b6119214e4a7018391195240a-Paper.pdf},
volume = {32},
year = {2019}
}
There are following four different datasets. The gifs are created using disentanglement_lib visualization tool.
The first three datasets consists of 1,036,800 images, corresponding to all the possible combinations of the following factors of variation:
Factors | Possible Values |
---|---|
object_color | white=0, green=1, red=2, blue=3, brown=4, olive=5 |
object_shape | cone=0, cube=1, cylinder=2, hexagonal=3, pyramid=4, sphere=5 |
object_size | small=0, large=1 |
camera_height | top=0, center=1, bottom=2 |
background_color | purple=0, sea green=1, salmon=2 |
horizontal_axis | 0,...,39 |
vertical_axis | 0,...,39 |
The real-world complex dataset consists of 460,800 images, containing the combinations of the following factors of variations.
Factors | Possible Values |
---|---|
object_color | yellow=0, green=1, olive=2, red=3 |
object_shape | coffee-cup=0, tennis-ball=1, croissant=2, beer-cup=3 |
object_size | small=0, large=1 |
camera_height | top=0, center=1, bottom=2 |
background_color | purple=0, sea green=1, salmon=2 |
horizontal_axis | 0,...,39 |
vertical_axis | 0,...,39 |
So far we only provide the datasets in 64x64 resolution. Higher resolution versions will be made available in the near future.
The datasets are provided in the form of numpy arrays. Once the data is loaded, you can use array.reshape([6,6,2,3,3,40,40,64,64,3]) to obtain an array where the first 7 dimensions corresponds to data generative factors as in the table above and the last three to the image dimensions. Note that for real-world complex dataset you need to use array.reshape([4,4,2,3,3,40,40,64,64,3]).
import numpy as np
data = np.load('./mpi3d_real.npz')['images']
# To visualize each factor of the data independently, you can reshape
# the array as the following.
data = data.reshape([6,6,2,3,3,40,40,64,64,3])
# For real-world complex dataset use:
data = data.reshape([4,4,2,3,3,40,40,64,64,3])
Use the following links to download the datasets.
- mpi3d_toy (simplistic rendered): link
- mpi3d_realistic (realistic rendered): link
- mpi3d_real (real-world images): link
- mpi3d_real_complex (real-world complex shapes images) : link
Please send any feedback to waleed.gondal10@gmail.com
This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/).