This is the official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2. A re-implementation of Temporal Predictive Coding for Model-Based Planning in Latent Space is also included.
DreamerPro makes large performance gains on the DeepMind Control suite both in the standard setting and when there are complex background distractions. This is achieved by combining Dreamer with prototypical representations that free the world model from reconstructing visual details.
First clone the repository, and then set up a conda environment with all required dependencies using the requirements.txt
file:
git clone https://github.com/FrankTianTT/dreamer-pro.git
cd dreamer-pro
conda create --name dreamer-pro python=3.8 cudatoolkit=11.0 cudnn=8.1 ffmpeg conda-forge::cudatoolkit conda-forge::cudnn
conda activate dreamer-pro
pip install --upgrade pip setuptools==57.5.0
pip install torch --index-url https://download.pytorch.org/whl/cu110
pip install -r requirements.txt
DreamerPro has not been tested on Atari, but if you would like to try, the Atari ROMs can be imported by following these instructions.
Our natural background setting follows TPC. For convenience, we have included their code to download the background videos. Simply run:
python download_videos.py
This will download the background videos into kinetics400/videos
.
For baseline
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/cheetah_noisy --task test_cheetah_video_background_noisy_sensor --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/walker_noisy --task test_walker_video_background_noisy_sensor --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/reacher_noisy --task test_reacher_video_background_noisy_sensor --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/cheetah_jitter --task test_cheetah_video_background_camera_jitter --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/walker_jitter --task test_walker_video_background_camera_jitter --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/reacher_jitter --task test_reacher_video_background_camera_jitter --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/cheetah_video --task test_cheetah_video_background --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/walker_video --task test_walker_video_background --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/reacher_video --task test_reacher_video_background --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/cheetah_noiseless --task test_cheetah_noiseless --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/walker_noiseless --task test_walker_noiseless --configs defaults noisy
MUJOCO_GL=egl python dreamerv2/train.py --logdir log/reacher_noiseless --task test_reacher_noiseless --configs defaults noisy
CUDA_VISIBLE_DEVICES=1
For standard DMC, run:
cd DreamerPro
python dreamerv2/train.py --logdir log/dmc_{task}/dreamer_pro/{run} --task dmc_{task} --configs defaults dmc norm_off
Here, {task}
should be replaced by the actual task, and {run}
should be assigned an integer indicating the independent runs of the same model on the same task. For example, to start the first run on walker_run
:
cd DreamerPro
python dreamerv2/train.py --logdir log/dmc_walker_run/dreamer_pro/1 --task dmc_walker_run --configs defaults dmc norm_off
For natural background DMC, run:
cd DreamerPro
python dreamerv2/train.py --logdir log/nat_{task}/dreamer_pro/{run} --task nat_{task} --configs defaults dmc reward_1000
DreamerPro is based on a newer version of Dreamer. For fair comparison, we re-implement TPC based on the same version. Our re-implementation obtains better results in the natural background setting than reported in the original TPC paper.
For standard DMC, run:
cd TPC
python dreamerv2/train.py --logdir log/dmc_{task}/tpc/{run} --task dmc_{task} --configs defaults dmc
For natural background DMC, run:
cd TPC
python dreamerv2/train.py --logdir log/nat_{task}/tpc/{run} --task nat_{task} --configs defaults dmc reward_1000
For standard DMC, run:
cd Dreamer
python dreamerv2/train.py --logdir log/dmc_{task}/dreamer/{run} --task dmc_{task} --configs defaults dmc
For natural background DMC, run:
cd Dreamer
python dreamerv2/train.py --logdir log/nat_{task}/dreamer/{run} --task nat_{task} --configs defaults dmc reward_1000 --precision 32
We find it necessary to use --precision 32
in the natural background setting for numerical stability.
The training process can be monitored via TensorBoard. We have also included performance curves in plots
. Note that these curves may appear different from what is shown in TensorBoard. This is because the evaluation return in the performance curves is averaged over 10 episodes, while TensorBoard only shows the evaluation return of the last episode.
Standard DMC
Natural Background DMC
This repository is largely based on the TensorFlow 2 implementation of Dreamer. We would like to thank Danijar Hafner for releasing and updating his clean implementation. In addition, we also greatly appreciate the help from Tung Nguyen in implementing TPC.
@inproceedings{deng2022dreamerpro,
title={Dreamerpro: Reconstruction-free model-based reinforcement learning with prototypical representations},
author={Deng, Fei and Jang, Ingook and Ahn, Sungjin},
booktitle={International Conference on Machine Learning},
pages={4956--4975},
year={2022},
organization={PMLR}
}