A reimplmentation of "Improving PILCO with Bayesian Neural Network Dynamics Models" by Yarin Gal et al. in PyTorch.
The Deep Ensembles variant's hyperparameters have not been optimised, hence the comparatively poor performce.
Even after an extensive hyperparameter search of the parameters not mentioned in the paper, the results obtained do not appear to quite match those obtained by original authors neither in [1] or [2].
pip install requirements.txt
From the root of this repository (.../deep-pilco-torch
):
pip install -e .
python torchpilco/run/train_deep_pilco.py
python run_plot_rewards.py --log_dirs {runs/deep_pilco_XX runs/deep_pilco_XX2} --labels {label-for-logdir-1 label-for-logdir-2} --save_path {where to save}
python run_plot_trajectories.py --log_dir {runs/deep_pilco_XX} --iter {chosen iteration} --save_path {where to save}
Sample trajectories using the trained policy at iteration 5: At iteration 40:
[1] Improving PILCO with Bayesian Neural Network Dynamics Models, Yarin Gal and Rowan Thomas McAllister and Carl Edward Rasmussen