This repository is an implementation of the H-UCRL algorithm introduced in Curi, S., Berkenkamp, F., & Krause, A. (2020). Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning.
To install create a conda environment:
$ conda create -n hucrl python=3.7
$ conda activate hucrl
$ pip install -e .[test,logging,experiments]
For Mujoco (license required) Run:
$ pip install -e .[mujoco]
For the inverted pendulum experiment run
$ python exps/inverted_pendulum/run.py
For the mujoco (license required) experiment run
$ python exps/mujoco/run.py --environment ENV_NAME --agent AGENT_NAME --action
We support MBHalfCheetah-v0, MBPusher-v0, MBReacher-v0, MBAnt-v0, MBCartPole-v0, MBHopper-v0, MBInvertedDoublePendulum-v0, MBInvertedPendulum-v0, MBReacher-v0, MBReacher3D-v0, MBSwimmer-v0, MBWalker2d-v0
If you this repo for your research please use the following BibTeX entry:
@article{curi2020efficient,
title={Efficient model-based reinforcement learning through optimistic policy search and planning},
author={Curi, Sebastian and Berkenkamp, Felix and Krause, Andreas},
journal={Advances in Neural Information Processing Systems},
volume={33},
year={2020}
}