/keras-rl

Deep Reinforcement Learning for Keras.

Primary LanguagePythonMIT LicenseMIT

Deep Reinforcement Learning for Keras

Build Status Documentation License Join the chat at https://gitter.im/keras-rl/Lobby

What is it?

keras-rl implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras. Just like Keras, it works with either Theano or TensorFlow, which means that you can train your algorithm efficiently either on CPU or GPU. Furthermore, keras-rl works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy. Of course you can extend keras-rl according to your own needs. You can use built-in Keras callbacks and metrics or define your own. Even more so, it is easy to implement your own environments and even algorithms by simply extending some simple abstract classes.

In a nutshell: keras-rl makes it really easy to run state-of-the-art deep reinforcement learning algorithms, uses Keras and thus Theano or TensorFlow and was built with OpenAI Gym in mind.

What is included?

As of today, the following algorithms have been implemented:

  • Deep Q Learning (DQN) [1], [2]
  • Double DQN [3]
  • Deep Deterministic Policy Gradient (DDPG) [4]
  • Continuous DQN (CDQN or NAF) [6]
  • Cross-Entropy Method (CEM) [7], [8]
  • Dueling network DQN (Dueling DQN) [9]
  • Deep SARSA [10]

You can find more information on each agent in the wiki.

I'm currently working on the following algorithms, which can be found on the experimental branch:

  • Asynchronous Advantage Actor-Critic (A3C) [5]

Notice that these are only experimental and might currently not even run.

How do I install it and how do I get started?

Installing keras-rl is easy. Just run the following commands and you should be good to go:

pip install keras-rl

This will install keras-rl and all necessary dependencies.

If you want to run the examples, you'll also have to install gym by OpenAI. Please refer to their installation instructions. It's quite easy and works nicely on Ubuntu and Mac OS X. You'll also need the h5py package to load and save model weights, which can be installed using the following command:

pip install h5py

Once you have installed everything, you can try out a simple example:

python examples/dqn_cartpole.py

This is a very simple example and it should converge relatively quickly, so it's a great way to get started! It also visualizes the game during training, so you can watch it learn. How cool is that?

Unfortunately, the documentation of keras-rl is currently almost non-existent. However, you can find a couple of more examples that illustrate the usage of both DQN (for tasks with discrete actions) as well as for DDPG (for tasks with continuous actions). While these examples are not replacement for a proper documentation, they should be enough to get started quickly and to see the magic of reinforcement learning yourself. I also encourage you to play around with other environments (OpenAI Gym has plenty) and maybe even try to find better hyperparameters for the existing ones.

If you have questions or problems, please file an issue or, even better, fix the problem yourself and submit a pull request!

Do I have to train the models myself?

Training times can be very long depending on the complexity of the environment. This repo provides some weights that were obtained by running (at least some) of the examples that are included in keras-rl. You can load the weights using the load_weights method on the respective agents.

Requirements

  • Python 2.7 or Python 3.5
  • Keras >= 1.0.7

That's it. However, if you want to run the examples, you'll also need the following dependencies:

keras-rl also works with TensorFlow. To find out how to use TensorFlow instead of Theano, please refer to the Keras documentation.

Documentation

We are currently in the process of getting a proper documentation going. The latest version of the documentation is available online. All contributions to the documentation are greatly appreciated!

Support

You can ask questions and join the development discussion:

You can also post bug reports and feature requests (only!) in Github issues.

Running the Tests

To run the tests locally, you'll first have to install the following dependencies:

pip install pytest pytest-xdist pep8 pytest-pep8 pytest-cov python-coveralls

You can then run all tests using this command:

py.test tests/.

If you want to check if the files conform to the PEP8 style guidelines, run the following command:

py.test --pep8

Citing

If you use keras-rl in your research, you can cite it as follows:

@misc{plappert2016kerasrl,
    author = {Matthias Plappert},
    title = {keras-rl},
    year = {2016},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/matthiasplappert/keras-rl}},
}

Acknowledgments

The foundation for this library was developed during my work at the High Performance Humanoid Technologies (H²T) lab at the Karlsruhe Institute of Technology (KIT). It has since been adapted to become a general-purpose library.

References

  1. Playing Atari with Deep Reinforcement Learning, Mnih et al., 2013
  2. Human-level control through deep reinforcement learning, Mnih et al., 2015
  3. Deep Reinforcement Learning with Double Q-learning, van Hasselt et al., 2015
  4. Continuous control with deep reinforcement learning, Lillicrap et al., 2015
  5. Asynchronous Methods for Deep Reinforcement Learning, Mnih et al., 2016
  6. Continuous Deep Q-Learning with Model-based Acceleration, Gu et al., 2016
  7. Learning Tetris Using the Noisy Cross-Entropy Method, Szita et al., 2006
  8. Deep Reinforcement Learning (MLSS lecture notes), Schulman, 2016
  9. Dueling Network Architectures for Deep Reinforcement Learning, Wang et al., 2016
  10. Reinforcement learning: An introduction, Sutton and Barto, 2011

Todos

  • Documentation: Work on the documentation has begun but not everything is documented in code yet. Additionally, it would be super nice to have guides for each agents that describe the basic ideas behind it.
  • TRPO, priority-based memory, A3C, async DQN, ...