Reinforcement Learning from Scratch

Spinning Up is an educational resource produced by OpenAI that makes it easier to learn about deep reinforcement learning (deep RL). I really appreciate Spinning up because I learned a lot from it.

Why I Built This

Inspired by the article, Spinning Up as a Deep RL Researcher, especially the following paragraph, I decided to write my own implementations.

Write your own implementations. You should implement as many of the core deep RL algorithms from scratch as you can, with the aim of writing the shortest correct implementation of each. This is by far the best way to develop an understanding of how they work, as well as intuitions for their specific performance characteristics.

I will first re-implement the existing algorithms in openai/spinningup with my favorite code style. Then I will implement some algorithms that are not there.

My design principle:

Writting the shortest correct implementation of core deep RL algorithms.
Writting more readable code.

Algorithms

VPG
TRPO
PPO
DDPG
TD3
SAC
DQN
C51
QR-DQN

Installation

Creating the python environment

conda create -n spinningup python=3.6
source activate spinningup

Installing Spinning Up

git clone https://github.com/XFFXFF/spinningup.git
cd spinningup
pip install -e .

Running Tests

Training a model

cd spinningup
python -m spinup.algos.ppo --env Pendulum-v0 --seed 0

Plotting the performance(average epoch return)

cd spinningup
python -m spinup.plot data/ppo/Pendulum-v0/seed0

See the page on plotting results for documentation of the plotter.

References

C51

A Distributional Perspective on Reinforcement Learning, Bellemare et al, 2017.
Marc G. Bellemare, Pablo Samuel Castro, Carles Gelada, Saurabh Kumar, Subhodeep Moitra. Dopamine, https://github.com/google/dopamine, 2018.

QR-DQN

Distributional Reinforcement Learning with Quantile Regression, Dabney et al, 2017.

xffxff/spinningup