Clean PyTorch implementations of Reinforcement Learning algorithms forked from OpenAI Spinning Up. This serves as a resource for understanding deep RL methods.
This fork has all the PyTorch implementations from the original repo, plus additional algorithms. The new implementations are all following the Spinning Up code format.
The following algorithms have been implemented.
Algorithm | Implementation | box |
discrete |
Multi Processing |
---|---|---|---|---|
REINFORCE [1] | RLBase | ✔️ | ✔️ | ✔️ |
VPG [2, 3] | Spinning Up | ✔️ | ✔️ | ✔️ |
PPO [4] | Spinning Up | ✔️ | ✔️ | ✔️ |
Double DQN [5, 6] | RLBase | 🔲 | ✔️ | 🔲 |
DDPG [7] | Spinning Up | ✔️ | 🔲 | 🔲 |
TD3 [8] | Spinning Up | ✔️ | 🔲 | 🔲 |
SAC [9] | Spinning Up | ✔️ | 🔲 | 🔲 |
HER [10] | RLBase | ✔️ | ✔️ | 🔲 |
Note: This is a work in progress and more algorithms will be added over time.
- You can install the package using
pip
. This installs the package and all its dependencies.
# From ~/rlbase
pip install -e .
- Follow the Spinning Up instructions for running the experiments and plotting the results.
- Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
- Policy Gradient Methods for Reinforcement Learning with Function Approximation
- High-Dimensional Continuous Control Using Generalized Advantage Estimation
- Proximal Policy Optimization Algorithms
- Human-level control through deep reinforcement learning
- Deep Reinforcement Learning with Double Q-learning
- Continuous control with deep reinforcement learning
- Addressing Function Approximation Error in Actor-Critic Methods
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
- Hindsight Experience Replay
- REINFORCE
- DQN and Double DQN
- Hindsight Experience Replay (HER)
- Option-Critic
- FeUdal Networks
- Refactor the code to share components among algorithms of the same class
- Add CNN option for policies
- Add Tensorboard support