This repository contains PyTorch implementations from scratch of the state-of-the-art deep reinforcement learning algorithms:
- Deep Q-Network (DQN)
- Proximal Policy Optimization (PPO)
- Twin Delayed DDPG (TD3)
- Soft Actor-Critic (SAC)
This project requires Python 3.5+, PyTorch 1.0.1+ and Gym 0.16+.
$ git clone
$ pip3 install -r requirements.txt
To run the code:
$ cd DQN
$ python3
For Boltzmann exploration, simply add:
$ python3 --boltzmann
Training parameters can be changed in the cfg/config_dqn.yaml file.
Pytorch implementation of the paper : Proximal Policy Optimization Algorithms (2017)
To run the code (env with discrete action space):
$ cd PPO
$ python --env CartPole-v1
Or (env with continuous action space):
$ python --env LunarLanderContinuous-v2
You can change the training parameters in the cfg/config_ppo.yaml file.
To run the code:
$ cd TD3
$ python
You can change the training parameters in the cfg/config_td3.yaml file.
To run the modern version of the SAC algorithm: clipped double-Q trick + no the extra value function. This implementation follows the pseudo-code available at Spinning Up (Open AI):
$ cd SAC
$ python3
Older version of SAC algorithm:
$ python3 --old_agent
You can change the training parameters in the cfg/config_sac.yaml file.
- Mnih, Kavukcuoglu, Silver, Graves, Antonoglou, Wierstra & Riedmiller. (2013). Playing Atari with Deep Reinforcement Learning. PDF.