RL Algorithms using PyTorch on OpenAI Gym

You will find here implementations of several deep reinforcement learning (RL) algorithms using PyTorch. I am going to evaluate and compare each on one or more environment from OpenAI Gym. The purpose of this repository is to help kickstart my journey in RL + document my learning experience. I hope it might be useful for other people starting as well. :)

I am planning to write a blog post to accompany this repo, so stay tuned!

Implementations

Algorithm	Features	Solved* (Episodes**)	Paper
REINFORCE (Monte-Carlo Policy Gradient)	Baseline Causality	`CartPole-v0 LunarLander-v2`	Williams 1992
Deep Q-Networks (DQN)	Huber Loss Gradient Clipping Polyak Averaging	`CartPole-v0 (783) LunarLander-v2 (344)`	Minh et al. 2013
Double DQN	Same as DQN	`CartPole-v0 (626) LunarLander-v2 (375)`	van Hasselt et al. 2015
Prioritized Experience Replay (PER)	Same as DQN Proportional Prioritization	`CartPole-v0 (538) LunarLander-v2 (278)`	Schaul et al. 2016
Dueling DQN	Same as DQN + PER	`CartPole-v0 (698) LunarLander-v2 (275)`	Wang et al. 2016
A3C	Generalized Advantage Estimation	`PongDeterministic-v4`	Minh el al. 2016
Rainbow			Hessel et al. 2017
and many more...
*These are the environments I attempted to solve using my code so far. The algorithms are certainly capable of solving more (check the attached papers for details). I will be trying them on more diverse environments in the future to evaluate my implementation.
**The average number of episodes it took to solve the environment across 10 runs with different seeds

Configurations

Each implementation has its own yaml config file to easily change model and environment parameters.

BKHMSI/RL-Playground

RL Algorithms using PyTorch on OpenAI Gym

Implementations

Configurations