/deep-rl-tf2

🐋 Implementation of various Deep RL algorithms using TensorFlow2

Primary LanguagePythonApache License 2.0Apache-2.0

TF Depend GYM Depend License Badge

🐋 Deep RL in TensorFlow2

This repository uses TensorFlow2 to implement a variety of popular Reinforcement Learning algorithms. We've used the environments in OpenAI gym and our goal is to continuously update them to implement all of the algorithms specified in OpenAI Spinning Up.

ENV Reward Plot
CartPole-v1 discrete

Algorithms

DQN

Name Deep Q-Learning
Paper Playing Atari with Deep Reinforcement Learning
Author Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller
Method Temporal Diffrence / Off-Policy
Action Discrete

DRQN

Name Deep Recurrent Q-Learning
Paper Deep Recurrent Q-Learning for Partially Observable MDPs
Author Matthew Hausknecht, Peter Stone
Method Temporal Diffrence / Off-Policy
Action Discrete

A2C

Name Advantage Actor-Critic
Paper Actor-Critic Algorithms
Author Vijay R. Konda, John N. Tsitsiklis
Method Temporal Diffrence / On-Policy
Action Discrete / Continuous

A3C

Name Asyncronous Advantage Actor-Critic
Paper Asynchronous Methods for Deep Reinforcement Learning
Author Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu
Method Temporal Diffrence / On-Policy
Action Discrete / Continuous

PPO

Name Proximal Policy Optimization
Paper Proximal Policy Optimization
Author John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov
Method Temporal Diffrence / On-Policy
Action Discrete / Continuous

Comming Soon...

Usage

Discrete Action Space Asyncronous Advantage Actor-Critic

$ python A3C/a3c_discrete_action.py

Deep Q-Learning

$ python DQN/dqn_discrete_action.py

Continuous Action Space Proximal Policy Optimization

$ python PPO/ppo_continuous_action.py

Papers

Reference