/RL-practice-keras

Primary LanguageJupyter Notebook

Deep Reinforcement Learning

Here I am implementing various RL algorithms, using python 2.7. I will use keras for the neurals nets. I'm going to use the OpenAI gym to test the algorithms. I list the methods below, which roughly divide into two categories.

I took / adjusted code from various online sources, which I inexhaustively list below (and in the code itself).

Value based methods

  • Q-learning (tabular)
  • Deep Q-Network (DQN)
  • Double DQN (DDQN)
  • DQN with prioritised replay
  • Dueling DQN (DDQN)
  • Distributional bellman

Policy based methods

  • Policy gradient -- REINFORCE & with baseline.
  • Actor critic (A2C)
  • Deep Deterministic Policy Gradient (DDPG)
  • Proximal policy optimization (PPO)
  • Soft Actor-Critic (soft AC)

Multi-agent

  • Muti-agent deep deterministic policy gradient (MADDPG)
  • Actor-Attention-Critic (AAC)
  • Value Decompostion Networks (VDN)
  • QMIX

Others

  • Explore-and-go
  • Curiosity driven learning (CDL)
  • Rainbow (RB)

Resources

Papers

Blogs

Textbooks

Acknowledgements