Deep Reinforcement Learning with PyTorch

1. Dynamic Programming (Update : 13. 2. 2019)
1. Conditional GAN
1. Policy Iteration & Value Iteration
2. Value Based Methods (Update : 17. 2. 2019)
1. Vanilla DQN
2. PDD DQN
3. Policy Based Methods (Update : 23. 2. 2019)
1. A2C
2. PPO
4. Off-policy Policy Based Methods (Update : 10. 3. 2019)
1. SAC
2. SIL ( not with A2C, PPO but SAC)
5. Exploration Techniques (Update : 16. 3. 2019)
1. Thompson sampling with MCDO
2. RND
Breakout with only intrinsic rewards
6. Uncertainty in RL (Update : 24. 3. 2019)
7. Imitation Learning (Update : 30. 3. 2019)
1. GAIL
8. Multi-Agent RL (Update : 4. 4. 2019)
1. Upper Confidence Bounds for Tree(UCT)
2. Counterfactual Hedge

airopti/Deep_RL_with_pytorch