/Deep-Reinforcement-Learning

Build and test DRL algorithms in different environments

Primary LanguageJupyter Notebook

Deep Reinforcement Learning

Build and test DRL Algorithms in different environments. Each folder in the archive contains all the needed files to run the notebooks to train an agent. The target of this repository is to implement and experiment with different algorithms to learn and better understand the methods.

List of implemented algorithms and environments (work in progress)

- [x] Deep Q-Network (DQN) for LunarLander-v2
- [x] Double Deep Q-Network (DDQN) for LunarLander-v2
- [x] Dueling Deep Q-Network (Dueling DQN) for LunarLander-v2
- [x] DDQN with Prioritized Experience Replay (PER) for LunarLander-v2
- [x] Dueling DDQN with PER for LunarLander-v2
- [x] Dueling DDQN with PER and N-step returns for LunarLander-v2
- [x] Rainbow DQN (Dueling DDQN with PER and N-step and Noisy Nets) for LunarLander-v2
- [x] Rainbow DQN (Dueling DDQN with PER and N-step and Noisy Nets) for Unity Banana Collector
- [x] REINFORCE for CartPole-v0
- [x] Proximal Policy Optimization (PPO) for LunarLander-v2
- [ ] Proximal Policy Optimization (PPO) for Unity Crawler (Multi-Agent)
- [x] Deep Deterministic Policy Gradient (DDPG) for Unity Reacher (Multi-Agent)
- [x] Soft Actor-Critic (SAC) for Continuous LunarLander-v2
- [x] Soft Actor-Critic (SAC) for Unity Reacher (Single-Agent)
- [x] AlphaZero for Connect4 (3x3, N=3)
- [x] Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for Unity Tennis
- [ ] Multi-Agent Proximal Policy Optimization (MAPPO) for Unity Soccer
- [ ] QMIX for Overcooked Environment
- [ ] MuZero for Connect4

papers to the implemented algorithms