欢迎大家入坑 Reinforcement Learning,本代码实现使用pytorch
python 3.8、pytorch 1.8.1、gym 0.15.7
- Monte Carlo
- Dynamic Program
- Policy iteration
- value iteration
- Sarsa
- Q learning
- Policy Gradient
- REINFORCE_discrete
- REINFORCE_continuous
- REINFORCE with Baseline
- DQN
- Nature_DQN
- Navie_DQN
- Double_DQN
- Dueling_DQN
- DQN with Prioritized Experience Replay
- N_Step_DQN
- Distributional_DQN
- Noisy_DQN
- Rainbow
- Actor-Critic
- DDPG/TD3
- PPO
- SAC
- Dyna-Q