reinforcement learning algorithms implementation with pytorch
reference:
https://github.com/AI4Finance-LLC/ElegantRL (DRL algo impl)
https://github.com/starry-sky6688/StarCraft (MARL algo impl)
-
MazeEnv: my gmy-like environment, for tabular algos
-
MonteCarlo
-
off-policy MonteCarlo (with important sampling)
-
Sarsa
-
QLearning
-
DoubleQLearning
-
n-step Sarsa
-
Sarsa(lambda)
-
Deep Q Network
-
DDQN(Double DQN)
-
Dueling DQN
-
D3QN(Dueling DDQN)
-
REINFORCE
-
REINFORCE with baseline
-
DDPG(Deep Deterministic Policy Gradient)
-
TD3(Twin Delayed DDPG)
- PPO(Proximal Policy Optimization) (PPO-Clip)
(Env: Multi-Agent partical world)
- Qmix