Policy Gradient
Minimal implementation of Stochastic Policy Gradient Algorithm in Keras
Pong Agent
This PG agent seems to get more frequent wins after about 8000 episodes. Below is the score graph.
Minimal Monte Carlo Policy Gradient (REINFORCE) Algorithm Implementation in Keras
PythonMIT