/reinforcement_learning

An implementation of the reinforcement learning for CartPole-v0 by policy optimization

Primary LanguagePython

An implementation of the reinforcement learning for CartPole-v0 by policy optimization

record

The step plot of the result

step

The histogram of the 100 simulation result (mean value 199)

hist

Reference

  1. CartPole-v0: https://gym.openai.com/envs/CartPole-v0/
  2. Ilyas, Andrew, et al. "A closer look at deep policy gradients." arXiv preprint arXiv:1811.02553 (2018).