/rlpg

Primary LanguagePython

Policy Gradient for CartPole-v1

This is a tensorflow implementation of a policy gradient algorithm for CartPole-v1 environment of OpenAI gym. In addition to the policy network, a value network is also lerned in order to reduce the variance during training.

Requirement

  • tensorflow 0.11
  • OpenAI gym

Training

	$ python main.py