Policy Gradient for CartPole-v1

This is a tensorflow implementation of a policy gradient algorithm for CartPole-v1 environment of OpenAI gym. In addition to the policy network, a value network is also lerned in order to reduce the variance during training.

Requirement

tensorflow 0.11
OpenAI gym

Training

	$ python main.py

shimaokasonse/rlpg

Policy Gradient for CartPole-v1

Requirement

Training