Proximal-Policy-Optimization

Tensorflow implementation of PPO from (https://arxiv.org/abs/1707.06347). Without any changed parameters, the program trains an agent in the Humanoid-v2 environment from OpenAI Gym.

Dependencies:

Mujoco-py (Mujoco150+)
OpenAI gym
Numpy
Tensorflow
Matplotlib
Scipy

Usecase: Example call:

python ppoMain.py --env Humanoid-v2 --episodes 1000 --localsteps 2000 --batchSize 64

python ppoMain.py -hcan be used to learn more about the input format.

JL321/Proximal-Policy-Optimization

Proximal-Policy-Optimization