An implementation of PPO in the discrete and continuous domains.
Set the appropriate hyperparameters and choose the environment in test_continuous.py or test_discrete.py. Run the file.
A plot of learning results automatically gets stored as well as video files of rendered episodes every 100 iterations.