Forked from stable-baselines.
Documentation is available online: https://stable-baselines.readthedocs.io/
stable-baselines/ppo2_ssup/ppo2/PPO2_SSup
TODO:
- Setup dev environment on flanders
- Specify pseudo-code
- Set initial hyper-parameters values
- Implementation for PPO
- xp + tune hyper-parameters
- Validation with additional xps (similar states vs. timesteps, etc.)