/pg_travel

Policy Gradient algorithms (REINFORCE, NPG, TRPO, PPO)

Primary LanguagePythonMIT LicenseMIT

Implement deeprm, which is initially implemented here, to suit the code structure used in pg_travel.

Usage:
python3 main.py [options]

TODO:

  • make the env more down-to-earth
  • integrate kubernetes features
  • use more sophisticated and specific model and algorithm