Powered by stable-baselines3
PORT
- port to listen for action requests on, defaults to 80OBS_SHAPE
- observation shape as a json arrayACTION_SHAPE
- action shape as int/json arrayPOLICY
- Stable Baselines PPO policy to use, defaults toMlpPolicy
BATCH_SIZE
- batch size when training; defaults to 32BUFFER_SIZE
- defaults to 1000000SAVE_STEPS
- Save every steps; defaults to 1000MODEL_PATH
- load/save path for the Stable Baselines PPO model, defaults to"models/model"
RESET
- if true, will create a new model instead of loading an existing oneVERBOSE
- Stable Baselines PPO2 verbosity level (int)TAU
- defaults to 0.005GAMMA
- reward discount rate, defaults to 0.99
- more documentation (env vars, request/response)
- support LSTM/CNN policies
- test :)
- allow done to be set through a route eg
/done