Status: Archive (code is provided as-is, no updates expected)
Our approach to the ProcGen FruitBot Final Project uses Phasic Policy Gradient with additions of reward normalization and action penalization. The explanation for PPG can be found at (
Supported platforms:
- macOS 10.14 (Mojave)
- Ubuntu 16.04
Supported Pythons:
- 3.7 64-bit
You can get miniconda from if you don't have it, or install the dependencies from environment.yml
git clone
conda env update --name phasic-policy-gradient --file phasic-policy-gradient/environment.yml
conda activate phasic-policy-gradient
pip install -e phasic-policy-gradient
To train the environment using Fruitbot, use the following command.
python -m phasic_policy_gradient.train --rnorm [False, True] --acpenalization [False, True]
To test the model on more difficult levels:
python -m phasic_policy_gradient.test --model_path path/to/model.jd
For either testing or training, to modify the levels you run on, change start_level and num_levels in line 9 of