shakti365/ppo

TF2 Implementation of Proximal Policy Optimization

Python

Proximal Policy Optimization

Implementation of PPO Algorithm in TF2

Notes: https://shivamshakti.dev/posts/ppo

Usage

Create a virtual environment for Python (I use this setup)
Install the dependencies
```
pip install -r requirements.txt
```

Run the training script

cd src
python main.py # Uses `MountainCarContinuous-v0` by default

Run the evaluation script

python play.py --model_name <PATH_TO_SAVED_MODEL>

References