/proximal-policy-optimization

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

Primary LanguagePython

Proximal Policy Optimization

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates entropy.

The code contains a lot of comments and can be helpful to understand both PPO and PyTorch.

How to use

  1. Clone the repository to get the files locally on your computer (see https://git-scm.com/book/en/v2/Git-Basics-Getting-a-Git-Repository, Cloning an Existing Repository)

  2. Navigate into the root folder of the project: /ppo

  3. Download necessary dependencies. These dependencies can be found in the file requirements.txt. Use your favorite package manager/installer to install the requirements, we recommend using pip. To install the requirements, run the following command in the root folder of the project (where requirements.txt is located):

    pip install -r requirements.txt

  4. All you need is an instance of the Environment class (see source code for specification), two are already provided. You also need a Learner object. See the example in main.py.