RecurrentMaskablePPO

RecurrentMaskablePPO is a custom implementation of the Proximal Policy Optimization (PPO) algorithm, designed specifically for environments with recurrent states and maskable actions. This implementation is based on the stable-baselines3-contrib repository, which extends the popular reinforcement learning library, stable-baselines3.

Features

Compatible with environments that have recurrent states and require masking of certain actions.
Built on top of the stable-baselines3 library, inheriting its modularity and ease of use.
Efficient and scalable implementation for complex tasks.

Installation

To install RecurrentMaskablePPO, follow the steps below:

Make sure you have Python 3.7 or later installed on your system. You can download the latest version from the official Python website.
Install stable-baselines3-contrib using requirements.txt:

pip install -r requirements.txt

Clone this repository:

git clone https://github.com/yourusername/RecurrentMaskablePPO.git

Navigate to the cloned repository and install the package:

cd recurrent_msakable
pip install -e .

wdlctc/recurrent_maskable

RecurrentMaskablePPO

Features

Installation