AINE-DRL

AINE-DRL is a deep reinforcement learning (DRL) baseline framework. AINE means "Agent IN Environment". If you want to know how to use, see AINE-DRL Documentation.

| Implementation | Experiments | Setup |

We always welcome your contributions! Please feel free to open an issue or pull request.

Implementation

AINE-DRL provides below things:

deep reinforcement learning agents
compatible with OpenAI Gym
compatible with Unity ML-Agents
inference (rendering, gif, picture)
model save/load
YAML configuration

If you're using AINE-DRL for the first time, please read Getting Started.

Agent

AINE-DRL provides deep reinforcement learning (DRL) agents. If you want to use them, it's helpful to read Agent docs.

Agent	Source Code
REINFORCE	reinforce
A2C	a2c
Double DQN	dqn
PPO	ppo
Recurrent PPO	ppo
PPO RND	ppo
Recurrent PPO RND	ppo

TODO

Experiments

You can see our experiments (source code and result) in experiments. We show some recent experiments.

BipedalWalker-v3 with PPO

Train agents in OpenAI Gym BipedalWalker-v3 which is continuous action space task.

Fig 1. BipedalWalker-v3 inference (cumulative reward - PPO: 248):

To train the agent, enter the following command:

python experiments/bipedal_walker_v3/run.py

Detail options:

Usage:
    experiments/bipedal_walker_v3/run.py [options]

Options:
    -i --inference                Wheter to inference [default: False].

If paging file error happens, see Paging File Error.

CartPole-v1 with No Velocity

Compare Recurrent PPO (using LSTM) and Naive PPO in OpenAI Gym CartPole-v1 with No Velocity, which is Partially Observable Markov Decision Process (POMDP) setting. Specifically, we remove "cart velocity" and "pole velocity at tip" from the observation space. This experiment shows to require memory ability in POMDP setting.

Fig 2. CartPole-v1 with No Velocity inference (cumulative reward - Recurrent PPO: 500, Naive PPO: 41):

Recurrent PPO	Naive PPO

Fig 3. CartPole-v1 with No Velocity cumulative reward (black: Recurrent PPO, cyan: Naive PPO):

To train the Recurrent PPO agent, enter the following command:

python experiments/cartpole_v1_no_velocity/run.py

Detail options:

Usage:
    experiments/cartpole_v1_no_velocity/run.py [options]

Options:
    -a --agent=<AGENT_NAME>       Agent name (recurrent_ppo, naive_ppo) [default: recurrent_ppo].
    -i --inference                Wheter to inference [default: False].

Setup

Follow the instructions.

Installation

This installation guide is simple. If you have a problem or want to see details, refer to Installation docs.

First, install Python 3.9 version.

If you want to use NVIDIA CUDA, install PyTorch with CDUA manually:

pip install torch==1.11.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

Now, install AINE-DRL package by entering the command below:

pip install aine-drl

Run

Run a sample script in samples directory. Enter the following command:

python samples/<FILE_NAME>

Example:

python samples/cartpole_v1_ppo.py

See details in Getting Started docs.

Paging File Error

When you use too many workers (e.g., greater than 8), because of too many multi parallel environments in multi threads, "The paging file is too small for this operation to complete." error may happen. If it happens, you can mitigate it using the command (Windows):

pip install pefile
python fixNvPe.py --input=C:\<Anaconda3 Path>\envs\aine-drl\Lib\site-packages\torch\lib\*.dll

<Anaconda3 Path> is one in which your Anaconda3 is installed.

Reference: cobryan05/fixNvPe.py (Github)

DevSlem/AINE-DRL