AINE-DRL is a deep reinforcement learning (DRL) baseline framework. AINE means "Agent IN Environment". If you want to know how to use, see AINE-DRL Documentation.
| Implementation | Experiments | Setup |
We always welcome your contributions! Please feel free to open an issue or pull request.
AINE-DRL provides below things:
- deep reinforcement learning agents
- compatible with OpenAI Gym
- compatible with Unity ML-Agents
- inference (rendering, gif, picture)
- model save/load
- YAML configuration
If you're using AINE-DRL for the first time, please read Getting Started.
AINE-DRL provides deep reinforcement learning (DRL) agents. If you want to use them, it's helpful to read Agent docs.
Agent | Source Code |
---|---|
REINFORCE | reinforce |
A2C | a2c |
Double DQN | dqn |
PPO | ppo |
Recurrent PPO | ppo |
PPO RND | ppo |
Recurrent PPO RND | ppo |
- DDPG
- Prioritized Experience Replay
- SAC
- Intrinsic Curiosity Module (ICM)
- Random Network Distillation (RND)
You can see our experiments (source code and result) in experiments. We show some recent experiments.
Train agents in OpenAI Gym BipedalWalker-v3 which is continuous action space task.
Fig 1. BipedalWalker-v3 inference (cumulative reward - PPO: 248):
To train the agent, enter the following command:
python experiments/bipedal_walker_v3/run.py
Detail options:
Usage:
experiments/bipedal_walker_v3/run.py [options]
Options:
-i --inference Wheter to inference [default: False].
If paging file error happens, see Paging File Error.
Compare Recurrent PPO (using LSTM) and Naive PPO in OpenAI Gym CartPole-v1 with No Velocity, which is Partially Observable Markov Decision Process (POMDP) setting. Specifically, we remove "cart velocity" and "pole velocity at tip" from the observation space. This experiment shows to require memory ability in POMDP setting.
Fig 2. CartPole-v1 with No Velocity inference (cumulative reward - Recurrent PPO: 500, Naive PPO: 41):
Recurrent PPO | Naive PPO |
---|---|
Fig 3. CartPole-v1 with No Velocity cumulative reward (black: Recurrent PPO, cyan: Naive PPO):
To train the Recurrent PPO agent, enter the following command:
python experiments/cartpole_v1_no_velocity/run.py
Detail options:
Usage:
experiments/cartpole_v1_no_velocity/run.py [options]
Options:
-a --agent=<AGENT_NAME> Agent name (recurrent_ppo, naive_ppo) [default: recurrent_ppo].
-i --inference Wheter to inference [default: False].
Follow the instructions.
This installation guide is simple. If you have a problem or want to see details, refer to Installation docs.
First, install Python 3.9 version.
If you want to use NVIDIA CUDA, install PyTorch with CDUA manually:
pip install torch==1.11.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
Now, install AINE-DRL package by entering the command below:
pip install aine-drl
Run a sample script in samples directory. Enter the following command:
python samples/<FILE_NAME>
Example:
python samples/cartpole_v1_ppo.py
See details in Getting Started docs.
When you use too many workers (e.g., greater than 8), because of too many multi parallel environments in multi threads, "The paging file is too small for this operation to complete." error may happen. If it happens, you can mitigate it using the command (Windows):
pip install pefile
python fixNvPe.py --input=C:\<Anaconda3 Path>\envs\aine-drl\Lib\site-packages\torch\lib\*.dll
<Anaconda3 Path>
is one in which your Anaconda3 is installed.
Reference: cobryan05/fixNvPe.py (Github)