/TUD_RL

Primary LanguagePython

RL Dresden Algorithm Suite

This suite implements several model-free off-policy deep reinforcement learning algorithms for discrete and continuous action spaces in PyTorch.

Algorithms

Name Single-/Multi-Agent Action Space Source
DQN Single Discrete Mnih et. al. 2015
Double DQN Single Discrete van Hasselt et. al. 2016
Bootstrapped DQN Single Discrete Osband et. al. 2016
Ensemble DQN Single Discrete Anschel et. al 2017
MaxMin DQN Single Discrete Lan et. al. 2020
SCDQN Single Discrete Zhu et. al. 2021
ACCDDQN Single Discrete Jiang et. al. 2021
KE-BootDQN Single Discrete Waltz, Okhrin 2022
DDPG Single Continuous Lillicrap et. al. 2015
LSTM-DDPG Single Continuous Meng et. al. 2021
TD3 Single Continuous Fujimoto et. al. 2018
LSTM-TD3 Single Continuous Meng et. al. 2021
SAC Single Continuous Haarnoja et. al. 2019
LSTM-SAC Single Continuous Own Implementation following Meng et. al. 2021
TQC Single Continuous Kuznetsov et. al. 2020
MADDPG Multi Continuous Lowe et. al. 2017
MATD3 Multi Continuous Ackermann et. al. 2019
DiscMADDPG Multi Discrete Gumbel-Softmax discretization of MADDPG
DiscMATD3 Multi Discrete Gumbel-Softmax discretization of MATD3

Prerequisites

To use basic functions of this package you need to have at least installed

In order to use the package to its full capabilites, it is recommended to install the following dependencies:

Installation

The package is set up to be used as an editable install, which makes prototyping very easy and does not require you to rebuild the package after every change.

Install it using pip:

$ git clone https://github.com/MarWaltz/TUD_RL.git
$ cd TUD_RL/
$ pip install -e .

Note that a normal package install via pip is not supported at the moment and will lead to import errors.

Usage

Configuration files

In order to train in an environment using this package, you must specify a training configuration .yaml file and place it in one of the two folders in /tud_rl/configs depending on the type of action space (discrete, continuous).

You also find a variety of different example configuration files in this folder.

For an increased flexibility, please make yourself familiar with the parameters each algorithm offers.

Training

The recommended way to train or visualize your environment is to use the tud_rl package as a module using the python -m flag.

To run the package, you have to supply the following flags to the module:

-m [--mode=]

Training mode can be either train or visualize. If you want to visualize your environment, you must ensure that training weights are supplied in the config file:

For discrete training the config entry looks like:

---
dqn_weights: /path/to/weights.pth

For continuous training you must supply both actor and critic weights:

---
actor_weights: /path/to/actor_weights.pth
critic_weights: /path/to/critic_weights.pth
-c [--config_file=]

Name of your configuration file placed in either /tud_rl/configs/discrete_actions or /tud_rl/configs/continuous_actions.

-a [--agent_name=]

Name of the agent you want to use for training or visualization. The specified agent must be a present in your configuration file.

Example:

$ python -m tud_rl -m train -c myconfig.yaml -a DDQN

Gym environment integration

This package provides an interface to specify your own custom training environment based on the OpenAI framework. Once this is done, no further adjustment is needed and you can start training as described in the section above.

Structure

In order to integrate your own environment you have to create a new file in /tud_rl/envs/_envs. There you need to specify a class for your environment that implements at least three methods as seen in the following blueprint:

Empty custom env [minimal example]

# This file is named Dummy.py
import gym
class MyEnv(gym.Env):
    def __init__(self):
        super().__init__()
        """Your code"""

    def reset():
        """reset your env"""
        pass

    def step(action):
        """Perform step in environment"""
        pass

    def render(): # optional
        """Render your env to an output"""
        pass

See this blog article for a detailed explanation on how to set up your own gym environment

Integration of custom environment into TUD_RL

Once your environment is specified, you need to register it with gym in order to add it to the list of callable environments. The registration is done in the /tud_rl/__init__.py file by selecting the name your environment will be called with, and the entry point for gym to know where your custom environment is located (loc is the fixed base location while the rest is the class name of your environment):

register(
    id="MyEnv-v0",
    entry_point= loc + "MyEnv",
)

You can now select your environment in your configuration file under the env category.

Example (incomplete):

---
env:
  name: MyEnv-v0
  max_episode_steps: 100
  state_type: feature
  wrappers: []
  wrapper_kwargs: {}
  env_kwargs: {}
  info: ""
agent:
  DQN: {}

Citation

If you use this code in one of your projects or papers, please cite it as follows.

@misc{TUDRL,
  author = {Waltz, Martin and Paulig, Niklas},
  title = {RL Dresden Algorithm Suite},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/MarWaltz/TUD_RL}}
}