/saferl_kit

Primary LanguagePythonMIT LicenseMIT

Saferl kit: Evaluating Efficient Reinforcement Learning Methods for Safe Autonomous Driving

image-20220517135619231

TD3 EPO LAG
FAC REC QPSL

Description

Safe_RL kit is a toolkit for benchmarking efficient and safe RL methods for autonomous driving tasks ,which consists of the following sections:

Repository Structure

The repository is organized as follows:

/metadrive the safe-rl environment for the car-driving, which are implement by metadrive .

/bullet_safety_gym the safe-rl environment for the car speed control,which are implement by Bullet-Safety-Gym .

/safe_algo :this folder defines the whole reinforcements learning algorithms used in this repository . All algorithms are implemented based on TD3 structure. The class structure of all methods is the same and the only differences are the implementation of methods in these classes.

/saferl_utils:This folder defines the empirical playback pool and network structure for all algorithms. corresponding to the folder safe_algosnetworks.pydefine all the networks used in this repository,which are based on the TD3 structure,using 2 hidden layers which has 256 unit per layer as the net-architecture,the replay buffer used in the off-policy are defined in the replay_buffer.py.

/safety_plotter:This folder defines the settings about logging files and drawing pictures.

plot.py plot the picture using files which generated by logger.

train_metadrive.py : The main script for running experiments in metadrive, which parses command-line arguments from the user .

Installation and Setup

To install the dependencies,use the command line

pip instal -r requirements.txt

Running Experiments

We can run and test all the safe reinforcement learning algorithms and two environments involved in the framework by using the following command line:

python main.py --use_td3 --env SafetyCarRun-v0 --save_model --exp_name td3 --seed 0 --loop 5 --device 1 --max_timesteps 1000000

We can use all the algorithms defined in this article in a manner similar to --use_td3, which is shown in the train_metadrive.py file; Also, we can set the environment to be used in this experiment by setting --env env_name. Currently, SafeMetaDriveEnv and SafetyCarrun-v0 are supported. In addition, we can also customize the environment by customizing gym environment.

Showing Results

we can see the algorithm result by drawing the logs which generated by the logger, will be placed at /logs in default.

use the following command line in the root folder to draw the result.

./plot.sh

this will draw the episode reward,episode cost,cost rate in sequence.