/RL-DWA

Using stable-baselines3 'PPO' reinforcement learning algorithm to train dynamic window approach

Primary LanguagePython

RL-DWA

Summary: Using stable-baselines3 'PPO' reinforcement learning algorithm to train dynamic window approach

💡Installation and Description

1️⃣ Stable-Baselines3 [🔗LINK]

For reinforcement learning algorithm, PPO is used in opensource Stable-Baselines3. Click installation link for Stable-Baselines3 above.

Here, the codes are tested in Windows in conda virtual environment dependent on python 3.7.

Please keep in mind to match the compatible versions of stable-baselines3, tensorboard, pytorch, python and so on.

2️⃣ Pygame Environment [🔗LINK]

The base idea of creating dynamic window approach pygame environment has come from the following link.

In scripts/dynamic_window_approach_game.py, you can check the modified code.

The main difference is output control of mobile robot changed from vr, vl (right and left wheel angular speed) to v, w (vehicle linear and angular speed).

⭐Main

In your command prompt (Anaconda Powershell Prompt), execute:

$ python DWA-learn-main.py

There is an option to see just reward log or other train hyperparmeter, loss included logs. If you wish to see details of the training log, go to scripts/DWA_learn_main.py and uncomment line 22 and 28, or just add (if you cannot find):

c_logger = configure(logdir, ["stdout", "csv", "tensorboard"])
model.set_logger(c_logger)

While training, you can check by executing:

$ tensorboard --logdir=logs   # simple logs
$ tensorboard --logdir=${your saved log directory name}   # detail logs

🚎Algorithm Application in Real World Mobile Robot

Check detailed information in the following link HERE.