Reinforcement Learning on Autonomous Race Car

Reinforcement Learning approach to "Formula Student Technion Driverless" project, simulated in Unreal Engine 4 with AirSim plugin, using Soft Actor Critic (SAC) algorithm and Variational Auto Encoder (VAE).

Youtube Video

Full Project Report

gifToGit

Table of Contents

Prerequisites

  • Operating System: Ubuntu 18.04 or Windows 10
  • Software: Unreal Engine 4.24.3
  • GPU: Nvidia GTX 1080 or higher (recommended)

How To Build

  1. Set up Unreal Engine 4, AirSim and the FSTD Environment by following this link Ubuntu , Windows
  2. If you are using Ubuntu, you can skip this step. If you are using Windows: Download updated RaceCourse folder from this link and place it in ProjectName\Content
  3. Launch a new Conda environment, Python version 3.6, and install requirements with pip install -r requirements.txt
  4. Download pretrained VAE model and place it in repo's directory

Run in Test Mode

If you wish to reproduce results with trained model

  • Choose a map, press Play, and run test.py

Run in Train Mode

If you wish to train your own model

  • Choose a map, press Play, and open train.py
    • If you are training from scratch, use initial learning in train.py and run
    • If you want to continue training model on another map, use continual learning in train.py and run

Run in Train Mode + VAE Training

It's recommended to use the trained vae.json model, as the training time of VAE and SAC together will be long.

  1. Uncomment line vae.optimize() in custom_sac.py
  2. Choose a map, press Play, use first learn in train.py and run

Overview

This project is an experiment in deep reinforcement learning to train a self-driving (fully autonomous) race-car to drive through cones in a race track, in contribution to the Technion team in the "Formula Student Driverless" competition.

The main goal is to learn a steering policy through the cones, with a constant throttle, getting the car to a speed of about 7.5 m/s (27 km/h). After about an hour of training, the car will complete a lap successfully.

Pipeline:

  • Observation is obtained from simulated camera mounted on the car, cropped to remove redundant data and then encoded by VAE.
  • Each VAE encoded observation is concatenated with the last 20 actions.
  • Every 4 VAE encoded observation are stacked together and fed into the SAC algorithm (using google DeepMind idea).
  • When the car goes out of bounds, or hits a cone, episode ends and SAC optimizations are made.
  • If VAE optimization is enabled, VAE will also optimize.
airsimPhotosFastCut vaePhotosFastCut
Cropped VAE output

Reward function is the distance of the car from the center of the track. The closer the car to the center, the higher the reward. If the car hits a cone of exits track, it gets a penalty. Center of the track is calculated by getting 2 closest WayPoints to the car, and the calculation of the distance between the car and the line connecting those 2 WayPoints.

Screenshot from 2021-03-03 15-53-01 Screen Shot 2021-03-09 at 19 07 58
WayPoints example Distance calculation

Calculation of distance:

equation

Citation

If this project helped you, please cite this repository in publications:

@misc{Reinforcement-Learning-on-Autonomous-Race-Car,
  author = {Kanfi, Elior},
  title = {Reinforcement Learning on Autonomous Race Car},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/eliork/Reinforcement-Learning-on-Autonomous-Race-Car/}},
}

Credits