Reinforcement Learning on Autonomous Race Car

Reinforcement Learning approach to "Formula Student Technion Driverless" project, simulated in Unreal Engine 4 with AirSim plugin, using Soft Actor Critic (SAC) algorithm and Variational Auto Encoder (VAE).

Youtube Video

Full Project Report

Prerequisites
How To Build
Run in Test Mode
Run in Train Mode
Run in Train Mode + VAE Training
Overview
Citation
Credits

Prerequisites

Operating System: Ubuntu 18.04 or Windows 10
Software: Unreal Engine 4.24.3
GPU: Nvidia GTX 1080 or higher (recommended)

How To Build

Set up Unreal Engine 4, AirSim and the FSTD Environment by following this link Ubuntu , Windows
If you are using Ubuntu, you can skip this step. If you are using Windows: Download updated RaceCourse folder from this link and place it in ProjectName\Content
Launch a new Conda environment, Python version 3.6, and install requirements with pip install -r requirements.txt
Download pretrained VAE model and place it in repo's directory

Run in Test Mode

If you wish to reproduce results with trained model

Choose a map, press Play, and run test.py

Run in Train Mode

If you wish to train your own model

Choose a map, press Play, and open train.py
- If you are training from scratch, use initial learning in train.py and run
- If you want to continue training model on another map, use continual learning in train.py and run

Run in Train Mode + VAE Training

It's recommended to use the trained vae.json model, as the training time of VAE and SAC together will be long.

Uncomment line vae.optimize() in custom_sac.py
Choose a map, press Play, use first learn in train.py and run

Overview

This project is an experiment in deep reinforcement learning to train a self-driving (fully autonomous) race-car to drive through cones in a race track, in contribution to the Technion team in the "Formula Student Driverless" competition.

The main goal is to learn a steering policy through the cones, with a constant throttle, getting the car to a speed of about 7.5 m/s (27 km/h). After about an hour of training, the car will complete a lap successfully.

Pipeline:

Observation is obtained from simulated camera mounted on the car, cropped to remove redundant data and then encoded by VAE.
Each VAE encoded observation is concatenated with the last 20 actions.
Every 4 VAE encoded observation are stacked together and fed into the SAC algorithm (using google DeepMind idea).
When the car goes out of bounds, or hits a cone, episode ends and SAC optimizations are made.
If VAE optimization is enabled, VAE will also optimize.


Cropped	VAE output

Reward function is the distance of the car from the center of the track. The closer the car to the center, the higher the reward. If the car hits a cone of exits track, it gets a penalty. Center of the track is calculated by getting 2 closest WayPoints to the car, and the calculation of the distance between the car and the line connecting those 2 WayPoints.


WayPoints example	Distance calculation

Calculation of distance:

Citation

If this project helped you, please cite this repository in publications:

@misc{Reinforcement-Learning-on-Autonomous-Race-Car,
  author = {Kanfi, Elior},
  title = {Reinforcement Learning on Autonomous Race Car},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/eliork/Reinforcement-Learning-on-Autonomous-Race-Car/}},
}

Credits

Zadok, D.; Hirshberg, T.; Biran, A.; Radinsky, K. and Kapoor, A. (2019). Explorations and Lessons Learned in Building an Autonomous Formula SAE Car from Simulations.In Proceedings of the 9th International Conference on Simulation and Modeling Methodologies, Technologies and Applications - Volume 1: SIMULTECH, ISBN 978-989-758-381-0, pages 414-421. DOI: 10.5220/0008120604140421
AirSim
AWS DeepRacer
Formula Student Technion Driverless - Based on AirSim
learning-to-drive-in-a-day
Learning to Drive Smoothly in Minutes
OpenAI GYM
Stable-Baselines

eliork/Reinforcement-Learning-on-Autonomous-Race-Car