/deepracer-training-2019

Dump of all training tools used in 2019 races

Primary LanguageJupyter Notebook

DeepRacer 2019 Sandbox

This is my full collection of tools, notebooks, scraps for participation in 2019 AWS DeepRacer Virtual League.

What you'll find in this repo:

  • Local training assets: container Dockerfiles, launch scripts mostly in bash, monitoring scripts
  • AWS cloud-based training scripts: pre-dating my local training setup, but most useful for cloud-based evaluations of local training
  • Analysis Notebook
  • Models/Experiments: all the training sessions hyperparameters, reward functions, action space
  • RoboMaker simapp: scripts to build the bundle, source files to add or replace files within the bundle
  • Twitch streaming assets: UI (flask-based), ffmpeg tools to stream from simulation
  • Airflow automation DAGs

NB: I am not an expert in ML/RL and participation in DeepRacer was a way to educate myself. Forgive me any naive or wrong approaches taken. Feel free to send me any observations, suggestions for different approaches, related papers or projects, or just to drop me a line.

Race Results

Race Standing
August 2019 Virtual Race
Shanghai Sudu
102 of 1375
September 2019 Virtual Race
Cumulo Carrera
132 of 1338
October 2019 Virtual Race
Toronto Turnpike
60 of 1983
November 2019 Virtual Race
Championship Cup Warm-up
8 of 904
AI Driving Olympics at NeurIPS Phase I: Perception challenge: Top 10
Phase II: Simulation to Reality challenge: did not place

The code and scripts are shared here unfiltered. Some items may be broken or hacky. The goal was to educate myself about reinforcement learning and train competitive models, sometimes at the expense of good coding practices. I'll be starting a new repo for any work I do on the 2020 DeepRacer races and won't be adding any more changes to this code.

I'll follow here with some select items that I hope may be of interest to those looking to compete in the 2020 DeepRacer League.


RoboMaker Bundle Management

The official SimApp bundle for DeepRacer is publicly readable and located at https://s3.amazonaws.com/deepracer-managed-resources/deepracer-github-simapp.tar.gz

robomaker/deepracer-simapp.tar.gz.md5 - MD5 of the bundle to verify we're using the correct base for file patches

airflow/monitor_deepracer_simapp.py - Script to monitor the hosted simapp bundle for changes. Currently uses a date-based validation comparing official bundle to a copy stored in an S3 bucket I own

patch/* - overlay files to add or replace files within the bundle. These are mostly local edits to markov package, additional gazebo assets, added parameters to launch files.

scripts/bundle.sh - Create a bundle using the base simapp and overlaying files from patch/.

scripts/publish.sh - Upload the patched bundle to an S3 bucket owned by me, consumable by RoboMaker for running patched simulations in the cloud


Local Training

This was grown out of necessity and not out of convenience. Therefore it is completely custom for my preferences and does not use the well-known DeepRacer Community training stack on GitHub.

Goals for my local stack were:

  • Full access to the simapp bundle code to edit or add files
  • Fast iteration on code changes to the simapp bundle using Docker volumes to patch containers
  • Unified logging for later analysis
  • Replication of all training artifacts to S3, effectively making local storage a "cache" that can be cleared

Components:

  • dr-training - Sagemaker/TensorFlow training
  • dr-simulation - RoboMaker/ROS/Gazebo simulation
  • dr-redis - pub/sub between dr-simulation and dr-training
  • dr-logger - "sidecar" logger to aggregate all container logs and write them to JSON files
  • dr-uploader - background synchronization of training assets and logs to S3 bucket
  • minio - S3 replacement to store training checkpoints locally

Interesting bits:

container/Dockerfile.* - Dockerfiles for the local training setup

scripts/launch_local.sh - Entrypoint for local training kickoff

models/* - Inputs for local training, a unique folder for each training session with hyperparameters, action space, reward function

docker-compose.yml - container configuration


Twitch Streaming

I streamed training at https://www.twitch.tv/deepstig later in the season. I used OBS to host a browser-based UI with a VLC stream overlay, sending frames out of my local training simulation via ffmpeg over udp.

twitch/app.py - Flask app to show a UI with some near-real-time metrics

container/streamer.sh - In-container script to listen to ROS camera node RGB image messages and pipe them directly to ffmpeg stdin in order to generate a mpegts stream over udp

scripts/monitor_video.sh - Script to launch streamer.sh within the container, passing in the udp stream destination


Log Analysis

Based on AWS DeepRacer Workshop Jupyter notebook but heavily modified. Any time I had a question about training progress or simulation behavior I would add some new features to this. Its really overgrown now but gives me a full and complete picture of training as I run it.

For brevity, I'll pull out a few interesting sections but you can click to the [full notebook](log_analysis/DeepRacer Log Analysis.ipynb) to see the code.

Description
Training progress, loss. I would watch this to discover points at which I would need to stop training or adjust hyperparameters.
Action space usage. This helped me to know if there were unused actions that could be culled out.
Car performance during training. Mostly scatterplots of episodic metrics, with mean for the iteration overlaid in orange. The most intersting is the fourth graph which shows progress per lap, but also ratio of completed laps. If the completion ratio was between 20% (red) and 40% (green) lines, I would submit the model for racing. If the completion ratio was more than 40%, I would push the speed a little further and retrain.
Correlate high rewards to high speed. If they don't correlate then there is most likely a problem in the reward function.
Heatmap showing rewards for each step. A good indicator of the line that is rewarded traversing the track.
Exit points plot. Clumped exit points may indicate an action space can be modified to have a better turn angle, or that reward function might be rewarding a wrong action.
Actions mapping. Only really useful for an action space with one speed per steering angle. Correlates actions with track waypoints.
Single episode summary. Shows: step location, heading angle (black), steering angle (red), episode pace
Speed. Blue line is the actual speed, measured as incremental distance between steps. Yellow is throttle and cyan is steering. This helps to easily see the effect of steering and throttle position on speed.
Correlate steering with heading change.
Reward and Progress. This graph verifies higher rewards for higher progress per step.
Try to detect slippage. The car can wipe out on turns if speed is too high. This graph shows when heading and direction of movement over ground don't correlate.
Run inference on an image to find its action probabilities. This can indicate the health of the model.
GradCAM. Finds the aspects of the image that the network is focusing on.
Convolutional layer activations. This is mostly for making the convolutional layers more interpretable by seeing the features they activate on.

Interesting items:

log_analysis/DeepRacer Log Analysis.ipynb - The notebook

log_analysis/images/* - Still image captures of a variety of tracks to use in analysis, such as running it through the model to get action probabilities


Airflow Automation

I had aspired to use airflow to work through a queue of training and evaluation jobs but ultimately didn't end up spending the time automating to that level. The primary usage of airflow was to submit the model to the virtual league every ~30 minutes.

It was unfortunate but the winners were so close that luck and brute force had a large part in getting to the top positions. This would use Selenium and ChromeDriver submit the model, and also handle any authentication that might need to happen as part of that workflow.

airflow/deepracer_submit_dag.py - Submit a model for evaluation every 30 minutes


Resources

Official AWS Resources

Components of DeepRacer

  • AWS SageMaker
  • AWS RoboMaker
  • AWS Kenesis
  • AWS CloudWatch Logs
  • AWS S3
  • AWS Lambda

Useful Tools

Community Resources

Education

Other Useful Resources

Discussion Groups


DeepRacer Service Map