
Rare Event Sampling for Safety Verification within the CARLA simulator

Primary LanguagePythonMIT LicenseMIT

PEM Rare-Event Sampling in CARLA

Code for paper: Testing Rare Downstream Safety Violations via Upstream Adaptive Sampling of Perception Error Models.

This code base runs a cross-entropy-based adaptive importance sampling algorithm for an automated braking scenario. In this scenario, the ego vehicle is following another vehicle on a straight road, until the car in front brakes at a red light. The ego vehicle must brake to avoid crashing. The ego vehicle uses a Perception Error Model as a surrogate for a YOLO-based obstacle detector trained on the KITTI dataset. The safety specification for avoiding a crash is written in signal temporal logic:

$$\square_{[0, T]} \left( \lVert C_{ego}.pos - C_{other}.pos \rVert \geq 2 \right)$$



  • Install and run the CARLA simulator. Instructions here
  • Install python package pre-requisites using python -m pip install -r requirements.txt

Pre-trained models

A pre-trained version of the Perception Error Model used for the automated braking experiments (listed as ML-NN in the paper). Can be found under models/det_baseline_full/pem_class_train_full, and can be used as the pem argument in the experiment runner scripts.

If you want to train the PEMs from scratch, you will first need a YOLO Obstacle Detector trained on the KITTI dataset. For example, a pre-trained py-torch version can be found here.

For the alternative baseline PEMs found in the paper, the logistic-regression PEM (LR) was trained using sci-kit-learn, and the Bayesian Neural Network (B-NN) was trained using pyro.

How to Run

Running Simulations

To run the adaptive importance sampling experiment for the automated braking scenario from the paper, run the command:

runSim.py <config-path>

Where config-path is a file path to one of the experiment configuration files (e.g., configs/classic.yaml).

Configuration Parameters

Within a given .yaml configuration file, you can set the following configuration parameters:

  • exp_name: Name given to experiment. Simulation rollout logs will be saved to


Learned proposal samplers will be saved to

  • render: True/False flag of whether to render simulations or not.
  • repetitions: Number of times to repeat the same experiment (used for getting averaged statistics)
  • cem_stages: The number of adaptive simulation stages to run in cross-entropy importance sampling. Corresponds to the $K$ parameter in paper.
  • episodes: The number of simulation rollouts sampled per adaptive stage. Corresponds to $N_{\kappa}$ parameter in paper.
  • timesteps: Time steps per simulation episode. Parameter $T$ in paper.
  • vel_burn_in_time: Number of time steps at start of simulation which are not recorded as part of trajectory. Useful for getting cars up to speed before tracking.
  • pem_path: Path to pre-trained perception error model
  • safety_func: STL Robustness metric used for evaluation. Choices are ["classic", "agm", "smooth_cumulative"]

If run with --render flag, you should see the following:

Server window

Automated Braking Simulation

On-board camera output

Camera Output

Analysing Data

Summary metrics

To chart the number of failures, and average negative log-likelihood of a given set of simulation rollouts, run:

analyzeFailures.py [<experiment-folder-paths>]

Where experiment-folder-paths are the paths to the top level simulation data folders you are interested in comparing. For example sim_data/STL_Classic/<timestamp>.

Visualizing the proposal

To see a plot of a given proposal distribution, run:

vizDistanceProposal.py <cem-model-path>

This charts the current distance from the car in front to the detection rate the proposal model gives for states with that distance.