/ebola_out_simulation

Simulated data and experiments for the use of machine learning in outbreak parameter (CFR) estimation.

Primary LanguageR

ebola_out_simulation

Comparison of machine learning methods for estimating case fatality ratios: An Ebola outbreak simulation study.

Alpha Forna1, PhD, Ilaria Dorigatti2, PhD, Pierre Nouvellet2,3 PhD, and Christl A. Donnelly2,4, ScD

  1. School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
  2. MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom.
  3. School of Life Sciences, University of Sussex, Brighton, UK.
  4. Department of Statistics, University of Oxford, Oxford, UK.

Introduction

Using simulated data, we use a ML algorithmic framework to evaluate data imputation performance and the resulting case fatality ratio (CFR) estimates, focusing on the scale and type of data missingness (i.e., missing completely at random - MCAR, missing at random – MAR, or missing not at random - MNAR).

Content

Outbreak simulation

Alogrithmic framework used to simulate outbreak data characteristics.

Requirements

Main R packages required to reproduce the simulation experiments.

Result visualisation

Sample script for visualising the outputs of the simulation experiments.

Simulated data.

Sample of the simulated data used for these experiments. (These data were simulated based on real outbreak data from the World Health Organisation (WHO). However, request for access to a real-life outbreak data should be made directly to the WHO by individual researchers and/or research groups).