OneQuietNight Covid-19 Forecast

Forecast National, state, and county numbers of new COVID-19 cases per week for next 4 weeks.
Authors Areum Jo (areumjo1@gmail.com), Jae Cho (jaehun.cho@gmail.com)
Last Updated 2021-01-10
Paper OneQuietNight Covid-19 Forecast

OneQuietNight Covid-19 Forecast uses scientifically-driven machine learning models to accurately predict the spread of Covid-19 infections using real-time data from Delphi COVIDcast, JHU CSSE, The COVID Tracking Project, Apple Mobility Trend Reports, Google COVID-19 Community Mobility Reports, and C3 AI Covid 19 Data Lake. OneQuietNight forecasts the number of new Covid-19 cases per week for the next 4 weeks at the national, state, and county levels.

We publish the forecast through a web application and submit them to the CDC to help inform public health decision making.

Runbook

Install:

git clone https://github.com/One-Quiet-Night/COVID-19-forecast
cd COVID-19-forecast
python setup.py install

Retrain the model and generate real-time predictions:

python main.py

Basic usage

OneQuietNightEnvironment is the main entry point for the program.

from onequietnight.env import OneQuietNightEnvironment
env = OneQuietNightEnvironment()

OneQuietNightEnvironment() takes two optional parameters:

  • base_path: pathlib.Path object where data can be cached for future analysis. If this is not provided, all data will be fetched and stored in memory only. If this is provided, some data will be persisted to disk under the base_path.
  • today: optional isoformat date string such as "2020-11-18" to override today value. This is used to cut off the "end" date for the c3ai data and to infer the epidemiological weeks for the forecast date and the target end date of the forecast.

OneQuietNightEnvironment contains the following functions:

  • get_or_create_locations_df: Create the location data. The location data joins three pieces of data together:

  • get_data: Download source data.

  • The following data are downloaded from Apple, Covid Tracking Project, Google, and JHU.

    • Apple_DrivingMobility
    • Apple_TransitMobility
    • Apple_WalkingMobility
    • CovidTrackingProject_ConfirmedCases
    • CovidTrackingProject_ConfirmedDeaths
    • CovidTrackingProject_ConfirmedHospitalizations
    • CovidTrackingProject_NegativeTests
    • CovidTrackingProject_PendingTests
    • Google_GroceryMobility
    • Google_ParksMobility
    • Google_ResidentialMobility
    • Google_RetailMobility
    • Google_TransitStationsMobility
    • Google_WorkplacesMobility
    • JHU_ConfirmedCases
    • JHU_ConfirmedDeaths
  • The following data are sourced from covidcast.)

    • Chng_SmoothedOutpatientCovid
    • DoctorVisits_SmoothedCli
    • FbSurvey_RawWili
    • FbSurvey_RawWcli
    • FbSurvey_RawHhCmntyCli
    • Ght_RawSearch
    • Safegraph_CompletelyHomeProp
    • Safegraph_FullTimeWorkProp
    • Safegraph_PartTimeWorkProp
    • Safegraph_MedianHomeDwellTime
  • get_features: Transform source data to input features for modeling. It currently produces three sets of features for the three models that we have at each geographic hierarchical level.

  • train_models: Trains the machine learning algorithms using the features. We implement a model pipeline to expose the data to the models and to handle the fit and predict processes. The pipeline class can be extended to implement additional models for use with the C3 AI Covid-19 Data Lake data sets.

  • save_visualization_data: Make predictions using the latest features. Generate csv files for OneQuietNight web application.

  • save_covidhub_data: Make predictions using the latest features. Generate csv files for Covid-19 Forecast Hub submissions.

When base_path is specified, the program will cache each of these data to the following structure under the base path on the first run today. Next time the functions above are called with today value (e.g. calling the program twice on 2020-11-18) would load the data from local storage rather than the remote APIs.

└── data
    ├── [data_name].feather         <- Historical time-series data from c3 ai saved in feather format.
    ├── locations.feather           <- Tabular location dimension table.
    ├── feature_store.joblib        <- Transformed time-series data.
    ├── model_store.joblib          <- Model parameters.
    ├── [today]-OneQuietNight.csv   <- Forecast output for Covid-19 Forecast Hub submission.
    ├── JHU_[target].csv            <- JHU target values for OneQuietNight web application.
    └── OQN_[forecast].csv          <- Forecast output for OneQuietNight web application.