Zoltpy
A python module that interfaces with Zoltar https://github.com/reichlab/forecast-repository
Installation requirements
- python 3.6
- pipenv for managing packages - see Pipfile
- click - for output, and for the demo application's handling of args
- pandas - for use of dataframe function
- requests
- numpy
Installation
Zoltpy is hosted on the Python Package Index (pypi.org), a repository for Python modules https://pypi.org/project/zoltpy/.
Install Zoltpy with the following command:
pip install git+https://github.com/reichlab/zoltpy/
One-time Environment Variable Configuration
Users must add their Zoltar username and password to environment variables on their machine before using this module.
For Mac/Unix
cd ~
nano .bash_profile
Add the following to your bash_profile:
export Z_USERNAME=<your zoltar username>
export Z_PASSWORD=<your zoltar password>
After you are finished, press Ctrl
+ O
, Enter
, and Ctrl
+ X
to save and quit.
Then enter the command:
source ~/.bash_profile
To ensure your environment variable is configured properly, run this command and check for Z_USERNAME and Z_PASSWORD:
printenv
For PC
In the command prompt, run the following commands:
set Z_USERNAME="<your zoltar username>"
set Z_PASSWORD="<your zoltar password>"
Usage
Zoltpy is a python module that communicates with Zoltar, the Reich Lab's forecast repository. To import the Zoltpy utility functions, run the following command after installing the package:
from zoltpy import util
Authentication
To access your project, you'll first need to authenticate via the authenticate(username, password)
method from the ZoltarConnection()
object. Pass it the username and password saved in your environment variables:
from zoltpy import util, connection
conn = util.authenticate()
Now you can use your authentication token to access private projects:
project = [project for project in conn.projects]
print(project)
- Be careful to store and use your username and password so that they're not accessible to others. The preferred method is to create enviornment variables
- The Zoltar service uses a "token"-based scheme for authentication. These tokens have a five minute expiration for
security, which requires re-authentication after that period of time. The Zoltpy library takes care of
re-authenticating as needed by passing your username and password back to the server to get another token. Note that
the connection object returned by the
re_authenticate_if_necessary()
function stores a token internally, so be careful if saving that object into a file.
Zoltpy currently has 6 Key Functions
- print_projects() - Print project names
- print_models(
conn
,project_name
) - Print model names for a specified project - delete_forecast(
conn
,project_name
,model_abbr
,timezero_date
) - Deletes a forecast from Zoltar - upload_forecast(
conn
,project_name
,model_abbr
,timezero_date
,forecast_csv_file
) - Upload a forecast to Zoltar - download_forecast(
conn
,project_name
,model_abbr
,timezero_date
) - Download a forecast from Zoltar - query_project(
conn
,project_name
,query_type
,query
) - Query a Zoltar project for forecasts or truth data
Print Project Names
This function returns the project names that you have authorization to view in Zoltar.
util.print_projects()
Print Model Names
Given a project, this function prints the models in that project.
util.print_models(conn, project_name = 'My Project')
Delete a Forecast
Deletes a single forecast for a specified model and timezero.
util.delete_forecast(conn, project_name='My Project', model_abbr='My Model', timezero_date='YYYY-MM-DD')
Example:
conn = util.authenticate()
util.delete_forecast(conn, `'Impetus Province Forecasts','gam_lag1_tops3','20181203')
Upload a Single Forecast
project_name = 'Docs Example Project'
model_abbr = 'docs forecast model'
timezero_date = '2011-10-09'
predx_json_file = 'examples/docs-predictions.json'
forecast_filename = 'docs-predictions'
conn = util.authenticate()
util.upload_forecast(conn, predx_json_file, forecast_filename, project_name, model_abbr, timezero_date overwrite=True)
Uploading Multiple Forecasts
This method makes uploading multiple forecasts for a single model and project more efficient. The first step is to iterate through every forecast in your model and create the following three batch variables: predx_batch
, forecast_filename_batch
, timezero_batch
. Below is an example of getting these batch variables
# import libraries
import pymmwr as pm
from zoltpy import util
import datetime
# initialize parameters
project_name = 'private project'
model_abbr = 'Test ForecastModel1'
# set up batch variables
predx_batch = []
forecast_filename_batch = []
timezero_batch = []
for csv_file in '/Users/my/forecast/directory':
conn = util.authenticate()
# get timezero
timezero = pm.epiweek_to_date(ew)
timezero = timezero + datetime.timedelta(days = 1) # timezeros on Mondays
timezero = timezero.strftime('%Y%m%d')
# generate predx_json and forecast_filename
predx_json, forecast_filename = util.convert_cdc_csv_to_json_io_dict(2016, csv_file)
# save batch variables
predx_batch += [predx_json]
forecast_filename_batch += [forecast_filename]
timezero_batch += [timezero]
util.upload_forecast_batch(conn, predx_batch, forecast_filename_batch, project_name, model_abbr, timezero_batch,
overwrite=False)
Download a forecast
conn = util.authenticate()
project_name = 'Docs Example Project'
model_abbr = 'docs forecast model'
timezero_date = '2011-10-09'
json_io_dict = util.download_forecast(conn, project_name, model_abbr, timezero_date)
print(f"downloaded {len(json_io_dict['predictions'])} predictions")
Return Forecast as a Pandas Dataframe
df = util.dataframe_from_json_io_dict(json_io_dict)
print(f"dataframe:\n{df}")
Query a Project
We can query a project to retrieve forecasts or truth data:
conn = util.authenticate()
project_name = 'COVID-19 Forecasts'
query = {
'models': ['epiforecasts-ensemble1', 'LNQ-ens1', 'UMass-MechBayes'],
'units': ['39'],
'targets': [str(h + 1) + ' wk ahead inc death' for h in range(4)],
'timezeros': ['2021-02-14', '2021-02-15'],
'types': ['quantile']}
forecasts_df = util.query_project(conn, project_name, connection.QueryType.FORECASTS, query)
print(f"dataframe:\n{forecasts_df}")
conn = util.authenticate()
project_name = 'COVID-19 Forecasts'
query = {
'units': ['39'],
'targets': [str(h + 1) + ' wk ahead inc death' for h in range(4)],
'timezeros': ['2021-02-14', '2021-02-15']}
truth_df = util.query_project(conn, project_name, connection.QueryType.TRUTH, query)
print(f"dataframe:\n{truth_df}")