This application connects to Google Drive to download the Colorado COVID-19 case data. The application then processes the county specific data into the attribute table of a Colorado by county shapefile
The only requirement is python3 with pip installed.
git clone https://github.com/chizou/co-covid-shp-generator
cd co-covid-shp-generator
pip install --upgrade -r requirements.txt
mkdir downloads
cp sample.csv downloads/
python ./final_project.py
In order to use this application, you'll need to configure your Google credentials to download data files from Google Drive
- Configure your Google credentials based on these instructions
- Ensure the credentials are configured to allow Google Drive API. Implementing an IP based restriction is also recommended
- set those credentials in the
config.yml
file with the following format:
google-drive-creds:
your-api-key
After Google API credentials are provided, the application can be run with the -d
parameter to download the entire set of data and process all of it into shapefiles. On a default run, one new shapefile will be produced for every csv file that exists in the downloads
directory. For every csv file in the downloads
directory, a corresponding shapefile will be created in the shapefiles
directory.
If you only need to process a subset of the data, copy only that supset to the downloads directory
to be processed into shapefiles.
The application is meant to be repeatable such that subsequent runs will always return the same result as long as the data source doesn't change. The reasoning behind this is because the fast pace that the data is changing because our understanding of COVID is rapidly changing. Additionally, the State of Colorado continues to add additional data points as time progresses.
This application uses the base shapefile provided in the shapefiles/base
directory to create derivative shapefiles from. Updates to the base shapefile will appear in all derivative shapefiles as well, with the limitation of the capabilities provided by (pyshp)[https://pypi.org/project/pyshp/]
Due to the fact that ESRI Shapefile format only supports a max of 10 characters, the attribute table fields have been shortened. All statistics below are per county. Here is the mapping:
Field name | Mapped Statistic |
---|---|
CASECOUNT | Total number of cases per county |
CASEPER100 | Nubmer of cases per 100,000 people |
DEATHS | Total number of deaths |
PCR | Total number of PCR test conducted |
SEROLOGY | Total number of serology tests conducted |
TESTRATE | Rate of test per 100,000 people |
TOTALTESTS | Total number of tests conducted |
This repo usese pre-commit and requires python version 3.7. Since there are the potential for Google API secrets, we do use Yelp's detect-secrets which will need to be installed