This repo contains the final part of the project for the course DATA 512 - Human Centered Data Science.
The goal of this analysis is to explore the
- impact of masking mandates on the spread of Covid 19 in Oakland county, MI (Part 1)
- impact of stay-at-home policies on the spread of Covid 19 in Oakland county, MI(Part 2)
We perform part 1 by a simple analysis by fitting a SIRS Model to the data to estimate the parameters related to infection spread. We perform part 2 by a regression analysis by comparing the infection doubling rate vs the changes in the baseline mobility.
We use the following data sources for this assignment.
The data related to Covid cases can be found here
It is licensed under Attribution 4.0 International (CC BY 4.0)
CDC dataset on masking mandates can be found here
Its licensing information can be found here
Google Community Mobility Reports can be found here
To use this, one must accept the Terms of Services mentioned by Google here - here
├── data_clean
│ ├── cases.pq
│ ├── deaths.pq
│ └── mask_mandates.pq
├── data_raw
│ ├── mask-mandate-by-county.csv
│ ├── mask-use-by-county.csv
│ ├── RAW_us_confirmed_cases.csv
│ └── RAW_us_deaths.csv
├── notebooks
│ └── part1.ipynb
│ └── part2.ipynb
├── README.md
├── requirements.txt
├── src
│ ├── clean_data.py
│ ├── clean_data_part2.py
│ ├── main.py
│ └── model.py
└── visualizations
There are four inputs used by the code in this repository.
The cases data is present in data_raw/RAW_us_confirmed_cases.csv
The deaths data is present in data_raw/RAW_us_deaths.csv
The community mobility report is present in data_raw/2020_US_Region_Mobility_Report.csv
, data_raw/2021_US_Region_Mobility_Report.csv
, data_raw/2022_US_Region_Mobility_Report.csv
Note1: Download this file from the link above and rename it mask-mandate-by-county.csv
as it is too large to commit to Github
Note2: Download the community mobility reports from the link above since they are too large to commit to Github
The following data files are generated by the notebook.
- data_clean/cases.pq
this stores the cleaned data of daily cases in the US at a county level - data_clean/deaths.pq
this stores the cleaned data of daily deaths in the US at a county level - data_clean/mask_compliance.pq
this stores the cleaned data of mask compliance in the US at a county level - data_clean/mask_mandates.pq
this stores the cleaned data of masking mandates in the US at a county level
Clone this repo using
git clone git@github.com:abhishekiitm/data-512-project_part1.git
cd data-512-project_part1
First install the necessary Python libraries in a virtual environment by executing the following steps in the Terminal (assuming you are running Linux):
$ virtualenv proj_env
$ source proj_env/bin/activate
Then install the libraries using
$ pip install -r requirements.txt
Download the raw files mentioned in the section Input Files
if you don't already have them.
Run clean data script to generate the cleaned data from the raw data.
$ python src/clean_data_part2.py
Execute the notebook notebooks/visualize.ipynb
using your choice of notebook environment (Jupyter Notebook or VS Code extension)