Health in Numbers

This was Part 3 of Experimental Design coursework and intended to be an continuation of regression/classification techniques alongside traditional train-test data splits to form predictions using open data sets to learn data science concepts.

Abstract

In this work, we have performed binomial logistic regression to examine the relative likelihood that schools in the city of Toronto will reach herd immunity levels for Diphtheria, Tetanus, Pertussis (“DTP”) and Measles, Mumps, and Rubella (“MMR”) vaccines. Herd immunity is an important measure of disease resistance as it protects individuals who are not or cannot be immune to a given disease. Our model found that schools are under-protected, and this is inadequately explained by religious exemption data collection. We have hypothesized that the rise of the vaccine hesitance movement (more colloquially referred to as “anti-vax”) has likely contributed to this shortfall. However, conclusions from this analysis have been limited by insufficient data depth and are difficult to make with appropriate veracity. Nonetheless, applying this model to like-structured datasets over time may help confirm the effect of the anti-vax movement on herd immunity in Toronto schools.

See the corresponding GitHub Pages for data visualization and analysis

See the rmarkdown for full code