/m3-4-rdd-project

Regression Discontinuity Design Project!

Primary LanguageJupyter Notebook

Analytics project

Regression discontinuity design (RDD) is a method used to identify the effect of a change implemented at a cutoff point. This usually leads to pretty diagrams like this one:

Which is using a linear regression.

The regression equation for it is

y = const + b1 * x + b2 * c

Where const is the Y intercept, x is the independent (feature) variable, and c is a dummy variable which indicates if x > cutoff. Then, the parameter on b2 can be t-tested to see if the cutoff had an effect on y.

It can also be used with a polynomial regression on both sides:

Here both sides have a different polynomial degree 2 regression, so the equation uses interaction variables:

y = const + b1 * x + b2 * x^2 + b3 * c + b4 * x * c + b5 * x^2 * c

Note that you can always overfit the regressions on each to make sure that the gap seems meaningful when it's not. Gelman & Imbens (2018) argue that you should never use more than degree 2 polynomials in RDD. Even degree 2 is sometimes an overfit.

This method has even been used to estimate the effect of lockdown measures, lie in this CDC study. For reference on RDD design, Lee & Lemieux's book chapter on RDD is a good technical reference.

Your task

Your task is to use RDD to estimate the effect of the following events in Quebec:

  1. The 20/3/2020 lockdown

  2. The reopening of schools on 31/8/2020

  3. The 25/12/2020 lockdown

Requirements

You need to find data on at least one COVID measure for y (either COVID cases, hospitalizations or deaths) and provide the following for each:

  • A RDD plot similar to the ones shown above

  • An interpretation of the p-value on the effect of the measure taken (the cutoff parameter)

  • A justification on the design of your regression:

  1. The amount of time included on both sides of the cutoff (longer is not necessarily better)

  2. The polynomial degree (higher is not always better)

  3. Other regression design considerations

  • A 2 paragraph explanation of your findings for that event.