pymc-labs/CausalPy

Add augmented synthetic control model

Opened this issue · 4 comments

As of now, we have "vanilla" synthetic control working with cp.pymc_experiments.SyntheticControl as the experiment class, and that is fed the cp.pymc_models.WeightedSumFitter as the model.

It is the cp.pymc_models.WeightedSumFitter which does the vanilla synthetic control model - as in weightings which sum to 1, and that is done via a Dirichlet distribution.

We want to add the ability to do augmented synthetic control. This will still use the cp.pymc_experiments.SyntheticControl cp.SyntheticControl experiment class, but instead we will feed it a new model, something like cp.pymc_models.AugmentedSyntheticControlModel. (However, see below because we may not need a new model)

Implementation notes

As far as I understand the algorithm for augmented synthetic control is along the lines of:

  • Based on the pre-treatment data, fit vanilla synthetic control model where weights are constrained to sum to 1.
  • Calculate the residuals between the model pre-treatment predictions and the observations
  • Fit these residuals with a model
  • Use the predictions of that model to adjust the synthetic control predictions

That need not be done in separate steps. What you could do is to have a model where the weightings of the control groups are constrained to sum to 1, but then simply add in more components to the model, such as an intercept and trend. For example, the model formula in one of the examples is currently:

Denmark ~ 0 + Austria + Belgium + Bulgaria + Croatia + Cyprus + Czech_Republic

but you could implement augmented synthetic control with something like

Denmark ~ 1 + trend + Austria + Belgium + Bulgaria + Croatia + Cyprus + Czech_Republic

Though you would have to ensure that the weights of the control units are constrained to sum to 1, but the 1 and trend predictors are weighted by 'unconstrained' coefficients.

So practically we might want to keep the original model formula but add a new residuals ~ 1 + trend, or something similar. Though it could just be simpler to do a custom model with something like:

  • control_units = ["Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus", "Czech_Republic"]
  • residuals ~ 1 + trend

Hi Ben! I actually sent an email. I'm interested in this issue a lot so I'm curious whether you can assign this issue to me. Thanks!

Hi @Jayhyung. It would be fantastic to have you contribute. All the technical info should be in the CONTRIBUTING.md guide, but do let me know if anything is unclear. I'll assign this to you.

Just to add @Jayhyung, we're soon going to merge a PR which completes a relatively large refactor of the code. It should be a relatively simple job to adapt any work you do to the new code structure, so no real need to hold off I think. I'll try to push that refactor along so it's done sooner rather than later.

Hi @Jayhyung. Just to mention that last week we merged the big code refactor (#381) and released 0.4.0. So if you still wanted / had time to work on this then it would be much smoother now :) Feel very free to ask questions or drop ideas in here if you want.