/differences

difference-in-differences in Python

Primary LanguagePythonOtherNOASSERTION

drawing

difference-in-differences estimation and inference for Python

For the following use cases

  • Balanced panels, unbalanced panels & repeated cross-section
  • Two + Multiple time periods
  • Fixed + Staggered treatment timing
  • Binary + Multi-Valued treatment
  • Heterogeneous treatment effects & triple difference
  • One + Multiple treatments per entity

see the Documentation for more details.

Installing

The latest release can be installed using pip

pip install differences

requires Python >= 3.8

Quick Start

ATTgt

the ATTgt class implements the estimation procedures suggested by Callaway and Sant'Anna (2021) , Sant'Anna and Zhao (2020) and the multi-valued treatment case discussed in Callaway, Goodman-Bacon & Sant'Anna (2021)

from differences import ATTgt, simulate_data

df = simulate_data()

att_gt = ATTgt(data=df, cohort_name='cohort')

att_gt.fit(formula='y')

att_gt.aggregate('event')

differences ATTgt benefitted substantially from the original authors' R packages: Callaway & Sant'Anna's did and Sant'Anna and Zhao's DRDID

NOTE: Important note on performance ! Currently, the ATTgt class allows users to pass string entity identifiers, as in the example with df = simulate_data() above, where the first index containing the entity identifiers is a string datatype. Note that the performance of the ATT computation (when calling .fit()) would improve greatly if you cast the entities to integers before initializing ATTgt. You can easily do that just by using pandas category codes.

TWFE

from differences import TWFE, simulate_data

df = simulate_data()

twfe = TWFE(data=df, cohort_name='cohort')

twfe.fit(formula='y')