/Causal-Inference-1

Causal Inference 1 Mixtape Session taught by Scott Cunningham

Primary LanguageTeX

Mixtape Sessions Banner

About

Causal Inference Part I kickstarts a new 4-day series on design-based causal inference series. It covers the foundations of causal inference grounded in a counterfactual theory of causality built on the Neyman-Rubin model of potential outcomes. It will also cover randomization inference, independence, matching, regression discontinuity and instrumental variables. We will review the theory behind each of these designs in detail with the aim being comprehension, competency and confidence. Each day is 8 hours with 15 minute breaks on the hour plus an hour for lunch. To help accomplish this, we will hold ongoing discussions via Discourse, work through assignments and exercises together, and have detailed walk-throughs of code in R and Stata. This is the prequel to the Part II course that covers difference-in-differences and synthetic control.

Schedule

Potential Outcomes

About

The modern theory of causality is based on a seemingly simple idea called the "counterfactual". The counterfactual is an unusual features of the arsenal of modern statistics because it is more or less storytelling about alternative worlds that may or may not exist, but could have existed had one single decision gone a different way. Out of this idea grew what a model, complete with its own language, on top of which the field of causal inference is based, and the purpose of this lecture is to learn that language. The language is called potential outcomes and it forms the basis for many causal objects we tend to be interested in, such as the average treatment effect. I also cover randomization, selection bias and randomization inference.

Slides

Foundations of causality DAGs

Code

Doctor PO

Replication of Thornton (2008)

Shiny App for Randomization Inference

Readings

Mixtape chapter 3: Directed Acyclical Graphs

Mixtape chapter 4: Potential Outcomes Causal Model

Known and Quantified Confounder Methods

About

In observational studies, researchers typically are not able to assume that a treatment is randomly assigned as in an experiment. However, this randomization becomes more plausible in some cases after conditioning on a set of covariates. For example, it is not likely that attending college is random since individuals will sort to college based on a bunch of personal characteristics and social setting. However, comparing two individuals who have much of the same characteristics and come from similar backgrounds, it becomes more likely that whether these two individuals attend college differ. This is often called selection on observables and this section covers how to try to "match" two individuals based on their characteristics when you believe this assumption.

Slides

Known Observed Confounders

Code

[Titanic exercise using stratification weighting] (see lab section under Titanic)

Replication of Lalonde (1986) and Dehejia and Wahba (2002)

Readings

Mixtape chapter 5: Matching and Subclassification

Instrumental Variables

About

In settings where we are not willing to assume selection on observables, researchers often turn to an instrumental variables (IV) strategy to estimate a causal effect. In short, IVs are a sort of "external shock" to the equilibrium we're thinking about. This chapter shows how to leverage these "external shocks" to identify causal effects.

Slides

Instrumental Variables

Code

Replication of Graddy (1995) and Card (1995)

Readings

Mixtape chapter 7: Instrumental Variables

Regression Discontinuity Design

About

One of the most desired quasi-experimental designs -- desired because it is viewed as highly credible despite being based on observational data -- is the regression discontinuity design. Here I will discuss the sharp RDD in great detail, going through identification, estimation, specification tests and tips, as well as a replication.

Slides

Regression Discontinuity Designs

Code

Replication of Hansen (2015)

Shiny App for RD Optimal Bandwidth

Readings

Mixtape chapter 6: Regression discontinuity