/EC607S24

Causal-inference oriented doctoral econometrics course at UO

Primary LanguageHTML

EC 607, Spring 2024

Welcome to Economics 607: Econometrics III (Spring 2024) at the University of Oregon (taught by Dr. Ed Rubin).

/\\\\\\\\\\\\\\\        /\\\\\\\\\            /\\\\\     /\\\\\\\     /\\\\\\\\\\\\\\\        
\/\\\///////////      /\\\////////         /\\\\////    /\\\/////\\\  \/////////////\\\       
 \/\\\               /\\\/               /\\\///        /\\\    \//\\\            /\\\/       
  \/\\\\\\\\\\\      /\\\               /\\\\\\\\\\\    \/\\\     \/\\\          /\\\/        
   \/\\\///////      \/\\\              /\\\\///////\\\  \/\\\     \/\\\        /\\\/         
    \/\\\             \//\\\            \/\\\      \//\\\ \/\\\     \/\\\      /\\\/          
     \/\\\              \///\\\          \//\\\      /\\\  \//\\\    /\\\     /\\\/           
      \/\\\\\\\\\\\\\\\    \////\\\\\\\\\  \///\\\\\\\\\/    \///\\\\\\\/    /\\\/            
       \///////////////        \/////////     \/////////        \///////     \///      

Schedule

Lecture Monday and Wednesday 10:00am–11:20am, Friendly 221

Lab Friday 12:00pm–12:50pm, 330 Condon

Office hours

Books

Main texts

We will mainly use two books.

Mostly Harmless Econometrics: An Empiricist's Companion (MHE)
by Angrist and Pischke
Your new best friend. Read it.

Microeconometrics (C&T)
by Cameron and Trivedi
Also very readable and accessible.

Runners up

Econometric Analysis (Greene)
by Greene
The standard—an encyclopedic resource for many of the questions MHE does not answer.

Introduction to Causal Inference (Neal)
by Brady Neal
A free, under-development, causal-inference book targeting folks who come from a prediction (think: machine learning) background.

Also helpful

Causal Inference in Statistics: A Primer (Pearl)
by Pearl, Glymour, and Jewell

Causal Inference: The Mixtape (Mixtape)
by Cunningham

Lecture slides

Note: The linked slides (below) are .html files that will only work properly if you are connected to the internet. If you're going off grid (camping + metrics?), grab the PDFs. You'll miss out on gifs and interactive plots, but the equations will actually show up.

The content of the lectures mainly follows MHE and Michael Anderson—with additional inspiration from Max Auffhammer and many other sources.

Another note on the notes: I create the slides with xaringan in R. Thanks to Grant McDermott for encouraging me to make this switch.

Lecture 01: Research + R + You = 💖

  1. An introduction to empirical research via applied econometrics.
  2. R: Light introduction—objects, functions, and help.

Note formats: .html | .pdf | .rmd

Readings: MHE preface + MHE chapter 1

Lecture 02: The Experimental Ideal

  1. Neyman potential outcomes framework (Rubin causal model)
  2. Selection bias and experimental variation in treatment

Note formats: .html | .pdf | .rmd

Readings: MHE chapter 2

Lecture 03: Why Regression?

  1. What's the big deal about least-squares (population) regression?
  2. What does the CEF tell us?
  3. How does least-squares regression relate to the CEF?

Note formats: .html | .pdf | .rmd

Readings: MHE chapter 3.1

Lecture 04: Inference and Simulation

  1. How do we move from populations to samples?
  2. What matters for drawing basic statistical inferences about the population?
  3. How can we learn about inference from simulation?
  4. How do we run (parallelized) simulations in R?

Note formats: .html | .pdf | .rmd

Readings: MHE chapter 3

Lecture 05: Regression Stuff

  1. Saturated models
  2. When is regression causal?
  3. The conditional-independence assumption

Note formats: .html | .pdf | .rmd

Readings: Still MHE chapter 3

Lecture 06: Controls

  1. Omitted-variable bias
  2. Good and bad controls

Note formats: .html | .pdf | .rmd

Readings: Still MHE chapter 3

Lecture 07: DAGs

  1. Defining graphs
  2. Underlying theory for DAGs
  3. Building blocks
  4. Examples

Note formats: .html | .pdf | .rmd
Readings: Brady Neal's book, chapters 1–3 (especially 2–3)

Extras: dagitty and ggdag

Lecture 08: Matching

  1. Matching estimators: Nearest neighbor and kernel
  2. Propensity-score methods: Regression control, treatment-effect heterogeneity, blocking, weighting, doubly robust

Note formats: .html | .pdf | .rmd
Readings: MHE chapter 3 + C&T section 25.4

Bonus: Slides outlining logistic regression.

Lecture 09: Instrumental Variables

  1. General research designs
  2. Instrumental variables (IV)
  3. Two-stage least squares (2SLS)
  4. Heterogeneous treatment effects and the LATE

Note formats: .html | .pdf | .rmd
Readings: MHE chapter 4 + C&T sections 4.8–4.9
Additional material: Paper on machine learning the first stage of 2SLS

Lecture 10: Regression Discontinuity

  1. Sharp regression discontinuities
  2. Fuzzy regression discontinuities
  3. Graphical analyses

Note formats: .html | .pdf | .Rmd
Readings: MHE chapter 6 + C&T sections 25.6

Lecture 11: Inference: Clustering

  1. General inference
  2. Moulton
  3. Cluster-robust standard errors

Note formats: .html | .pdf | .Rmd

Readings: MHE chapter 8

Lecture 12: Inference: Resampling and Randomization

  1. Resampling
  2. The bootstrap
  3. Permutation tests (Fisher)
  4. Randomization inference (Neyman-Pearson)

Note formats: .html | .pdf | .Rmd

Readings: MHE chapter 6 + C&T sections 25.6

Lecture 13: Machine learning (in one lecture)

  1. Prediction basics
  2. The bias-variance tradeoff
  3. In-sample vs. out-of-sample performance
  4. Hold-out methods (including cross validation)
  5. Ridge regression and lasso
  6. Decision trees
  7. Ensembles (of trees)

Note formats: .html | .pdf | .Rmd

Readings: Introduction to statistical learning

Lab

Owen Jetton will walk you through R and applications of the course content. You should attend.

Previous lab slides

Note: From previous iteration of our class.

Lab 01: R Intro/Review

  1. Object types/classes/structures
  2. Package management
  3. Math and stat. in R
  4. Indexing

Note formats: .html | .html (no pause) | .pdf | .pdf (no pause) | .Rmd
Solutions: .html | .pdf

Lab 02: Data in/and R

  1. Data frames
  2. Data work with dplyr

Note formats: .html | .html (no pause) | .pdf | .pdf (no pause) | .Rmd

Lab 03: RStudio + Data i/o with R

  1. RStudio
  2. Getting data into and out of R

Note formats: .html | .html (no pause) | .pdf | .pdf (no pause) | .Rmd

Lab 04: Regression in R

  1. lm() and lm objects
  2. estimatr and lm_robust()
  3. Other regressions, e.g., glm()

Note formats: .html | .html (no pause) | .pdf | .pdf (no pause) | .Rmd

Lab 05: Plotting in R

  1. Default plot() methods
  2. ggplot2

Note formats: .html | .html (no pause) | .pdf | .pdf (no pause) | .Rmd

Lab 06: Simulation in R

  1. General simulation strategies
  2. Simulating IV in finite samples

Note formats: .html | .html (no pause) | .pdf | .pdf (no pause) | .Rmd

Lab 07: Miscellaneous R Tips and Tricks

  1. The apply family
  2. for() loops
  3. Lists
  4. Logical vectors and which()

Note formats: .html | .html (no pause) | .pdf | .pdf (no pause) | .Rmd

Problem sets

Problem sets combining econometric theory and R.

Problem set 1
Due Thursday, 18 April 2024

Problem set 2
Due Sunday, 12 May 2024

Problem set 3
Due Friday, 24 May 2024

Project

The course has two projects:

  1. A research proposal that centers on a causal question.
  2. A presentation of a topic that extends what we cover during the course.

Project 1: Research proposal

Building a research project/proposal.

Why? You are wrapping up your first year in the PhD. It's time to start thinking about how you could apply what you've learned.

Step 1: Research question (causal relationship of interest) and motivation

Assignment: Pitch a project that includes a causal question of interest. Include motivation.

  • This project should be something you could turn into a legitimate research project.
  • Length: 150–250 words
  • You should have several drafts (only submit the last one).
  • Talk with your classmates (and me!).

More information here

Due: May 1, 2024; submit on Canvas

Step 2: Full project proposal

Assignment: Incorporate feedback from step 1 and write a "full" project proposal (~3 pages).

  1. Motivate and outline the causal question of interest.
  2. Explain potential sources of selection that could bias estimation.
  3. Describe the ideal experiment for your setting.
  4. Discuss a practical research design through which one could answer the question. Explain how this research design avoids selection bias.

Note: You do not need to actually estimate anything.

More information here.

Due: May 29, 2024

Project 2: Extensions

Assignment

  • Choose a topic related to causal inference–that we do not cover in class (e.g., difference-in-differences, the Wild Clustered Bootstrap, synthetic control methods).
  • Write a summary/tutorial of the topic that includes (a) the math behind the approach and (b) an empirical example.
  • Present a five-minute summary of the topics to your classmates.

Why? In the course of the PhD, we want to teach you how to learn. We will not provide you with everything you need to know to be able to do research on any topic. But hopefully we provide you with a foundation and the ability to learn new things. Also: You need to learn how to communicate both in writing and in person.

Due: All material (including slides) is due June 2, 2024. Presentations will be during class on June 5, 2024.

See here for more information and example topics.

Practice problems

  1. Inference and simulation
  2. Matching
  3. Instrumental variables
  4. Regression discontinuity
  5. Inference: Clustering and resampling

Exams

The final exam has two parts:

  • In class: 10:15am–12:15pm on Tuesday, June 11th (2024).
  • Take-home exam: Responses due by 11:59pm Pacific on Thursday, June 13th, 2023.

We do not have a midterm exam.

Examples of past exams:

Grades

As you've hopefully figured out by now, our PhD program is not "about grades." This class is critical to building the intuition and skills that you will rely upon in your own empirical work and in communicating with others about their empirical work. Commit to (and focus on) learning the material—the theory, the intuition, and the programming.

That said, I do have to turn in grades (and there is a GPA requirement to sit for the qualifying exam). I will weight your grades as follows:

  • Exams: The exam is worth 45% of your course grade.
  • Project: Each of the projects is worth 12.5% of your course grade (so the projects together are worth 25% of your course grade).
  • Assignments: Assignments jointly cover the remaining 30% of the grade (and may not be weighted equally).

Note: Anything you turn in with your name on it should be legitimately your own work. I encourage you to work with classmates and to get good with ChatGPT/Copilot/Google, but you still need to put things in your own words and understand what you've submitted. Submitting other people's work as your own will result in you failing this course.

Resources

Metrics books

R resources

Metrics and R

More