jonathandroth/staggered

Stata version bug

Opened this issue · 14 comments

Hi, I installed your code exactly as described on this page. When I enter the following command line in Stata, I get an error, but the code runs fine in R, and the package the error message refers to is installed. In general, the Stata wrapper seems to have install errors for package dependencies in R.

I'll just run the code in R for now, but thought you might want to know about this!

. staggered, y("complaints") g("first_trained") t("period") i("uid") estimand("simple")
Installing devtools and staggered packages in R
Error in install.packages("devtools", repos = "http://cran.us.r-project.org") : unable to install packages

@johnklopfer can you send it to my email? pedro.h.santanna@vanderbilt.edu
Github is blocking it.

Some thoughts on some errors:

  1. It seems that FIPS is not a cross-sectional identifier, as we observe several observations of the same FIPS for the same time period. Should set i = leaid , as leaid seems to be the correct cross-sectional identifier here. I did this in R and had no problems.

  2. I just double checked in R that did and staggered_cs agreed on point estimates. They do. In did, you need to make sure that you set control_group = "notyettreated" .

  3. I am not getting any error with staggered or staggered_cs in R. Would you mind updating/reinstall the R package and see if you still get the error?

  4. I am also getting the error message on Stata : "vcv_neyman not found'. Need to dig more about why this is happening.

On the Stata bug: We are currently going deep into this because this is a weird bug. Depending on the event time structure, it currently may or may not work. Which doesn't happen in R!

Now, with a single treated cluster: The theory of our paper and also of Callaway and Sant'Anna (and I'd say all papers doing staggered DiD) do not cover that case. Issue is that we can't identify sigma with a single observation; not much we can do on this...

I will keep you posted on the Stata bug. Keeping this open for that.

Thanks

The theory behind CS relies on the number of units in each group growing. I believe this is the same for BJS and Gardner. The fact that csdid, did_imputation and did2s produce results do not imply that their results are necessarily reliable for this particular case with one unit per group.

As an author of CS, I can talk more about that one. For instance, with a single treated unit in a group, CS essentially only accounts for the uncertainty coming from the construction of the comparison group. That is, it treat Y(g)|G=g as fixed (zero variance).

With staggered, this is trickier bc we are adopting a design-based perspective, and we cant estimate the variance of Y(g) if group G=g has a single observation. Treating the variance as zero is not really a "solution"....

Does this help?

Thanks