NickCH-K/did

estpost matrix fails when there are missing values in returned matrix from R

Closed this issue · 4 comments

did in R pushes back the table, which can for (identification?) reasons fail to obtain a parameter for a day. For context am running att_gt on daily data (and estimating upwards of 200+ atts). For some day's this doesnt identify and the matrix from did looks like so:
<snip>
Group-Time Average Treatment Effects:
Group Time ATT(g,t) Std. Error [95% Simult. Conf. Band]
22165 22138 0.0013 0.0082 -0.0315 0.0341
22165 22139 0.0138 0.0079 -0.0177 0.0453
22165 22140 0.0057 0.0082 -0.0270 0.0384
22165 22141 -0.0547 0.0670 -0.3215 0.2120
22165 22142 0.0051 0.0074 -0.0243 0.0346
22165 22143 0.0466 0.0674 -0.2217 0.3150
22165 22144 0.0135 0.0082 -0.0192 0.0461
22165 22145 0.0551 0.0670 -0.2118 0.3221
22165 22146 0.0466 0.0674 -0.2217 0.3150
22165 22147 0.0000 NA NA NA
22165 22148 0.6654 0.6104 -1.7663 3.0972
22165 22149 0.0000 NA NA NA
22165 22150 0.0455 0.1005 -0.3549 0.4459
22165 22151 0.0000 NA NA NA
22165 22152 0.0000 NA NA NA
22165 22153 0.0726 0.0717 -0.2132 0.3584
22165 22154 -0.1263 0.0799 -0.4446 0.1920
22165 22155 -0.6982 0.6901 -3.4477 2.0512
22165 22156 0.0000 NA NA NA
<snip>

Now, those NAs are coded as missings in Stata, but estpost matrix really doesn't like that.

I have hacked at the code from rcall to return my matrix at the moment (am basically injecting zeros if those NAs are detected). I have raised this as a bug/feature request in Rcall. But I don't know if there is anything you could do in R before passing the matrix back such that the issue never arises?

The bug prevents att_gt from completing as nothing can be posted, and then it blocks Rcall (as it remembers this error) and in future runs it thinks that did is not installed.

Hmm, that makes sense. I mean, you probably don't want to be using the results at that point, but it should be done in a way that doesn't mess up the rest of the Rcall session. It should be fixable by converting each matrix to a character matrix in R and replacing all the NAs with ".". I'll keep it as a to-do

This should be fixed with 5be213d, can you update your did installation and try your code again? Thanks!

Thanks for that. I have tested, and I'm afraid it's not worked. The original issue with estpost stems from the fact that estpost is not allowed to have missing entries in a matrix. Your solution replaced the NAs by . which leads to the same issue.

I believe the only way for it to work and not poison the next call of rcall is to inject the missings with zeros (as the missing values will be for the bounds and the SEs (and the estimate is returned as zero) I do not believe this would be an issue for users.

A side issue is that the fix in 5be213d turns the matrix into a string matrix on the way back (and the Rcall beta) cannot handle that, as it saves the matrix in R as a Stata dataset and then side loads it into stata using mkmat.

So if you could ensure the matrix is numeric that would be grand. Otherwise I will flag this as an issue in Rcall. (for the moment I am for example destringing in the code to call_return.ado.)

Oh I see. So you are getting a matrix back but estpost won't take it. That seems like an estpost issue :p I'm not inclined to replace with 0s since that would be misrepresenting the results. If someone wants to replace . with 0 in Stata they can do so after the fact.

I was getting back numeric matrices, so I'm a bit surprised you're getting string. Were you getting proper numeric matrices with missings before the update?

Also, crucially, if you can share a minimal example of your data/code that would make this way easier to help with. I'm sorta shooting in the dark here.