sergiocorreia/reghdfe

clustering by categorical variable

aferrari0 opened this issue · 0 comments

I have recently updated ivreghdfe, ftools as well as reghdfe.

With the older versions, I had been successfully running panel IV regressions of the type

ivreghdfe y (x=z), absorb(id) cluster(id)

where id is the variable of the panel identifier. Importantly, id is a string variable.

I have 32,251 observations and 2,329 different id values.

ivreghdfe (only after my recent update) started giving me the following error

insufficient observations
r(2001);

This only happens when estimating the regression above. If I encode the id variable

encode id, gen(id_code)

and estimate

ivreghdfe y (x=z), absorb(id) cluster(id_code)

or

ivreghdfe y (x=z), absorb(id_code) cluster(id_code)

the regressions work, and I get the same results as before the update.

For context, clustering by the original string variable does not seem to be a problem in reghdfe.

Best