What's the difference between double ml and double selection?
Alalalalaki opened this issue · 1 comments
My problem is more about the theory or practice of double ml rather than of the pkg per se. I am sorry for that but I cannot find another place to ask the question. The original paper is way beyond my capability and I learned all about double ml through your document.
The thing is that I read Belloni, Chernozhukov, and Hansen (2014JEP) and find that in a case similar to the partially linear regression, they recommend to apply variable selection methods (lasso) to the two reduced form equations and use all of the selected controls in the traditional estimation (OLS) of the treatment effect of interest. There is no mention of cross-fitting for this double selection method. I wonder that is this double selection method simply the double ml without cross-fitting? And is the double ml with cross-fitting strictly better than double selection for any specific cases?
One related question is I want to know how to do double ml for the cases that there are some covariates that I don't want put them into the ml algorithms but want estimate them in a traditional way (like the covariates in a simple 2SLS). If using double selection, I can add them to the final OLS. But how can I do this with DoubleMLPR
? Or does adding such variables make sense at all?
Thanks in advance for any suggestions.
The double selection approach implicitly creates an orthogonal moment condition so that it is equivalent to the double machine learning approach which relies on a Neyman orthogonal moment condition.
For lasso cross-fitting is usually not required, as the entropy can be controlled. But for other methods, e.g. Random Forest, it is not easily possible and hence cross-fitting has to be employed. (In general it is recommended to do cross-fitting.)
If you know that variables should be included in the model, you can of course force them to be in the model. Up to now, this is not yet implemented in doubleML and hdm (maybe in the future), but as a workaround, you could manually partial out the effect of the relevant variables or you define thme as treatment variables.
Literature: For background reading the vignettes for the packages hdm and doubleML might be helpful.