[Bug]: KeyError in DoubleMLPLIV.fit() with multiple instruments and store_predictions=True
vnastl opened this issue · 1 comments
vnastl commented
Describe the bug
In the case of multiple instruments, the function DoubleMLPLIV.fit() throws an error when executed with the parameter 'store_predictions=True'.
Minimum reproducible code snippet
import numpy as np
import doubleml as dml
from doubleml.datasets import make_pliv_CHS2015
from sklearn.ensemble import RandomForestRegressor
from sklearn.base import clone
np.random.seed(3141)
learner = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
ml_l = clone(learner)
ml_m = clone(learner)
ml_r = clone(learner)
obj_dml_data = make_pliv_CHS2015(n_obs=500, alpha=1.0, dim_x=10, dim_z=10, return_type='DoubleMLData')
dml_pliv_obj = dml.DoubleMLPLIV(obj_dml_data, ml_l, ml_m, ml_r)
dml_pliv_fit = dml_pliv_obj.fit(store_predictions=True)
Expected Result
Predictions for the whole list of learners ('params_names') are stored, i.e. for:
print(dml_pliv_obj.params_names)
['ml_l',
'ml_r',
'ml_m_Z1',
'ml_m_Z2',
'ml_m_Z3',
'ml_m_Z4',
'ml_m_Z5',
'ml_m_Z6',
'ml_m_Z7',
'ml_m_Z8',
'ml_m_Z9',
'ml_m_Z10']
Actual Result
After executing the code, the following error is stated:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/var/folders/sb/q_1b_jtx6_x55nw95r50s0tr0002mt/T/ipykernel_44055/2685974828.py in <module>
11 obj_dml_data = make_pliv_CHS2015(n_obs=500, alpha=1.0, dim_x=10, dim_z=10, return_type='DoubleMLData')
12 dml_pliv_obj = dml.DoubleMLPLIV(obj_dml_data, ml_l, ml_m, ml_r)
---> 13 dml_pliv_fit = dml_pliv_obj.fit(store_predictions=True)
/opt/anaconda3/envs/py39/lib/python3.10/site-packages/doubleml/double_ml.py in fit(self, n_jobs_cv, keep_scores, store_predictions, store_models)
500
501 if store_predictions:
--> 502 self._store_predictions(preds['predictions'])
503 if store_models:
504 self._store_models(preds['models'])
/opt/anaconda3/envs/py39/lib/python3.10/site-packages/doubleml/double_ml.py in _store_predictions(self, preds)
1000 def _store_predictions(self, preds):
1001 for learner in self.params_names:
-> 1002 self._predictions[learner][:, self._i_rep, self._i_treat] = preds[learner]
1003
1004 def _store_models(self, models):
KeyError: 'ml_m_Z1'
Versions
macOS-10.16-x86_64-i386-64bit
Python 3.10.6 (main, Oct 24 2022, 11:04:34) [Clang 12.0.0 ]
DoubleML 0.6.dev0
Scikit-Learn 1.1.3
SvenKlaassen commented
Thanks for reporting the issue.
It will be fixed with #182. The same bug occured when calculating the RMSE for the nuisance functions. I will leave the issue open until the fix is merged into the dev version.