Error running simpleimputer_intro.py in the example
CLWcynthia opened this issue · 11 comments
When I ran the simpleimputer_intro.py in the example, the following error occurred
Traceback (most recent call last):File "/Users/chen/PycharmProjects/test2/datamissing/examples/simpleimputer_intro.py", line 41, in <module> predictions = imputer.predict(df_test)
File "/usr/local/lib/python3.7/site-packages/datawig/simple_imputer.py", line 420, in predict score_suffix, inplace=inplace)
File "/usr/local/lib/python3.7/site-packages/datawig/imputer.py", line 822, in predict if data_frame.columns.contains(imputation_col):
AttributeError: 'Index' object has no attribute 'contains'
It could be a data processing error in predict function
this was due to a backwards-incompatible API change in pandas 1.0 and was addressed in this PR - sorry that we did not get to release a new version yet.
A quick fix could be to pip install pandas==0.25.0 manually before installing datawig.
In a jupyter notebook this could be done like:
!pip install pandas==0.25.0
!pip install datawig
Does this solve the problem?
It works for the moment,but there are warnings about the future,I find it feasible to modify the following two lines in imputer.py
if data_frame.columns.str.contains(imputation_col).any():
in line 822
if data_frame.columns.str.contains(imputation_proba_col).any():
in line 829
You're right, and we've fixed those lines in the last PR i had mentioned earlier, in particular the lines you mentioned are compliant with the new pandas API, see for instance here.
While the source code is fixed since some time that commit is not released in pip yet, we'll make sure that this and and some other mxnet related fix will be released asap.
Thanks for noticing this!
Should be solved with latest release, please reopen if problem persists
Hey, @felixbiessmann
It seems that this issue still persists. I get the initial error when trying to do:
imputer = datawig.SimpleImputer(
input_columns = ["advice", "reason", "reason_id"],
output_column = "advice_id"
)
imputer.fit(train_df = df3_train)
predictions = imputer.predict(df3_test)
Followed by error:
AttributeError Traceback (most recent call last)
<ipython-input-56-89a3fce1d6ab> in <module>
----> 1 predictions = imputer.predict(df3_test)
~/opt/anaconda3/lib/python3.8/site-packages/datawig/simple_imputer.py in predict(self, data_frame, precision_threshold, imputation_suffix, score_suffix, inplace)
417 :return: data_frame original dataframe with imputations and likelihood in additional column
418 """
--> 419 imputations = self.imputer.predict(data_frame, precision_threshold, imputation_suffix,
420 score_suffix, inplace=inplace)
421
~/opt/anaconda3/lib/python3.8/site-packages/datawig/imputer.py in predict(self, data_frame, precision_threshold, imputation_suffix, score_suffix, inplace)
820 for label, imputations in predictions:
821 imputation_col = label + imputation_suffix
--> 822 if data_frame.columns.contains(imputation_col):
823 raise ColumnOverwriteException(
824 "DataFrame contains column {}; remove column and try again".format(
AttributeError: 'Index' object has no attribute 'contains'
Following the advice from the thread above, I attempt to do locally in a Notebook:
!pip install pandas==0.25.0
!pip install datawig
However that Pandas version seems to fail when installing as it seems to have been deprecated. Is there another solution to get around this? Thanks!
Hey,
thanks for the heads up, we're currently in the process of refactoring the package and there's a pending PR that should solve some of these Problems - but it's in a preliminary stage. Until the next release I'd recommend to use the old package versions.
Thanks
Felix
Okay, thank you @felixbiessmann - in the meantime, do you recommend a particular old package version?
Same problem. I am facing too :(
I am still facing the same issue. Any advice?
Hi,
Is there an update on this issue? I am facing the same issue -
AttributeError: 'Index' object has no attribute 'contains'
I have pandas 1.2.4 and going back to 0.25.0 is not an option since it's been deprecated.