SUNCAT-Center/CatLearn

sklearn deprecated Imputer breaking `clean_data.py` module

Closed this issue · 2 comments

The most recent version of sklearn (0.22) has removed the Imputer class from within from preprocessing.imputation location and as a result the following traceback is given:

~/TEMP/CatLearn/catlearn/preprocess/clean_data.py in <module>
      2 import numpy as np
      3 from collections import defaultdict
----> 4 from sklearn.preprocessing import Imputer
      5 from scipy.stats import skew
      6 

ImportError: cannot import name 'Imputer'

The following message is located in the old preprocessing.imputation file

@deprecated("Imputer was deprecated in version 0.20 and will be "                      
    "removed in 0.22. Import impute.SimpleImputer from "                       
    "sklearn instead.")                

It looks like simply replacing Inputer with SimpleImputer would be sufficient, but we should make sure that these classes are in fact the same before fixing

Making the following change and using the SimpleImputer instead of the Imputer class seems to work fine out of the box, the arguments to the Imputer class don't need to be modifyed within the clean_data.py file

diff --git a/catlearn/preprocess/clean_data.py b/catlearn/preprocess/clean_data.py
index f96fce0..fed2119 100644
--- a/catlearn/preprocess/clean_data.py
+++ b/catlearn/preprocess/clean_data.py
@@ -1,7 +1,7 @@
 """Functions to clean data."""
 import numpy as np
 from collections import defaultdict
-from sklearn.preprocessing import Imputer
+from sklearn.impute import SimpleImputer
 from scipy.stats import skew

#89 fixes this issue.