Consistent handling of NaN/missing values
ml-evs opened this issue · 1 comments
ml-evs commented
We should check this throughout the whole featurization process. Do we consistently replace the nan with some value? here it is zero, but in cleaning it is -1.
Originally posted by @ppdebreuck in #23 (comment)
As suggested, we should probably move all this to clean_df
, or handle it with a scikit transformer/scaler.