awslabs/datawig

Unify precision filter

felixbiessmann opened this issue · 0 comments

There appear to be two places in the code where precision filtering for categorical predictions is done.

  • in imputer.predict where below threshold values are replaced by empty strings; here the resulting data frame has the same number of rows as the data frame that was the argument to predict

  • in imputer.__filter_predictions where the below threshold values are discarded; the result list now can have a lower number of rows and there will be an error in imputer.predict

We should make sure filtering is done consistently and preferably without changing the size of the input data frame