yuenshingyan/MissForest

How to deal with categorical variables

brtang63 opened this issue · 2 comments

Thanks for sharing the implementation. I haven't figured out the way you deal with the categorical variables. Could you please tell me what type the input categorical variables should take the form of? It seems to me they could be string labels, and you apply one-hot encoding to them before imputation. Not sure if I understand it correctly. Thanks in advance for your help.

I agree, this is missing in the Readme.

  • Great package, thanks

Class method '_get_map_and_revmap' will checks if all values in the columns of the data are string, if so, it will constructs and returns two dictionaries 'mappings' and 'rev_mappings'. 'mappings' will be used to encode those categorical variables from strings to intergers and 'rev_mappings' will be used to reversely encode those integers back to the original strings.