ResidentMario/missingno

Use placeholder for NA in categorical variables

Closed this issue · 1 comments

I have a dataset in which categorical variables with value "blank" are considered missing.
Is there any option for recognizing this kind of thing when plotting?

I think a parameter like the following could be useful:

  • na_values: str or list of str containing values that will be recognized as missing.

In my case, I would do something like this:
msno.matrix(source_df, na_values='blank')

This is a great example of something that missingno shouldn't do for you! You should do this yourself, before passing your data to missingno, instead:

msno,matrix(source_df.replace("blank", np.nan))

Replacing sentinel values with a true null type is one of the most common data preprocessing tasks, you should be doing this basically every time. 🙂