fct_lump() adds an extra argument
PursuitOfDataScience opened this issue · 3 comments
PursuitOfDataScience commented
fct_lump()
should be added an extra argument to remove Other
. Currently, some filtering step is needed to remove it after lumping together. It would be more convenient for the future versions of the forcats package to add such a feature. Not a big deal, but just a suggestion as I need to do this most of the time. Thanks!
hadley commented
Remove it and replace it with what?
huftis commented
Remove it and replace it with what?
I’m not the OP, but replace it with NA
, perhaps? In theory, other = NA
would work, but this generates an explicit NA level, which isn’t handled ‘correctly’ by is.na()
:
library(forcats)
x_with_na = factor(c("a", "a", NA))
x_with_other_na = fct_lump_min(c("a", "a", "c"), min = 2, other = NA)
x_with_na
#> [1] a a <NA>
#> Levels: a
x_with_other_na
#> [1] a a <NA>
#> Levels: a <NA>
is.na(x_with_na)
#> [1] FALSE FALSE TRUE
is.na(x_with_other_na)
#> [1] FALSE FALSE FALSE
hadley commented
It's not clear what is needed/wanted here, so I'm going to close the issue.