Make forcats 1.0.0 Tidyverse blog article more instructive
turbanisch opened this issue · 1 comments
turbanisch commented
Not sure if this is worth opening an issue - I just went through the forcats 1.0.0 tidyverse blog article and found it mildly confusing at first glance. The introduction of fct_na_value_to_level()
to the plot doesn't do anything; the plot printed below is exactly the same.
We can make fct_infreq() do what we want by moving the NA from the values to the levels:
ggplot(starwars, aes(y = fct_rev(fct_infreq(fct_na_value_to_level(hair_color))))) +
geom_bar() +
labs(y = "Hair color")
I would find it more instructive to rename the NA level in the same step to show that ggplot will then properly adjust:
fct_na_value_to_level(hair_color, "missing")
Otherwise it is easy to miss this bit because it appears only in the context of lumping factor levels together further down below:
starwars |>
mutate(
hair_color = hair_color |>
fct_na_value_to_level("(Unknown)") |>
fct_infreq() |>
fct_lump_min(2, other_level = "(Other)") |>
fct_rev()
) |>
ggplot(aes(y = hair_color)) +
geom_bar() +
labs(y = "Hair color")
cwdjankoski commented
I found the plots confusing as well - basically these 2 are the same no ?
ggplot(starwars, aes(y = fct_rev(fct_infreq(fct_na_value_to_level(hair_color))))) +
geom_bar() +
labs(y = "Hair color")
and this
starwars |>
mutate(
hair_color = hair_color |>
fct_na_value_to_level() |>
fct_infreq() |>
fct_rev()
) |>
ggplot(aes(y = hair_color)) +
geom_bar() +
labs(y = "Hair color")