`include_reference = TRUE` doesn't work in combination with `as.factor()`
snhansen opened this issue · 2 comments
snhansen commented
Shouldn't the following two examples yield the same output? It appears that the reference isn't included when a variable is converted to a factor in the model statement. You get the same behaviour when the explanatory variable is a character variable probably because it's converted to a factor on-the-fly.
mtcars |>
dplyr::mutate(gear = factor(gear)) |>
lm(mpg ~ gear, data = _) |>
parameters::parameters() |>
print(include_reference = TRUE)
#> Parameter | Coefficient | SE | 95% CI | t(29) | p
#> ------------------------------------------------------------------
#> (Intercept) | 16.11 | 1.22 | [13.62, 18.59] | 13.25 | < .001
#> gear [3] | 0.00 | | | |
#> gear [4] | 8.43 | 1.82 | [ 4.70, 12.16] | 4.62 | < .001
#> gear [5] | 5.27 | 2.43 | [ 0.30, 10.25] | 2.17 | 0.038
lm(mpg ~ as.factor(gear), data = mtcars) |>
parameters::parameters() |>
print(include_reference = TRUE)
#> Parameter | Coefficient | SE | 95% CI | t(29) | p
#> ------------------------------------------------------------------
#> (Intercept) | 16.11 | 1.22 | [13.62, 18.59] | 13.25 | < .001
#> gear [4] | 8.43 | 1.82 | [ 4.70, 12.16] | 4.62 | < .001
#> gear [5] | 5.27 | 2.43 | [ 0.30, 10.25] | 2.17 | 0.038
strengejacke commented
as.character()
can still be improved...
lm(mpg ~ as.factor(gear) + factor(am) + hp, data = mtcars) |>
parameters::parameters() |>
print(include_reference = TRUE)
#> Parameter | Coefficient | SE | 95% CI | t(27) | p
#> ------------------------------------------------------------------
#> (Intercept) | 27.48 | 1.97 | [23.43, 31.53] | 13.92 | < .001
#> gear [3] | 0.00 | | | |
#> gear [4] | 0.08 | 1.83 | [-3.68, 3.83] | 0.04 | 0.967
#> gear [5] | 2.39 | 2.38 | [-2.50, 7.29] | 1.00 | 0.324
#> am [0] | 0.00 | | | |
#> am [1] | 4.14 | 1.81 | [ 0.42, 7.85] | 2.29 | 0.030
#> hp | -0.06 | 0.01 | [-0.09, -0.04] | -6.24 | < .001
#>
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#> using a Wald t-distribution approximation.
lm(mpg ~ as.factor(gear) + as.character(am) + hp, data = mtcars) |>
parameters::parameters() |>
print(include_reference = TRUE)
#> Parameter | Coefficient | SE | 95% CI | t(27) | p
#> ---------------------------------------------------------------------------
#> (Intercept) | 27.48 | 1.97 | [23.43, 31.53] | 13.92 | < .001
#> gear [3] | 0.00 | | | |
#> gear [4] | 0.08 | 1.83 | [-3.68, 3.83] | 0.04 | 0.967
#> gear [5] | 2.39 | 2.38 | [-2.50, 7.29] | 1.00 | 0.324
#> am [0] | 0.00 | | | |
#> as character(am) [1] | 4.14 | 1.81 | [ 0.42, 7.85] | 2.29 | 0.030
#> hp | -0.06 | 0.01 | [-0.09, -0.04] | -6.24 | < .001
#>
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#> using a Wald t-distribution approximation.
Created on 2024-03-14 with reprex v2.1.0
bwiernik commented
I don't think as.character()
is something to be concerned about, as it isn't really a modeling function.