System fit coefficient names are not used if two equations share coefficient names
Closed this issue · 6 comments
Relates to #1233
library(car)
library(systemfit)
library(marginaleffects)
set.seed(12345)
N <- 100
C <- matrix(c(1, .3, .3,
.3, 1, .3,
.3, .3, 1), 3, 3)
colnames(C) <- rownames(C) <- c("x1", "x2", "x3")
data <- as.data.frame(mvrnorm(N, rep(0, 3), C))
sys <- with(data, {
y1 <- 2 * x1 + 3 * x2 + x3 + 10 * rnorm(N)
m1 <- lm(y1 ~ x1 + x2 + x3)
y2 <- x1 + 3 * x2 + 2 * x3 + 10 * rnorm(N)
m2 <- lm(y2 ~ x1 + x2 + x3)
systemfit(list(formula(m1), formula(m2)), data = data)
})
h <- hypotheses(sys)
# This works
hypotheses(sys, "b3 = b7")
# This works
linearHypothesis(sys, "eq1_x2 = eq2_x2")
# This does not work
hypotheses(sys, "eq1_x2 = eq2_x2")
Hypothesis seems to work only with parameter indices, not parameter names. This should not be the case because using parameter indices is error-prone and does not make the intent of the code clear.
systemfit
objects just need a get_coef()
method for this to work. Alternatively, we could modify the default method of get_coef()
to paste in the Component
column of the insight::get_parameters()
output when it is present.
For now,
get_coef.systemfit <- function (model, ...) {
out <- insight::get_parameters(model, component = "all")
out <- stats::setNames(out$Estimate, paste(out$Component, out$Parameter, sep = "_"))
return(out)
}
should do the trick. You can run this interactively to get better labels before this is implemented in the package.
Oh, it looks like Vincent already fixed this in 9496606. Installing the development version should fix it.
I do not think that this has been fully addressed. Using my example, get_coef produces names with equation numbers
> get_coef(sys)
eq1_(Intercept) eq1_x1 eq1_x2 eq1_x3 eq2_(Intercept)
1.7242089 2.5062010 4.4522016 -2.3640198 -0.4534193
eq2_x1 eq2_x2 eq2_x3
-0.4228978 3.4664965 2.6820931
However, If I run hypotheses without hypothesis argument, the equation names are missing:
> hypotheses(sys)
Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
(Intercept) 1.724 0.952 1.811 0.07014 3.8 -0.142 3.59
x1 2.506 1.025 2.445 0.01449 6.1 0.497 4.52
x2 4.452 1.079 4.125 < 0.001 14.7 2.337 6.57
x3 -2.364 0.966 -2.447 0.01442 6.1 -4.258 -0.47
(Intercept) -0.453 0.911 -0.498 0.61867 0.7 -2.239 1.33
x1 -0.423 0.981 -0.431 0.66634 0.6 -2.345 1.50
x2 3.466 1.033 3.356 < 0.001 10.3 1.442 5.49
x3 2.682 0.925 2.901 0.00372 8.1 0.870 4.49
Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
Using the coefficient names from get_coef(sys) in a hypothesis still produces an error:
> hypotheses(sys, "eq1_x2 = eq2_x2")
Error: To use term names in a `hypothesis` string, the same function call without
`hypothesis` argument must produce a `term` column with unique row identifiers. You
can use `b1`, `b2`, etc. indices instead of term names in the `hypotheses` string Ex:
"b1 + b2 = 0" Alternatively, you can use the `newdata`, `variables`, or `by`
arguments:
mod <- lm(mpg ~ am * vs + cyl, data = mtcars)
comparisons(mod, newdata = "mean", hypothesis = "b1 = b2")
comparisons(mod, newdata = "mean", hypothesis = "am = vs")
comparisons(mod, variables = "am", by = "cyl", hypothesis = "pairwise")
Here is my session info
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS 15.0.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Helsinki
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] texreg_1.39.4 systemfit_1.1-30 lmtest_0.9-40
[4] zoo_1.8-12 Matrix_1.7-0 marginaleffects_0.23.0.2
[7] MASS_7.3-61 lattice_0.22-6 car_3.1-3
[10] carData_3.0-5 lubridate_1.9.3 forcats_1.0.0
[13] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
[16] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[19] ggplot2_3.5.1 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] sandwich_3.1-1 utf8_1.2.4 generics_0.1.3 stringi_1.8.4 hms_1.1.3
[6] magrittr_2.0.3 grid_4.4.1 timechange_0.3.0 backports_1.5.0 Formula_1.2-5
[11] httr_1.4.7 fansi_1.0.6 scales_1.3.0 abind_1.4-8 cli_3.6.3
[16] rlang_1.1.4 munsell_0.5.1 withr_3.0.1 tools_4.4.1 tzdb_0.4.0
[21] checkmate_2.3.2 colorspace_2.1-1 vctrs_0.6.5 R6_2.5.1 lifecycle_1.0.4
[26] insight_0.20.5.1 pkgconfig_2.0.3 pillar_1.9.0 gtable_0.3.5 glue_1.8.0
[31] data.table_1.16.2 Rcpp_1.0.13 tidyselect_1.2.1 rstudioapi_0.16.0 compiler_4.4.1
@mronkko can you try the latest github version from this morning?
This works with the current version. Thanks!