emmeans over geepack::geeglm accepts "vcov" but glmtoolbox::glmgee does not
Generalized opened this issue · 6 comments
Dear @rvlenth
I noticed that in the newest update of the emmeans the glmtoolbox is supported without the need for qdrg(). I thank you for it very, very much!
When I fit some model via geepack::geeglm and provide the vcov parameter, then emmeans recognizes the fact it's user-supplied:
mg <- geepack::geeglm(X~ Y* T+ ....., corstr = "exchangeable", id=Pat, data=dd)
emmeans(mg, specs = ~Y*T, vcov=my_estimator)
....
....
Covariance estimate used: user-supplied # here!
Confidence level used: 0.95
The message "user-supplied" confirms it recognized the v-cov estimator provided by the user.
Now I want to use it with glmtoolbox:
> emmeans(mg, specs = ~Y*T, vcov=vcov(mg, type = "bias-corrected"))
Y T emmean SE df lower.CL upper.CL
B 6 4.11 0.370 46 3.36 4.85
K 6 5.39 0.310 46 4.77 6.02
B 12 3.86 0.467 46 2.92 4.80
K 12 5.06 0.341 46 4.37 5.75
Confidence level used: 0.95
There's no message that user-provided vcov was used.
Let's use a different estimator
> emmeans(mg, specs = ~Y*T, vcov=vcov(mg, type = "model"))
Y T emmean SE df lower.CL upper.CL
B 6 4.11 0.370 46 3.36 4.85
K 6 5.39 0.310 46 4.77 6.02
B 12 3.86 0.467 46 2.92 4.80
K 12 5.06 0.341 46 4.37 5.75
Confidence level used: 0.95
No message, same SEs and CIs. Nothing changed.
So, while for geepack the user-supplied vcov parameter is recognized, for glmtoolbox it's not.
I need then the qdrg() to account for it:
mq <- qdrg(formula = Y*T, coef=coef(mg), vcov= vcov(mg, type = "bias-corrected"), data=dd, df = 46)
emmeans(mq, specs = ~Y*T)
Y T emmean SE df lower.CL upper.CL
B 6 4.11 0.410 46 3.28 4.93
K 6 5.39 0.358 46 4.67 6.12
B 12 3.86 0.526 46 2.80 4.92
K 12 5.06 0.393 46 4.27 5.85
Confidence level used: 0.95
vs.
mq <- qdrg(formula = Y*T, coef=coef(mg), vcov= vcov(mg, type = "model"), data=dd, df = 46)
emmeans(mq, specs = ~Y*T)
Y T emmean SE df lower.CL upper.CL
B 6 4.11 0.403 46 3.29 4.92
K 6 5.39 0.403 46 4.58 6.21
B 12 3.86 0.403 46 3.05 4.68
K 12 5.06 0.403 46 4.25 5.87
Confidence level used: 0.95
And indeed, now the SEs and CIs are different for different type of vcov().
I'm sorry for giving no data, but I'd need to anonymize it and make several changes. But you can use any single dataset and example, it's not data-specific. It's package-level specific.
Thanks for the report. I understand the concern about your confidential data. However, since you say that any single data set can reproduce this error, I would like to ask you to provide a reproducible example, preferably from datasets or one of the packages. That will ensure that I am working with a suitable example, and also verify that the bug indeed is not an artifact of your particular dataset.
Please find the dataset
d <- structure(list(ID = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 12L, 13L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L,
25L, 26L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L,
25L, 26L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L,
25L, 26L), levels = c("1", "2", "3", "4", "5", "6", "7", "8",
"9", "10", "11", "12", "13", "101", "102", "103", "104", "105",
"106", "107", "108", "109", "110", "111", "112", "113"), class = "factor"),
Method = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), levels = c("B", "K"), class = "factor"),
Time = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), levels = c("1", "6", "12"
), class = "factor"), Response = c(5.5, 5.5, 6, 11.5, 7,
5.5, 7, 7.5, 12, 8, 8.5, 8.5, 8.5, 3, 4.5, 3, 4.5, 3, 4,
3.5, 3.5, 3.5, 8, 4.5, 6.5, 4, 6, 3, 3, 4, 2, 3, 3, 3, 3,
8, 4, 6, 4, 5.5, 5.5, 5, 6, 7, 5.5, 8, 5.5, 6.5, 5, 8, 8,
6, 5.5, 4.5, 4.5, 5.5, 6.5, 5.5, 4, 5, 5, 3, 6, 7.5, 5.5,
5.5, 3, 4, 5, 6, 5, 4, 5, 5, 3, 6, 7.5, 5), Baseline = c(5.5,
5.5, 6, 11.5, 7, 5.5, 7, 7.5, 12, 8, 8.5, 8.5, 8.5, 5.5,
5.5, 6, 11.5, 7, 5.5, 7, 7.5, 12, 8, 8.5, 8.5, 8.5, 5.5,
5.5, 6, 11.5, 7, 5.5, 7, 7.5, 12, 8, 8.5, 8.5, 8.5, 5.5,
5.5, 5, 6, 7, 5.5, 8, 5.5, 6.5, 5, 8, 8, 6, 5.5, 5.5, 5,
6, 7, 5.5, 8, 5.5, 6.5, 5, 8, 8, 6, 5.5, 5.5, 5, 6, 7, 5.5,
8, 5.5, 6.5, 5, 8, 8, 6)), row.names = c(NA, 78L), class = "data.frame")
For glmtoolbox + emmeans:
> mg <- glmtoolbox::glmgee(Response ~ Method * Time, corstr = "unstructured", id=ID, data=d)
> (mg_em_1 <- emmeans(mg, specs = ~ Method * Time, vcov = vcov(mg, type="robust")))
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.562 72 6.65 8.89
K 1 6.27 0.301 72 5.67 6.87
B 6 4.27 0.391 72 3.49 5.05
K 6 5.23 0.301 72 4.63 5.83
B 12 4.00 0.449 72 3.11 4.89
K 12 4.92 0.330 72 4.26 5.58
Confidence level used: 0.95
> (mg_em_2 <- emmeans(mg, specs = ~ Method * Time, vcov = vcov(mg, type="model")))
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.562 72 6.65 8.89
K 1 6.27 0.301 72 5.67 6.87
B 6 4.27 0.391 72 3.49 5.05
K 6 5.23 0.301 72 4.63 5.83
B 12 4.00 0.449 72 3.11 4.89
K 12 4.92 0.330 72 4.26 5.58
Confidence level used: 0.95
> identical(mg_em_1, mg_em_2)
[1] TRUE
For glmtoolbox + qdrg + emmeans:
> (mg_em_1 <- emmeans( qdrg(formula = Response ~ Method * Time,
+ coef=coef(mg), vcov= vcov(mg, type = "robust"), data=d),
+ specs = ~ Method * Time))
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.562 Inf 6.67 8.87
K 1 6.27 0.301 Inf 5.68 6.86
B 6 4.27 0.391 Inf 3.50 5.04
K 6 5.23 0.301 Inf 4.64 5.82
B 12 4.00 0.449 Inf 3.12 4.88
K 12 4.92 0.330 Inf 4.28 5.57
Confidence level used: 0.95
>
> (mg_em_2 <- emmeans( qdrg(formula = Response ~ Method * Time,
+ coef=coef(mg), vcov= vcov(mg, type = "model"), data=d),
+ specs = ~ Method * Time))
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.416 Inf 6.95 8.59
K 1 6.27 0.416 Inf 5.45 7.09
B 6 4.27 0.416 Inf 3.45 5.09
K 6 5.23 0.416 Inf 4.41 6.05
B 12 4.00 0.416 Inf 3.18 4.82
K 12 4.92 0.416 Inf 4.11 5.74
Confidence level used: 0.95
> identical(mg_em_1, mg_em_2)
[1] FALSE
For geepack::geeglm + emmeans it works from the emmeans level. geepack doesn't provide the vcov(type), but I can use one from the glmtoolbox to illustrate (I didn't want to push totally made up values)
> mg1 <- geepack::geeglm(Response ~ Method * Time, corstr = "ar1", id=ID, data=d)
> (mg_em_1 <- emmeans(mg1, specs = ~ Method * Time, vcov = vcov(mg, type="robust")))
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.562 Inf 6.67 8.87
K 1 6.27 0.301 Inf 5.68 6.86
B 6 4.27 0.391 Inf 3.50 5.04
K 6 5.23 0.301 Inf 4.64 5.82
B 12 4.00 0.449 Inf 3.12 4.88
K 12 4.92 0.330 Inf 4.28 5.57
Covariance estimate used: user-supplied
Confidence level used: 0.95
> (mg_em_2 <- emmeans(mg1, specs = ~ Method * Time, vcov = vcov(mg, type="model")))
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.416 Inf 6.95 8.59
K 1 6.27 0.416 Inf 5.45 7.09
B 6 4.27 0.416 Inf 3.45 5.09
K 6 5.23 0.416 Inf 4.41 6.05
B 12 4.00 0.416 Inf 3.18 4.82
K 12 4.92 0.416 Inf 4.11 5.74
Covariance estimate used: user-supplied
Confidence level used: 0.95
> identical(mg_em_1, mg_em_2)
[1] FALSE
This is what I get:
For mg (glmgee)
> emmeans(mg, specs = ~ Method * Time, vcov = vcov(mg, type="robust"))
Error in match.arg(type) : 'arg' must be NULL or a character vector
> emmeans(mg, specs = ~ Method * Time, vcov = "robust")
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.562 72 6.65 8.89
K 1 6.27 0.301 72 5.67 6.87
B 6 4.27 0.391 72 3.49 5.05
K 6 5.23 0.301 72 4.63 5.83
B 12 4.00 0.449 72 3.11 4.89
K 12 4.92 0.330 72 4.26 5.58
Confidence level used: 0.95
> emmeans(mg, specs = ~ Method * Time, vcov = vcov(mg, type="model")))
Error: unexpected ')' in "emmeans(mg, specs = ~ Method * Time, vcov = vcov(mg, type="model")))"
> emmeans(mg, specs = ~ Method * Time, vcov = "model")
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.416 72 6.94 8.60
K 1 6.27 0.416 72 5.44 7.10
B 6 4.27 0.416 72 3.44 5.10
K 6 5.23 0.416 72 4.40 6.06
B 12 4.00 0.416 72 3.17 4.83
K 12 4.92 0.416 72 4.09 5.75
Confidence level used: 0.95
So for me, the user-supplied covariances in your test don't work at all, but you can just supply the named covariance methods provided
For mg1 (geeglm)
> ## user-supplied covariances: as you show
> ## named covariance methods...
> emmeans(mg1, specs = ~ Method * Time, vcov.method = "robust")
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.562 Inf 6.67 8.87
K 1 6.27 0.301 Inf 5.68 6.86
B 6 4.27 0.391 Inf 3.50 5.04
K 6 5.23 0.301 Inf 4.64 5.82
B 12 4.00 0.449 Inf 3.12 4.88
K 12 4.92 0.330 Inf 4.28 5.57
Covariance estimate used: vbeta
Confidence level used: 0.95
> emmeans(mg1, specs = ~ Method * Time, vcov = "model")
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.562 Inf 6.67 8.87
K 1 6.27 0.301 Inf 5.68 6.86
B 6 4.27 0.391 Inf 3.50 5.04
K 6 5.23 0.301 Inf 4.64 5.82
B 12 4.00 0.449 Inf 3.12 4.88
K 12 4.92 0.330 Inf 4.28 5.57
Covariance estimate used: vbeta
Confidence level used: 0.95
So this is just the opposite! named covariance methods don't work for geeglm, and user-supplied ones don't work for glmgee. And I really, really wish these class names weren't so damn easy to confuse.
I think the real source of the problem is that ref_grid()
has an optional argument vcov.
(really part of the ...
argument), whereas the emm_basis
methods for these objects both have a vcov.method
argument, and there is masking going on.
I can't exaplin why you are getting the user-supplied covariances to work for geeglm objects. What version of emmeans are you using?
I've found an easy fix for glmgee
, but so far not providing a "user-supplied" message:
> emmeans(mg, specs = ~ Method * Time, vcov = vcov(mg, type="model"))
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.416 72 6.94 8.60
K 1 6.27 0.416 72 5.44 7.10
B 6 4.27 0.416 72 3.44 5.10
K 6 5.23 0.416 72 4.40 6.06
B 12 4.00 0.416 72 3.17 4.83
K 12 4.92 0.416 72 4.09 5.75
Confidence level used: 0.95
> emmeans(mg, specs = ~ Method * Time, vcov = vcov(mg, type="robust"))
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.562 72 6.65 8.89
K 1 6.27 0.301 72 5.67 6.87
B 6 4.27 0.391 72 3.49 5.05
K 6 5.23 0.301 72 4.63 5.83
B 12 4.00 0.449 72 3.11 4.89
K 12 4.92 0.330 72 4.26 5.58
Confidence level used: 0.95
OK, I got it through my thick skull that with geeglm
objects, vcov.method = <character value>
matches an internal slot whereas with glmgee
, vcov.method = <character value>
is passed as the type
argument to vcov()
. So I think everything works as it should now (at least by one definition of "should": handling character values per code design, non-character as the internal vcov.
argument). My trouble is I don't really use GEE methods and geepack is undocumented as far as vcov()
methods go. EWe now have annotations for glmgee.
Some examples...
Text values
> ## geeglm...
> emmeans(mg1, specs = ~ Method * Time, vcov = "robust")
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.562 Inf 6.67 8.87
K 1 6.27 0.301 Inf 5.68 6.86
B 6 4.27 0.391 Inf 3.50 5.04
K 6 5.23 0.301 Inf 4.64 5.82
B 12 4.00 0.449 Inf 3.12 4.88
K 12 4.92 0.330 Inf 4.28 5.57
Covariance estimate used: vbeta
Confidence level used: 0.95
> emmeans(mg1, specs = ~ Method * Time, vcov = "naive")
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.4 Inf 6.99 8.55
K 1 6.27 0.4 Inf 5.49 7.05
B 6 4.27 0.4 Inf 3.49 5.05
K 6 5.23 0.4 Inf 4.45 6.01
B 12 4.00 0.4 Inf 3.22 4.78
K 12 4.92 0.4 Inf 4.14 5.71
Covariance estimate used: vbeta.naiv
Confidence level used: 0.95
> ## glmgee...
> emmeans(mg, specs = ~ Method * Time, vcov = "model")
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.416 72 6.94 8.60
K 1 6.27 0.416 72 5.44 7.10
B 6 4.27 0.416 72 3.44 5.10
K 6 5.23 0.416 72 4.40 6.06
B 12 4.00 0.416 72 3.17 4.83
K 12 4.92 0.416 72 4.09 5.75
Covariance estimate used: model
Confidence level used: 0.95
> emmeans(mg, specs = ~ Method * Time, vcov = "robust")
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.562 72 6.65 8.89
K 1 6.27 0.301 72 5.67 6.87
B 6 4.27 0.391 72 3.49 5.05
K 6 5.23 0.301 72 4.63 5.83
B 12 4.00 0.449 72 3.11 4.89
K 12 4.92 0.330 72 4.26 5.58
Covariance estimate used: robust
Confidence level used: 0.95
Matrix values
> emmeans(mg1, specs = ~ Method * Time, vcov = vcov(mg, type = "jackknife"))
Method Time emmean SE df asymp.LCL asymp.UCL
B 1 7.77 0.608 Inf 6.58 8.96
K 1 6.27 0.326 Inf 5.63 6.91
B 6 4.27 0.423 Inf 3.44 5.10
K 6 5.23 0.326 Inf 4.59 5.87
B 12 4.00 0.486 Inf 3.05 4.95
K 12 4.92 0.358 Inf 4.22 5.62
Covariance estimate used: user-supplied
Confidence level used: 0.95
> emmeans(mg, specs = ~ Method * Time, vcov = vcov(mg, type = "jackknife"))
Method Time emmean SE df lower.CL upper.CL
B 1 7.77 0.608 72 6.56 8.98
K 1 6.27 0.326 72 5.62 6.92
B 6 4.27 0.423 72 3.43 5.11
K 6 5.23 0.326 72 4.58 5.88
B 12 4.00 0.486 72 3.03 4.97
K 12 4.92 0.358 72 4.21 5.64
Covariance estimate used: user-supplied
Confidence level used: 0.95
I think this is resolved, so am closing this issue