Comparing each level to a weighted mean of all other levels
walrossker opened this issue · 5 comments
In a one-way ANOVA, I want to compare each level's mean to the mean of all other samples ignoring group/level. My understanding is that the del.eff
contrast method compares each level with the "average of averages" of all other levels. I thought the weights = "proportional"
option might do what I'm aiming for, but it looks like a one-way ANOVA returns the same output regardless of weighting:
library(emmeans)
# Create data
dat <- mtcars
dat$cyl <- factor(dat$cyl)
# Run ANOVA
mod <- aov(mpg ~ cyl, data = dat)
# emmeans w/ equal weighting
emmeans(mod, del.eff ~ cyl, weights = "equal")
#> $emmeans
#> cyl emmean SE df lower.CL upper.CL
#> 4 26.7 0.972 29 24.7 28.7
#> 6 19.7 1.218 29 17.3 22.2
#> 8 15.1 0.861 29 13.3 16.9
#>
#> Confidence level used: 0.95
#>
#> $contrasts
#> contrast estimate SE df t.ratio p.value
#> cyl4 effect 9.24 1.23 29 7.544 <.0001
#> cyl6 effect -1.14 1.38 29 -0.825 0.4161
#> cyl8 effect -8.10 1.16 29 -6.976 <.0001
#>
#> P value adjustment: fdr method for 3 tests
# emmeans w/ proportional weighting
emmeans(mod, del.eff ~ cyl, weights = "proportional")
#> $emmeans
#> cyl emmean SE df lower.CL upper.CL
#> 4 26.7 0.972 29 24.7 28.7
#> 6 19.7 1.218 29 17.3 22.2
#> 8 15.1 0.861 29 13.3 16.9
#>
#> Confidence level used: 0.95
#>
#> $contrasts
#> contrast estimate SE df t.ratio p.value
#> cyl4 effect 9.24 1.23 29 7.544 <.0001
#> cyl6 effect -1.14 1.38 29 -0.825 0.4161
#> cyl8 effect -8.10 1.16 29 -6.976 <.0001
#>
#> P value adjustment: fdr method for 3 tests
# emmeans w/ cell weighting
emmeans(mod, del.eff ~ cyl, weights = "cell")
#> $emmeans
#> cyl emmean SE df lower.CL upper.CL
#> 4 26.7 0.972 29 24.7 28.7
#> 6 19.7 1.218 29 17.3 22.2
#> 8 15.1 0.861 29 13.3 16.9
#>
#> Confidence level used: 0.95
#>
#> $contrasts
#> contrast estimate SE df t.ratio p.value
#> cyl4 effect 9.24 1.23 29 7.544 <.0001
#> cyl6 effect -1.14 1.38 29 -0.825 0.4161
#> cyl8 effect -8.10 1.16 29 -6.976 <.0001
#>
#> P value adjustment: fdr method for 3 tests
I can specify a custom contrast to get the tests I want (i.e., 10 is the mean difference for cyl == 4
that I was looking for, not the 9.24 above):
# Get the proportions for each level excluding focal level (i.e., cyl == 4)
props <- prop.table(table(dat[!dat$cyl == "4", "cyl"]))
# Run contrast comparing cyl == 4 mean to prop-weighted mean of other levels
cyl_4_contr <- c(1, 0, 0)
all_but_cyl_4_contr <- c(0, props["6"], props["8"])
emmeans(mod, "cyl", contr = list(cyl_4_contr - all_but_cyl_4_contr))
#> $emmeans
#> cyl emmean SE df lower.CL upper.CL
#> 4 26.7 0.972 29 24.7 28.7
#> 6 19.7 1.218 29 17.3 22.2
#> 8 15.1 0.861 29 13.3 16.9
#>
#> Confidence level used: 0.95
#>
#> $contrasts
#> contrast estimate SE df
#> c(1, "6" = -0.333333333333333, "8" = -0.666666666666667) 10 1.2 29
#> t.ratio p.value
#> 8.349 <.0001
...but this would require a wrapper function to iterate over all levels. So I'm curious if emmeans
can already do this in another way or if you have any other recommendations. Would also like to be able to do this for the "eff" method (i.e., compare each level to the overall mean ignoring group/level). Thanks
This looks very similar to #346, so check that out.
I think your confusion is that weights
is an argument for emmeans()
to specify what kind of averaging to use when computing marginal means when averaging over other factors. Here, you have only the one factor. Moreover, weights
is not an argument to contrast()
.
I also would add that I spend a lot of time regretting that I provided that two-sided formula interface, because it has you doing two different things: means and contrasts. That creates confusion because it's not always obvious which of those tasts is affected by optional arguments that you provide. I strongly recommend keeping means and contrasts separate.
Here's an approach to your example for obtaining eff
-style contrasts against the weighted grand mean.
First, get the means and the weighted grand mean:
> emm = emmeans(mod, ~ cyl)
> gm = emmeans(emm, ~ 1, weights = "prop")
Then combine those into one object:
> ( tmp = rbind(emm, gm) )
cyl 1 emmean SE df lower.CL upper.CL
4 . 26.7 0.972 29 24.1 29.3
6 . 19.7 1.218 29 16.5 23.0
8 . 15.1 0.861 29 12.8 17.4
. overall 20.1 0.570 29 18.6 21.6
Results are averaged over some or all of the levels of: cyl
Confidence level used: 0.95
Conf-level adjustment: bonferroni method for 4 estimates
Finally, contrast each of these with the last one:
> contrast(tmp, "trt.vs.ctrlk")
contrast estimate SE df t.ratio p.value
4 . - . overall 6.573 0.787 29 8.349 <.0001
6 . - . overall -0.348 1.077 29 -0.323 0.9576
8 . - . overall -4.991 0.646 29 -7.725 <.0001
Results are averaged over some or all of the levels of: cyl
P value adjustment: dunnettx method for 3 tests
It's harder doing the del.eff
ones, but see that other issue.
Note: Making gm
is an instance where weights
makes a difference. Without weights
, we get
> emmeans(emm, ~1)
1 emmean SE df lower.CL upper.CL
overall 20.5 0.594 29 19.3 21.7
Results are averaged over the levels of: cyl
Confidence level used: 0.95
The weighted grand mean is 20.1
Hmmm, just this same day, I ran across the contrMat()
function in multcomp package. I think del.eff
corresponds to type = "AVE"
and eff
corresponds to type = "GrandMean"
, though I don't readily see where this is documented.
In any case, you need to transpose the matrix this function produces. Example:
> library(multcomp)
> wts = emm@grid$.wgt.
> mth = data.frame(t(contrMat(wts, "AVE")))
> mth
C.1 C.2 C.3
4 1.0000000 -0.44 -0.6111111
6 -0.3333333 1.00 -0.3888889
8 -0.6666667 -0.56 1.0000000
> contrast(emm, mth)
contrast estimate SE df t.ratio p.value
C.1 10.016 1.20 29 8.349 <.0001
C.2 -0.445 1.38 29 -0.323 0.7490
C.3 -8.872 1.15 29 -7.725 <.0001
Using data.frame(t(X))
transposes the matrix and creates a data frame, which is really a list of its columns, so it works with the method
argument of contrast()
.
Well, if I don't feel like a fool. It would help both you and me to read the documentation. There is already an optional wts
argument in del.eff.emmc
and eff.emmc
. (I think I added that feature in response to #346, actually.) All you have to do is use 'em!
> w = emm@grid$.wgt.
> contrast(emm, "del.eff", wts = w)
contrast estimate SE df t.ratio p.value
cyl4 effect 10.016 1.20 29 8.349 <.0001
cyl6 effect -0.445 1.38 29 -0.323 0.7490
cyl8 effect -8.872 1.15 29 -7.725 <.0001
P value adjustment: fdr method for 3 tests
> contrast(emm, "eff", wts = w)
contrast estimate SE df t.ratio p.value
cyl4 effect 6.573 0.787 29 8.349 <.0001
cyl6 effect -0.348 1.077 29 -0.323 0.7490
cyl8 effect -4.991 0.646 29 -7.725 <.0001
P value adjustment: fdr method for 3 tests
In the next update, I am adding a weights()
method so we can use weights(emm)
instead of what I did above for w
.
I think this is resolved, so am closing this issue
@rvlenth Thanks for looking into this! That’s already a good solution, and will keep an eye out for the update