rvlenth/emmeans

emmeans gives wrong results for `afex_aov`s if the `at = ` argument is reversed

Joao-O-Santos opened this issue · 4 comments

Describe the bug

When I run emmeans(model, ~ Var, at = list(Var = c("levelx", "levely"))) the results get switched for afex_aovs if I place levelx and levely in a different order, but this doesn't happen for models of class afex_mixed, and it probably won't happen for other models.

To reproduce

library(emmeans)

df <- data.frame(a = rnorm(300), b = rnorm(300, 3), pp = 1:300)
df <- tidyr::pivot_longer(df, -"pp")
afex_aov <- afex::aov_4(value ~ name + (name | pp), df)
afex_mixed <- afex::mixed(value ~ name + (1 | pp), df)

ordered <- list(name = c("a", "b"))
reversed <- list(name = c("b", "a")

Expected behavior

If we list the factors in the order they appear we always get correct results.

emmeans(afex_aov, ~ name, at = ordered)
emmeans(afex_mixed, ~ name, at = ordered)

If we reverse the names of the factors we get wrong values in for the model of class afex_aov.

emmeans(afex_aov, ~ name, at = reversed)
emmeans(afex_mixed, ~ name, at = reversed)

How can I help?

Thank you so much for developing emmeans it is such an amazing package!
I noticed in the README that you are looking for another maintainer, I'm afraid I can't take that burden right now, but let me know if there's any way I can help, with this bug, or with other smaller tasks.

Sincery,
J

P.S: I checked the ground rules.

Ground rules

Make sure that you have used things as they are documented.

It might be user error on my part, but I found no documentation about having to list the labels in the correct order for afex_aovs but not for other models.

More than one or two pipes is usually too many.

Check. I used no pipes.

Did you know that there is an index of vignette topics?

I looked up the vignettes and found no mention of this, but I might have missed something.

Please examine the output from what you have tried.

There are no annotations below the output.

I also looked at the afex package to see if it was implementing this functionality.
I couldn't tell for sure if that was the case, so I submitted this report and cross-posted it in the afex package as well.

Hmmmm. The issue isn't really in afex per se, it is in how multivariate models are handled. Specifically, the at argument comes into play twice: once when we select which predictor levels to use, and again when we select which multivariate levels to keep. I'll do a parallel example using one of emmeans's provided datasets:

library(emmeans)
MOats.lm <- lm (yield ~ Block + Variety, data = MOats)

# All levels of Variety
emmeans(MOats.lm, "Variety")
##  Variety     emmean   SE df lower.CL upper.CL
##  Golden Rain  104.5 5.01 10     93.3      116
##  Marvellous   109.8 5.01 10     98.6      121
##  Victory       97.6 5.01 10     86.5      109
## 
## Results are averaged over the levels of: Block, rep.meas 
## Confidence level used: 0.95

# Two levels in order
emmeans(MOats.lm, "Variety", at = list(Variety = c("Golden Rain", "Victory")))
##  Variety     emmean   SE df lower.CL upper.CL
##  Golden Rain  104.5 5.01 10     93.3      116
##  Victory       97.6 5.01 10     86.5      109
## 
## Results are averaged over the levels of: Block, rep.meas 
## Confidence level used: 0.95

# Two levels in reverse order
emmeans(MOats.lm, "Variety", at = list(Variety = rev(c("Golden Rain", "Victory"))))
##  Variety     emmean   SE df lower.CL upper.CL
##  Victory       97.6 5.01 10     86.5      109
##  Golden Rain  104.5 5.01 10     93.3      116
## 
## Results are averaged over the levels of: Block, rep.meas 
## Confidence level used: 0.95

########################################################
# All levels of rep.meas
emmeans(MOats.lm, "rep.meas")
##  rep.meas emmean   SE df lower.CL upper.CL
##  0          79.4 3.20 10     72.3     86.5
##  0.2        98.9 3.81 10     90.4    107.4
##  0.4       114.2 5.02 10    103.0    125.4
##  0.6       123.4 4.22 10    114.0    132.8
## 
## Results are averaged over the levels of: Block, Variety 
## Confidence level used: 0.95

# Three levels in order
emmeans(MOats.lm, "rep.meas", at = list(rep.meas = c(0,.2,.6)))
##  rep.meas emmean   SE df lower.CL upper.CL
##       0.0   79.4 3.20 10     72.3     86.5
##       0.2   98.9 3.81 10     90.4    107.4
##       0.6  123.4 4.22 10    114.0    132.8
## 
## Results are averaged over the levels of: Block, Variety 
## Confidence level used: 0.95

# Three levels in reverse order
emmeans(MOats.lm, "rep.meas", at = list(rep.meas = rev(c(0,.2,.6))))
##  rep.meas emmean   SE df lower.CL upper.CL
##       0.6   79.4 3.20 10     72.3     86.5
##       0.2   98.9 3.81 10     90.4    107.4
##       0.0  123.4 4.22 10    114.0    132.8
## 
## Results are averaged over the levels of: Block, Variety 
## Confidence level used: 0.95

Created on 2023-07-13 with reprex v2.0.2

What we see is that at is working right for Variety, which is a predictor, but not for rep.meas, which is a multivariate ressponse. I'll try to fix this.

Thanks for reporting this. I'll post again when I have fixed it.

The code I had worked by excluding the levels not used, but not re-ordering the rest of it. I re-coded this section so it picks out the cases that match the specified levels. It seems to work right now:

library(emmeans)
MOats.lm <- lm (yield ~ Block + Variety, data = MOats)

# All levels of rep.meas
emmeans(MOats.lm, "rep.meas")
##  rep.meas emmean   SE df lower.CL upper.CL
##  0          79.4 3.20 10     72.3     86.5
##  0.2        98.9 3.81 10     90.4    107.4
##  0.4       114.2 5.02 10    103.0    125.4
##  0.6       123.4 4.22 10    114.0    132.8
## 
## Results are averaged over the levels of: Block, Variety 
## Confidence level used: 0.95

# Three levels in order
emmeans(MOats.lm, "rep.meas", at = list(rep.meas = c(0,.2,.6)))
##  rep.meas emmean   SE df lower.CL upper.CL
##       0.0   79.4 3.20 10     72.3     86.5
##       0.2   98.9 3.81 10     90.4    107.4
##       0.6  123.4 4.22 10    114.0    132.8
## 
## Results are averaged over the levels of: Block, Variety 
## Confidence level used: 0.95

# Three levels in reverse order
emmeans(MOats.lm, "rep.meas", at = list(rep.meas = rev(c(0,.2,.6))))
##  rep.meas emmean   SE df lower.CL upper.CL
##       0.6  123.4 4.22 10    114.0    132.8
##       0.2   98.9 3.81 10     90.4    107.4
##       0.0   79.4 3.20 10     72.3     86.5
## 
## Results are averaged over the levels of: Block, Variety 
## Confidence level used: 0.95

# All levels plus an illegal one, haphazard order
emmeans(MOats.lm, "rep.meas", at = list(rep.meas = c(.2,.6,.4,.75,0)))
##  rep.meas emmean   SE df lower.CL upper.CL
##       0.2   98.9 3.81 10     90.4    107.4
##       0.6  123.4 4.22 10    114.0    132.8
##       0.4  114.2 5.02 10    103.0    125.4
##       0.0   79.4 3.20 10     72.3     86.5
## 
## Results are averaged over the levels of: Block, Variety 
## Confidence level used: 0.95

Created on 2023-07-13 with reprex v2.0.2

I'll push this up soon.

Thank you, so, so much for being on top of this and for fixing it so fast, Prof. Lenth!

P.S: feel free to close the issue.

Closing this issue as resolved