rvlenth/emmeans

emmeans is giving oposite results for averaging than for each of the 8 models

Closed this issue · 11 comments

Hi Russell, I found an error diging more into the averaging issues, and I am still not sure if is a emmeans error or a MuMIn error, but I thought it would be good to notify you so this is the same dataset and analysis, again all data is on this repo this repo

first we load the needed r packages and the dataset

library(emmeans)
library(lme4)
library(MuMIn)
library(doParallel)

Data <- readRDS("Data.rds")

Now we fit a general model:

Model <- glmer(richness ~ aspect + elevation +
  initial_habitat +
  I(abs(year - 1)) +
  I((year - 1)^2) +
  slope +
  treatment:initial_habitat +
  year:initial_habitat +
  year:treatment +
  year:treatment:initial_habitat +
  (1 | block_no), family = poisson, data = Data, control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 100000)))

And we make a model selection (skip this step since it takes a while, the “SelectRichness.rds” file is in this github)

options(na.action = "na.fail")

library(doParallel)
cl <- makeCluster(4)
registerDoParallel(cl)

clusterEvalQ(cl, library(lme4))

clusterExport(cl, "Data")

Select <- MuMIn::pdredge(Model, extra = list(R2m = function(x) r.squaredGLMM(x)[1, 1], R2c = function(x) r.squaredGLMM(x)[1, 2]), fixed = ~ YEAR:Treatment, cluster = cl)

saveRDS(Select, "SelectRichness.rds")

And now we Select the best models, I will do this twice, since the outcome of the subset function will be used to show how the best model does not have issues and the averaged model from get.models which is the result I need is not working

Select <- readRDS("SelectRichness.rds")
SelectedList <- get.models(Select, delta < 2)

Lets see all models

So now, if we look at each model we get this

SelectedList <- get.models(Select, delta < 2)


# Calculate emmeans fore each of the 8 models

ListEMM <- SelectedList |>
  purrr::map(~ emmeans(.x, ~ year * initial_habitat * treatment, at = list(year = 4), data = Data, type = "response"))

# Then get the pairs table for each model

TablePairs <- ListEMM |>
  purrr::map(~ pairs(.x, simple = "treatment")) |>
  purrr::map(as.data.frame)
## add a model name
for (i in 1:length(TablePairs)) {
  TablePairs[[i]]$Model <- paste("Model", i)
}

# Join all tables and show it

TablePairs <- TablePairs |> purrr::reduce(rbind)

As you can see in this table we have significant values for all comparisons, where all ratios are bellow 1, pointing at PermanentExclosure having lower richness than Control

contrast year initial_habitat ratio SE df null z.ratio p.value Model
PermanentExclosure / Control 4 Forest 0.8030333 0.0422571 Inf 1 -4.168592 3.06e-05 Model 1
PermanentExclosure / Control 4 Meadow 0.8030333 0.0422571 Inf 1 -4.168592 3.06e-05 Model 1
PermanentExclosure / Control 4 Rangeland 0.8030333 0.0422571 Inf 1 -4.168592 3.06e-05 Model 1
PermanentExclosure / Control 4 Forest 0.7964861 0.0420561 Inf 1 -4.309411 1.64e-05 Model 2
PermanentExclosure / Control 4 Meadow 0.7964861 0.0420561 Inf 1 -4.309411 1.64e-05 Model 2
PermanentExclosure / Control 4 Rangeland 0.7964861 0.0420561 Inf 1 -4.309411 1.64e-05 Model 2
PermanentExclosure / Control 4 Forest 0.8031875 0.0421844 Inf 1 -4.172918 3.01e-05 Model 3
PermanentExclosure / Control 4 Meadow 0.8031875 0.0421844 Inf 1 -4.172918 3.01e-05 Model 3
PermanentExclosure / Control 4 Rangeland 0.8031875 0.0421844 Inf 1 -4.172918 3.01e-05 Model 3
PermanentExclosure / Control 4 Forest 0.7966193 0.0419842 Inf 1 -4.314336 1.60e-05 Model 4
PermanentExclosure / Control 4 Meadow 0.7966193 0.0419842 Inf 1 -4.314336 1.60e-05 Model 4
PermanentExclosure / Control 4 Rangeland 0.7966193 0.0419842 Inf 1 -4.314336 1.60e-05 Model 4
PermanentExclosure / Control 4 Forest 0.8006527 0.0422373 Inf 1 -4.214460 2.50e-05 Model 5
PermanentExclosure / Control 4 Meadow 0.8006527 0.0422373 Inf 1 -4.214460 2.50e-05 Model 5
PermanentExclosure / Control 4 Rangeland 0.8006527 0.0422373 Inf 1 -4.214460 2.50e-05 Model 5
PermanentExclosure / Control 4 Forest 0.8039527 0.0423231 Inf 1 -4.145121 3.40e-05 Model 6
PermanentExclosure / Control 4 Meadow 0.8039527 0.0423231 Inf 1 -4.145121 3.40e-05 Model 6
PermanentExclosure / Control 4 Rangeland 0.8039527 0.0423231 Inf 1 -4.145121 3.40e-05 Model 6
PermanentExclosure / Control 4 Forest 0.7939884 0.0420386 Inf 1 -4.357007 1.32e-05 Model 7
PermanentExclosure / Control 4 Meadow 0.7939884 0.0420386 Inf 1 -4.357007 1.32e-05 Model 7
PermanentExclosure / Control 4 Rangeland 0.7939884 0.0420386 Inf 1 -4.357007 1.32e-05 Model 7
PermanentExclosure / Control 4 Forest 0.8008168 0.0421648 Inf 1 -4.218685 2.46e-05 Model 8
PermanentExclosure / Control 4 Meadow 0.8008168 0.0421648 Inf 1 -4.218685 2.46e-05 Model 8
PermanentExclosure / Control 4 Rangeland 0.8008168 0.0421648 Inf 1 -4.218685 2.46e-05 Model 8

Working with the average model gives oposite result

If we go with the model averaging, we see the opposite result

AV <- model.avg(SelectedList, fit = TRUE, random = TRUE)

noise.emm <- emmeans(AV, ~ year * initial_habitat * treatment, at = list(year = 4), data = Data, type = "response")

AverageTable <- pairs(noise.emm, simple = "treatment")

We can see that the table gives the opposite result, where all comparisons have a ration over 1 with PermanentExclosure having higher richness than control, but it is not significant.

knitr::kable(AverageTable)
contrast year initial_habitat ratio SE df null t.ratio p.value
PermanentExclosure / Control 4 Forest 1.070482 0.2635936 233 1 0.2765966 0.7823351
PermanentExclosure / Control 4 Meadow 1.070482 0.2635936 233 1 0.2765966 0.7823351
PermanentExclosure / Control 4 Rangeland 1.070482 0.2635936 233 1 0.2765966 0.7823351
Standard output and standard error `/home/au687614/Documents/emmeansProblem/Problem_std_out_err.txt`
Session info
sessioninfo::session_info()
#;-) ─ Session info ───────────────────────────────────────────────────────────
#;-)  setting  value
#;-)  version  R version 4.2.2 Patched (2022-11-10 r83330)
#;-)  os       Ubuntu 20.04.5 LTS
#;-)  system   x86_64, linux-gnu
#;-)  ui       RStudio
#;-)  language en_US:en
#;-)  collate  en_US.UTF-8
#;-)  ctype    en_US.UTF-8
#;-)  tz       Europe/Copenhagen
#;-)  date     2023-03-13
#;-)  rstudio  2022.07.2+576 Spotted Wakerobin (desktop)
#;-)  pandoc   2.19.2 @ /usr/lib/rstudio/bin/quarto/bin/tools/ (via rmarkdown)
#;-) 
#;-) ─ Packages ───────────────────────────────────────────────────────────────
#;-)  package      * version     date (UTC) lib source
#;-)  boot           1.3-28      2021-05-03 [4] CRAN (R 4.0.5)
#;-)  bslib          0.4.2       2022-12-16 [1] CRAN (R 4.2.2)
#;-)  cachem         1.0.7       2023-02-24 [3] CRAN (R 4.2.2)
#;-)  cli            3.6.0       2023-01-09 [1] CRAN (R 4.2.2)
#;-)  cluster        2.1.4       2022-08-22 [4] CRAN (R 4.2.1)
#;-)  coda           0.19-4      2020-09-30 [3] CRAN (R 4.0.2)
#;-)  codetools      0.2-19      2023-02-01 [4] CRAN (R 4.2.2)
#;-)  digest         0.6.31      2022-12-11 [1] CRAN (R 4.2.2)
#;-)  doParallel   * 1.0.17      2022-02-07 [1] CRAN (R 4.2.1)
#;-)  dplyr          1.1.0       2023-01-29 [1] CRAN (R 4.2.2)
#;-)  emmeans      * 1.8.4-90005 2023-03-07 [1] Github (rvlenth/emmeans@2b0273d)
#;-)  estimability   1.4.1       2022-08-05 [1] CRAN (R 4.2.1)
#;-)  evaluate       0.20        2023-01-17 [1] CRAN (R 4.2.2)
#;-)  fansi          1.0.4       2023-01-22 [3] CRAN (R 4.2.2)
#;-)  fastmap        1.1.1       2023-02-24 [3] CRAN (R 4.2.2)
#;-)  foreach      * 1.5.2       2022-02-02 [1] CRAN (R 4.2.1)
#;-)  fs             1.6.1       2023-02-06 [3] CRAN (R 4.2.2)
#;-)  generics       0.1.3       2022-07-05 [1] CRAN (R 4.2.1)
#;-)  glue           1.6.2       2022-02-24 [3] CRAN (R 4.1.2)
#;-)  htmltools      0.5.4       2022-12-07 [1] CRAN (R 4.2.2)
#;-)  iterators    * 1.0.14      2022-02-05 [1] CRAN (R 4.2.1)
#;-)  janitor        2.1.0       2021-01-05 [1] CRAN (R 4.2.1)
#;-)  jquerylib      0.1.4       2021-04-26 [1] CRAN (R 4.2.0)
#;-)  jsonlite       1.8.4       2022-12-06 [1] CRAN (R 4.2.2)
#;-)  knitr          1.42        2023-01-25 [1] CRAN (R 4.2.2)
#;-)  lattice        0.20-45     2021-09-22 [4] CRAN (R 4.2.0)
#;-)  lifecycle      1.0.3       2022-10-07 [3] CRAN (R 4.2.1)
#;-)  lme4         * 1.1-31      2022-11-01 [1] CRAN (R 4.2.2)
#;-)  lubridate      1.9.0       2022-11-06 [1] CRAN (R 4.2.2)
#;-)  magrittr       2.0.3       2022-03-30 [3] CRAN (R 4.1.3)
#;-)  MASS           7.3-58.2    2023-01-23 [4] CRAN (R 4.2.2)
#;-)  Matrix       * 1.5-3       2022-11-11 [4] CRAN (R 4.2.2)
#;-)  mgcv           1.8-42      2023-03-02 [4] CRAN (R 4.2.2)
#;-)  minqa          1.2.5       2022-10-19 [3] CRAN (R 4.2.1)
#;-)  multcomp       1.4-20      2022-08-07 [1] CRAN (R 4.2.1)
#;-)  MuMIn        * 1.47.3      2023-02-28 [1] R-Forge (R 4.2.2)
#;-)  mvtnorm        1.1-3       2021-10-08 [1] CRAN (R 4.2.1)
#;-)  nlme           3.1-162     2023-01-31 [4] CRAN (R 4.2.2)
#;-)  nloptr         2.0.3       2022-05-26 [3] CRAN (R 4.2.0)
#;-)  permute        0.9-7       2022-01-27 [1] CRAN (R 4.2.0)
#;-)  pillar         1.8.1       2022-08-19 [1] CRAN (R 4.2.1)
#;-)  pkgconfig      2.0.3       2019-09-22 [3] CRAN (R 4.0.0)
#;-)  purrr          1.0.1       2023-01-10 [1] CRAN (R 4.2.2)
#;-)  R.cache        0.16.0      2022-07-21 [1] CRAN (R 4.2.1)
#;-)  R.methodsS3    1.8.2       2022-06-13 [1] CRAN (R 4.2.1)
#;-)  R.oo           1.25.0      2022-06-12 [1] CRAN (R 4.2.1)
#;-)  R.utils        2.12.2      2022-11-11 [1] CRAN (R 4.2.2)
#;-)  R6             2.5.1       2021-08-19 [3] CRAN (R 4.1.1)
#;-)  Rcpp           1.0.10      2023-01-22 [3] CRAN (R 4.2.2)
#;-)  reprex         2.0.2       2022-08-17 [1] CRAN (R 4.2.2)
#;-)  rlang          1.0.6       2022-09-24 [1] CRAN (R 4.2.1)
#;-)  rmarkdown      2.20        2023-01-19 [1] CRAN (R 4.2.2)
#;-)  rsconnect      0.8.28      2022-10-24 [1] CRAN (R 4.2.2)
#;-)  rstudioapi     0.14        2022-08-22 [1] CRAN (R 4.2.1)
#;-)  sandwich       3.0-2       2022-06-15 [1] CRAN (R 4.2.1)
#;-)  sass           0.4.5       2023-01-24 [1] CRAN (R 4.2.2)
#;-)  sessioninfo    1.2.2       2021-12-06 [3] CRAN (R 4.1.2)
#;-)  snakecase      0.11.0      2019-05-25 [1] CRAN (R 4.2.1)
#;-)  stringi        1.7.12      2023-01-11 [1] CRAN (R 4.2.2)
#;-)  stringr        1.5.0       2022-12-02 [1] CRAN (R 4.2.2)
#;-)  styler         1.8.1       2022-11-07 [1] CRAN (R 4.2.2)
#;-)  survival       3.5-3       2023-02-12 [4] CRAN (R 4.2.2)
#;-)  TH.data        1.1-1       2022-04-26 [1] CRAN (R 4.2.1)
#;-)  tibble         3.1.8       2022-07-22 [1] CRAN (R 4.2.1)
#;-)  tidyselect     1.2.0       2022-10-10 [1] CRAN (R 4.2.1)
#;-)  timechange     0.1.1       2022-11-04 [1] CRAN (R 4.2.2)
#;-)  TMB            1.9.2       2023-01-23 [1] CRAN (R 4.2.2)
#;-)  utf8           1.2.3       2023-01-31 [3] CRAN (R 4.2.2)
#;-)  vctrs          0.5.2       2023-01-23 [1] CRAN (R 4.2.2)
#;-)  vegan          2.6-4       2022-10-11 [1] CRAN (R 4.2.1)
#;-)  withr          2.5.0       2022-03-03 [3] CRAN (R 4.1.3)
#;-)  xfun           0.37        2023-01-31 [1] CRAN (R 4.2.2)
#;-)  xtable         1.8-4       2019-04-21 [1] CRAN (R 4.2.0)
#;-)  yaml           2.3.7       2023-01-23 [3] CRAN (R 4.2.2)
#;-)  zoo            1.8-11      2022-09-17 [1] CRAN (R 4.2.1)
#;-) 
#;-)  [1] /home/au687614/R/x86_64-pc-linux-gnu-library/4.2
#;-)  [2] /usr/local/lib/R/site-library
#;-)  [3] /usr/lib/R/site-library
#;-)  [4] /usr/lib/R/library
#;-) 
#;-) ──────────────────────────────────────────────────────────────────────────

Before I look at this, is the same SelectRichness.rds file as before? If so, I won't need to download it. If not, it should have a different name.

BTW, I am not asking for a re-do, but in future reporting of bugs to me or another person, it is best to avoid elaborating things as much as possible. Just straight calls please to the functions that are needed to reproduce the bug. Pre- and post-processing through pipes, dplyr(), kable(), etc. are just extra, unneeded steps and mask the actual objects produced by my package functions.

Hey Thank you Russell, I pushed the new Select.Richness.rds right now, sorry about the piping, I tried to keep it at a minimum this time, if there is a new issue I will avoid it entirely

NO!!!! If SelectRichness has changed, please, please do not give it the same name. Because you have now probably made issue 402 non-reproducible.

This is what I am seeing based on the original SelectRichness from #402 -- just using model predictions for a common grid:

> ref_grid(AV, data = Data)
'emmGrid' object with variables:
    aspect = 125
    elevation = 37.718
    year = 1.9918
    initial_habitat = Forest, Meadow, Rangeland
    treatment = PermanentExclosure, Control
Transformation: “log” 

> grid = transform(noise.emm@grid, aspect = 125, elevation = 37.718, block_no = NA)
> grid
  year initial_habitat          treatment .wgt. aspect elevation block_no
1    4          Forest PermanentExclosure    50    125    37.718       NA
2    4          Meadow PermanentExclosure    15    125    37.718       NA
3    4       Rangeland PermanentExclosure    70    125    37.718       NA
4    4          Forest            Control    25    125    37.718       NA
5    4          Meadow            Control    14    125    37.718       NA
6    4       Rangeland            Control    70    125    37.718       NA

> preds = sapply(SelectedList, \(m) predict(m, newdata = grid, re.form = NA, type = "response"))
> preds
        275      403       274      402       283       279       411       282
1  7.825350  8.49674  8.057187  8.73896  8.158278  7.764152  8.869441  8.399967
2 14.208545 14.67030 14.646058 15.14676 15.814345 14.649633 16.364545 16.301204
3 16.670437 16.13882 17.164182 16.61635 15.902183 16.644227 15.377029 16.373395
4  9.744739 10.66778 10.031513 10.97005 10.189530  9.657477 11.170740 10.489261
5 17.693594 18.41878 18.234917 19.01379 19.751807 18.222015 20.610552 20.355744
6 20.759335 20.26253 21.370080 20.85857 19.861515 20.703000 19.366812 20.445891

> (avgpred = apply(preds, 1, mean))
        1         2         3         4         5         6 
 8.288759 15.225174 16.360828 10.365137 19.037650 20.453468 
 
> predict(AV, newdata = grid, re.form = NA, type = "response")
        1         2         3         4         5         6 
 8.252539 15.003734 16.452004 10.317070 18.755675 20.562313 
> ### These are the same results ###
 
> fun = function(x) c(r1 = x[1]/x[4], r2 = x[2]/x[5], r3 = x[3]/x[6])
> apply(preds, 2, fun)
           275      403       274       402       283       279       411       282
r1.1 0.8030333 0.796486 0.8031876 0.7966197 0.8006531 0.8039524 0.7939887 0.8008159
r2.2 0.8030333 0.796486 0.8031876 0.7966197 0.8006531 0.8039524 0.7939887 0.8008159
r3.3 0.8030333 0.796486 0.8031876 0.7966197 0.8006531 0.8039524 0.7939887 0.8008159

> fun(avgpred)
      r1.1      r2.2      r3.3 
0.7996768 0.7997402 0.7999049 

On the other hand, if I use noise.emm, I do get different results:

> predict(noise.emm)
[1] 11.03103 20.05298 22.00366 10.30473 18.73265 20.55491

> ### calculated "manually"
> as.numeric(exp(noise.emm@linfct %*% noise.emm@bhat))
[1] 11.03103 20.05298 22.00366 10.30473 18.73265 20.55491

> fun(predict(noise.emm))
      r1       r2       r3 
1.070482 1.070482 1.070482 

> ### The linear functions and coefs look right:
> noise.emm@linfct
     (Intercept) aspect elevation I((year - 1)^2) I(abs(year - 1)) initial_habitatMeadow initial_habitatRangeland initial_habitatForest:year initial_habitatMeadow:year
[1,]           1    125  37.71776               9                3                     0                        0                          4                          0
[2,]           1    125  37.71776               9                3                     1                        0                          0                          4
[3,]           1    125  37.71776               9                3                     0                        1                          0                          0
[4,]           1    125  37.71776               9                3                     0                        0                          4                          0
[5,]           1    125  37.71776               9                3                     1                        0                          0                          4
[6,]           1    125  37.71776               9                3                     0                        1                          0                          0
     initial_habitatRangeland:year year:treatmentControl
[1,]                             0                     0
[2,]                             0                     0
[3,]                             4                     0
[4,]                             0                     4
[5,]                             0                     4
[6,]                             4                     4

> noise.emm@bhat
                  (Intercept)                        aspect                     elevation               I((year - 1)^2)              I(abs(year - 1))         initial_habitatMeadow 
                 2.155205e+00                  2.044456e-05                  7.891242e-04                 -1.824446e-02                  1.649255e-01                  6.357188e-01 
     initial_habitatRangeland    initial_habitatForest:year    initial_habitatMeadow:year initial_habitatRangeland:year         treatmentControl:year 
                 7.759404e-01                 -2.934701e-02                 -3.886043e-02                 -5.070798e-02                 -1.702733e-02 

> coef(AV, full = TRUE)
                     (Intercept)                  I((year - 1)^2)                 I(abs(year - 1))            initial_habitatMeadow         initial_habitatRangeland 
                    2.155205e+00                    -1.824446e-02                     1.649255e-01                     6.357188e-01                     7.759404e-01 
treatmentPermanentExclosure:year            treatmentControl:year       initial_habitatForest:year       initial_habitatMeadow:year    initial_habitatRangeland:year 
                   -7.279999e-02                    -1.702733e-02                    -2.934701e-02                    -3.886043e-02                    -5.070798e-02 
           treatmentControl:year                        elevation                           aspect 
                   -1.702733e-02                     7.891242e-04                     2.044456e-05 

> length(noise.emm@bhat)
[1] 11
> length(coef(AV, full = TRUE))
[1] 13
> length(unique(names(coef(AV, full = TRUE))))
[1] 12

The question is why; and I'm not quite sure because everything seems right. except for the last bit, where the coefs are mismatched. One more thing:

> setdiff(names(coef(AV, full = TRUE)), names(noise.emm@bhat))
[1] "treatmentPermanentExclosure:year"

So apparently the issue is that the above coefficient is omitted in noise.emm
I'm not sure how to resolve this. And BTW I don't know what is supposed to be the effect of random = TRUE in the call to model.avg.

Just to clarify my previous comment. I'd say the gold standard for implementation of emm_basis methods is that it produces predictions identical to those obtained via predict(model, newdata = ref_grid(model)@grid). My tests prove that the predictions obtained this way are identical to the average of the predictions from the individual models, and that this is different from predict(ref_grid(model)), So the test fails.

Not knowing how to fix it does not mitigate that fact.

I'm going to take back some of what I said above. The gold standard I refer to must not be real gold, because it has some tarnish on it! Namely, if you follow what is done in my tests, we produced predictions from the separate models and averaged them together, and that process produced exactly the same results as predict(AV, newdata = grid, re.form = NA, type = "response"). Read every detail of the previous call...

  • It doesn't work without re.form = NA. That's because it needs that argument. That's a clue.
  • It says type = "response". I should have noticed earlier that we averaged those predictions on the response scale, not on the link scale.

Because the link function is the log, which is nonlinear, it is impossible to develop a linear predictor for the link scale that will back-transform to exactly the rersults we had on the response scale. For that reason, it is clear there is something shady about predict.averaging(). Looking at its code, we find that the returned value we obtained was obtained exactly the way I did in my tests: obtain the predictions for each model (passing the needed argument re.form = NA and type = "response") and average them together.

So I think the so-called gold standard I mention is a sham, because there is no way those results could be obtained using the model coefficients. And it makes me wonder what we even mean by model averaging and whether the implementation of predictions in MuMIn is correct. Specifically, note that

> exp(predict(AV, newdata = grid, re.form=NA, type = "link"))
        1         2         3         4         5         6 
 8.244221 14.986917 16.444795 10.304731 18.732655 20.554906

These are different results than obtained before, showing that predict.averaging() is inconsistent with itself, assuming we really are talking about a single averaged model that has its own regression coefficients.

The code for predict.averaging has a clause like this:

if ((missing(se.fit) || !se.fit) && (is.na(type) || type == 
        "link") && !backtransform && all(linherits(models, c(gam = FALSE, 
        lm = TRUE))) && !anyNA(object$coefficients[1L, ])) {
            ... code that does it kind of like what emmeans does ...
}

I ran this in debug mode and the if condition tests FALSE but still I could run the code it skipped over. I obtained exactly the same model matrix as in the @linfct slot of noise.emm, and exactly the same estimates as was obtained in my tests above from emmeans().

So I don't know what to think, other than I'm less prone to believe I did it wrong.

... and not inclined to believe that "I can fix it."

OK, here's something to ponder...

> colnames(model.matrix(AV))
 [1] "(Intercept)"                      "I((year - 1)^2)"                 
 [3] "I(abs(year - 1))"                 "initial_habitatMeadow"           
 [5] "initial_habitatRangeland"         "treatmentPermanentExclosure:year"
 [7] "treatmentControl:year"            "initial_habitatForest:year"      
 [9] "initial_habitatMeadow:year"       "initial_habitatRangeland:year"   
[11] "year:treatmentControl"            "elevation"                       
[13] "aspect"                          

> colnames(model.matrix(terms(delete.response(formula(AV))), data = Data))
 [1] "(Intercept)"                   "aspect"                       
 [3] "elevation"                     "I((year - 1)^2)"              
 [5] "I(abs(year - 1))"              "initial_habitatMeadow"        
 [7] "initial_habitatRangeland"      "initial_habitatForest:year"   
 [9] "initial_habitatMeadow:year"    "initial_habitatRangeland:year"
[11] "year:treatmentControl"

The first lists the names of the regression coefficients that can potentially appear in a model. The second is the column names of the model matrix generated by the model's terms component. They don't match; and in particular, the latter excludes treatmentPermanentExclosure:year. I think this happens because the model formula in the individual models lacks a main effect of treatment. But in any case, the bottom line is that the models are parameterized differently, meaning that we are using the wrong coefficients.

By golly, I think I'm getting close:

> RG = ref_grid(AV, at = list(year = 4), data = Data, type = "response")
> (noise.emm = emmeans(RG, ~ year * initial_habitat * treatment))
 year initial_habitat treatment           rate   SE  df lower.CL upper.CL
    4 Forest          PermanentExclosure  8.24 1.25 233     6.11     11.1
    4 Meadow          PermanentExclosure 14.99 2.99 233    10.11     22.2
    4 Rangeland       PermanentExclosure 16.44 1.63 233    13.53     20.0
    4 Forest          Control             9.63 3.02 233     5.19     17.9
    4 Meadow          Control            17.50 5.57 233     9.34     32.8
    4 Rangeland       Control            19.20 4.67 233    11.90     31.0

Confidence level used: 0.95 
Intervals are back-transformed from the log scale 

> summary(pairs(noise.emm, simple = "treatment"), by = NULL)
Note: adjust = "tukey" was changed to "sidak"
because "tukey" is only appropriate for one set of pairwise comparisons
 contrast                     year initial_habitat ratio    SE  df null t.ratio p.value
 PermanentExclosure / Control 4    Forest          0.856 0.209 233    1  -0.634  0.8940
 PermanentExclosure / Control 4    Meadow          0.856 0.209 233    1  -0.634  0.8940
 PermanentExclosure / Control 4    Rangeland       0.856 0.209 233    1  -0.634  0.8940

P value adjustment: sidak method for 3 tests 
Tests are performed on the log scale 

What I did was, instead of creating a model matrix and matching coefficients to its columns, I did just the opposite by creating a model matrix column corresponding to the name of each regression coefficient -- including any duplicates.

In the above output, we find that the first 3 estimates are equal to the results of exp(predict(AV, newdata = grid, re.form=NA, type = "link")) shown in a previous comment. The 4--6th estimates are somewhat smaller, so somehow the effect of treatmentControl is coming out wrong. I'll see if I can figure it out.

OK, we've got it. Starting with the above,

n.emm = noise.emm
> n.emm@bhat[11] = 0   # zero out the duplicated coefficient
> n.emm
 year initial_habitat treatment           rate   SE  df lower.CL upper.CL
    4 Forest          PermanentExclosure  8.24 1.25 233     6.11     11.1
    4 Meadow          PermanentExclosure 14.99 2.99 233    10.11     22.2
    4 Rangeland       PermanentExclosure 16.44 1.63 233    13.53     20.0
    4 Forest          Control            10.30 3.23 233     5.55     19.1
    4 Meadow          Control            18.73 5.97 233    10.00     35.1
    4 Rangeland       Control            20.55 4.99 233    12.74     33.2

Confidence level used: 0.95 
Intervals are back-transformed from the log scale 
> summary(pairs(n.emm, simple = "treatment"), by = NULL)
Note: adjust = "tukey" was changed to "sidak"
because "tukey" is only appropriate for one set of pairwise comparisons
 contrast                     year initial_habitat ratio    SE  df null t.ratio p.value
 PermanentExclosure / Control 4    Forest            0.8 0.196 233    1  -0.913  0.7407
 PermanentExclosure / Control 4    Meadow            0.8 0.196 233    1  -0.913  0.7407
 PermanentExclosure / Control 4    Rangeland         0.8 0.196 233    1  -0.913  0.7407

P value adjustment: sidak method for 3 tests 
Tests are performed on the log scale 

Now all 6 estimates match those using the predict function. So the key is simply to omit the duplicates.

I have done this and pushed up the revised package. If you re-install emmeans from GitHub, it should work right. This issue was a tough one! Thanks for reporting it and thanks for your patience.

Hey Russel this is amazing you really did it!! thanks for all your work