strengejacke/ggeffects

zi_prob-type predictions with new predict_response function

jml-science opened this issue · 7 comments

Greetings,

I was wondering how to implement adjusted predictions with ggpredict(... type = "zi_prob") with the new predict_response "wrapper" function.

In the documentation for zero-inflated mixed models, it says that "zprob" is only available for predict_response(margin = "empirical"), which takes forever with my large dataset.

There seems to be no option to use "zi_prob" in combination with mean_mode or mean_reference (which I understand to be the "new" ggpredict). Is that correct? mean_reference or mean_reference ultimately passes down to predict(), so shouldn't there be a way of predicting the zero probabilities for models of class glmmTMB?

Ideally, I would like to use "marginalmeans" to marginalize over categorial variables for all three "components" of my zero-inflated negative binomial model (conditional model only, conditional + zero-inflation component, structural zero probabilities). Is there any plan to implement that? I realise that this was also not possible before with ggemmeans().

Thank you!
Julia

That documentation is probably misleading. "zi_prob" is available for ggpredict() (i.e. the default predict_response()), but not for predict_response(margin = "empirical") (which only takes the original type options from the model's predict() method).

emmeans indeed supports marginal means for the zero-inflation probabilities, but is not yet implemented in ggeffects. I could add that, however. Currently, you can only use predict_response() / ggpredict(), or manually convert the results from emmeans into probabilities.

library(ggeffects)
library(glmmTMB)
data(Salamanders)
m <- glmmTMB(
  count ~ spp + mined + (1 | site),
  ziformula = ~ spp + mined,
  family = truncated_poisson,
  data = Salamanders
)

predict_response(m, "spp", type = "zi_prob")
#> # Predicted zero-inflation probabilities of count
#> 
#> spp   | Predicted |     95% CI
#> ------------------------------
#> GP    |      0.85 | 0.77, 0.91
#> PR    |      0.97 | 0.94, 0.98
#> DM    |      0.79 | 0.69, 0.87
#> EC-A  |      0.95 | 0.90, 0.97
#> EC-L  |      0.79 | 0.69, 0.87
#> DES-L |      0.75 | 0.64, 0.83
#> DF    |      0.79 | 0.69, 0.87
#> 
#> Adjusted for:
#> * mined = yes
#> *  site = NA (population-level)
predict(m, newdata = data_grid(m, "spp"), type = "zprob")
#> [1] 0.8526565 0.9687523 0.7906218 0.9458406 0.7906218 0.7472506 0.7906218

out <- emmeans::emmeans(m, "spp", component = "zi")
plogis(as.data.frame(out)$emmean)
#> [1] 0.6349916 0.9031009 0.5316524 0.8400022 0.5316524 0.4705601 0.5316524

Created on 2024-03-27 with reprex v2.1.0

Thank you, this helps a lot! It was a combination of being confused by the documentation and having a typo in the code. predict_response (default mode) in combination with "zi_prob" does work for me, too.

I will have a look at using emmeans::emmeans directly and converting the results into probabilities. I would appreciate the additional implementation in ggeffects (and being able to easily plot the probabilities marginalised with emmeans), but this is probably a niche request :-)

Easier than expected:

library(ggeffects)
library(glmmTMB)
data(Salamanders)
m <- glmmTMB(
  count ~ spp + mined + (1 | site),
  ziformula = ~ spp + mined,
  family = truncated_poisson,
  data = Salamanders
)

out <- emmeans::emmeans(m, "spp", component = "zi", level = NULL)
out <- as.data.frame(out)
cbind(pred = plogis(out$emmean), ci_low = plogis(out$asymp.LCL), ci_hi = plogis(out$asymp.UCL))
#>           pred    ci_low     ci_hi
#> [1,] 0.6349916 0.5161122 0.7394115
#> [2,] 0.9031009 0.8336782 0.9454429
#> [3,] 0.5316524 0.4114694 0.6482722
#> [4,] 0.8400022 0.7521894 0.9008017
#> [5,] 0.5316524 0.4114694 0.6482722
#> [6,] 0.4705601 0.3535621 0.5908851
#> [7,] 0.5316524 0.4114694 0.6482722

ggemmeans(m, "spp", type = "zi_prob")
#> # Predicted zero-inflation probabilities of count
#> 
#> spp   | Predicted |     95% CI
#> ------------------------------
#> GP    |      0.63 | 0.52, 0.74
#> PR    |      0.90 | 0.83, 0.95
#> DM    |      0.53 | 0.41, 0.65
#> EC-A  |      0.84 | 0.75, 0.90
#> EC-L  |      0.53 | 0.41, 0.65
#> DES-L |      0.47 | 0.35, 0.59
#> DF    |      0.53 | 0.41, 0.65

ggpredict(m, "spp", type = "zi_prob")
#> # Predicted zero-inflation probabilities of count
#> 
#> spp   | Predicted |     95% CI
#> ------------------------------
#> GP    |      0.85 | 0.77, 0.91
#> PR    |      0.97 | 0.94, 0.98
#> DM    |      0.79 | 0.69, 0.87
#> EC-A  |      0.95 | 0.90, 0.97
#> EC-L  |      0.79 | 0.69, 0.87
#> DES-L |      0.75 | 0.64, 0.83
#> DF    |      0.79 | 0.69, 0.87
#> 
#> Adjusted for:
#> * mined = yes
#> *  site = NA (population-level)

Created on 2024-03-27 with reprex v2.1.0

Will add tests, and then this PR should be ready to be merged.

predict_response(m, "spp", type = "zi_prob", margin = "marginalmeans")
#> # Predicted zero-inflation probabilities of count
#> 
#> spp   | Predicted |     95% CI
#> ------------------------------
#> GP    |      0.63 | 0.52, 0.74
#> PR    |      0.90 | 0.83, 0.95
#> DM    |      0.53 | 0.41, 0.65
#> EC-A  |      0.84 | 0.75, 0.90
#> EC-L  |      0.53 | 0.41, 0.65
#> DES-L |      0.47 | 0.35, 0.59
#> DF    |      0.53 | 0.41, 0.65

Created on 2024-03-27 with reprex v2.1.0

And plot shows probality-axis-scale:

library(ggeffects)
library(glmmTMB)
data(Salamanders)
m <- glmmTMB(
  count ~ spp + mined + (1 | site),
  ziformula = ~ spp + mined,
  family = truncated_poisson,
  data = Salamanders
)

ggemmeans(m, "spp", type = "zi_prob") |> plot()

ggemmeans(m, "spp", type = "zi_prob") |> plot(log_y = TRUE)

Created on 2024-03-27 with reprex v2.1.0

btw, you can install with ggeffects::install_latest() (once the binaries are built on r-universe)

Thank you - greatly appreciated! I was able to already install it from GitHub and works like a charm.