vincentarelbundock/marginaleffects

Pooling fails for `avg_comparisons()` after multiple imputation with `transform` specified

Closed this issue · 3 comments

When pooling estimates after multiple imputation (e.g., using avg_comparisons() on a mira object), an error is thrown if transform is supplied. See below:

library(mice)
library(marginaleffects)

data("lalonde_mis", package = "cobalt")

imp <- mice(lalonde_mis, print = FALSE)

# Logistic regression
fits <- with(imp, glm(treat ~ age + married, family = binomial))

# Mean difference okay
avg_comparisons(fits, variables = "married")
#> Warning in get.dfcom(object, dfcom): Infinite sample size assumed.
#> 
#>  Estimate Std. Error     t Pr(>|t|)    S  2.5 % 97.5 %   Df
#>     -0.27     0.0372 -7.25   <0.001 40.8 -0.343 -0.197 2920
#> 
#> Term: married
#> Type:  response 
#> Comparison: mean(1) - mean(0)
#> Columns: term, contrast, estimate, std.error, s.value, predicted_lo, predicted_hi, predicted, df, statistic, p.value, conf.low, conf.high

# Pooling fails with transform
avg_comparisons(fits, variables = "married",
                comparison = "lnratioavg",
                transform = "exp")
#> Warning in get.dfcom(object, dfcom): Infinite sample size assumed.
#> Error: The output does not include a `std.error` column. Some models do not
#>   generate standard errors when estimates are backtransformed (e.g., GLM
#>   models). One solution is to use `type="response"` for those models.

Created on 2024-11-02 with reprex v2.1.1

Note adding type="response" does not help. This is not an issue for non-imputed data. I feel that pooling should occur before the transform step anyway, with transform applied to the pooled output, so it should not be searching for the std.error column after transformation. Thanks for looking into this :)

@ngreifer can you try this PR and tell me if it works as expected?

#1275

Looks to be working fine to me! Thanks for the fix.

merged in main