AIC model selection VS Likelihood Ratio Tests
RodolfoPelinson opened this issue · 1 comments
Hello. First of all, thank you for this great package that has been helping me a lot with the analysis of community data.
I am currently analyzing community data from an experiment where I have a fully-factorial design of two factors with three levels each (agrochemical treatment and spatial isolation). So I wish to test whether I have additive or interactive effects of these two treatments on community data. Something like:
community ~ treatments * isolation
So I first ran the following models and compared them using AICc (I am omitting some arguments just to make the code cleaner). Also, I am using zero latent variables because the AIC value for this model was by far the smallest one if compared to 1, 2, 3, 4, and 5 latent variables (all tested with the full model).
model_1 <- gllvm(formula = ~ 1, num.lv = 0, n.init = 10, seed = 1:10)
model_2 <- gllvm(formula = ~ treatments, num.lv = 0, n.init = 10, seed = 1:10)
model_3 <- gllvm(formula = ~ isolation, num.lv = 0, n.init = 10, seed = 1:10)
model_4 <- gllvm(formula = ~ treatments + isolation, num.lv = 0, n.init = 10, seed = 1:10)
model_5 <- gllvm(formula = ~ treatments * isolation, num.lv = 0, n.init = 10, seed = 1:10)
When I compare those models using AICc (using the function AICc()
from the gllvm
package), this is the result:
model_1 = 1783.451
model_2 = 1777.148
model_3 = 1778.979
model_4 = 1776.965
model_5 = 1804.377
In a cleaner model selection table (this is from a function that I designed myself):
model AICc delta_AICc df nobs
1 No Effect 1783.451 6.4863018 14 315
2 Treatments 1777.148 0.1829908 28 315
3 Isolation 1778.979 2.0145949 28 315
4 Isolation + Treatments 1776.965 0.0000000 42 315
5 Isolation * Treatments 1804.377 27.4121353 70 315
It seems that the most plausible model is the one that only includes additive effects of both factors. Considering the rule that models with delta AICc < 2 are equally plausible, I would say that the most plausible model is actually the one that only includes the effects of treatments, since it das delta AICc < 2 and is a simpler model. Also, the model that includes the interaction is by far the less plausible model.
However, if I do likelihood ratio tests using the anova()
function from the gllvm
package, these are the results:
Testing the effects of treatments:
> anova(model_3, model_4)
Model 1 : ~ treatments
Model 2 : ~ treatments + isolation
Resid.Df D Df.diff P.value
1 287 0.00000 0
2 273 35.78408 14 0.00112398
Testing the effects of isolation:
> anova(model_2, model_4)
Model 1 : ~ isolation
Model 2 : ~ treatments + isolation
Resid.Df D Df.diff P.value
1 287 0.00000 0
2 273 37.61569 14 0.00059483
Testing the effects of the interaction:
> anova(model_4, model_5)
Model 1 : ~ treatments + isolation
Model 2 : ~ treatments * isolation
Resid.Df D Df.diff P.value
1 273 0.00000 0
2 245 56.04616 28 0.0012698
It basically says that all main effects and the interaction are significant and with very small p-values.
I understand that model selection using AIC is not supposed to return exactly the same results (qualitatively) as likelihood ratio tests. I am also aware that the anova()
function only returns approximate p-values when df.diff is larger than 20. Still, these results are strongly discrepant. Usually, I don`t have such conflicting results between AIC and LRT in univariate models. I wonder if I am not doing something wrong. If not, do you know what is probably happening? Any guesses on what approach should I use?
Thank you!
Thanks for your question; there is no reason why anova
and AICc
should correspond here (though ideally they would of course).
In general I would suggest caution in doing extensive model-selection with either LRT or AIC in models with mixed-effects (or generally, actually). LRT does not penalize for the number of parameters, AICc does, and a model is likely to improve if you throw in a number of parameters equal to the number of species! As the anova
function says, please do not rely on it when the different in the number of parameters is so large (as is the case here).
I am very hesitant on making a recommendation what you should do, as there are many different opinions and thoughts on the subject of model-selection (and how to do it in mixed-effects models). I would personally go with AICc
here, or "just" rely on the model where all terms are included and comparing the statistical uncertainties of parameter estimates, avoiding model-selection altogether.
As an aside, the seed
argument is ignored when you specify n.init
, as the seeds are randomly sampled internally.