How many parameters does phylolm estimate? A comparison with caper::pgls()
Closed this issue · 4 comments
Hello!
I need some help interpreting the number of parameters fitted by phylolm().
In the example below, I have fitted a phylogenetic regression with Brownian motion, and a single continuous predictor. I would expect the number of parametrs to be two: intercept and slope. This is indeed the result I get with caper::pgls().
Instead, phylolm() estimates3 parameters. While the two packages yield models with the same coefficients (and same log-likelihood), they differ in number of estimated parameters, and therefore they differ in AIC and AICc.
Which parametrs is phylolm estimating? Why does it estimate an "extra" parameter compared to pgls::caper?
Thank you!
library(ade4)
library(ape)
library(caper)
library(MuMIn)
library(phylolm)
library(tidyverse)
#data from the ade4 package
data(lizards)
#phylogenetic tree
tree <- ape::read.tree(text = lizards$hprA)
#traits
data <- lizards$traits %>% dplyr::mutate(names = rownames(.))
#two implementations of the same regression: caper::pgls vs phylolm::phylolm
gls.list <-
list(
caper =
caper::pgls(matur.L ~ age.mat, data = caper::comparative.data(phy = tree, data = data, names.col = names, vcv = TRUE, warn.dropped = TRUE), lambda = 1),
phylolm =
phylolm::phylolm(matur.L ~ age.mat, data = data, phy = a.01.tree, model = "BM")
)
#the models have the same coefficients and log-likelihood
lapply(gls.list, coef)
lapply(gls.list, stats::logLik)
#but different number of estimated parameters
gls.list$caper$k
gls.list$phylolm$p
#and therefore, different AIC and AICc
lapply(gls.list, stats::AIC)
lapply(gls.list, MuMIn::AICc)
Some software packages don't count the variance as a parameter, because it's present in all models. Here, could the BM variance be the parameter counted by phylolm
but not caper
? You didn't include the number of parameters output by each package; that would be useful.
Possibly! In their book on information theoretic methods, Brunham&Anderson (2002, p.63), state that in an ordinary least squares regresion, the variance must be counted in the number of estimated paramaters (together with intercept, slopes, etc). I don't know whether the same should apply to gls.
As for the number of parameters per package, the lines of code are:
#but different number of estimated parameters
gls.list$caper$k
gls.list$phylolm$p
The correct number of parameters is 3 (intercept, slope, and variance). I think caper doesn't count variance.
I see, thank you! :)