JenniNiku/gllvm

Interpreting concurrent latent variables

Closed this issue · 4 comments

I'm running an ordination that looks like this when unconstrained, with a clear LV2 gradient.

Screenshot 2023-05-03 at 19 37 06

When I run a concurrent ordination, i still get clear separation, but no longer along one LV. (The clusters are individual water bodies).
fit <- gllvm(y = y, X = Xreduced, family = binomial(), num.lv.c = 2, lv.formula = ~ Overall.Status.Grouped + PC1, studyDesign = Site, row.eff = ~ (1 | Site), control.start = list(n.init = 20))

Screenshot 2023-05-03 at 19 37 33

Conceptually, this is fine, but if i ask for the contributions of Overall.Status.Grouped and PC1 to the concurrent LVs, Overall.Status.Grouped is not significant, since the separation is along the diagonal. In PCA, one would do a rotation to make the PCs intepretable.

I'm curious what you think about this situation. I'd like to use the partial R2 as an estimate of variable importance.

Screenshot 2023-05-03 at 19 37 48

Screenshot 2023-05-03 at 19 38 05

You can definitely rotate the LVs post-hoc! The best way to do that (currently) is via the GPArotation R-package.

Alternatively, it is not too difficult to find a rotation yourself that maximizes the relationship of an LV with a particular predictor. I think I have some code laying around somewhere, but it is a bit late on the evening now to dig it up. I will put it on my to-do list, and get back you either tomorrow (if I can find the time) or within the next fews days otherwise.

I should add that one should not put too much faith in the significance of predictor effects in the ordination. As you note those effects and their uncertainties are sensitive to the rotation. It is definitely helpful for interpreting what drives vertical or horizontal separation, but I generally advice also to take a good look at coefplot(fit) on how the predictors affect the model.

As an aside, why on earth would you include a Principal Component inside another (model-based) ordination method?

to answer the PCA question, i had four very correlated landcover variables (% A + % B + % C + % D = 100%) that i was able to reduce to one covariate.

I am indeed also looking at coefplot() values. These are more useful for species that we know something about, but not so much for many of my datasets, which are obscure fungal or meiofaunal OTUs.

please take your time with the rotation code. The partial R2 is what i am going to focus on. thanks!

OK, here is an example of how to rotate the ordination so that the first LV is unrelated to the first predictor. Note that you can do this for any predictor or LV of course, but if you want to simultaneously minimize or maximize for more predictors or in more dimensions things get a lot harder and the below does not apply. Also the uncertainties and test-statistics of the coefficients are affected by the rotation, and I do not correct that in the example below. The R² from van der Veen et al. 2022 is invariant to the rotation and remains the same.

library(units)
library(gllvm)
data(spider)

mod <- gllvm(y = spider$abund, X = scale(spider$x), num.RR = 2, family = "poisson")
ordiplot(mod)

# rotate fiirst predictor to 0 degrees from y axis
angle <- abs(set_units(as_units(atan(mod$params$LvXcoef[1,1]/mod$params$LvXcoef[1,2]),"radians"),"degrees"))
# construct rotation matrix
rot <- diag(cos(angle), ncol = 2, nrow = 2)
rot[row(rot)!=col(rot)]<-c(sin(angle), -sin(angle))
# Rotate LV coefficients
LVcoef <- mod$params$LvXcoef%*%rot
mod$params$LvXcoef <- LVcoef
# note that SEs are now not correct so arrow.ci needs to be false
# otherwise we would 
ordiplot(mod,rotate=F, arrow.ci = F)

unrotated:
49270ad8-334b-48ee-8ff6-0149289bc0e8
rotated:
6e334681-27ed-4beb-af2b-f356e3d1e008

thanks for this. i'm going to have to remember some trigonometry...