Model fit and ppmc() with ordinal data
littlehifive opened this issue · 2 comments
Hi Ed,
@bgoodri and I are working on a project together where we wish to build a Bayesian CFA model for a few measures with items on an ordinal scale (e.g., 1-4 Likert scale). We have a few questions:
- I am a bit puzzled by the fact that the model fit of the Bayesian CFA (given by
blavFitIndices()
) is very different from that of the Frequentist CFA (given bylavaan::cfa()
). I understand that the Frequentist model may be overfitting the data by giving CFI = 0.99 and RMSEA = 0.04. But the Bayesian model fit is very different. I checked the predicted ordinal values from the model and they seem to correspond well with the raw distribution of the items. Could it be because these Bayesian fit indices do not work too well with ordinal data?
Posterior mean (EAP) of devm-based fit indices:
BRMSEA BGammaHat adjBGammaHat BMc
0.637 0.246 -0.497 0.000
- When I tried to use
ppmc()
on my fitted Bayesian CFA model, I got this error. Is this because ppmc() does not work too well with ordinal data? Would addingmcmcextra = list(data = list(emiter = 50))
inbcfa()
help? I am usingblavaan_0.3-18.853
.
Error in "mcmcdata" %in% names(lavobject@external) :
trying to get slot "external" from an object of a basic class ("NULL") with no slots
- The
MargLogLik
is still NA after addingmcmcextra = list(data = list(llnsamp = 200))
. Is there another way to compute the likelihood?
Thanks!
Hi Zezhen, thanks for the questions. I think that these issues are mostly due to the fact that ordinal models are very new to blavaan. I will have to look at 1 some more... it might well be that these metrics do not work well for ordinal data (they were developed for continuous data), but I cannot rule out a bug right now. I will let you know if I find something.
For 2, this is a bug and should be fixed soon.
For 3, the "MargLogLik" is the likelihood used for Bayes factors (marginal over all parameters, as opposed to marginal over only the latent variables... the llnsamp
setting is only used to compute a likelihood that is marginal over latent variables). In the continuous case, blavaan uses a Lagrange approximation here that does not immediately work for ordinal. The ordinal models return NA for now because I have not gotten around to implementing it.
And I would recommend upgrading blavaan to 0.4-1, or to the github version.
Just some follow-ups, involving commit b0c191e from earlier today:
- I have turned off
blavFitIndices()
for ordinal models because I think some underlying code is assuming continuous data, and it will take some extra research to ensure this works on the ordinal side (the continuous metrics are based on recent publications, and I don't think the analogous publications exist for ordinal). This was an oversight on my part: in 0.4-1, I was focused on getting the Stan model working and did not test the fit indices as much as I should have. ppmc()
should now be working better with the ordinal models. But there is still some unresolved ambiguity. For example, the default argument to that isfit.measures = c("srmr", "chisq")
. This creates a lavaan object using each posterior sample, then computes the usual frequentist metrics of srmr and chisq. But chisq is ambiguous for ordinal models. Under the defaultppmc()
arguments, it computes the DWLS chi square statistic using lavaan. If you change the argument to justfit.measures="chisq"
(leaving outsrmr
and any others), then it approximates the multinomial likelihood underlying the Bayesian model and does the usual likelihood ratio statistic. This is clearly not the best situation, and I need to find an intuitive way to distinguish between them.