Specification of factor model with auto.var = TRUE, std.lv = TRUE with ordinal indicators
jpritikin opened this issue · 5 comments
library(blavaan) # 0.5-5
future::plan("multicore")
NoYesLevels <- c('No','Not sure','Yes')
EuphoricLevels <- c('Euphoric', 'Mildly euphoric', 'No change', 'Mild discomfort', 'Severe discomfort')
ImpactLevels <- c('Helped','Slightly helped','No impact','Slightly hindered','Hindered')
AbsItemNames <- c('boredom','frustration','anxiety', 'abort','bodyAcute','bodyAfter','sleep')
ImportStudyItems <- function(df) {
if (any(is.na(match(AbsItemNames, colnames(df))))) stop("Items are mising")
df[['boredom']] <- ordered(df[['boredom']], levels=NoYesLevels)
df[['frustration']] <- ordered(df[['frustration']], levels=NoYesLevels)
df[['anxiety']] <- ordered(df[['anxiety']], levels=NoYesLevels)
df[['abort']] <- ordered(df[['abort']], levels=NoYesLevels)
df[['bodyAcute']] <- ordered(df[['bodyAcute']], levels=EuphoricLevels)
df[['bodyAfter']] <- ordered(df[['bodyAfter']], levels=EuphoricLevels)
df[['sleep']] <- ordered(df[['sleep']], levels=ImpactLevels)
df
}
psiData <- ImportStudyItems(read.csv("psi.csv"))
pmData <- ImportStudyItems(read.csv("pm.csv"))
combinedData <- rbind(cbind(psiData, group="psi"),
cbind(pmData, group="pm"))
combinedData$group <- factor(combinedData$group)
spec <- '
phenom =~ boredom + frustration + anxiety + abort + bodyAcute + bodyAfter + sleep
boredom ~ 1
frustration ~ 1
anxiety ~ 1
abort ~ 1
bodyAcute ~ 1
bodyAfter ~ 1
sleep ~ 1
boredom | -.43*t1 + .43*t2
frustration | -.43*t1 + .43*t2
anxiety | -.43*t1 + .43*t2
abort | -.43*t1 + .43*t2
bodyAcute | -.84*t1 + -.25*t2 + .25*t3 + .84*t4
bodyAfter | -.84*t1 + -.25*t2 + .25*t3 + .84*t4
sleep | -.84*t1 + -.25*t2 + .25*t3 + .84*t4
'
fit <- blavaan(spec, data=combinedData,
n.chains = 2, sample=500,
ordered = AbsItemNames,
group = "group",
group.equal = c("loadings", "intercepts", "means", "residuals", "regressions"),
group.partial = c("phenom~1"),
auto.var = TRUE, std.lv = TRUE)
summary(fit)
There are three things that I don't understand:
- I have
auto.var=TRUE
but I don't see any estimated parameters related to the observed indicator variances. It seems like there has got to be some estimated observed variances otherwise I don't understand how the latent variance can be fixed to 1. - I have
group.partial = c("phenom~1")
but thephenom
mean is fixed to zero in both groups. Weirdly, the variance forphenom
is free in one group. Isn't~1
the notation for the intercept, not the variance? How do I free it for one group and keep fixed in other group? - I have
std.lv = TRUE
and all of my indicators are unbounded. Don't I need to constrain at least one of them to be positive or negative? Or do you take care of this automagically?
These simulated data have at least one observation in every ordinal outcome category,
pm.csv
psi.csv
Thanks for the example, and see below. In general, I think the model syntax in spec
will provide more flexibility than the arguments in blavaan()
. But I wonder whether the following gets closer to what you are looking for:
fit <- blavaan(spec, data=combinedData,
n.chains = 2, sample=500,
ordered = AbsItemNames,
group = "group",
group.equal = c("loadings", "intercepts", "regressions"),
auto.var = TRUE, std.lv = TRUE)
And in response to your bullets:
- In the summary() there is a "variance" section and a "scales" section for residual variance of each variable and for full latent variance (which includes variance due to factors), respectively. In your original model, I think that
group.equal = "residuals"
was fixing all the variances to 1 for both groups. Removing that, you should now see estimated variances for group 2 but not group 1. I am not sure that it is possible to stop the group 1 variances from equaling 1 via ablavaan()
argument, it might be necessary to add lines like below to the model syntax:
boredom ~~ c("v1", "v2") * boredom
- Yes,
~1
is for the mean. I am not quite sure why thegroup.partial
thing is not working (maybe a bug), and I think the variance is being estimated independently of this statement (due to the multiple groups). For now, I think you need to get rid ofgroup.equal = "means"
and then add something like this to the model syntax:
phenom ~ c(NA, 0)*1
- The first loading is automagically constrained to be positive. Some more detail is here.
Okay, I changed it to
spec <- '
phenom =~ boredom + frustration + anxiety + abort + bodyAcute + bodyAfter + sleep
boredom ~ 1
frustration ~ 1
anxiety ~ 1
abort ~ 1
bodyAcute ~ 1
bodyAfter ~ 1
sleep ~ 1
phenom ~ c(0, NA)*1
phenom ~~ 1*phenom
boredom ~~ v1*boredom
frustration ~~ v2*frustration
anxiety ~~ v3*anxiety
abort ~~ v4*abort
bodyAcute ~~ v5*bodyAcute
bodyAfter ~~ v6*bodyAfter
sleep ~~ v7*sleep
boredom | -.43*t1 + .43*t2
frustration | -.43*t1 + .43*t2
anxiety | -.43*t1 + .43*t2
abort | -.43*t1 + .43*t2
bodyAcute | -.84*t1 + -.25*t2 + .25*t3 + .84*t4
bodyAfter | -.84*t1 + -.25*t2 + .25*t3 + .84*t4
sleep | -.84*t1 + -.25*t2 + .25*t3 + .84*t4
'
fit <- blavaan(spec, data=combinedData,
n.chains = 2, sample=500,
ordered = AbsItemNames,
group = "group",
group.equal = c("loadings", "intercepts", "regressions"),
std.lv=TRUE)
This is looking really good. The only weird thing that I notice is that the variance of the ordinal item abort
got larger than 1.0. Shouldn't the variance be bounded above at 1.0? Can I put an upper bound on the variances?
.abort (v4) 0.643 0.249 0.304 1.212 0.998 gamma(1,.5)[sd]
Good to hear things are working. I don't think the variance is bounded at 1 because this is using the theta parameterization, i.e., there is no restriction that the total variance is 1 (and the "scales" part of the summary output shows that the total variances are above 1). There is no way to add an upper bound, but you can change the prior so there is little density above 1, using an argument like
dp = dpriors(theta = "gamma(5,7)[sd]")
where you could also use [var]
or [prec]
to put a prior on the variance or precision, instead of the SD.
You seem to imply that it's fine, in theory, to have an ordinal indicator with a total variance greater than 1. I'm not sure where I got the idea that ordinal indicators should max out at a variance of 1. Maybe I'm misremembering something. I guess I'll leave the model as is. I doubt it would make much practical difference anyway.
I think that a common identification constraint for ordinal CFA involves fixing the total variance to 1, i.e.,
Then you are right that the variances (theta estimates) should not go above 1. The default for blavaan (the "theta" parameterization) is instead to fix