Extract df from under-identified model
isaactpetersen opened this issue · 2 comments
I'd like to extract the degrees of freedom from an under-identified model (that has more parameters than known values). That is, I'd like to extract the df even when the df is negative. As a teaching exercise, I'm fitting a formative model and would like to demonstrate that additional constraints are required to make the model identifiable. It used to be possible to extract the df from an under-identified model using the fitMeasures()
function. However, that function now throws an error when the model is under-identified and doesn't provide the df:
fitMeasures(formativeModelFit, "df")
#> numeric(0)
fitMeasures(formativeModelFit)
#> Warning in lav_partable_check(lavpartable, categorical = lavoptions$.categorical, : lavaan WARNING: automatically added intercepts are set to zero:
#> [formative]
#> Warning in lav_fit_cfi_lavobject(lavobject = object, fit.measures =
#> fit.measures, : lavaan WARNING: computation of robust CFI failed.
#> Error in if (!is.finite(X2) || !is.finite(df) || !is.finite(X2.null) || : missing value where TRUE/FALSE needed
Interestingly, the model output shows the df (see below), but I'd like to be able to extract it so I can assign it to an object:
#> lavaan 0.6.16 ended normally after 1 iteration
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 5
#>
#> Used Total
#> Number of observations 50 75
#> Number of missing patterns 1
#>
#> Model Test User Model:
#>
#> Test statistic NA
#> Degrees of freedom -5
#> P-value (Unknown) NA
I posted on the google group, but nobody responded, which leads me to think that this is not currently possible (though it used to be in prior versions of lavaan), though I could be wrong.
Here is a reproducible example:
library("lavaan")
#> This is lavaan 0.6-16
#> lavaan is FREE software! Please report any bugs.
complement <- function(y, rho, x) {
x <- rnorm(length(y))
y.perp <- residuals(lm(x ~ y))
rho * sd(y.perp) * y + y.perp * sd(y) * sqrt(1 - rho^2)
}
set.seed(1)
v1 <- complement(PoliticalDemocracy$y1, .4)
v2 <- complement(PoliticalDemocracy$y1, .4)
v3 <- complement(PoliticalDemocracy$y1, .4)
v4 <- complement(PoliticalDemocracy$y1, .4)
PoliticalDemocracy$v1 <- v1
PoliticalDemocracy$v2 <- v2
PoliticalDemocracy$v3 <- v3
PoliticalDemocracy$v4 <- v4
PoliticalDemocracy <-
as.data.frame(lapply(
PoliticalDemocracy,
function(cc) cc[ sample(
c(TRUE, NA),
prob = c(0.9, 0.1),
size = length(cc),
replace = TRUE)]))
formativeModel_syntax <- '
#Formative model factor loadings
formative <~ v1 + v2 + v3 + v4
formative ~~ formative
'
formativeModelFit <- sem(
formativeModel_syntax,
data = PoliticalDemocracy,
missing = "ML",
estimator = "MLR")
#> Warning in lav_data_full(data = data, group = group, cluster = cluster, :
#> lavaan WARNING: all observed variables are exogenous; model may not be
#> identified
#> Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: 25 cases were deleted due to missing values in
#> exogenous variable(s), while fixed.x = TRUE.
#> Warning in lav_partable_check(lavpartable, categorical = lavoptions$.categorical, : lavaan WARNING: automatically added intercepts are set to zero:
#> [formative]
#> Warning in lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats, : lavaan WARNING:
#> Could not compute standard errors! The information matrix could
#> not be inverted. This may be a symptom that the model is not
#> identified.
#> Warning in max(abs(Hessian - t(Hessian))): no non-missing arguments to max;
#> returning -Inf
#> Warning in max(diag(Hessian)): no non-missing arguments to max; returning -Inf
#> Warning in lav_test_yuan_bentler(lavobject = NULL, lavsamplestats = lavsamplestats, : lavaan WARNING: could not invert information [matrix needed for robust test statistic
formativeModelFit
#> lavaan 0.6.16 ended normally after 1 iteration
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 5
#>
#> Used Total
#> Number of observations 50 75
#> Number of missing patterns 1
#>
#> Model Test User Model:
#>
#> Test statistic NA
#> Degrees of freedom -5
#> P-value (Unknown) NA
fitMeasures(formativeModelFit, "df")
#> numeric(0)
fitMeasures(formativeModelFit)
#> Warning in lav_partable_check(lavpartable, categorical = lavoptions$.categorical, : lavaan WARNING: automatically added intercepts are set to zero:
#> [formative]
#> Warning in lav_fit_cfi_lavobject(lavobject = object, fit.measures =
#> fit.measures, : lavaan WARNING: computation of robust CFI failed.
#> Error in if (!is.finite(X2) || !is.finite(df) || !is.finite(X2.null) || : missing value where TRUE/FALSE needed
sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=English_United States.utf8
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> time zone: America/Chicago
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] lavaan_0.6-16
#>
#> loaded via a namespace (and not attached):
#> [1] vctrs_0.6.4 cli_3.6.1 knitr_1.45 rlang_1.1.2
#> [5] xfun_0.41 purrr_1.0.2 styler_1.10.2 glue_1.6.2
#> [9] pbivnorm_0.6.0 htmltools_0.5.7 stats4_4.3.1 rmarkdown_2.25
#> [13] quadprog_1.5-8 R.cache_0.16.0 evaluate_0.23 fastmap_1.1.1
#> [17] yaml_2.3.7 lifecycle_1.0.4 compiler_4.3.1 fs_1.6.3
#> [21] rstudioapi_0.15.0 R.oo_1.25.0 R.utils_2.12.3 digest_0.6.33
#> [25] reprex_2.0.2 parallel_4.3.1 mnormt_2.1.1 magrittr_2.0.3
#> [29] R.methodsS3_1.8.2 tools_4.3.1 withr_2.5.2
Created on 2023-11-18 with reprex v2.0.2
I think lav_partable_df()
can give you what you're looking for. Bearing in mind that sem()
sets fixed.x=TRUE
by default, you need to pass that argument to get the matching parameter tables. Also, your example sets estimator = "MLR"
and missing = "ML"
, both of which implicitly set meanstructure = TRUE
.
parTable(formativeModelFit)
(PT <- lavaanify(formativeModel_syntax, fixed.x=TRUE, meanstructure=TRUE))
## wrong degrees of freedom using defaults
lav_partable_df(lavaanify(formativeModel_syntax)) # -1
## the right ones (matches summary() output)
lav_partable_df(PT)
That perfectly answered my question--thank you!