simongrund1/mitml

Error message with spatial regression (spml)

Closed this issue · 2 comments

Dear Simon Grund,

Please see the example below with a reproducible example.
I try to use mitml with a standard spatial regression spml and receive an error message when trying to run fit1 with formula fm1 which contains an imputed data for unemp1.
If I run fit1 with formula fm2 which does not contain an imputed data, I do not receive an error message.

Ran

# a reproducible example
library(splm)
library(Ecdat)
library(spdep)
library(mitml)


data("Produc", package = "Ecdat")
data("usaww")

#create data with missing valuese
usalw <- mat2listw(usaww)
Produc$unemp1<-Produc$unemp
Produc$unemp1[Produc$unemp>6.2]<-NA #to produce missing observations
usalw=as.matrix(usaww) #spml neads a weighted matrix

#create multiple imputation in variable unemp1
fml<-unemp1~hwy+water+ (1|state)
imp <- panImpute(Produc, formula=fml, n.burn=1000, n.iter=100, m=5)
implist <- mitmlComplete(imp, print=1:5)

fm <- log(gsp) ~ log(pcap) + log(pc) + log(emp)
fm1<- log(gsp) ~ log(pcap) + log(pc) + log(emp)+unemp1
fm2<- log(gsp) ~ log(pcap) + log(pc) + log(emp)+pc 

# using fm1 below will get an error massage in fit1 below
# using fm2 INSTEAD of fm1 you will not recieve an error massage in fit1, however
# this formula does not contain imputation
fit0 <- with(implist, spml(formula = fm, data = Produc, index = NULL,
                           listw = usaww, lag = TRUE, spatial.error = "b", model = "within",
                           method = "eigen", na.action = na.fail))

fit1 <- with(implist, spml(formula = fm1, data = Produc, index = NULL,
                           listw = usaww, lag = TRUE, spatial.error = "b", model = "within",
                           method = "eigen", na.action = na.fail))

testEstimates(fit1)
testModels(fit1, fit0)

Hi Ran,

this is not strictly a mitml issue, so I'll mark this as "solved". For a workaround, see my suggestion below and check if it solves your problem.

The error message occurs because you specified data = Produc when you used with. This results in the model being run with the original rather than the imputed data. Then, because unemp1 is incomplete in the original data, you receive an error with fm1.

The with function is intended for use with functions that can be evaluated in the local enviroment (i.e., without having to specify data). Unfortunately, spml does not support that and instead requires that data is explicitly specified. As a workaround, you can fit the model using lapply, where you can explicitly do that.

fit0 <- lapply(implist, function(i){
   spml(formula = fm, data = i, index = NULL,
        listw = mat2listw(usaww), lag = TRUE, spatial.error = "b", model = "within",
        method = "eigen", na.action = na.fail)
})

fit1 <- lapply(implist, function(i){
   spml(formula = fm1, data = i, index = NULL,
        listw = mat2listw(usaww), lag = TRUE, spatial.error = "b", model = "within",
        method = "eigen", na.action = na.fail)
})

Notice the data = i, which fits the two models to each of the imputed data sets. The resulting objects can be analyzed as usual using testEstimates and testModels.

testEstimates(fit1)
testModels(fit1, fit0)
# Call:
# 
# testEstimates(model = fit1)
# 
# Final parameter estimates and inferences obtained from 5 imputed data sets.
# 
#            Estimate Std.Error   t.value        df   P(>|t|)       RIV       FMI
# lambda        0.002     0.006     0.324   172.529     0.746     0.180     0.162
# rho          -0.058     0.066    -0.871    23.456     0.393     0.703     0.457
# log(pcap)     0.152     0.018     8.555  1708.294     0.000     0.051     0.050
# log(pc)       0.303     0.011    28.754  5401.140     0.000     0.028     0.028
# log(emp)      0.598     0.014    42.186  1791.654     0.000     0.050     0.048
# unemp1        0.006     0.004     1.381    23.057     0.181     0.714     0.461
# 
# Unadjusted hypothesis test as appropriate in larger samples.

# Call:
# 
# testModels(model = fit1, null.model = fit0)
# 
# Model comparison calculated from 5 imputed data sets.
# Combination method: D1
# 
#    F.value     df1     df2   P(>F)     RIV
#      1.907       1  23.057   0.181   0.714
# 
# Unadjusted hypothesis test as appropriate in larger samples.

See also Example 2 in ?testModels for a similar problem. However, the solution with lapply is more flexible and should be preferred.

Best wishes,
Simon