leeper/margins

Error when svyglm drops observations due to missingness

mattysimonson opened this issue · 2 comments

The margins() function returns an error when a svyglm object has omitted observations from the original data due to missingness. It appears to have trouble reconciling the number of rows in the original data with the number of rows actually used.

## load package
library("margins")

# Create a survey design using the survey package vignette
library(survey)
data(api)
dstrat <- svydesign(id=~1,strata=~stype,  data=apistrat, fpc=~fpc)
dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2)

# Run a regression
m1 <- svyglm(api00 ~ ell + meals + mobility, design = dclus2)

# So far so good, margins() works
margins(m1, design = dclus2)

# Now simulate what happens if values are missing
apiclus2_modified <- apiclus2
apiclus2_modified[1:10, "meals"] <- NA

# Create survey design and run regression
dclus2_modified<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2_modified)
m2 <- svyglm(api00 ~ ell + meals + mobility, design = dclus2_modified)

# margins() fails
margins(m2, design = dclus2_modified)


## session info for your system
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] survey_4.0      survival_3.1-12 Matrix_1.2-18   margins_0.3.26 

loaded via a namespace (and not attached):
 [1] MASS_7.3-51.4     compiler_3.6.1    DBI_1.1.0         tools_3.6.1      
 [5] splines_3.6.1     data.table_1.13.4 packrat_0.5.0     lattice_0.20-38  
 [9] mitools_2.4       prediction_0.3.14

# Error message:
Error in data.frame(..., check.rows = FALSE, check.names = FALSE, fix.empty.names = FALSE,  : 
  arguments imply differing number of rows: 252, 232

#Traceback: 
13: stop(gettextf("arguments imply differing number of rows: %s", 
        paste(unique(nrows), collapse = ", ")), domain = NA)
12: data.frame(..., check.rows = FALSE, check.names = FALSE, fix.empty.names = FALSE, 
        stringsAsFactors = FALSE)
11: make_data_frame(out, fitted = unclass(tmp), se.fitted = sqrt(unname(attributes(tmp)[["var"]])))
10: prediction.svyglm(model = model, data = data.table::rbindlist(list(d0, 
        d1)), type = type, calculate_se = FALSE, ...)
9: prediction(model = model, data = data.table::rbindlist(list(d0, 
       d1)), type = type, calculate_se = FALSE, ...)
8: dydx.default(X[[i]], ...)
7: FUN(X[[i]], ...)
6: lapply(c(varslist$nnames, varslist$lnames), dydx, data = data, 
       model = model, type = type, eps = eps, as.data.frame = as.data.frame, 
       ...)
5: marginal_effects.glm(model = model, data = data, variables = variables, 
       type = type, eps = eps, varslist = varslist, ...)
4: marginal_effects(model = model, data = data, variables = variables, 
       type = type, eps = eps, varslist = varslist, ...)
3: build_margins(model = model, data = data_list[[i]], variables = variables, 
       type = type, vcov = vcov, vce = vce, iterations = iterations, 
       unit_ses = unit_ses, weights = wts, eps = eps, varslist = varslist, 
       ...)
2: margins.svyglm(m2, design = dclus2_modified)
1: margins(m2, design = dclus2_modified)

Did this every get resolved? I am having the same problem-thanks!

It looks like my pull-request #159 solves this problem too.