mfdb_sample_count not return bootstrapped samples
Closed this issue · 3 comments
I am attempting to set up a bootstrapping gadget model; however, I cannot seem to export samples from bootstrapped areas using mfdb_sample_count
. I am following code already set up here. Below is a minimum reproducible example that gives the same effect.
Basically I am setting up n
resampled areacells in a list, and then trying to export using mfdb_sample_count
, which should return a list with n
data.frames
. However, it only returns 1. Am I thinking about this correctly or not?
Thanks for any help.
Minimum Reproducible Example
library(mfdb)
# setup a simple database and populate with data
mdb <- mfdb('zebrafish')
year <- 1:10
month <- 1
area <- 1:10
base_data <- expand.grid(year = year,
month = month,
areacell = area)
count <- sample(1:1000, 100, replace = T)
data <- cbind(base_data, count)
mfdb_import_area(mdb, data.frame(name = 1:10))
# solved: must also import division for bootstrapping to work
# these two lines were not included originally
divList <- as.list(1:10); names(divList) <- 1:10
mfdb_import_division(mdb, divList)
mfdb_import_survey(mdb, data, data_source = 'test')
# attempt to retrieve samples
defaults <- list(
areacell = mfdb_group(`1` = 1:10),
timestep = mfdb_timestep_quarterly,
year = 1:10
)
n <- 10 # number of bootstrap samples to retrieve
export_test <- mfdb_sample_count(mdb,
cols=NULL,
params = c(list(), defaults))
defaults <- within(defaults,
{areacell = mfdb_bootstrap_group(n,
defaults$areacell,
seed = 270)})
bootstrap_test <- mfdb_sample_count(mdb,
cols=NULL,
params = c(list(), defaults))
length(export_test) == length(bootstrap_test)
# > TRUE
mfdb_disconnect(mdb)
mfdb('zebrafish', destroy_schema=T)
Okay, I believe this has to do with the distinction between column names in the data_in
argument for mfdb_import_survey
. For example, if we change the name in the defaults
list to area
instead of areacell
, then mfdb_sample_count
returns a list of the appropriate number of bootstrap samples. However, all data.frame
s in the list are 0 columns and 0 rows. But, if I try to change the imported data column name in mfdb_import_survey
to area
instead of areacell
I get the following error:
Error in sanitise_col(mdb, data_in, "areacell", lookup = "areacell") :
Input data is missing 'areacell'. Columns available: year,month,area,count
How can I import data_in
with a column name of area
instead of areacell
?
I think I've solved my own problem thanks to help from @bthe. Area must also be added with mfdb_import_division
as in the following two lines.
divList <- as.list(1:10); names(divList) <- 1:10
mfdb_import_division(mdb, divList)
See updated code in the original post.
Yeah, area -> divisions is an odd special case, picked up from DST^2. Glad you got it sorted though, sorry I didn't get here in time.