Bug when set.seed() is called from within Experiment components
tiffanymtang opened this issue · 0 comments
tiffanymtang commented
If set.seed()
is called from within DGPs/Methods/Evaluators/Visualizers and add_vary_across(.dgp = ...)
is used, the seed affects all replicates after the first vary-across parameter value. Consequently, the results are the same across all replicates for a given vary-across parameter value (after the first one).
For now, it is highly recommended to not use set.seed() within DGPs/Methods/Evaluators/Visualizers.
Reproducible Example:
library(simChef)
library(magrittr)
rm(list = ls())
N_REPS <- 2
set.seed(1234)
#### DGPs ####
dgp_fun <- function(n, p) {
X <- matrix(rnorm(n * p), nrow = n, ncol = p)
y <- rnorm(n)
return(list(X = X, y = y))
}
dgp <- create_dgp(dgp_fun, .name = "DGP", n = 300, p = 3)
noseed_fun <- function(X, y) {
lm_df <- cbind(data.frame(X), .y = y)
lm_fit <- lm(.y ~ ., data = lm_df)
return(coef(lm_fit))
}
noseed_method <- create_method(noseed_fun, .name = "No Seed")
seed_fun <- function(X, y) {
set.seed(1)
lm_df <- cbind(data.frame(X), .y = y)
lm_fit <- lm(.y ~ ., data = lm_df)
return(coef(lm_fit))
}
seed_method <- create_method(seed_fun, .name = "Seed")
# this works
experiment <- create_experiment() %>%
add_dgp(dgp) %>%
add_method(noseed_method) %>%
add_vary_across(
.dgp = "DGP", n = c(100, 200, 300)
)
out <- run_experiment(experiment, n_reps = N_REPS)
out$fit_results %>%
dplyr::arrange(n)
# this gives the same results across all replicates
experiment <- create_experiment() %>%
add_dgp(dgp) %>%
add_method(noseed_method) %>%
add_method(seed_method) %>%
add_vary_across(
.dgp = "DGP", n = c(100, 200, 300)
)
out <- run_experiment(experiment, n_reps = N_REPS)
out$fit_results %>%
dplyr::arrange(n)