Error when passing optional arguments to `cross_validate`
osofr opened this issue · 10 comments
Minimal example below, taken from vignette. The function cross_validate
currently errors out whenever I try to pass any optional arguments to cv_fun
. The error appears to be downstream with future
package. This is crucial to my functionality, I have to pass optional arguments to cv_fun
. Any advice on work-arounds?
library(origami)
cvlm <- function(fold, mydata) {
train_data <- training(mydata)
valid_data <- validation(mydata)
mod <- lm(mpg ~ ., data = train_data)
preds <- predict(mod, newdata = valid_data)
list(coef = data.frame(t(coef(mod))), SE = ((preds - valid_data$mpg)^2))
}
# cross-validated estimate
folds <- make_folds(mtcars)
results <- cross_validate(cvlm, folds, mtcars)
results <- cross_validate(cvlm, folds, mydata = mtcars)
In general, I think it might not be the best practice to provide an example with a function cvlm
that depends on some object (mtcars
) that was defined in the calling environment of this function. This creates quite a bit of confusion and makes it hard to read the code. That's just my opinion though.
Yep I ran into that exact same issue myself, and reported it here (see workarounds): HenrikBengtsson/future#156
Thanks for reporting this error, Oleg. And, thanks for having already reported this error in future
to Henrik, Chris.
I'm honestly not sure of a quick fix off of the top of my head, but can do a bit of digging and report anything back here. Based on the conversation in the issue reported to future
, it looks like Henrik has plans to fix this soon.
Also, thanks for pointing out that slip up in the definition of the cvlm
function. I definitely agree that having the function defined in that way is bad practice and will update that example (and any others like it) to reflect this.
future.globals=FALSE
should work fine for plan(sequential)
and plan(multicore)
, but for plan(multisession)
you'll need to specify the globals and packages needed by the different sessions manually. You can see an example of that here: https://github.com/jeremyrcoyle/SuperLearner/blob/integrate_origami/tests/testthat/test-origami_SuperLearner.R#L39
This seems like something of a regression from when we were using foreach
. Depending on how long the underlying issue takes to get fixed, we might want to consider going back to foreach
or having an option to use either foreach
or future
. Thoughts?
Ok -- good to know that you encountered this problem in the origami-SL integration as well.
Given that its unclear when the error might be resolved on the foreach
side, the most sensible thing seems to be to restore the foreach
implementation, having either an option for users to specify use of foreach
v. future
(as you suggested), or simply having foreach
be a failsafe for any problems that might arise from use of future
(the latter assumes that the foreach
implementation was fairly robust and suggests something in the style of try-catch).
This wasn't encountered in the original SL implementation, it was added today: https://github.com/jeremyrcoyle/SuperLearner/commit/b0b3201befb7c262a8a4220ce45e403903056c14
With examples (and/or tests) of the parallelization options in origami it would have been identified though.
@ck37 I've been aware of the issue for a while now (see https://app.asana.com/0/270379281729322/369465305300769/f), but haven't had time to look into it in more detail (I'm currently on vacation). There's a commented out test here: https://github.com/jeremyrcoyle/origami/blob/master/tests/testthat/test_future_plan.R#L42 that is disabled because it was failing in some environments, and it didn't seem to make sense to delay our package's release due to an upstream issue we don't yet fully understand/have to wait for Henrik to fix.
Henrik has fixed the bug: HenrikBengtsson/future#156
Looks like we can move to close this issue once we confirm that the change in globals
resolved our problem (after installing globals
via remotes::install_github("HenrikBengtsson/globals@develop")
as per Henrik's recommendation).
It sounds like he's planning a new CRAN release, so we should be ok without having to mess with how dependencies are currently specified for origami
.
Closing since globals
0.10.1 has been added to CRAN and made a dependency for future
.