earthlab/rslurm

Execute slurm_apply() locally when submit is FALSE

wlandau-lilly opened this issue · 9 comments

Maybe this was the original intention, but in any case, I do not see output when I try the example at the top of the vignette

test_func <- function(par_mu, par_sd) {
    samp <- rnorm(10^6, par_mu, par_sd)
    c(s_mu = mean(samp), s_sd = sd(samp))
}
pars <- data.frame(par_mu = 1:10,
                   par_sd = seq(0.1, 1, length.out = 10))
library(rslurm)
sjob <- slurm_apply(test_func, pars, jobname = 'test_apply',
                    nodes = 2, cpus_per_node = 2, submit = FALSE)
res <- get_slurm_out(sjob, outtype = 'table')
Warning message:
In get_slurm_out(sjob, outtype = "table") :
  The following files are missing: results_0.RDS, results_1.RDS

I believe the intended usage of slurm_apply(..., submit = FALSE) is given in the documentation:

If submit = TRUE, the job is sent to the cluster and a confirmation message (or error) is output to the console. If submit = FALSE, a message indicates the location of the saved data and script files; the job can be submitted manually by running the shell command sbatch submit.sh from that directory.

Fair enough. What about a dryrun flag? Since I might be developing on top of rslurm, I would appreciate the ability to go through the motions with platform-independent automated unit tests.

I agree this would be useful.

You may be interested in my typical workflow for using rslurm, which I've documented here. It's a bit clunky and uses do.call as a "local" version of slurm_apply.

Another "hack" I use to mimic the kind of behavior you're describing is documented on stackoverflow.

You can see I've defined two versions of the is_suspicious() function, a "local" version and a "slurm" version.

My preferred terminology for this flag would be:

default:
slurm_apply()
slurm_apply(local = FALSE)

for a dry run:
slurm_apply(local = TRUE)

All this reminds me of mockr.

There's currently a non-exported function 'local_slurm_array' in the slurm_utils.R source file, which we use in package unit tests to execute a slurm_job object locally (the object is first created with slurm_apply, with submit = FALSE).

Sounds like you did the heavy lifting already. I may be in the minority here, but I feel strongly about the ability to toggle between a project's test mode and production mode with the flip of master switch. That's the whole reason why I wrote downsize.

I've addressed this old issue by cleaning up the local_slurm_array function and exporting it with the package rather than being an unexported utility function used for tests. See #49