nismod/smif

An alternative to the current linear scheduler

willu47 opened this issue · 1 comments

Is your feature request related to a problem? Please describe.
The current naive scheduler implementation is limited to linear jobs. It would be great to allow parallel paths to be exploited on a laptop, but allow scalability to cluster, or the cloud.

Describe the solution you'd like
Snakemake provides all of these features. Smif could write a snakemake configuration file from the system of systems configuration, which could then be deployed on a laptop or cluster. Snakemake also allows each "rule" within a snakemake workflow to be run within a specific conda environment, or on a container, enabling good scalability.

Describe alternatives you've considered
There are many alternative workflow management systems, but this seems a good candidate.

Additional context
Snakemake has an experimental gui, and can render images of a workflow using dot. It's written in Pure python. It can be pip installed, or from the bioconda channel.

I like the idea of getting smif clearly out of the scheduler business 🙂

In the meantime, here's a practical way to run a stack of smif model runs in parallel, which should tackle one variant of the problem (but won't help with running jobs in parallel within a single model run):

cat batchfile | parallel -j 20 smif run {} -v -i local_binary

Where batchfile is a text file containing a model run name on each line. smif could run this sequentially using smif -b batchfile. Instead, piping it into parallel means each model run will run concurrently, up to the limit optionally passed in with -j (jobs).