insitro/redun

SLURM/HPC executor

Opened this issue · 3 comments

From what I'm reading in the docs, the 3 executors are AWS Batch, AWS Glue, and local. However for HPC users it would be helpful to have a dedicated executor that submits tasks to that queueing system. A slightly easier way to do this in Python might be to just make a dask executor, and since dask has implementations for many platforms (e.g. http://jobqueue.dask.org/en/latest/), you kind of get this for free.

Thanks @multimeric for posting this issue. You are correct that we intend to add additional executors over time and HPC clusters is indeed an important use case. Piggy backing off of Dask to get multiple executor backends at once is a great idea to investigate. Thanks for sharing!

Hoeze commented

Hi @mattrasmus, I would be interested in trying out redun as an alternative for snakemake, but according to the documentation the only viable way to use redun at scale is by running it on AWS. This is a big no-go.

Are there any updates on a SLURM / HPC executor?
Is it possible to configure custom executors like e.g. Snakemake allows?

+1. I'd also recommend considering the use of PSI/J to streamline such an addition: https://github.com/ExaWorks/psij-python. It is a lightweight dependency with a unified interface to various job schedulers, including up-and-coming ones.