This cookiecutter provides a template Snakemake profile for configuring Snakemake to run on the SLURM Workload Manager. The profile defines three scripts
slurm-submit.py
- submits a jobscript to slurmslurm-jobscript.sh
- a template jobscriptslurm-status.py
- checks the status of jobs in slurm
and a configuration file config.yaml
that defines default values for
snakemake command line arguments. The default config.yaml
file is
restart-times: 3
jobscript: "slurm-jobscript.sh"
cluster: "slurm-submit.py"
cluster-status: "slurm-status.py"
max-jobs-per-second: 1
max-status-checks-per-second: 10
local-cores: 1
latency-wait: 60
Given an installed profile profile_name
, when snakemake is run with
--profile profile_name
, the configuration above would imply the
following snakemake call:
snakemake --jobscript slurm-jobscript.sh --cluster slurm-submit.py --cluster-status slurm-status.py --restart-times 3 --max-jobs-per-second 1 --max-status-checks-per-second 10 --local-cores 1 --latency-wait 60
plus any additional options to snakemake that the user has applied.
See the snakemake documentation on profiles for more information.
To create a slurm profile from the cookiecutter, simply run
cookiecutter https://github.com/Snakemake-Profiles/slurm.git
in a directory. You will be prompted to set some values for your
profile (here assumed to be called profile_name
), after which the
profile scripts and configuration file will be installed in the
current working directory as ./profile_name
. Then you can run
Snakemake with
snakemake --profile profile_name ...
Note that the --profile
argument can be either a relative or
absolute path. In addition, snakemake will search for a corresponding
folder profile_name
in /etc/xdg/snakemake
and
$HOME/.config/snakemake
, where globally accessible profiles can be
placed.
One typical use case is to setup a profile to use a specific slurm account:
$ cd ~ && mkdir -p my_project && cd my_project
$ cookiecutter https://github.com/Snakemake-Profiles/slurm.git
profile_name [slurm]: slurm.my_account
sbatch_defaults []: account=my_account no-requeue exclusive
cluster_config []:
Select advanced_argument_conversion:
1 - no
2 - yes
Choose from 1, 2 [1]:
cluster_name []:
The command snakemake --profile slurm.convert_args ...
will submit
jobs with sbatch --parsable --account=my_account --no-requeue --exclusive
. Note that the option --parsable
is always added.
It is possible to install multiple profiles in a project directory. Assuming our HPC defines a multi-cluster environment, we can create a profile that uses a specified cluster:
$ cookiecutter slurm
profile_name [slurm]: slurm.dusk
sbatch_defaults []: account=my_account
cluster_config []:
Select advanced_argument_conversion:
1 - no
2 - yes
Choose from 1, 2 [1]:
cluster_name []: dusk
(Note that once a cookiecutter has been installed, we can reuse it without using the github URL).
The command snakemake --profile slurm.dusk ...
will now submit jobs
with sbatch --parsable --account=my_account --cluster=dusk
. In
addition, the slurm-status.py
script will check for jobs in the
dusk
cluster job queue.
As a final example, assume we want to use advanced argument conversion:
$ cookiecutter slurm
profile_name [slurm]: slurm.convert_args
sbatch_defaults []: account=my_account
cluster_config []:
Select advanced_argument_conversion:
1 - no
2 - yes
Choose from 1, 2 [1]: 2
cluster_name []:
The command snakemake --profile slurm.convert_args ...
will now
submit jobs with sbatch --parsable --account=my_account
. The
advanced argument conversion feature will attempt to adjust memory
settings and number of cpus to comply with the cluster configuration.
See the section below
profile_name
: A name to address the profile via the--profile
Snakemake option.sbatch_defaults
: List of default arguments to sbatch, e.g.:qos=short time=60
. This is a convenience argument to avoidcluster_config
for a few aruments.cluster_config
: Path to a YAML or JSON configuration file analogues to the Snakemake--cluster-config
option. Path may relative to the profile directory or absolute including environment variables (e.g.$PROJECT_ROOT/config/slurm_defaults.yaml
).advanced_argument_conversion
: If True, try to adjust/constrain mem, time, nodes and ntasks (i.e. cpus) to parsed or default partition after converting resources. This may fail due to heterogeneous slurm setups, i.e. code adjustments will likely be necessary.cluster_name
: some HPCs define multiple SLURM clusters. Set the cluster name, leave empty to use the default. This will add the--cluster
string to the sbatch defaults, and adjustslurm-status.py
to check status on the relevant cluster.
Default arguments to snakemake
may be adjusted in the <profile path>/config.yaml
file.
Arguments are overridden in the following order and must be named according to sbatch long option names:
sbatch_defaults
cookiecutter option- Profile
cluster_config
file__default__
entries - Snakefile threads and resources (time, mem)
- Profile
cluster_config
file entries --cluster-config
parsed to Snakemake (deprecated since Snakemake 5.10)- Any other argument conversion (experimental, currently time, ntasks and mem) if
advanced_argument_conversion
is True.
Resources specified in Snakefiles must all be in the correct
unit/format as expected by sbatch
. The implemented resource names
are given (and may be adjusted) in the slurm_utils.RESOURCE_MAPPING
global. This is intended for system agnostic resources such as time
and memory.
By default, Snakefile resources are provided as-is to the sbatch
submission step. Although sbatch
does adjust options to match
cluster configuration, it will throw an error if resources exceed
available cluster resources. For instance, if the memory is set larger
than the maximum memory of any node, sbatch
will exit with the
message
sbatch: error: CPU count per node can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration is not available
By choosing the advanced argument conversion upon creating a profile, an attempt will be made to adjust memory, cpu and time settings if these do not comply with the cluster configuration. As an example, consider a rule with the following resources and threads:
rule bwa_mem:
resources:
mem_mb = lambda wildcards, attempt: attempt * 8000,
runtime = lambda wildcards, attempt: attempt * 1200
threads: 1
Assume further that the available cores provide 6400MB memory per core. If the job reaches a peak memory (8000MB), it will likely be terminated. The advanced argument conversion will compare the requested memory requirements to those available (defined as memory per cpu times number of requested cpus) and adjust the number of cpus accordingly.
Other checks are also performed to make sure memory, runtime, and number of cpus don't exceed the maximum values specificed by the cluster configuration.
Although cluster configuration has been officially
deprecated
in favor of profiles since snakemake 5.10, the --cluster-config
option can still be used to configure default and per-rule options.
Upon creating a slurm profile, the user will be prompted for the
location of a cluster configuration file, which is a YAML or JSON file
(see example below).
__default__:
account: staff
mail-user: slurm@johndoe.com
large_memory_requirement_job:
constraint: mem2000MB
ntasks: 16
The __default__
entry will apply to all jobs.
Tests can be run on a HPC running SLURM or locally in a docker stack. To execute tests, run
pytest -v -s tests
from the source code root directory. Test options can be configured
via the pytest configuration file tests/pytest.ini
.
Test dependencies are listed in test-environment.yml
and can be
installed in e.g. a conda environment.
Test fixtures are setup in temporary directories created by
pytest. Usually
fixtures end up in /tmp/pytest-of-user or something similar. In any
case, these directories are usually not accessible on the nodes where
the tests are run. Therefore, when running tests on a HPC running
SLURM, by default tests fixtures will be written to the directory
.pytest
relative to the current working directory (which should be
the source code root!). You can change the location with the pytest
option --basetemp
.
For local testing the test suite will deploy a docker stack
cookiecutter-slurm
that runs two services based on the following
images:
The docker stack will be automatically deployed provided that the user
has installed docker and enabled docker
swarm (docker swarm init
).
The docker stack can also be deployed manually from the top-level
directory as follows:
DOCKER_COMPOSE=tests/docker-compose.yaml ./tests/deploystack.sh
See the deployment script tests/deploystack.sh
for details.
Testing of the cookiecutter template is enabled through the pytest plugin for Cookiecutters.
The slurm tests all depend on fixtures and fixture factories defined
in tests/conftest.py
to function properly. The cookie_factory
fixture factory generates slurm profiles, and the data
fixture
copies the files tests/Snakefile
and tests/cluster-config.yaml
to
the test directory. There is also a datafile
fixture factory that
copies any user-provided data file to the test directory. Finally, the
smk_runner
fixture provides an instance of the
tests/wrapper.SnakemakeRunner
class for running Snakemake tests.
As an example, tests/test_slurm_advanced.py
defines a fixture
profile
that uses the cookie_factory
fixture factory to create an
slurm profile that uses advanced argument conversion:
@pytest.fixture
def profile(cookie_factory, data):
cookie_factory(advanced="yes")
The test tests/test_slurm_advanced.py::test_adjust_runtime
depends
on this fixture and smk_runner
:
def test_adjust_runtime(smk_runner, profile):
smk_runner.make_target(
"timeout.txt", options=f"--cluster-config {smk_runner.cluster_config}"
)
The make_target
method makes a snakemake target with additional
options passed to Snakemake.
See tests/test_issues.py
and tests/Snakefile_issue49
for an
example of how to write a test with a custom Snakefile.