Is it confusing to call ssh a clustermq scheduler?
mattwarkentin opened this issue · 0 comments
Hi @mschubert,
While reading over some of the clustermq
documentation again I am reminded of some confusion I had early on when using clustermq
to ssh into my HPC. It seems to me that ssh
isn't really a scheduler, but rather a means to connect to the scheduler/resource you want to use. So I think it can be confusing to list it alongside the schedulers, when you actually need to choose SSH + a scheduler. This is really just semantics but I wonder if it would make things clearer to separate ssh
from the schedulers (see here).
For example, the docs show this as a way to run multiprocess
on a remote machine via ssh
:
# REMOTE
options(
clustermq.scheduler = "multiprocess" # or multicore, LSF, SGE, Slurm etc.
)
# LOCAL
options(
clustermq.scheduler = "ssh",
clustermq.ssh.host = "user@host", # use your user and host, obviously
clustermq.ssh.log = "~/cmq_ssh.log" # log for easier debugging
)
It is sort of confusing that you specify two "different" schedulers, when the actual scheduler is multiprocess
, you just happen to be connecting via ssh. It might help users' mental model and clean up the API a little by separating things out into something like:
# LOCAL
options(
clustermq.connection = "ssh",
clustermq.ssh.host = "user@host"
)
Where the default connection is "local"
or something.
What do you think?