hu-macsy/simexpal

Tighter integration with batch schedulers

Closed this issue · 0 comments

Right now, we can use Slurm to launch experiments but we cannot monitor and/or kill experiments through Slurm.

  • In simex e, query the batch scheduler to determine if job are still alive or not.
  • Add a command to kill currently running jobs.

As an implementation strategy, we could store the job IDs of experiments in some file and use that to invoke squeue and scancel.