LLNL/maestrowf

study key 'gpus' results in unsubmittable sbatch script

Jmast opened this issue · 2 comments

Jmast commented

On LC pascal/surface the sbatch script generated when trying to specify gpus cannot be submitted (maestro 1.1.9dev0).
gpus key worked on 1.1.6 version of maestro, but it was likely ignored.

@Jmast -- I think we came to the conclusion that this was a specific cluster configuration that SLURM didn't include the GPUs in their node definition. The result was that actually requesting GPUs caused SLURM to bounce back with the error. I'm inclined to say that isn't necessarily a Maestro bug -- thoughts?

Jmast commented

I think that is correct ... LC needs to fix the slurm config to support gres on clusters that have gpus.