ECP-VeloC/VELOC

SLURM restart-in-place script double counts down node

CamStan opened this issue · 1 comments

When testing veloc_srun on SLURM, on back-to-back runs after a node was already taken down, the second run ended up double counting the same downed node in down_nodes.

Unfortunately I don't have the output from this test as it was done on a different machine.

This issue stayed inactive for a long time. Please reopen if still relevant.