Exporter dies when Slurm accounting not enabled
Closed this issue · 5 comments
Hey,
We're using this to monitor a small Slurm cluster, and it's very useful, thanks! Facing an issue however, after recently upgrading to 0.17.
In ParseAllocatedGPUs()
, sacct
is executed to get some data. We don't use Slurm accounting, so the subprocess exits with code 1
to show failure. Execute()
receives the non-zero code, and considers this fatal, killing the entire exporter.
I'm happy to attempt a fix myself, but do you have any suggestions for a good logic flow in this case?
Perhaps something like an optional argument to Execute() that designates "allowable" exit codes; meaning blank data is returned and execution continues.
I do not know how, since I am not a Go programmer, but what I would suggest is adding a command-line flag --disable-accounting
to the project that when passed disables all submodules that depend on calls to sacct
. You can find them by running git grep sacct
.
In my PR #43 I have added uniform error reporting for failing commands which you may find useful.
Great news, thanks. Apologies for the wait—I'd intended to test it earlier. But have just installed 0.19 and it works fine.