vpenso/prometheus-slurm-exporter

export queue info from squeue

Closed this issue · 3 comments

jlec commented

Hi

would it be possible to attach queue info to the jobs? Iw ould be nice to plot the job state graph filtered by queue.

Best
Justin

mtds commented

It is theoretically possible but slightly tricky: we would have to filter the partition names every time (any admin can decide to partition the cluster in multiple 'slices') before running the 'squeue' commands.

This will bloat the code greatly......What we can do is add a command line option to filter the partition name. In this case the exporter will extract only the data related to a certain partition, including its jobs.

In this case you'll have to run multiple exporters (in theory for every partition) and just the listening
port with the command line flag.

I can try to add this feature in the coming weeks but I cannot say for sure when I'll be able to deliver it.

Regards,
Matteo

Hi,

I have a closely related if not identical question. I want to essentially create a table in the Grafana dashboard that has all of the squeue information. We only have a single partition if that's an issue. Is this possible with the current code base, or would it require a significant update?

Thank you,
Collin

mtds commented

If any of you are still interested in these functionalities, you should check our latest commits.

Now this exporter is able to provide the following additional info:

  • Running/suspended Jobs per partitions, divided between Slurm accounts and users.
  • CPUs total/allocated/idle per partition plus used CPU per userid.