Solution to expose HTCondor metrics through node_exporter
ATENTION !!! In this case specific case, HTCondor-ce runs inside of HTCondor-CE Container.
If your HTCondor resides on the host machine, you need to modify condor_textfile_collector.sh and remove "docker exec".
Run installation script
wget -O - https://raw.githubusercontent.com/ejr004/condor_textfile_collector/master/node_exporter_install.sh | bash
Add this line to your crontab
crontab -e
*/5 * * * * /scripts/condor_node_exporter.sh <your-container-ce>
Firewall rules (just in case)
firewall-cmd --zone=public --add-port=9090/tcp --permanent
Now you are good to go !
Metrics:
# HELP condor_queue_held Metric read from /var/lib/node_exporter/textfile_collector/condor.queue.prom
# TYPE condor_queue_held counter
condor_queue_held{ce="ce01"} 0
# HELP condor_queue_held_alice Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_alice.prom
# TYPE condor_queue_held_alice untyped
condor_queue_held_alice{ce="ce01"} 0
# HELP condor_queue_held_lhcb Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_lhcb.prom
# TYPE condor_queue_held_lhcb untyped
condor_queue_held_lhcb{ce="ce01"} 0
# HELP condor_queue_idle Metric read from /var/lib/node_exporter/textfile_collector/condor.queue.prom
# TYPE condor_queue_idle counter
condor_queue_idle{ce="ce01"} 1
# HELP condor_queue_idle_alice Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_alice.prom
# TYPE condor_queue_idle_alice untyped
condor_queue_idle_alice{ce="ce01"} 0
# HELP condor_queue_idle_lhcb Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_lhcb.prom
# TYPE condor_queue_idle_lhcb untyped
condor_queue_idle_lhcb{ce="ce01"} 1
# HELP condor_queue_removed Metric read from /var/lib/node_exporter/textfile_collector/condor.queue.prom
# TYPE condor_queue_removed counter
condor_queue_removed{ce="ce01"} 0
# HELP condor_queue_removed_alice Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_alice.prom
# TYPE condor_queue_removed_alice untyped
condor_queue_removed_alice{ce="ce01"} 0
# HELP condor_queue_removed_lhcb Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_lhcb.prom
# TYPE condor_queue_removed_lhcb untyped
condor_queue_removed_lhcb{ce="ce01"} 0
# HELP condor_queue_running Metric read from /var/lib/node_exporter/textfile_collector/condor.queue.prom
# TYPE condor_queue_running counter
condor_queue_running{ce="ce01"} 782
# HELP condor_queue_running_alice Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_alice.prom
# TYPE condor_queue_running_alice untyped
condor_queue_running_alice{ce="ce01"} 0
# HELP condor_queue_running_lhcb Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_lhcb.prom
# TYPE condor_queue_running_lhcb untyped
condor_queue_running_lhcb{ce="ce01"} 782
# HELP condor_queue_suspended Metric read from /var/lib/node_exporter/textfile_collector/condor.queue.prom
# TYPE condor_queue_suspended counter
condor_queue_suspended{ce="ce01"} 0
# HELP condor_queue_suspended_alice Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_alice.prom
# TYPE condor_queue_suspended_alice untyped
condor_queue_suspended_alice{ce="ce01"} 0
# HELP condor_queue_suspended_lhcb Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_lhcb.prom
# TYPE condor_queue_suspended_lhcb untyped
condor_queue_suspended_lhcb{ce="ce01"} 0
# HELP condor_queue_suspended_dune Metric read from /var/lib/node_exporter/textfile_collector/condor.queue_dune.prom
# TYPE condor_queue_suspended_dune untyped
condor_queue_suspended_dune{ce="ce01"} 0
# HELP condor_status_claimed Metric read from /var/lib/node_exporter/textfile_collector/condor.status.prom
# TYPE condor_status_claimed counter
condor_status_claimed{ce="ce01"} 778
# HELP condor_status_machines Metric read from /var/lib/node_exporter/textfile_collector/condor.status.prom
# TYPE condor_status_machines counter
condor_status_machines{ce="ce01"} 1040
# HELP condor_status_unclaimed Metric read from /var/lib/node_exporter/textfile_collector/condor.status.prom
# TYPE condor_status_unclaimed counter
condor_status_unclaimed{ce="ce01"} 262
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
You can wombo combo with grafana and add Condor general view Dashboard to your grafana dashboard.
Installs node exporter v0.18.1 and setup textfile collector directory on host machine.
Generate condor-ce metrics for node exporter file collector
Condor Grafana Dashboard