Create a counter/gauge metric that exposes scrape errors
marcelcorso opened this issue · 4 comments
Right now it's not easy to monitor if clickhouse_exporter can reach it's target.
For example: To solve that problem Prometheus' jmx_exporter
exposes a gauge:
# HELP jmx_scrape_error Non-zero if this scrape failed.
# TYPE jmx_scrape_error gauge
jmx_scrape_error 0.0
that goes to 1 when there is a problem scraping it's target.
It would be great to have something like this for clickhouse_exporter.
Will look at this
it's already exposed exporter_scrape_failures_total
in grafana you can use sum(exporter_scrape_failures_total) OR vector(0)
PS. if you are using grafana you can also look at https://grafana.com/dashboards/882
We are using the grafana dashboard 👍 . But because we have many clickhouse instances its getting a bit busy. But that's another problem :-)
I saw the exporter_scrape_failures_total
but I'm not sure how to alert on it.
If the rate(exporter_scrape_failures_total[1m]) > 0
sounds weird
Looks legit. If you have further questions feel free to reopen an issue.