Explanation of metrics
Opened this issue · 9 comments
Love the exporter, but is there somewhere with a good description of what each of the metrics tracks?
There is no official documentation (as far as I know) from Elastic team, but most of the important metrics are these:
# HELP filebeat_libbeat_output_events libbeat.output.events
# TYPE filebeat_libbeat_output_events untyped
filebeat_libbeat_output_events{type="acked"} 0
filebeat_libbeat_output_events{type="active"} 0
filebeat_libbeat_output_events{type="batches"} 0
filebeat_libbeat_output_events{type="dropped"} 0
filebeat_libbeat_output_events{type="duplicates"} 0
filebeat_libbeat_output_events{type="failed"} 0
If you feel like delving and figuring out everything, PR is welcome to update help messages of metrics
@shivas can you please advise on what's the difference between filebeat_libbeat_output_events
and filebeat_libbeat_pipeline_events
?
Also, which metric would indicate connection issues to logstash/elasticsearch?
Thanks!
Also there's filebeat_filebeat_events
, to add some complication :)
+1 here. I'm trying to understand how many messages we are collecting and sending but i'm not sure what's up from down.
+1 agree would be great to have some documentation something like this https://github.com/ClusterLabs/ha_cluster_exporter/blob/master/doc/metrics.md
This can be good for inspiration.
+1. A bit of semantics can be read out of the official Kibana Filebeat monitoring built-in, but that is just a screenshot with very limited explanatory potential.
+1
+1
This metric appears to be related to the events waiting to be sent: filebeat_libbeat_pipeline_events{type="active"}
. I'm going to use this as an initial effort to monitor the filebeat queue (formatted for helm):
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
{{- include "alert-rules.labels" . | nindent 4 }}
{{- include "common-library.labels" . | nindent 4 }}
name: alert-rules-filebeat
spec:
groups:
- name: alert-rules-filebeat
rules:
- alert: FileBeatQueueEmpty
expr: |
filebeat_libbeat_pipeline_events{type="active"} == 0
for: 30m
labels:
severity: warning
annotations:
description: Filebeat queue is empty
- alert: FileBeatQueueGrowing
expr: |
filebeat_libbeat_pipeline_events{type="active"} > 500 and
delta(filebeat_libbeat_pipeline_events{type="active"}[15m]) > 0
for: 15m
labels:
severity: warning
annotations:
description: |
{{ `Filebeat queue is {{printf "%.0f" $value}} and growing` }}