linux-system-roles/metrics

Missing installation of BCC agent when "metrics_graph_service: yes" is set

kurik opened this issue · 4 comments

kurik commented

When metrics_graph_service: yes is set in a playbook, the role installs grafana-pcp package. On the latest releases of RHEL and Fedora the grafana-pcp package (version 3.x.y) delivers a Grafana dashboard PCP Vector: eBPF/BCC Overview. However this dashboard requires pcp-pmda-bcc package to be installed and configured.

The metrics role currently does not install the BCC PMDA, so all the charts of the mentioned dashboard are in error state, not being able to get metrics from the BCC agent.

Hmm, I don't think this is fixable in practice, within the metrics role. Its a generic kind of problem, not specific to BCC - there can be grafana-pcp dashboards for any metrics (e.g. there is a mssql dashboard in newer versions - should we install mssql PMDA everywhere? most people don't run SQL Server though). Many people may want a 'lighter' PCP install without the BCC PMDA and its deps (requires llvm, kernel headers, extra python modules, another agent running, etc).

The best way to tackle this (if possible @andreasgerstmayr ?) may be to make the Grafana dashboards diagnostics really clear on how to enable any optional sources of metrics they need.

kurik commented

The idea behind this was to have aligned the default set of metrics provided by PCP with the default set of charts delivered in Grafana.

Otherwise, I do understand your comment and I agree we can not cover all the customer use cases.

The idea behind this was to have aligned the default set of metrics provided by PCP with the default set of charts delivered in Grafana.

+1 ... the BCC metrics are not enabled by default in PCP though. Let's see what Andreas reckons here, but I think bcc and mssql metrics are in the exact same category now (non-default PMDAs, both have dashboards available in grafana-pcp)

With grafana-pcp v3 (almost) all dashboards are optional, i.e. not installed by default. The eBPF/BCC Overview has a text with Installation Instructions on the top: "This dashboards requires the bcc PMDA to be installed and configured with the following modules: runqlat, biolatency, tcptop, tcplife."

So I think it's fine to not install the BCC PMDA by default - same as with the MSSQL dashboard/server, as @natoscott mentioned.