Latency metrics are missing in Cassandra 4.1
c3-clement opened this issue · 6 comments
Hello,
Our latency Grafana dashboards are not showing any data with management api 4.1.5-v0.1.79
, while they are working fine on 4.0 .
The following prometheus metrics are missing from prometheus:
mcac_client_request_latency_bucket
mcac_table_range_latency_bucket
mcac_table_read_latency_bucket
mcac_table_write_latency_bucket
mcac_table_coordinator_read_latency_bucket
mcac_table_coordinator_scan_latency_bucket
In the system logs I'm seeing this error message that could be related:
INFO [insights-8-1] 2024-06-12 14:24:03,177 NoSpamLogger.java:105 - Not able to get buckets for org.apache.cassandra.metrics.dropped_message.internal_dropped_latency.finalize_propose_msg 128 type org.apache.cassandra.metrics.DecayingEstimatedHistogramReservoir$EstimatedHistogramReservoirSnapshot
I have tried to request the MCAC metrics endpoint on port 9103. In 4.1.5 there is not single entry starting with collectd_mcac_micros_bucket
, while I'm seeing it in 4.0.X
I'm using this telemetry configuration on k8ssandracluster
:
telemetry:
mcac:
enabled: true
metricFilters:
- allow:org.apache.cassandra.metrics.Table
- allow:org.apache.cassandra.metrics.table
- allow:org.apache.cassandra.metrics.client_request
prometheus:
enabled: true
@adejanovski @burmanm I've seen this closed issue #444 .
However, it seems that the issue is still happening
The #444 should have fixed the missing metrics and in our testing it did, assuming you use the newer metrics endpoints. The names of the metrics are a bit different, to align with the naming inside Cassandra. Only the older endpoint returns mcac* metrics and that endpoint is deprecated and no changes will be done to it.
The #444 should have fixed the missing metrics and in our testing it did, assuming you use the newer metrics endpoints. The names of the metrics are a bit different, to align with the naming inside Cassandra. Only the older endpoint returns mcac* metrics and that endpoint is deprecated and no changes will be done to it.
Thanks for the feedback @burmanm .
assuming you use the newer metrics endpoints. The names of the metrics are a bit different
Is there any documentation about those new metrics endpoints and those new metrics names?
We are using k8ssandra-operator
and it's creating a prometheus ServiceMonitor
to scrape Cassandra metrics, so I assume it should hit the correct endpoint automatically when Cassandra 4.1 is deployed.
However if metrics names changed we probably have to update our Grafana dashboards
That's the old "MCAC" port. The new /metrics endpoint listens in port 9000. The k8ssandra-operator will create ServiceMonitors for the new endpoints if MCAC is no longer enabled:
telemetry:
mcac:
enabled: false
But yes, you would need new dashboards to support the new naming. See here for our example ones for installation of the new ones: https://docs.k8ssandra.io/tasks/monitor/prometheus-grafana/#install-the-grafana-dashboards
If you don't wish to disable MCAC yet, you can also simply create new ServiceMonitor for the new endpoint. Endpoints would look like this in the ServiceMonitor spec:
spec:
endpoints:
- port: metrics
interval: 15s
path: /metrics
scheme: http
scrapeTimeout: 15s
Rest can be copied from the old one.
Thanks a lot @burmanm ! We will try this shortly