Distribution-to-Histogram conversion yields invalid data

Question

Distribution-to-Histogram conversion yields invalid data

cboggs opened this issue 7 years ago · 2 comments

Setup

Run stackdriver_exporter with --monitoring.metrics-type-prefixes=bigtable.googleapis.com.

Expected

Metrics like stackdriver_bigtable_table_bigtable_googleapis_com_server_latencies_count should present as monotonically increasing counters, as would normal Prometheus histogram counters.

Actual

Metrics stackdriver_bigtable_table_bigtable_googleapis_com_server_latencies_count frequently decrement and oscillate in ways they shouldn't.

Screenshot:

This inherently throws off quantile calculations on the respective histogram buckets, as the individual bucket counts oscillate in the same manner. This yields per-bucket rates that provide errant quantiles (ex: 47-day-long latency measurements on p50).

Answer 1 · 2021-07-29T06:29:03.000Z

It's not just _count, but bucket values as well.
The problem is due to DISTRIBUTION + DELTA metrics in Cloud Monitoring (Stackdriver) returning deltas for a given interval, whereas Prometheus counter should present the total accumulated count.

Related: #116 (comment)

Answer 2 · 2023-05-26T19:08:26.000Z

@SuperQ I think this one can be closed. I think I got syntax wrong on #168 for it to auto close when it merged