prometheus-community/stackdriver_exporter

Distribution-to-Histogram conversion yields invalid data

cboggs opened this issue · 2 comments

Setup

Run stackdriver_exporter with --monitoring.metrics-type-prefixes=bigtable.googleapis.com.

Expected

Metrics like stackdriver_bigtable_table_bigtable_googleapis_com_server_latencies_count should present as monotonically increasing counters, as would normal Prometheus histogram counters.

Actual

Metrics stackdriver_bigtable_table_bigtable_googleapis_com_server_latencies_count frequently decrement and oscillate in ways they shouldn't.

Screenshot:
screen shot 2018-04-12 at 9 10 57 am

This inherently throws off quantile calculations on the respective histogram buckets, as the individual bucket counts oscillate in the same manner. This yields per-bucket rates that provide errant quantiles (ex: 47-day-long latency measurements on p50).

It's not just _count, but bucket values as well.
The problem is due to DISTRIBUTION + DELTA metrics in Cloud Monitoring (Stackdriver) returning deltas for a given interval, whereas Prometheus counter should present the total accumulated count.

Related: #116 (comment)

@SuperQ I think this one can be closed. I think I got syntax wrong on #168 for it to auto close when it merged