Distribution-to-Histogram conversion yields invalid data
cboggs opened this issue · 2 comments
Setup
Run stackdriver_exporter with --monitoring.metrics-type-prefixes=bigtable.googleapis.com
.
Expected
Metrics like stackdriver_bigtable_table_bigtable_googleapis_com_server_latencies_count
should present as monotonically increasing counters, as would normal Prometheus histogram counters.
Actual
Metrics stackdriver_bigtable_table_bigtable_googleapis_com_server_latencies_count
frequently decrement and oscillate in ways they shouldn't.
This inherently throws off quantile calculations on the respective histogram buckets, as the individual bucket counts oscillate in the same manner. This yields per-bucket rates that provide errant quantiles (ex: 47-day-long latency measurements on p50).
It's not just _count
, but bucket values as well.
The problem is due to DISTRIBUTION
+ DELTA
metrics in Cloud Monitoring (Stackdriver) returning deltas for a given interval, whereas Prometheus counter should present the total accumulated count.
Related: #116 (comment)