infrawatch/sg-core

Timestamps are not updating

Closed this issue · 2 comments

While testing recently I saw these messages from prometheus:

level=warn ts=2020-04-17T15:58:56.610684616Z caller=manager.go:514 component="rule manager" group=./openstack.rules msg="Error on ingesting results from rule evaluation with different value but same timestam
p" numDropped=1
level=warn ts=2020-04-17T15:59:06.056020486Z caller=scrape.go:1094 component="scrape manager" scrape_pool=service-telemetry/stf-default/0 target=http://10.129.0.31:8081/metrics msg="Error on ingesting sample
s that are too old or are too far into the future" num_dropped=33

Closer inspection reveals that the timestamps are not updating.

collectd_cpu_percent{host="compute-0.redhat.local",plugin_instance="0",type_instance="idle"} 99.8 1587147485180
collectd_cpu_percent{host="compute-0.redhat.local",plugin_instance="0",type_instance="idle"} 98.8 1587147510180
collectd_cpu_percent{host="compute-0.redhat.local",plugin_instance="0",type_instance="idle"} 94.8 1587147525180

but also

collectd_cpu_percent{host="controller-0.redhat.local",plugin_instance="0",type_instance="idle"} 91.3654618473896 1587147521092
collectd_cpu_percent{host="controller-0.redhat.local",plugin_instance="0",type_instance="idle"} 91.9191919191919 1587147576092
collectd_cpu_percent{host="controller-0.redhat.local",plugin_instance="0",type_instance="idle"} 92.9859719438878 1587147581092

Pretty sure this is my bug, and the timestamp is only being recorded when the first metric with a particular label set is received.

There is definitely code to pull the timestamp each time a message is received, but perhaps the prometheus API code is holding a reference it expects to be updated?

I'm recording here for posterity that it was a long day and those timestamps are clearly updating as per the output I put in the Issue description! I don't know how I could have missed it. the log messages are real, though; I'll still need to figure out what's causing them and if it has anything to do with the smoketest failing.