grafana/metrictank

Stored 'lastupdate' being approximate can cause inconsistent missing data

shanson7 opened this issue · 6 comments

Describe the bug

The cassandra idx has a configurable update-interval. The lastupdate value is only updated when it gets out fo date by the configured update-interval (default 4h). There is an equivalent in bigtable (default 3h).

The bug is that when a Metrictank instance restarts it will load in the lastupdate value from storage, which can be behind by as much update-interval. For series that are not regularly published, this means that there is a range of time that this series will not be returned, despite having data, because the lastupdate value is imprecise.

Possible Solutions

1. Reduce update-interval

This was our first step. It's effective at reducing the likelihood of the issue, but reducing it too much adds a decent amount of load onto Metrictank instances/cassandra. So diminishing returns on this work around.

2. Periodically persist series in the index that haven't been persisted in a while and lastupdate is incorrect.

This seems like a reasonable solution since it would reduce the window that lastupdate is out-of-date in the database, but any instances that restart while it's out of date will not pick up changes afterward. Also, instances shutting down without updating lastupdate would not be able to update it later.

3. Add update-interval seconds to lastupdate on startup.

This solution seems the most sound to me. If update interval is 60 minutes, add up to 3600 seconds to the lastupdate we get from the database (not going beyond the current time) to err on the side of including series that might not have data rather than excluding series that might have data.

Helpful Information
Metrictank Version: latest
Golang Version: 1.12
OS: RHEL

I thought this exact issue has come up before, but I couldn't find an issue or PR.
Could only find some related issues (e.g. #550, #1532)

sidenote: somewhat confusingly, LastSave is the property that governs writes to the persistent index (subject to "update-interval"), and it causes the full metricdef to be written, but yeah i can see how this would be a problem for the lastUpdate field specifically.

Option 3 seems like the "obviously correct" solution to me. In fact, it essentially "extends" the exact same solution of #1532, taking the "extra allowance" used at query time and enabling it at index loading time. actually for the former fix to function correctly, it requires the latter fix (shortly after startup, anyway)
I'm not too worried about this causing a lot more index entries to be loaded, as the update-interval is typically a low value (couple of hours), much shorter to the the pruning durations used in index-rules.conf

So yes, let's do number 3. should be a very easy fix I think.

actually for the former fix to function correctly, it requires the latter fix (shortly after startup, anyway)

Not sure I understand this bit. It seems to me that Option 3 would obviate the need for #1532 (and would support the large number of Find style operations that have been added since). So implementing Option 3 would likely mean undoing the change in #1532.

#1532 makes sure that e.g. a metric with lastUpdate=x-3600 is included in the response to a query from: x to: y, if update-interval is 2*3600 (because if the MT instance started after the last point was sent for a metric, the metric may have been updated after x-3600 but the lastUpdate would not have been saved to the persistent index. If the MT instance is not "fresh" and the metric is still alive, iow if it has ingested points for the metric with newer timestamp values, than the lastUpdate field in the memory index is perfectly accurate, and the #1532 tweak is needlessly "eager".

I think I understand now what you meant with option 3. The above assumes that the lastUpdate field is retained from cassandra/bigtable when loading into memory. But you're suggesting to increment the live, in memory lastUpdate property by update-interval. I think this will work indeed, and involves undoing #1532

The main concern I have is whether there's an edge case where repeated restart cycles will cause the lastUpdate to keep increasing. That would require updating the "adjusted lastUpdate" to the persistent index, but i have gone through the AddOrUpdate code and it seems this wouldn't happen precisely because we always wait update-interval before saving any update.

and would support the large number of Find style operations that have been added since

TagQuery, basically?

The main concern I have is whether there's an edge case where repeated restart cycles will cause the lastUpdate to keep increasing. That would require updating the "adjusted lastUpdate" to the persistent index, but i have gone through the AddOrUpdate code and it seems this wouldn't happen precisely because we always wait update-interval before saving any update.

Right, we only write updates when updates are received, so I think we should be safe from this scenario.

TagQuery, basically?

Yeah, the various functions that take a TagQuery.

I think I understand now what you meant with option 3. The above assumes that the lastUpdate field is retained from cassandra/bigtable when loading into memory. But you're suggesting to increment the live, in memory lastUpdate property by update-interval. I think this will work indeed, and involves undoing #1532

the upside from this is that we also remove the "needlessly eagerness" of #1532, by only adjusting lastUpdate when it makes sense, but in the average case (with a MT instance being up for a while) lastUpdate will always be precise (because the in memory index always updates its lastUpdate values) and #1532 needlessly bumped the values.

Let me whip up a PR for you...