grafana/dashboard-linter

Check every query for a counter (`_total`) is `irate`'d, and that we never use `rate`

mshahzeb opened this issue · 6 comments

Always use irate instead of rate.

Hi not sure about the explanation, it seems to me that this thread explains more the disadvantages of using irate than anything else? I also think that rate should be preferred over irate almost everywhere...

@MrFreezeex Yes that is the consensus and we will be putting in a check for that in the dashboard linter.

I will remove the link to the confusing discussion :)

Do you mind explaining why? I mean I very much agree to the the thread you posted that explains why irate skip metrics and the graph will be "incorrect" (quoting directly: No, irate doesn’t capture spikes - it just returns random sample of data points for the given time series.)... Also not sure if this is very much a consensus since even mimir dashboards seems to prefer rate over irate for most cases if I am not mistaken.

This is a healthy discussion, and I believe it may be an indicator that this isn't appropriate as a hard-and-fast rule.

I'll also confess that while I have a decent understanding of the difference between rate and irate are, I do not have a clear understanding of the recommendation to always use irate. It seems to me that you'd use an appropriate function for the type of counter you're trying to represent, which would include choice of rate or irate as well as min steps and the interval chosen (typically $__rate_interval which is also a rule).

I would defer a bit to @tomwilkie here, as this rule was his suggestion. I am inclined to either forego creating this rule, or clearly understand and document the intent behind it so that users can make an informed decision about whether they exclude it or not.

Yeah I think in hindsight best not to mandate the use of irate; still worth having a rule that checks counters (ie anything ending in _total) is rate'd/irate'd/increased etc, no?

Yeah I think in hindsight best not to mandate the use of irate; still worth having a rule that checks counters (ie anything ending in _total) is rate'd/irate'd/increased etc, no?

That sounds useful for sure! I can't think of any use-case why you would not do that on a counter...