scylladb/scylla-monitoring

Add scylla_io_queue_consumption plots

xemul opened this issue · 5 comments

xemul commented

There's a convenient metrics called scylla_io_queue_consumption which shows the "fraction" of the maxumum (configured with io-properties) disk bandwidth+iops consumed by individual class. It's extremely useful when checking query latencies -- the metrics shows if the delays are due to disk being full or there's still room to be utilized. Also it's good to compare individual classes' consumption fractions with each other.

The reported is the counter type. Resulting rate is in [0.0;1.0] range, so it's good to convert it into percents rather than reporting plain decimals. It's important to sum the numbers by iogroup label, otherwise the reported number will make little sense. Summing/Splitting on shard basis doesn't make any sense either.

It's good to have different io classes (class label) shown on the same plot, not on different ones as it's done for per-class latency/queue-length/bandwidth/etc. Also it's good to have the ability to sum up classes and show the consumption for the instance.

Example (per class for a single instance):
image

mykaul commented

"are due to disk being full" - you mean disk queues?

xemul commented

Well, not quite. Let me explain it differently.

If the IO latency is high it's always coupled with "io scheduler queues are long" almost without any exceptions. And then the question is -- why are sched. queues are that long? Shouldn't the scheduler dispatch more requests into the disk instead of keeping them in its queues? This metrics helps answering it.

In case it is close to 1.0, the answer is -- scheduler sees that the disk capacity is over (the configured capacity, seastar doesn't estimate the capacity runtime) and it should hold requests in the queue for longer. In that case the next question to answer is -- why does scylla put that many requests into io queue.

If this metrics shows some low numbers, say -- all classes sum up to 50% per iogroup, then scheduler indeed mustn't keep requests in queue and the scheduler does contributes to high latency. For example, we've discovered scylladb/seastar#1641 as one of the problems some time ago, had we this metrics earlier it would have been instantly obvious.

xemul commented

BTW, comparing classes' consumption to each other is also of great interest. We expect that query class with its 1k shares consumes all it wants and compaction class with ~100 shares doesn't dominate. This metrics will show if our expectations are met.

amnonh commented

@xemul in your example query, you've ignore the mountpoint, I've checked a random cluster and saw that mountpoint could be the scylla directory, or none does it make sense to just over all mountpoint like you did in your example?

What would be a good dashboard for this panel, deatiled? advanced?

xemul commented

@xemul in your example query, you've ignore the mountpoint, I've checked a random cluster and saw that mountpoint could be the scylla directory, or none does it make sense to just over all mountpoint like you did in your example?

It would work, because consumption for "none" is always zero :) But technically, this is incorrect and should be separated

What would be a good dashboard for this panel, deatiled? advanced?

Good question. On one hand it fits naturally into Advanced, next to other IO-related (and sched-related) metrics. But on the other hand, the grouping for this metrics is different. As I wrote, I'd like the ability to see different classes on one plot next to each other. Ideally -- as a stacked plot, because these numbers are fractions of 100% and it's natural to stack them on each other. And total consumption too. With stacked plot it would be natural, with non-stacked -- probably just another plot line with "all" label. In any case -- it can be a single plot with "disk consumption for foo mountpoint" title which matches the "OS metrics" dashboard :D