strimzi/kafka-quotas-plugin

Block replication and not just producers

scholzj opened this issue · 10 comments

Currently, when the disk is getting full, the quota plugin blocks producers => so it only stops the partitions for which it is a leader. But it does not block replication. So under the right circumstances, producers will be stopped from producing directly into this node. But will keep producing to other partitions with leaders on other nodes and the data will get replicated to the broker with the disk over the limit as well and can make it full.

The idea discussed in the past was to have the plugin just expose the information to CO and the CO would control the quotas both for producers through the plugin but also through replication throttles via Kafka Admin API.

what's CO?

@k-wall Cluster Operator

My 2 cents ...

The ClientQuotaCallback implementations are handled by the ClientQuotaManager in the Kafka broker and it's in charge of handling quotas for clients only (produce, fetch, ...) and not for replica followers.
The Kafka broker has a different component named ReplicationQuotaManager for "internal" handling (without any kind of possible extension via plugin) of the replication throttling via the corresponding dynamic configuration parameter follower.replication.throttled.rate that can be changed, as Jakub mentioned, via the Admin Client API with no need of a broker roll.

Having the current quota plugin doing this configuration would be an hack and it's a solution that I don't like for different reasons:

  • scope of the plugin and corresponding interface is about clients as already mentioned but ...
  • ... it could be somehow possible instantiating an Admin Client in the plugin even if its configuration would become really problematic. The plugin should have configuration for TLS and even mutual authentication (if you think, for example, that the Strimzi operator does some handling via Admin Client API connecting to the replication port 9091 which has TLS enabled and mutual authentication). Maybe nothing impossible but a real "hack", which I don't like.

Maybe, it's possible I am missing any other good reason so that above "hack" would not work at all.

So, the logic for throttling replication should be somewhere else with plugin providing more information through metrics.
It already exposes JMX metrics for "total storage used bytes" and "soft limit bytes" but it would need some additions:

  • the broker.id they are referring to (in order to configure replication throttling for that specific broker)
  • "hard limit bytes" metric in order to allow making better decisions about the throttling

I would say that the quota plugin itself would need the above changes only in order to allow a different component to make decisions about replication throttling.
This component could be the CO, as Jakub mentioned, that:

  • should be able to scrape metrics from the quota plugin via JMX. It could have somehow same logic of the JMX Exporter that we use today for scraping Kafka JMX metrics and translating into Prometheus format.
  • it has all the data/information for configuring an Admin Client for connecting to the cluster easily and making the change on a specific broker.

If users are using this quota plugin outside of Strimzi they could anyway use the JMX metrics to allow a software component doing the replication throttling. It's just a matter of providing them the possibility to make decisions via metrics.

@k-wall @scholzj any thoughts on this?

Whatever the solution is, it will probably require a proposal => especially if it would involve the cluster operator for remote configuration.

@SamBarker @robobario I believe this was more or less addressed in your changes to the plugin, right?

Not directly, there can still be replication between brokers but I think blocking replication directly is likely to lead to many subtle and awkward issues.

What we have addressed the core issues and makes replication induced disk problems very unlikely.

Right. But the plugin now blocks producers on all nodes when the disk gets full on one of them and not just on a single node, correct? Which I think mostly addresses this indirectly.

The motivation for this issue was that the old version of the plugin blocked only the producers talking directly to the node running out of disk space but not producers talking to other nodes. So it could have happened that the replication from other nodes filled in the disk anyway. Where as when we block producers on all nodes when just one of them starts getting out of disk space, there should be no new messages going to the cluster and thus not many new data to replicate.

Correct, the quota plugin prevents production on all brokers when any disk fills up and so replication should cease soon after.

Great, thanks.

Closing as this is fixed through blocking producers on all nodes.