reddit/baseplate.py

Complain if # kafka consumers exceeds # partitions

Opened this issue · 1 comments

It is quite easy for someone who is not familiar with kafka to think more consumers are better. However, if you run more consumers than there are partitions, they will never process any messages. see if we can get baseplate to complain loudly if the number of consumers exceeds the partitions on a topic.

Hi, I am interested in taking this on if it is available!

My proposed change would go in kafka.py:BaseKafkaQueueConsumerFactory:make_kafka_consumer. What I'd like to do is iterate through the topics in ClusterMetadata and check the number of partitions in each TopicMetadata. Then, the number of partitions on the topic can be compared to the number of Consumers currently listening to the topic. If the number of Consumers is equal to the number of partitions, we will throw an error when creating the consumer.

A question I have is whether consumers are using group functionality? I ask because if this is the case I could use the confluent_kafka AdminClient to list the consumer groups and then find each consumer that is a member of the group.

I hope this makes sense and appreciate any insight you can provide. 😊