LeaderNotAvailable disguised as UnknownTopicOrPartition
Closed this issue · 1 comments
the problem
When running SimpleConsumer.subscribe
in a LeaderNotAvailable
scenario, a UnknownTopicOrPartition
error is thrown:
KafkaError: This request is for a topic or partition that does not exist on this broker.
This can be reproduced (sometimes it works, sometimes it doesn't) running the code at https://github.com/Quadric/radiaction/tree/40d3433be9da803ab2c2207e51f4088bcb4ed069/examples/basic-example
It's important to have something done about it because such case is very hard to catch and debug. It took me days to find this error hidden inside SimpleConsumer.client.topicMetadata
. Keep in mind that it is never guaranteed that the error will be there next time you run your code, given the nature of a LeaderNotAvailable
issue. That's how my topicMetadata
looks sometimes (some other times it's just empty):
{
"rick-morty__BUY_SAUCE": {
"0": {
"error": {
"name": "KafkaError",
"code": "LeaderNotAvailable",
"message": "This error is thrown if we are in the middle of a leadership election and there is currently no leader for this partition and hence it is unavailable for writes."
},
"partitionId": 0,
"leader": -1,
"replicas": [],
"isr": []
}
},
... // repeats for every topic
the solutions
- a
LeaderNotAvailable
error should be thrown as result of a failingsubscribe
due to the lack of a leader - there needs to be a way to wait for a leader to be elected, and then be able to call
subscribe
again.
Topic:partition pairs that received LeaderNotAvailable
error during subscribe will be retried to subscribe on each _fetch
call. So thats exactly what you name as a second solution:
there needs to be a way to wait for a leader to be elected, and then be able to call subscribe again.