Brooklin KAFKA mirroring task produces duplicated records on re-balance
sanjay24 opened this issue · 3 comments
sanjay24 commented
Subject of the issue
If Group coordinator becomes unreachable for a kafka mirroring task (consumer end), it triggers re-balance and causes duplicated records
Your environment
- Operating System
CentOs 7.6 - Brooklin version
master/1.0.2 - Java version
1.8 - Kafka version
2.1.0 - ZooKeeper version
3.4.13
Steps to reproduce
- Enable and start a kafka mirroring task
- Make the source broker unreachable
- See that re-balance is triggered and check for duplicates
Expected behaviour
No duplicates
Actual behaviour
Duplicated data
sanjay24 commented
If 'exactly once semantics' are not supported are there any suggested configurations which could reduce potential duplicates?
ahmedahamid commented
This is by design, @sanjay24 . Brooklin supports at least once semantics. We haven't assessed what it would take to have Brooklin operate under exactly once semantics when mirroring Kafka clusters.
ahmedahamid commented
Please, feel free to reach out if you have any questions.