Try to FixBug: Two consumers compete partitions on zkpath: $kafka_root/consumers/$group/owners/$topic/{$partid1; $partid2...}
sirfangx opened this issue · 2 comments
Hi
I have met with a bug: I have 32 partitions in one topic, and two consumers using the same group in this topic. When consumer A starts, it will consume all 32 partitions, as expected. But when the second consumer B starts, B will try to consume 16 partitions which have not been yield from consumer A yet.
So I just put some retry code to fix this bug, waiting for author's official fix. Somebody who has also met with this bug, can use this code temporarily.
In github.com/wvanbergen/kafka/consumergroup/consumer_group.go, line 339, function partitionConsumer:
change code from:
err := cg.instance.ClaimPartition(topic, partition)
to
retry_sec := 10
var err error
for i:=0; i< retry_sec; i++ {
err = cg.instance.ClaimPartition(topic, partition)
if err == nil {
break
}
cg.Logf("%s/%d :: Retry to Claim the partition : %s\n" ,topic, partition, partition)
time.Sleep(1*time.Second)
}
Thanks.
I have add debug logs in my codes, and it really goes into my codes.
So I put my bugfix online, it works well, hiahia.
在 2015年08月11日 11:08, Joey 写道:
i met too. but i am not confirm what`s wrong
—
Reply to this email directly or view it on GitHub
#65 (comment).
@wvanbergen hi, i also got this issue. below is my error log
[Sarama]2015/08/17 14:58:20 [kafka_topic_push_group_live/afd3b0cf4ce6] kafka_topic_push_live :: Started topic consumer
[Sarama]2015/08/17 14:58:20 [kafka_topic_push_group_live/afd3b0cf4ce6] kafka_topic_push_live :: Claiming 1 of 2 partitions
[Sarama]2015/08/17 14:58:20 [kafka_topic_push_group_live/afd3b0cf4ce6] kafka_topic_push_live/0 :: FAILED to claim the partition: Cannot claim partition: it is already claimed by another instance
[Sarama]2015/08/17 14:58:20 [kafka_topic_push_group_live/afd3b0cf4ce6] kafka_topic_push_live :: Stopped topic consumer
[Sarama]2015/08/17 14:58:29 [kafka_topic_push_group_live/afd3b0cf4ce6] Deregistered consumer instance livecmt-1:e009dbd6-5d0c-4c8e-8ffc-afd3b0cf4ce6.
i think it because of the first consumer didn`t release patition yet when the second consumer claim partition