confluentinc/librdkafka

consume data backlog after Heartbeat failed: REBALANCE_IN_PROGRESS: group is rebalancing

guoliushui opened this issue · 0 comments

Description

During the operation of our service, we found that there was a data backlog in some partitions. After investigation, we found that rebalance happened during heartbeat detection. After the heartbeat detection failed, rebalance was actively triggered and the allocated partitions were removed, but consumption coordination was not triggered. Redistribution of coordinators and coordination groups

497:W0517 16:16:25.694190 367 ] topic=MY_TOPIC, lvl=7, facility=HEARTBEAT, msg=[thrd:main]: Heartbeat failed: REBALANCE_IN_PROGRESS: group is rebalancing

image
./block_consumer.log:112:336:I0517 16:16:21.057873 371 g] Got revocated 1 partitions!, partition info is: [ MY_TOPIC[25:#] ]
./block_consumer.log:137:485:I0517 16:16:25.693912 371 ] Got assigned 1 partitions!, partition info is: [ MY_TOPIC[25:#] ]
./block_consumer.log:153:501:I0517 16:16:25.694259 371 g] Got revocated 1 partitions!, partition info is: [ MY_TOPIC[25:#] ]

block_consumer.log

How to reproduce

<your steps how to reproduce goes here, or remove section if not relevant>

IMPORTANT: Always try to reproduce the issue on the latest released version (see https://github.com/confluentinc/librdkafka/releases), if it can't be reproduced on the latest version the issue has been fixed.

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

  • librdkafka version (release number or git tag): <tag: v1.0.0-RC7>
  • Apache Kafka version: <REPLACE with e.g., 0.10.2.3>
  • librdkafka client configuration: <enable.auto.commit=false, enable.sparse.connections=false, debug=consumer,cgrp,fetch>
  • Operating system: <CentOS Linux 7 (Core)>
  • Provide logs (with debug=consumer,cgrp,fetch )
  • Provide broker log excerpts
  • Critical issue