quixio/quix-streams

Application leaves the group faster than expected on `stop()`

Closed this issue · 0 comments

Tell us about the bug
When the application stops gracefully (via Application.stop()), it closes the Consumer, the Consumer triggers the on_revoke callback, and on_revoke triggers the checkpoint commit.
The problem is that when the Consumer gets closed, it stops sending heartbeats, which limits the time the app has to commit the work before stopping.

If the app takes more time to commit (e.g., to flush the sinks), it may fail with the Broker: Unknown member id error instead of stopping gracefully.

What did you expect to see?
The deadline is usually expected to be dictated by the max.poll.interval.ms setting, which has a 5-minute default value.

Actual behavior
Because the Consumer stops sending the heartbeats to the broker, the much shorter session.timeout.ms kicks in (45s default), and the Checkpoint commits may fail if they took more than session.timeout.ms.

What version of the library are you using?
v3.3.0