Lease renewal logic breaks consumer state when a shard is split
cfstout opened this issue · 3 comments
I've recently run into an issue with the library when trying to consume from a shard that was spit into two. It looks like in this case there is a lease maintained for the parent shard (shardId-000000000000
), but then the polling fetcher fails due to
Fail(software.amazon.awssdk.services.kinesis.model.ResourceNotFoundException: Shard shardId-000000000000 in stream <stream> under account <account> does not exist)`
This behavior happened in a loop numerous times and doesn't seem to self recover. Should there be logic in the lease refreshing logic to ensure that a given shard still exists on renewal?
I believe I also tried redeploying my app which didn't fix things, though I wasn't the most scientific when trying to address. Ideally though the library could detect this case and dynamically handle this scenario. I'm happy to take on the work unless I'm missing some existing behavior to handle that I was just too impatient to wait for.
Thanks for reporting.
I assume this is with the zio-native consumer, not the DynamicConsumer?
Yes, zio-native. This was not the DynamicConsumer
There is logic for handling splitting shards. The lease should be checkpointed with a SHARD_END
checkpoint value and the lease should not be taken anymore, see shardHasEnded
usage in DefaultLeaseCoordinator
. It's been a while since I checked that, so I couldn't tell you exactly what is supposed to happen.
Do you have a bit more of a timeline / logs perhaps?