Azure/durabletask

[DurableTask-AzureStorage] Eternal Orchestration Stuck and Consistently Abandoning the Message

ykhazbak opened this issue · 0 comments

Eternal Orchestration "SiteNetworkServiceStateBillingOrchestrator" started execution and then got stuck while processing a message after lease re-assignment.

The partition "ansmsitenetworkservicehub-control-06" was reassigned to worker node "_armBEaz_11" from worker node "_armBEaz_10", and just after the lease re-assignment, the worker node "armBEaz11" was never able to process one message of (TimerFired Event) and consistently abandoning the message for days.

The orchestration is stuck at line 114 of the code below, note that four task activities were already executed at this point:
image

Logs:
https://jarvis-int-west.microsoftgeneva.com/E06F8A5F
https://jarvis-int-west.microsoftgeneva.com/8D6D9236

Instance Id: 613e83a4-eb15-42c6-aa12-329f0e215894:SiteNetworkServiceStateBillingOrchestrator:V1
Event Type: TimerFired

image
image

Can someone help identify if this is a race condition? And how we can solve this? This is a billing orchestration which runs periodically, and it is very important to ensure it runs smoothly and consistently emitting billing events.