Java based durable Functions are failing randomly
Opened this issue · 2 comments
Environment details:
- Service: Azure Function Apps
- Language: Java 17
- OS: Linux
- Runtime version (current): 4.1036.2.2
We have an Orchestrator Durable function and this we invoke through HTTP. It further invokes the Activity Function.
This Activity Function does some processing logic and return the response back, which is then returned back from the Orchestrator Durable Function.
We are using Java based Durable Functions. The issue occurs in the 1st called Durable function. It is not able to invoke the next steps and we get the below error :
a0998fba-a586-4728-9aa3-8f1c34ce851e: Function '(Orchestrator)' failed with an error. Reason: DurableTask.Core.Exceptions.OrchestrationFailureException
at Microsoft.Azure.WebJobs.Extensions.DurableTask.OutOfProcMiddleware.<>c__DisplayClass10_0.<<CallOrchestratorAsync>b__0>d.MoveNext() in D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\OutOfProcMiddleware.cs:line 145
--- End of stack trace from previous location ---
at Microsoft.Azure.WebJobs.Host.Executors.TriggeredFunctionExecutor`1.<>c__DisplayClass7_0.<<TryExecuteAsync>b__0>d.MoveNext() in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\TriggeredFunctionExecutor.cs:line 51
--- End of stack trace from previous location ---
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeWithTimeoutAsync(IFunctionInvoker invoker, ParameterHelper parameterHelper, CancellationTokenSource timeoutTokenSource, CancellationTokenSource functionCancellationTokenSource, Boolean throwOnTimeout, TimeSpan timerInterval, IFunctionInstance instance) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 581
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 527
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 306. IsReplay: False. State: Failed. RuntimeStatus: Failed. ExtensionVersion: 2.13.5. SequenceNumber: 17. TaskEventId: -1
2024-10-17T08:20:17Z [Information] a0998fba-a586-4728-9aa3-8f1c34ce851e: Orchestration awaited and scheduled 1 durable operation(s).
2024-10-17T08:20:17Z [Information] a0998fba-a586-4728-9aa3-8f1c34ce851e: Orchestration completed with a 'Failed' status and 0 bytes o
Below is the screenshot which shows the exception:
The same durable function was working fine till 8-OCT-2024. We suspect that after this release of Azure Function's host, the issue has started, as it has changes related to middleware:
@nayansc568 can you provide us with a small reproducer project which demonstrates these failures?
Also, you mentioned that the failures are random. Can you expand on this a bit? For example, what percentage of the time does the function work and what percentage does it fail (roughly)?
@cgillum: Sorry for the delayed response.
Starting from 8th October 2024, we've started observing - OutOfProcMiddleware error on 1 of our durable function and then on 31 October we downgraded the function runtime version to 4.34.1.22669 and everything started working fine and in parallel we were working with Azure support to check the cause of the issue. Before around 1 week, we've upgraded function runtime back to latest version (4.1036.2.2) and now the issue is no longer reproducible.
Nature of the OutOfProcMiddleware error:
- 1-2 API calls works, then rest of API calls to the durable function fails for some time and then 1-2 API calls work and so on.