Azure/azure-functions-java-library

Add Retry annotation

pragnagopa opened this issue ยท 16 comments

cc @jeffhollan @TsuyoshiUshio @amamounelsayed

PR: Azure/azure-webjobs-sdk#2463 added support for function execution retries

Tracking item to add annotations

Note: this would require updating mvn plugin to consume the new library

@apawast this may be a good item to drop on our GitHub project to close out this feature

Thank you so much @pragnagopa. We will add this work item to work on it next.

Tagging @casper-79 to track progress

Hi @amamounelsayed

Do you have an estimate on when the updated mvn plugin will be available? We are working on an application that really needs the retry annotation ..

Hi @jeffhollan @TsuyoshiUshio @amamounelsayed

Can any of you provide guidance on when the retry annotation will be ready to use?

The base implementation in the next day or so

Thank you @casper-79 and @jeffhollan, With the base implementation, you can add manually in the host.json or function.json the retry tag, this will unblock you. Meanwhile we will add the annotation ETA mid November to support the function.json tag generation. We will update in case any delays.

Thanks @amamounelsayed

Is there any documentation you can point me to? I have tried to figure out how to do it based on this commit, but I am not exactly sure I did it right. Using a host.json file as seen below, I would expect failing messages to be retried in 5s, 10s, 20s, 40s, 80s ...?

{ "version": "2.0", "extensionBundle": { "id": "Microsoft.Azure.Functions.ExtensionBundle", "version": "[2.*, 3.0.0)" }, "retry": { "strategy": "exponentialBackoff", "maxRetryCount": 6, "delayInterval": "00:00:05" } }

We will be posting documentation for using retry soon. Please hold off using the feature until then. Docs will be updated as soon as the functions runtime version that supports this feature is rolled out to all regions in production.

Thanks @pragnagopa

I will try it out right away!

Hello again @pragnagopa

I have tested the retry functionality today and seen the exponential retry strategy in action. I am also seeing some very strange behaviour, however. As I understand the documentation the retry strategy is implemented on the function instance itself, rather than storing the delivery state on the queue. I am seeing what I believe is side effects of this approach. My experiments center around submitting poisonous messages that will always fail onto a queue consumed by a Java azure function. The function uses a retry strategy defined in host.json as seen below:

"retry":{
  "strategy":"exponentialBackoff",
  "maxRetryCount":6,
  "minimumInterval":"00:00:10",
  "maximumInterval":"00:05:00"
}

(1) Processing of poisonous messages does not always show up in the Application Insights and the "monitor" section of Azure functions. When I use the Azure portal to peek look at test messages I can tell DeliveryCount has gone up by 1, but more often than not there is no trace of the failed execution that increased the counter.

(2) Azure function instances are short lived, thereby affecting the useful range of the parameters in the retry configuration parameters. Can you provide guidance on what will work in practice? I am guessing you will run into problems if you set maximumInterval to 24 hours and retryCount to 30 in my host.json?

(3) What is the recommended approach for dead lettering? The only solution I can think of is to set maxDeliveryCount=1 on the queue, but this will only work if all retry attempts of your strategy can be be performed within the typical lifetime of an instance. Otherwise, I guess the message will be retried for ever.

Regards,

Casper

Gonna reply to this in another thread as not related to this issue, though arguably of these points may justify their own issue (specifically the non-durability of the current implementation) - but for now let's move this conversation here: Azure/azure-webjobs-sdk#2595

Sorry for being late. I also consider to the Kafka extension. Azure/azure-functions-kafka-extension#122
Hi @pragnagopa
For adding retry policy to an extension for C#, do we need to do something for the extension side? or just add Attribute like [FixedDelayRetry(5, "00:00:10")] works?

When do you plan to implement retry annotation? Is there any progress?

Thanks.