Azure/azure-iot-sdk-csharp

[Bug Report] [SDK v2] Wrong currentRetryCount value for different RetryPolicy-s

bastyuchenko opened this issue · 5 comments

I'm implementing my own RetryPolicy

for ProvisioningClient

var options = new ProvisioningClientOptions(mqttSettings)
{
    RetryPolicy = new MyProvisioningClientRetryPolicy()
};

public class MyProvisioningClientRetryPolicy : ProvisioningClientRetryPolicyBase
{
    public MyProvisioningClientRetryPolicy() : base(.....)
    {  }

    public override bool ShouldRetry(uint currentRetryCount, Exception lastException, out TimeSpan retryDelay)
    {
        if (!base.ShouldRetry(currentRetryCount, lastException, out retryDelay))
        {    return false;    }
        ....
    }
}

for IotHubClient

var options = new IotHubClientOptions(mqttSettings)
{
    RetryPolicy = new MyIotHubClientRetryPolicy()
};

public class MyIotHubClientRetryPolicy : IotHubClientRetryPolicyBase
{
    public MyIotHubClientRetryPolicy() : base(.....)
    {  }

    public override bool ShouldRetry(uint currentRetryCount, Exception lastException, out TimeSpan retryDelay)
    {
        if (!base.ShouldRetry(currentRetryCount, lastException, out retryDelay))
        {    return false;    }
        ....
    }
}

For ProvisioningClientRetryPolicy, currentRetryCount equals 1 in ShouldRetry method for the first attempt.
For IotHubClientRetryPolicy, currentRetryCount equals 0 in ShouldRetry method for the first attempt.

From my point of view, If you are implementing RetryPolicy for one client and see currentRetryCount==0 you are expecting the same behavior for another client's RetryPolicy with the same schema.

Hello @bastyuchenko -- thanks for your patience here. Could you please tell us more about the use cases for retry policies in your application? I.e, what kinds of exceptions are triggering retries in the first place? Additionally, which transport protocols are you using? This will help us narrow down possible causes here. Let me know if I can be more clear!

Hello @patilsnr ,
We have a backend application and an application that we roll out on our devices.
Our customer wants to increase number of devices and the communication workload with IoT Hub and our entire system workload from devices will increase.
Currently, I'm trying to implement retry logic in the code from a device side.
We use:

  • X509 Certificate for the authentication to connect to Azure DPS to register device in Azure IoT Hub
  • X509 Certificate for the authentication to connect to Azure IoT Hub
  • MQTT protocol for communication with Azure DPS
  • MQTT protocol for communication with Azure IoT Hub
    I would like to implement some re-try logic. I derived from IotHubClientRetryPolicyBase and ProvisioningClientRetryPolicyBase to implement my own re-try policy. However, I see some inconsistency in IotHubClientRetryPolicyBase and ProvisioningClientRetryPolicyBase behavior:
  • for ProvisioningClientRetryPolicy, I receive currentRetryCount parameter value that equals 1 in ShouldRetry method for the first re-try.
  • for IotHubClientRetryPolicy, I receive currentRetryCount parameter value that equals 0 in ShouldRetry method for the first re-try.

I believe that it is unexpected and not obvious behavior for a developer that uses Azure IoT SDK .NET

To catch a transient exception and debug the retry policy I just disconnect my network cable or turn-off WiFi before run provisioningClient.RegisterAsync(...) or iotHubClient.OpenAsync(...)

Hi @bastyuchenko, thanks for the clarification. We've made a change that will hopefully fix the issue and should be included in our next preview release. Adding the fix checked in tag to this issue, and will ping when the release is out to verify that this is fixed.

Hi @patilsnr , thanks for the update.

closing with fix available