Azure/azure-iot-sdk-csharp

[Bug Report] NullReferenceException was thrown from RegistryManager.GetTwinAsync

iryanzhang opened this issue · 7 comments

Nuget Verison 1.38.2

Class: RegistryManager
Method: GetTwinAsync
Exception: System.NullReferenceException
Message: Object reference not set to an instance of an object
Source: Microsoft.Azure.Devices

from the stack trace. looks like there is an error occurred when GetExceptionCode.

at Microsoft.Azure.Devices.ExceptionHandlingHelper.GetExceptionCodeAsync(HttpResponseMessage response)
at Microsoft.Azure.Devices.ExceptionHandlingHelper.<>c.<<-cctor>b__5_8>d.MoveNext()
--- End of stack trace from previous location ---
at Microsoft.Azure.Devices.HttpClientHelper.MapToExceptionAsync(HttpResponseMessage response, IDictionary2 errorMapping) at Microsoft.Azure.Devices.HttpClientHelper.ExecuteAsync(HttpClient httpClient, HttpMethod httpMethod, Uri requestUri, Func3 modifyRequestMessageAsync, Func2 isMappedToException, Func3 processResponseMessageAsync, IDictionary2 errorMappingOverrides, CancellationToken cancellationToken) at Microsoft.Azure.Devices.HttpClientHelper.GetAsync[T](Uri requestUri, TimeSpan operationTimeout, IDictionary2 errorMappingOverrides, IDictionary`2 customHeaders, Boolean throwIfNotFound, CancellationToken cancellationToken)

Hi @iryanzhang, thanks for reaching out to us! Could you also share the following info and a code snippet to reproduce this issue with us?

Context

  • OS, version, SKU and CPU architecture used: (Windows 10 Desktop x64, Ubuntu 15.04 x86, Windows 10 IoT Core arm32, etc.)
  • Application's .NET Target Framework : (See https://docs.microsoft.com/en-us/dotnet/standard/frameworks. E.g. netcoreapp2.1, net451, uap10.0, xamarin)
  • Device: (Laptop, Raspberry PI3, Android APIv25 etc.)

OS: Windows 10 x64
Framework .Net 6.0
Device One VM node in Service Fabric

I think it is hard to repro since we only see this exception in our geneva log twice in the past 2 years.
but from the call stack. seems. I believe we could do something to avoid NRE in client code.
https://github.com/Azure/azure-iot-sdk-csharp/blob/a7bd887619d13d87675552d2b52ed9514d5a1c09/common/src/service/ExceptionHandlingHelper.cs

our usage is straightforward :

RegistryManager registryManager = await _registryManagerFactory.CreateAsync(hostName, tokenCredential, cancellationToken);
var twin = await registryManager.GetTwinAsync(deviceId, cancellationToken);

I see. My best guess for now is that the response couldn't be parsed as the expected way in GetExceptionCodeAsync() call. As this issue happened so uncommonly, I assume there's no more detailed logs or info you could share, regarding what the response object looked like or on which line it threw. We will try to add null checks here as much as possible then.

@iryanzhang would you be willing to share a copy of the twin document with us for the device in question? You can redact any information that you need to.

Also, we don't have a method called RegistryManager.CreateAsync(...) Is that your own factory and does it simply initialize the RegistryManager with one of the standard initializers? Do you have a shim between your app code and the SDK's RegistryManager?

Does this app sit behind a firewall or proxy?

Correct. we have a factory to create a RegistryManager. and it is a standard initializer. we run this service with the same code for quite a long time. it was the first time hit this error.
Our service is hosted in Service Fabric Cluster. there could be some transient network communication issue between the node and hub, we did observe that before.

Sorry. can't share the twin as it is from one of our tenant's, I don't have direct access to the hub

Hi @iryanzhang, the fix has been merged in and we will let you know once we ship another release including it.

Hi @iryanzhang, the fix has been shipped in Microsoft.Azure.Devices 1.39.0 with the latest release. I am closing this issue.