elastic/elasticsearch-net

NullReferenceException thrown when calling PingAsync with Elastic.Apm during DNS failure time

Opened this issue · 5 comments

Elastic.Clients.Elasticsearch version:
8.18.3
8.17.4

Elasticsearch version:
8.18.1
8.17.4

.NET runtime version:
dotnet 8

Operating system version:
windows 11

Description of the problem including expected versus actual behavior:
NullReferenceException being thrown when using client.PingAsnc() with Elastic.Apm reference with an invalid url to simulate DNS failure.

Note: This issue is fixed in elsaticsearch dotnet client 9.05 and above

Steps to reproduce:

using Elastic.Apm;
using Elastic.Clients.Elasticsearch;
using Elastic.Transport;

Console.WriteLine("Begin Testing.");

var user = "elasticuser";
var pwd = "randomepwd";

//NOTE: provide unreachable url to simulate DNS failure
var searchUris = new Uri[] { new Uri("https://localhost2:9200") };

var pool = new StaticNodePool(searchUris);

//var clientSettings = new ElasticsearchClientSettings(pool);
var clientSettings = new ElasticsearchClientSettings(pool);
clientSettings.ServerCertificateValidationCallback((o, cert, chain, errors) => true)
.DefaultFieldNameInferrer(name => name)
.DefaultDisableIdInference()
.EnableDebugMode()
.Authentication(new BasicAuthentication(user, pwd));

var client = new ElasticsearchClient(clientSettings);

// NOTE: below line caused NullReferenceException being thrown with Elastic.Apm 1.32.2 or 1.31.0
// and elastic client 8.17.4 or 8.18.3, 9.0.3, 9.0.4
// Working fine with elastic client 9.0.5, 9.0.6, 9.0.7
var tracer = Agent.Tracer;
var pingResponse = await client.PingAsync().ConfigureAwait(false);

Console.WriteLine($"Is valid response: {pingResponse.IsValidResponse}");

Expected behavior
A clear and concise description of what you expected to happen.

Provide ConnectionSettings (if relevant):

Provide DebugInformation (if relevant):
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
at Elastic.Transport.DistributedTransport1.RequestCoreAsync[TResponse](Boolean isAsync, EndpointPath path, PostData data, Action1 configureActivity, IRequestConfiguration localConfiguration, CancellationToken cancellationToken)
at Elastic.Clients.Elasticsearch.ElasticsearchClient.<>c__DisplayClass701_03.<<DoRequestCoreAsync>g__SendRequestWithProductCheckCore|2>d.MoveNext() in /_/src/Elastic.Clients.Elasticsearch/_Shared/Client/ElasticsearchClient.cs:line 213 --- End of stack trace from previous location --- at Elastic.Clients.Elasticsearch.ElasticsearchClient.<>c__DisplayClass701_03.<g__SendRequestWithProductCheck|1>d.MoveNext() in /_/src/Elastic.Clients.Elasticsearch/_Shared/Client/ElasticsearchClient.cs:line 172
--- End of stack trace from previous location ---
at Program.

$(String[] args) in C:\sandbox\elastic\TestUnreachableServer\Program.cs:line 33
at Program.(String[] args)

We also get the same issue in APM agent .NET: elastic/apm-agent-dotnet#2622

I have analyzed the issue and here are might findings:

After testing/replicating, I can confirm that there is indeed a NullReferenceException for some versions of 'Elastic.Apm' and 'Elastic.Clients.Elasticsearch'.

if Elastic.Apm version =< 1.23 then all fine
if Elastic.Clients.Elasticsearch version >= 9.0.5 then all fine

The PR: https://github.com/elastic/elasticsearch-net/pull/8549/files of version 9.0.5 fixed the problem

In 'src/Elastic.Clients.Elasticsearch/Elastic.Clients.Elasticsearch.csproj':
changed to

I have taken the APM version 1.32..2 and Elastic.Clients.Elasticsearch version 9.0.4 (failing example provided in the issue) but with an updated version of Elastic.Transport to Version="0.9.2" and it works.

@JeremyBessonElastic Appreciate your prompt investigation and response! Is dotnet elasticesarch client 8.18.3 + Elastic.Transport 0.9.2 a valid combination? (we are using elasticsearch server 8.17.4 and 8.14.3). My initial test with client 8.18.3 and Transport 0.9.2 seems working fine (will test more within our apps)

It is a hard question as usual...

What we advice, for clear reasons, is to use the latest versions which should be compatible because we test them obviously.

I am trying to get more information from the 'Elastic.Clients.Elasticsearch' people but I am afraid that I might not get information before mid-august.

I would say that it is probably compatible and the best way to know is to test.

Best

We will discuss if we can backport the upgrade of <PackageReference Include="Elastic.Transport" Version="0.8.1" /> to <PackageReference Include="Elastic.Transport" Version="0.9.2" /> for previous versions of Elastic.Clients.Elasticsearch (for versions before 9.0.5).

Thank you @JeremyBessonElastic, at meanwhile I am testing against our app with Elastic.Transport 0.9.2