SocketException errors (TCP connection limit reached)

Question

SocketException errors (TCP connection limit reached)

hardware opened this issue 5 years ago · 3 comments

Hi,

Thank you for this HttpClientSample, it helped me a lot to implement HttpClientFactory inside an ASP.NET core 2.2 application.

I used exactly the same implementation as yours, but still, I got a lot of SocketException with this error message :

An attempt was made to access a socket in a way forbidden by its access permissions

As you can see, my app crash every time Azure app service plan TCP connection limit has been reached (seems to be 700 for B1 plan) :

Steve Gordon said this :

When you add a typed client it will be registered as a transient service in Dependency Injection, so you will always get a fresh instance with a fresh HttpClient created for you by the factory. The factory will manage the underlying handler lifetime for that HttpClient instance, re-using a handler (and connection) if available.

HttpClientFactory is not supposed to address those issues and handle lifetime management of underlying resources like TCP connections ? What do you recommend ?

Microsoft documentation stated :

Therefore, HttpClient is intended to be instantiated once and reused throughout the life of an application. Instantiating an HttpClient class for every request will exhaust the number of sockets available under heavy loads. That issue will result in SocketException errors. Possible approaches to solve that problem are based on the creation of the HttpClient object as singleton or static, as explained in this Microsoft article on HttpClient usage.

But there’s a second issue with HttpClient that you can have when you use it as singleton or static object. In this case, a singleton or static HttpClient doesn’t respect DNS changes, as explained in this issue at the .NET Core GitHub repo.

To address those mentioned issues and make the management of HttpClient instances easier, .NET Core 2.1 introduced a new HttpClientFactory that can also be used to implement resilient HTTP calls by integrating Polly with it.

Maybe I could try using typed clients from a singleton service like as it is described here :

https://www.stevejgordon.co.uk/ihttpclientfactory-patterns-using-typed-clients-from-singleton-services

But I'm not sure if it's the best way for my application. I'm developing a status website for an online game, who makes requests day and night to the game's API to get the status of the backend servers. This application is often under heavy loads during game maintenance or downtime periods.

The current implementation of this website in production use Node.js but I rewrote it from scratch with ASP.NET core 2.2, I learn at the same time how ASP.NET Core 2 works 😃 The new service is in beta since today and I observe this issue after few hours.

I can share application sources if needed, but it's mostly what you did.

Best Regards.

@stevejgordon

Answer 1 · 2019-05-29T09:15:03.000Z

You could use a singleton HttpClient instead of a pool of them but this has all the downsides around DNS not working properly.

If you're hitting a TCP connection limit on your server, there is not a lot you can do. You could consider using a circuit breaker to reduce pressure and fail fast but that doesn't solve the issue.

Answer 2 · 2019-05-29T10:12:54.000Z

You could also experiment with embedding a Polly Bulkhead policy in the configuration of your HttpClient created via HttpClientFactory. This can give you programmatic control over load while retaining the DNS-management and pooling benefits of HttpClientFactory.

Bulkhead policy is a parallelism throttle: when the parallelism limit is reached, you can choose whether to make the excess requests queue (at the expense of course of those requests experiencing greater latency- may or may not be acceptable); or choose simply to shed the excess load. The advantage of pro-actively shedding excess load is that you shed the load before it brings down your whole app. We discuss this more in the Polly wiki here: bulkhead as load shedding. You would have to experiment with finding the parallelism limit that provides this stability for your chosen Azure App Service plan.

You can also, further, use reaching or nearing the bulkhead parallelism limit as a trigger for horizontal scaling of your App Service. This way you can scale up with demand, and back down when utilisation drops. You can expose the bulkhead available capacity to AppInsights, and use it as a custom metric for auto-scaling.

Answer 3 · 2019-05-29T22:11:13.000Z

@RehanSaeed & @reisenberger Thank you both for your quick answers. But I'm not sure to understand because I launched this web service in beta yesterday in private only for me and some users who had the link.

4 users was the peak and that was enough to reach 700 TCP connections in only 2 hours. This is not a heavy load issue but very likely a bad implementation on my end of the HttpClientFactory. I can't believe, even with a simple implementation of HttpClientFactory without Polly policies or other specific settings, can exhaust the number of sockets available under a very small load, while my node.js app can runs simultaneously 3000 users with a free heroku plan.

I'm sure that even by myself without any other user on the service, I could have reached the limit after a few hours. I have a bad implementation of HttpClientFactory, and I do not know where the problem may come from. My app keeps connections open so they can be reused later but this is obviously not the case, and continues to open connections again and again.

Furthermore, I used a caching system to make HTTP requests only when needed, so whatever the number of active users, the number of requests remains the same. My app makes 1 request to auth.hitman.io every 30 seconds and 1 request to hitmanforum.com every 60 seconds. The result of each request is cached in memory and reused until the next check.

EDIT: my bad, I just saw on the picture above that I had 100,000 calls to my NoSQL database CosmosDB. I understand better the 109K dependency failures on DocumentDB now... I focused only on HTTP calls on the 2 main external services. I forgot that the Azure SDK with Cosmos DB internaly use HTTP calls to request the database via a SQL API.

Azure Cosmos DB is a globally distributed multi-model database that supports the document, graph, and key-value data models. The content in this section is for creating, querying, and managing document resources using the SQL API via REST.