AsyncHttpClient/async-http-client

What is causing AHC/2.1 to create mass amounts of HTTP connections?

fredsted opened this issue · 5 comments

Hello

I'm the founder of Webhook.site, a tool for testing webhooks and HTTP requests.

I've found that a lot of users connecting with the AHC/2.1 user agent seem to create mass amounts of HTTP connections (seen with netstat), in the order of thousands, or tens of thousands, per IP address. These connections stay open for a while. It seems to be happening for different users, so it's not a single mis-configured user that this is happening with.

So far I've resorted to blocking the AHC/2.1 user agent and manually blocking IP addresses and ranges of users who're using this software. But since this software seems to be a common way to send HTTP requests I'd like to know if there's anything I can do to avoid this on my side.

One thing of note is that we automatically block users sending large amounts of traffic with either a 410 or a 429 HTTP error. Could this be what's causing it? Does AHC/2.1 have some sort of auto-retry mechanism enabled per default?

Hello

You're sadly the victim of ill intentioned or clueless users. This is not this library's fault.
As you expose an internet service, you have no choice but to implement countermeasures on your side.

These connections stay open for a while.

Keep-alive is the default behavior since HTTP/1.1. Both ends can decide to close the connection but by default the connection stays open until one end decides to close it.
In AHC, by default, connections will only be actively closed by this side when the engine is closed.

You definitely have to implement a keep-alive timeout on your side. You can trust the counterpart to play nice and close idle connections.
Then, if those connections are not idle, well, they are being actively used.

So far I've resorted to blocking the AHC/2.1 user agent

Please note that those users could override the user agent header value and use whatever other value, eg Chrome's, Firefox's or curl's.

users who're using this software

Not a software, a library. Anyone can build whatever software on top of it.

Could this be what's causing it?

People with ill intentions, or not caring about the consequences of their actions for the service maintainer.

Does AHC/2.1 have some sort of auto-retry mechanism enabled per default?

AHC has a built-in retry mechanism when a keep-alive TCP connection is closed while a request is under way, yes.
Then, users might implement their own mechanism.

Good luck

Thanks for the quick response @slandelle

One thing I'm noting that this issue is specific to the AHC/2.1 software. I'm not seeing this issue on any other HTTP clients. Webhook.site is used by hundreds of thousands of users every month and our customers are using all sorts of HTTP clients.

I don't believe that this is caused by ill intentions, but rather that the AHC/2.1 software perhaps has defaults that can cause this behavior. I'm seeing a lot of different users that are only sending a few HTTP request to their Webhook.site endpoint, but have thousands of connections. I don't think that they're knowlingly DDoSing Webhook.site.

While this can perhaps be remediated with better configuration/timeouts on my side, there seems to be some behavior in this software that is causing this issue.

One thing I'm noting that this issue is specific to the AHC/2.1 software. I'm not seeing this issue on any other HTTP clients.

Yeah, one bad behaving user is enough.
Also, AHC is pretty popular amongst web crawlers and those people tend to go through public proxies list to hide their deeds.

While this can perhaps be remediated with better configuration/timeouts on my side, there seems to be some behavior in this software that is causing this issue.

Honestly, I don't think so. Even if these users don't really mean to harm you, they could be misusing this library. Many times I've seen people creating one client instance per request (first error) and not close it (second error).

Anyway, this project is basically dead. I've stopped maintaining years ago and the person who was supposed to take over never did the job. I just answered here because I felt sorry for you.

Thanks again, I guess if the project is dead, the issue won't be fixed anyways.

I'll be adding a note in our documentation that AHC/2.1 is broken should be avoided.

No, AHC is not broken. The way some clueless people use it is.