getanteon/anteon

Large request loads produce non-http errors

douglasg14b opened this issue · 5 comments

When testing with say 30k requests over 30 seconds The following errors start happening:

Are these local to the device? If so are there recommended changes/setup that need to be performed before running these kinds of tests?

This happens on both Windows & Linux.

RESULT
-------------------------------------
Success Count:    1699  (5%)
Failed Count:     28301 (95%)

Durations (Avg):
  DNS                  :0.0176s
  Connection           :0.2787s
  Request Write        :0.0001s
  Server Processing    :0.1124s
  Response Read        :0.0000s
  Total                :0.6075s

Status Code (Message) :Count
  200 (OK)    :1699

Error Distribution (Count:Reason):
  17899     :connection timeout
  2         :context deadline exceeded (Client.Timeout or context cancellation while reading body)
  10400     :read timeout

Hi @douglasg14b, what's your timeout (-T flag) for this test?

It's 2. However, changing it seems to not have an effect on these results. It just delays the failures for a few more seconds.

It doesn't appear that all these requests hit my API either, as the load balancer request load stays low during this test. A sign of the API being overloaded is also 504 returns from the load balancer, which we're not getting any of here.

Hi @douglasg14b, how can I reproduce this issue? What's your local balancer? Are there any throttling/security rules in front of the load balancer?

By the way, if you get socket: too many open files errors you can set ulimit -n 1048575 to increase the open files limit which is 256 on MacOS by default.

Hi.

The load balancer is Amazon Elastic Load Balancer in front of an ECS cluster.

Non-local. I'm also using Linux & Windows.

I can give you a test payload & URL privately that you can hit to test. It's scaled down right now, so you'll want to run a long test so it has time to scale back up based on load.