swift-server/async-http-client

Too low of a decompression ratio in `HTTPClient.shared` settings

MahdiBM opened this issue · 11 comments

Penny is having difficulty decompressing some GitHub responses after the recent Penny update to use HTTPClient.shared.
The decompression limit of 10-ratio seems to be too low for a request to https://api.github.com/repos/vapor/sql-kit/contributors, currently. This might not be reproducible sometime soon when the contents of the endpoint changes.

To reproduce:

docker run --rm -it swift:5.10-jammy bash -c "git clone https://github.com/MahdiBM/AHCLowDecompRatioIssue && cd AHCLowDecompRatioIssue && swift run"

For future reference:

curl https://api.github.com/repos/vapor/sql-kit/contributors -v \                                                                                                                             
    -H "User-Agent: AHC-test" \
    &> ahc-decomp-failure.txt

ahc-decomp-failure.txt

I'm not necessarily inclined to change that setting: I think it's appropriate to leave it not far from where it is. You can create another singleton for your use-case in your module 👍

@Lukasa that'd be fine by me to even revert the whole or most of the PR I did for Penny to use HTTPClient.shared (I regret it), but from my POV it seems like the decompression limit is just too low.

I didn't find proper references on what ratio would be reasonable when searching through some Go and Rust HTTP client libraries. If a ratio of 10 is a widely accepted value, I'm fine with keeping it at 10. Otherwise the only reference I found on Google (the link didn't even open, but was already indexed on Google) mentioned 95% compression is around the max value for normal text files, so I would suggest increasing the ratio to 15/20/25 (each representing 93.3, 95 and 96% compression of a file).

Optionally we can disable decompression to prevent such decompression failures, but that would make the configuration less like what a browser like Firefox would do.

My concern is only that I don't want to expose users to sudden spikes in memory usage caused by decompression. I'm tentatively open to widening the ratio, but I don't think I'd want us to go much beyond 25.

I would have marked this issue as resolved if it wasn't for swiftlang/swiftly#120 (comment)...

Let's mention this issue there and see what happens.

I've run into this problem recently with this URL: https://api.github.com/repos/apple/swift/releases?per_page=100&page=1

A workaround is to create a local shared HTTP client. Here is an example:

private class HTTPClientWrapper {
    fileprivate let inner = HTTPClient(eventLoopGroupProvider: .singleton)

    deinit {
        try? self.inner.syncShutdown()
    }
}

One question is why is the decompression settings different for the .shared HTTPClient, versus the initializer specified above that has the default settings and uses the shared event loop group provider?

We should get these settings aligned. We'd welcome a patch to do that.

@Lukasa The goal of HTTPClient.shared is that it works. And we said (see also docs) that if there's a conflict on how the settings behave we should align them with the platform's default browser. So if Safari works, then HTTPClient.shared should work. It appears that we need to raise the limits for this.

Oh, it appears we raised them already, good. @Lukasa do you know how real-world browsers limit this?

Off the top of my head I don't. I'd expect them to be quite relaxed: memory exhaustion as a user-agent is not really a scary threat. It's not ideal, but the principal impact is DoS on your own origin. That's not a terribly valuable attack to perform, and force-quit will resolve any memory or CPU issue you're hitting. Quite a different threat profile to a server.

@cmcgee1024 no rush for me, but just want to get the situation straight if possible.
The NIO team apparently had yet to release the decompression-ratio changes that I had done.
They released it after you brought it up.

So ... has that resolved your problem?
Or have you not needed to even test the situation again to come to any conclusions?