benoitc/hackney

Add transparent Content-Encoding: gzip/deflate support

Opened this issue ยท 29 comments

So, should add Accept-Encoding: gzip,deflate to request, then check for Content-Encoding: gzip or Content-Encoding: deflate and decompress body (after de-chunking).
Decompression may be done in single pass (zlib:gunzip/1 for gzip and zlib:unzip/1 for deflate) in case of synchronous requests and using streaming (zlib:open/0, zlib:inflateInit/1, zlib:inflate/2 ..., zlib:close/1) in case of asynchronous.
Single-pass example can be found here https://github.com/seriyps/xhttpc/blob/master/src/middlewares/compression_middleware.erl

Some option should be added to enable this feature.

will add it in next release. Thanks for the suggestion.

Also, bear in mind new Erlang 18.0 API to protect from gzip bombs http://www.erlang.org/documentation/doc-7.0-rc1/erts-7.0/doc/html/zlib.html#inflateChunk-2

feature is coming. I am in the middle of adding some big improvments to hackney. I will probably include it right after. Thanks for the hint.

+1 :)

zyro commented

+1 Would be great to see this handled transparently for both requests and responses.

+1 :)

+1 on this :)

+1

Can I help with this? I would be glad to follow some guidance and help with this feature.

@edgurgel thanks! It should probably be added to the function receiving the body, if you wrap them and keep the state it should be OK. Anyway betetr to wait over the we, with the new recv and recv_multipart functions :)

Any update on this feature ?

+1 :)

What's the status of this issue?

No work have been done on my side yet. But any help is appreciated. Anyway I expect that this feature will land after the current rework of the internals.

+1. Just encountered this issue that some server send gzip content and Hackey returns bitstring in body. Finally figure the root issue with lots of people's help. Though it would be really good if Hackey can transparently unzip the body based on Content-Encoding. :)

I have a shim wrapper for this that works for default/gzip and even when calling stream_body/1 that I can put up here if or anyone wants to see the moving parts that are involved? It is not a hackney fork though.

Have the internals been reworked?

I'm working on a ueberauth strategy for StackOverflow and I would love to have this feature completed so I don't have to wire it into OAuth2's handling of the response.

I have some time available to actively help develop, test, or otherwise pitch in. How might I be able to lend a hand? Is a branch started for this that I can begin looking at?

Responses from StackOverflow are gzipped. Ueberauth hands off the request to the OAuth2 library, that just calls Poison for the deserialization of the body. The OAuth2 library does not pass the headers to Poison, just the body.

I'm getting around this by creating my own serializer to gunzip the body before handing off to Poison (https://github.com/scrogson/oauth2/blob/master/lib/oauth2/serializer.ex#L27-L39). But in doing this, I still don't have access to the headers so it's blind faith right now.

Hit me up on Elixir's slack if you are more interested, I feel like this is a tad off-topic for this issue.

Surely only when you include a accept-encoding: gzip, deflate header though?

@jimdigriz Can you share your shim with us?

Okay...it is ugly as sin though. Attached is code you will just have to adapt to fit into your setup.

zlib-shim.txt - lets call this the http module

So instead really what you want is:

  • hackney:get -> http:get (you cannot use the get body option here)
  • hackney:body -> http:body
  • hackney:stream_body -> http:stream_body
  • http:stream_body_drain is used for when you need to drain and discard a request; though probably show not exist as you could just call http:body...

Anyone tried #456, it has been in production for my use case for two weeks?

Was curious if this was supposed and found this issue. Is it still planned for 2.0.0?

i am lagging in release, but yes that will be part of the next release. There is a RC that will land next week including it :)

๐Ÿ‘