honor Content-Type charset

Question

honor Content-Type charset

Closed this issue 3 years ago · 3 comments

I've run into this issue, and it looks like others have as well:

lostisland/faraday#139
lostisland/faraday#443

It would be easy to handle charset, at least on par with other adapters. What do you think? Happy to put together a PR if there's interest.

Answer 1 · 2021-06-13T21:46:11.000Z

I wrote some code to handle the charset (see below). I really have no idea how to write the tests, unfortunately. I haven't used RSpec before. I see a bunch of nested tests but I'm not sure how to stub responses inside an adapter test. Any pointers?

      def encoded_body(http_response)
        body = http_response.body || +''
        if encoding = encoding_for(http_response)
          if body.encoding != encoding
            body = body.dup if body.frozen?
            body.force_encoding(encoding)
          end
        end
        body
      end

      def encoding_for(http_response)
        if content_type = http_response['Content-Type']
          if encoding = content_type[/\bcharset=\s*(.+?)\s*(;|$)/, 1]
            Encoding.find(encoding) rescue nil
          end
        end
      end

Answer 2 · 2021-06-13T21:50:14.000Z

Also, I am unfamiliar with this syntax http_response.body || +''. I copied it from the code above, but what does it do? :)

Answer 3 · 2021-06-14T08:44:08.000Z

Hi @gurgeous thanks for raising this and doing some initial research!
We can take this from here and incorporate your code with some RSpec tests in a PR, I'll just need to find some time to work on it (@olleolleolle @JanDintel feel free to jump in if you have some time!).

Also, I am unfamiliar with this syntax http_response.body || +''. I copied it from the code above, but what does it do? :)

The only "strange" thing in there I can see is the + prefix to the string literal, which makes the string mutable rather than the immutable default 👍
That together with body = body.dup if body.frozen? is necessary to be able to call force_encoding later