twingly/twingly-url

twingly-url doesn't support IDNA2008, only IDNA2003 (libidn)

dentarg opened this issue · 2 comments

Failures:

  1) Twingly::URL#origin internationalized domain name given in Unicode (http://straße.de) should eq "http://xn--strae-oqa.de"
     Failure/Error: it { is_expected.to eq("http://xn--strae-oqa.de") }

       expected: "http://xn--strae-oqa.de"
            got: "http://strasse.de"

       (compared using ==)
     # ./spec/lib/twingly/url_spec.rb:305:in `block (5 levels) in <top (required)>'

Finished in 0.20734 seconds (files took 0.34129 seconds to load)
251 examples, 1 failure

From https://curl.haxx.se/docs/adv_20161102K.html

When curl is built with libidn to handle International Domain Names (IDNA), it translates them to puny code for DNS resolving using the IDNA 2003 standard, while IDNA 2008 is the modern and up-to-date IDNA standard.

This misalignment causes problems with for example domains using the German ß character (known as the Unicode Character 'LATIN SMALL LETTER SHARP S') which is used at times in the .de TLD and is translated differently in the two IDNA standards, leading to users potentially and unknowingly issuing network transfer requests to the wrong host.

For example, straße.de is translated into strasse.de using IDNA 2003 but is translated into xn--strae-oqa.de using IDNA 2008. Needless to say, those host names could very well resolve to different addresses and be two completely independent servers. IDNA 2008 is mandatory for .de domains.

curl is not alone with this problem, as there's currently a big flux in the world of network user-agents about which IDNA version to support and use.

This name problem exists for DNS-using protocols in curl, but only when built to use libidn.

We are not aware of any exploit of this flaw.

I say #102 fixed this, I don't know of any other cases. If we find any, let's open a new issue then.

We can keep an eye on sporkmonger/addressable#247 going forward, but I don't think it warrants it's own issue on our repo