URLs don't support fromText -> toURI with URLs containing IPv6 literals
hawkowl opened this issue · 2 comments
hawkowl commented
>>> URL.fromText(u"http://[3fff::1]/foo").asURI().asText()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/hawkowl/venvs/commands/lib/python2.7/site-packages/hyperlink/_url.py", line 1338, in to_uri
new_host = self.host if not self.host else idna_encode(self.host, uts46=True).decode("ascii")
File "/home/hawkowl/venvs/commands/lib/python2.7/site-packages/idna/core.py", line 340, in encode
s = uts46_remap(s, std3_rules, transitional)
File "/home/hawkowl/venvs/commands/lib/python2.7/site-packages/idna/core.py", line 332, in uts46_remap
_unot(code_point), pos + 1, repr(domain)))
idna.core.InvalidCodepoint: Codepoint U+003A not allowed at position 5 in u'3fff::1'
mahmoud commented
Hey Hawkie! This was pretty concerning at first, since I thought we had a bunch of ipv6 coverage, but now I see, so the problem is actually the to_uri() part and the newly-integrated idna stuff:
>>> url = URL.from_text(u'https://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:80/')
>>> url.to_uri()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/hyperlink/_url.py", line 1338, in to_uri
new_host = self.host if not self.host else idna_encode(self.host, uts46=True).decode("ascii")
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 358, in encode
s = alabel(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 270, in alabel
ulabel(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 304, in ulabel
check_label(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 261, in check_label
raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))
idna.core.InvalidCodepoint: Codepoint U+003A at position 5 of u'2001:0db8:85a3:0000:0000:8a2e:0370:7334' not allowed
So I'm guessing we just need to skip idna-encoding of IP-literal stuff, since it's pretty much guaranteed to be ASCII (some examples). How's that sound?
hawkowl commented
That's the approach that Twisted's internals use -- check if it's an IP
address, idna encode only if it's not.
…On Thu., 6 Dec. 2018, 05:46 Mahmoud Hashemi ***@***.*** wrote:
Hey Hawkie! This was pretty concerning at first, since I thought we had a
bunch of ipv6 coverage, but now I see, so the problem is actually the
to_uri() part and the newly-integrated idna stuff:
>>> url = URL.from_text(u'https://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:80/')
>>> url.to_uri()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/hyperlink/_url.py", line 1338, in to_uri
new_host = self.host if not self.host else idna_encode(self.host, uts46=True).decode("ascii")
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 358, in encode
s = alabel(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 270, in alabel
ulabel(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 304, in ulabel
check_label(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 261, in check_label
raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))
idna.core.InvalidCodepoint: Codepoint U+003A at position 5 of u'2001:0db8:85a3:0000:0000:8a2e:0370:7334' not allowed
So I'm guessing we just need to skip idna-encoding of IP-literal stuff,
since it's pretty much guaranteed to be ASCII (some examples
<http://www.gestioip.net/docu/ipv6_address_examples.html>). How's that
sound?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#68 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJ2XGOL8IwRqDSGVgq15IQZVz2ZpnrOks5u2BSUgaJpZM4ZCzCo>
.