python-hyper/hyperlink

normalizing "free radical" percent signs

glyph opened this issue · 2 comments

glyph commented

I think that

>>> hyperlink.URL(path=['%%%']).normalize()
URL.from_text('%%%')

ought to be giving me %25%25%25 ?

I haven't managed to dredge up a spec reference for this, but a % in the path without 2 hex digits after it seems like it ought to just be quoted.

Yeah, I agree this would be nice behavior. Spec says free percents should always be encoded, but in a quick test, neither Firefox nor Chrome do this:

$ nc -l -p 9999

# firefox

GET /%%%/%%% HTTP/1.1
Host: localhost:9999
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
...

# chromium
GET /%%%/%%% HTTP/1.1
Host: 127.0.0.1:9999
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/64.0.3282.167 Chrome/64.0.3282.167 Safari/537.36
...

So, to that expectation, .normalize() does seem like the right place to put this sort of thing. I'll look into it now.

@glyph, got a PR for ya (#62), as mentioned above. Thanks for the initial report and thanks in advance if you're able to review.