Percent encoding of ~
Closed this issue · 3 comments
Currently cmark percent encodes ~
but it doesn't do for .
_
-
All 4 of them are unreserved. Shouldn't ~
also not be percent encoded?
I don't know, this comes from houdini_href_e.c which was originally from GitHub.
It seems that ~
was required to be encoded in the past, and maybe the code is just playing it safe:
https://jkorpela.fi/tilde.html
RFC 3986 section-2.3 says,
For consistency, percent-encoded octets in the ranges of ALPHA
(%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
underscore (%5F), or tilde (%7E) should not be created by URI
producers and, when found in a URI, should be decoded to their
corresponding unreserved characters by URI normalizers.
I'm happy to change this if you want to submit a PR.
Probably just need to change one item in the array in houdini_href_e.c from 1 to 0.