Some HTML entities are incorrectly transformed to UTF8 symbols (e.g. in URLs)
samupl opened this issue · 2 comments
When working on something I noticed (in a django app) that some URLs were rendered incorrectly.
The url in question had a query param called copy_origin
. When the query param was not first (e.g. rendered as ©_origin=something
then it got transformed to the ©
symbol. This doesn't happen if the param is just called copy
, the following underscore seems to make minify-html think it's a valid entity.
I found a few more examples.
This issue is happening at least since 0.11 up until the latest version 0.15:
❯ echo '<a href="/example?attribute=something©_something=1®_something=1&euro_something=1¥_something=1">test</a>' | ./minhtml-0.15.0-x86_64-unknown-linux-gnu
<a href=/example?attribute=something©_something=1®_something=1&euro_something=1¥_something=1>test</a>%
@wilsonzlin Could you verify if this is a bug, or perhaps if it's not just me making incorrect assumptions about the minification?
Hello there, I am leaving this link here: https://denevcloud.azureedge.net/gumeristore/assets/js/minipopup-open.js to try out, this is not correctly minified and the return vector cannot be decoded. container UTF8 chars. Try it yourself guys.