wilsonzlin/minify-html

Error with backarrow

Closed this issue · 4 comments

I ran into an error while using this minifier on strings of the following form: <a><- H</a>

Expected behavior:
image

Actual behavior:
image

Minimal reproduce:

from minify_html import minify

print(minify("<a><- H</a>"))  # `<a><- a h<>`

@gruvw - I don't think this is a bug. While > characters can optionally be encoded into HTML &gt; entities, < characters must be encoded into HTML &lt; entities so they are not mistaken for the beginning of a tag.

This should work for you:

from minify_html import minify

print(minify("<a>&lt;- H</a>"))  # `<a><- a h<>`

except that bug #191 prevents this from working properly.

Well, on a conceptual level, from the point of view of the user of a minifier, I would assume that I get the same output with or without using the minifier.

In that particular case, I got a different result by using the minifier (thus the reason why I called it a bug/error).

I am not very familiar with this particular project (minify-html), but it's a general rule that I tend to apply.
If it does not apply here and there are edge cases of that sort, is there a documented list of common pitfalls and how to fix them ?

@gruvw - for any utility that processes input, I think it's fair to expect uncertain output behavior for invalid inputs.

You can verify that this input is invalid by going to the W3C Markup Validation Service page and checking this HTML:

<!DOCTYPE html>
<html lang="en">
 <head><title>title</title></head>
 <body>
  <p><a><- H</a></p>
 </body>
</html>

I generally prefer the utility to error instead of silently producing uncertain output, but I got your point.