(Intentional?) inconsistency between 4.6 block HTML and 6.6 raw HTML comments
wooorm opened this issue · 10 comments
The block HTML algorithm here allows <!-->
, <!--->
, etc, as comments.
These comments are also fine by the HTML parser (13.2.5.44, case for U+002D
).
(note there are a couple of cases such as <!>
and <!->
which HTML also allows but sees as parse errors, I am not talking about these).
The “inline” algorithm here does not allow <!-->
or <!--->
. They look a lot like comments, so I don’t really expect people to depend on these characters to be text. And it’s inconsistent with blocks. Can we change the spec to allow them?
I can do the work
Yes, I'm in favor.
Good to hear! One thing that I was wondering: --
in a comment is the same. For example, <!-- some stuff -- some more stuff -->
. OK too?
If I recall, we deliberately simplified the comment parsing (even though this diverts from HTML standard). I don't remember why, though. I'm okay with implementing something more standard as long as it doesn't increase complexity too much, both in the spec and in parsers.
I wouldn’t know why that was the case! Perhaps if you care more about XML than HTML?
In my case, this just removes states in my state machine that are needed for inline, but not for block.
I can see --
in comments being used by humans, so that might even be considered a bug fix.
For reference, the HTML5 spec for comments:
https://html.spec.whatwg.org/multipage/syntax.html#comments
Thanks for merging this, John!
Reopening until we get the issue of <!-->
and <!--->
(not to mention <!-- hi -->
) sorted out. See comments on linked PR.
I think an inconsistency between the block and inline cases is okay, given that the spec for block HTML allows invalid HTML.
However, allowing --
inside HTML comments is a change worth making.
commented in the PR: #713 (comment).