Retain closing br tag as though it were a normal br tag
IMSoP opened this issue · 2 comments
The WHATWG spec includes a special rule for handling </br>
, in the section on parsing when "in body":
An end tag whose tag name is "br"
Parse error. Drop the attributes from the token, and act as described in the next entry; i.e. act as if this was a "br" start tag token with no attributes, rather than the end tag token that it actually is.
The result is that invalid HTML like Hello <br>World</br>!
will be rendered by browsers as though it had two linebreaks, Hello <br>World<br>!
. This library currently (quite reasonably!) removes the erroneous end tag instead, giving Hello <br>World!
I'm using this library for processing some messy HTML, and it would be useful to have this rule match the spec / browser behaviour.
if that is the browsers behavior, I think that we should do the same.
Are you willing to submit a patch for this?
I'll give it a go; by the looks of it, it will just need a special case at the top of DOMTreeBuilder::endTag