Empty tag attributes are not parsed correctly
Closed this issue · 4 comments
1player commented
iex(4)> Floki.parse_document("<a href></a>")
{:ok, [{"a", [{"href", "href"}], []}]}
Floki interprets this example as if it was <a href="href">
which is of course wrong. I would expect either Floki to represent the empty attribute as an empty string, or to omit it altogether.
1player commented
Does not seem to affect fast_html
philss commented
This is a limitation of the default parser, mochiweb_html
. Please try to use FastHTML
or HTML5ever
as the README suggest.
1player commented
That's worth documenting, rather than closing this bug as "completed", no? What's the point of shipping with a broken parser?
…On Thu, 6 Jun 2024, at 17:28, Philip Sampaio wrote:
This is a limitation of the default parser, `mochiweb_html`. Please try to use `FastHTML` or `HTML5ever` as the README suggest.
—
Reply to this email directly, view it on GitHub <#558 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFIPSBYNWVFVZ5CTZZDAHDZGCE3ZAVCNFSM6AAAAABI5CN4SCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJSHE2DAOBXHA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
philss commented
@1player sorry, I didn't want to sound rude. My point was to point out that this is documented in our README: https://github.com/philss/floki?tab=readme-ov-file#alternative-html-parsers