Edge case: does not decode example string on w3 spec

Question

Edge case: does not decode example string on w3 spec

youming-lin opened this issue 8 years ago · 4 comments

I was testing encode/decode via https://mothereff.in/html-entities while cross-referencing the spec, and I noticed that he is not able to decode certain named references correctly. On the w3 spec page, it lists this example string, I'm &notit; I tell you, which should be parsed into I'm ¬it; I tell you with a parse error. he returns the string un-parsed. It appears that he is not able to parse legacy named references if there are one or more alphanumeric characters after the legacy named reference followed by a semicolon ; character. he parses correctly if the tail of alphanumeric characters ends with a character other than semicolon.

Answer 1 · 2016-10-08T15:14:38.000Z

Good catch! Thanks for the excellent bug report.

Answer 2 · 2017-07-21T18:08:47.000Z

Got bitten by this too, but can't find what would be the way to fix it in he...

Answer 3 · 2019-09-04T22:32:38.000Z

Surely this has been fixed by now...

Answer 4 · 2019-10-02T03:26:06.000Z

128th character in ASCII table which looks like a small square when printed with this code alert(String.fromCharCode(128)); is not being encoded. While it's next character 129 in ASCII is encoded as .