Consider escaping lone surrogates
mathiasbynens opened this issue · 5 comments
Lone surrogates are not valid in UTF-16 or UTF-8, and can be (and have been) used to break such parsers. To protect against this, just escape them.
- Explanation: https://speakerdeck.com/mathiasbynens/hacking-with-unicode-in-2016?slide=106 (there’s a video recording too)
- Related ECMAScript proposal: https://github.com/tc39/proposal-well-formed-stringify
FWIW, I worked on https://github.com/mathiasbynens/jsesc which shares devalue’s security goals (although it does not compete with devalue, as it doesn’t aim to support cycles).
Thanks! I opened #17, but I have to confess I don't really know what I'm doing. Does it look like a reasonable solution? Essentially it replaces JSON.stringify
with a stringifyString
function that behaves equivalently (in theory) except for the handling of lone surrogates.
Note that with https://github.com/tc39/proposal-well-formed-stringify, JSON.stringify
should behave equivalently (modulo casing for hex digits in escape sequences). V8 v7.2.10 / Chrome 72 implements this.
Closing now that #17 is merged. Cheers, Rich!
Would jsesc make sense in this list? https://github.com/Rich-Harris/devalue#see-also
yeah! added it