Unicode escape write
peroket opened this issue · 5 comments
When having string with unicode values and serialising them those unicode values should be escaped.
For example, if in c++ I write
auto result = glz::write_json("\x1f");
I would expect the result string to actually be "\u001f". glaze does not seem to do any conversions. Is there a way to be conform that I missed?
Thanks for brining this up. There are a few issues in glaze dealing with improved unicode handling, so it's certainly necessary, but I'm not sure when the full support will be added. Will try to add support sooner than later, or I would be happy to accept a pull request.
Forcing escape checks and conversions on everyone I think is a bad idea, because it results in a significant performance loss when writing strings.
I think this should be opt-in, with a global compile time value, and a wrapper for using on specific strings.
I tend to agree with this discussion that escaping these characters is a fault in the specification: https://softwaremaniacs.org/blog/2015/03/22/json-encoding-problem/en/
However, I still think it must be supported, but opt-in, as described above. It is best to by default conform to the I-JSON RFC.
The only characters that need to be escaped, per the specification, are the control values (less than 32) which do not have shorthands.
The original example of "\x1f"
would need to be escaped as JSON escaped unicode.
My plan is to add a compile time option and wrappers to support this behavior, but not to support it by default. A major reason is that escaped control values allow embedding null characters into strings, which can create ugly bugs when working with algorithms that use null termination.
There would also be a major performance hit to generally escape these control characters that cannot even be displayed as text.
However, this is a valid corner of JSON that will be supported. It will jut be opt-in.
I think the current behaviour is a good choice. But it would be great if it could be mentioned under https://github.com/stephenberry/glaze/tree/main?tab=readme-ov-file#json-conformance ? Perhaps with a link to this ticket.
Thanks for the suggestion, I mentioned this under json-conformance and linked this ticket.