Inconsistency in Unicode noncharacter tests

Question

Inconsistency in Unicode noncharacter tests

Closed this issue 8 years ago · 2 comments

These two tests seem inconsistent: both test the handling of a noncharacter, but one has an i_ prefix and the other has a y_ prefix.

$ cat y_string_escaped_noncharacter.json | xxd
0000000: 5b22 5c75 4646 4646 225d                 ["\uFFFF"]
$ cat i_string_unicode_U+FFFE_nonchar.json | xxd
0000000: 5b22 5c75 4646 4645 225d                 ["\uFFFE"]

I think they should both be implementation-defined FWIW.

Answer 1 · 2016-11-01T11:32:47.000Z

Unicode Standard v9, section 23.7 Noncharacters

they are not illegal in interchange, nor does their presence cause Unicode text to be ill-formed

Normally you would only use noncharacters internally, but passing around JSON internally is fine.

rfc7159 section 7 is also fairly clear that anything up to \uFFFF is fine for escaped characters and FFFF/FFFE are both valid unescaped as well.

Escaped or not all the noncharacter tests should be y_ named.

Answer 2 · 2016-11-01T16:10:25.000Z

Escaped or not all the noncharacter tests should be y_ named.

Ok, agreed.