pyfisch/cbor

Deserialisation of maps does not parse string keys to integers

Timmmm opened this issue · 0 comments

This is one of those issues where it's not exactly clear what the correct thing to do is, but I feel like CBOR is mostly used as a compact version of JSON, so I think you should be able to write JSON using serde_json, convert it to CBOR, read it in using serde_cbor and it should give the same result.

This doesn't work for something like HashMap<u32, String>. The only key type that JSON supports is String, so when serde_json reads or writes that type it converts the integers to/from strings. The code to do this is MapKey. serde_cbor doesn't have the same code, presumably because CBOR does support integer keys.

The CBOR authors did consider this issue. The spec has this to say about map keys:

The encoding and decoding applications need to agree on what types of
keys are going to be used in maps. In applications that need to
interwork with JSON-based applications, keys probably should be
limited to UTF-8 strings only; otherwise, there has to be a specified
mapping from the other CBOR types to Unicode characters, and this
often leads to implementation errors. In applications where keys are
numeric in nature and numeric ordering of keys is important to the
application, directly using the numbers for the keys is useful.

If multiple types of keys are to be used, consideration should be
given to how these types would be represented in the specific
programming environments that are to be used. For example, in
JavaScript objects, a key of integer 1 cannot be distinguished from a
key of string "1". This means that, if integer keys are used, the
simultaneous use of string keys that look like numbers needs to be
avoided. Again, this leads to the conclusion that keys should be of
a single CBOR type.