Check and enforce string encoding
robfromboulder opened this issue · 3 comments
Currently KVTree.get
assumes that all persisted strings are UTF-8, leaving several obvious gaps:
- Cases where libraries or applications use other string encodings
KVTree.put
does not validate the encoding of incoming strings- Currently missing configuration for what encoding
KVTree
should require
Should follow same guidelines as pmem/pmemkv-ruby#3
Not sure if we should handle the string encoding at all. Perhaps, KVTree
should not care about the encoding and treat the keys as binary data (not strings)?
Hi @krzycz, the issue is that the JS bindings are based on JS strings, which are binary-safe, but still require use of the correct encoding when converting between raw bytes to interned string objects. Most of the time (especially EC6) these JS strings will be UTF-8 encoded, but they don't always have to be, and we probably have to handle those other cases gracefully. (BTW, this is the same issue for Ruby and Java bindings for converting between raw bytes and real String objects)
Whether or not we allow binary-safe keys at the pmemkv
layer is a slightly different issue -- JS strings are binary-safe regardless of what pmemkv
does.
Anyway, hope this helps better frame the problem!
RobD
Closing, now obsolete