Codecs/bytes->str does not produce correct utf8 strings according to PostgreSQL
Rovanion opened this issue · 5 comments
Passing the output of
(buddy.core.codecs/bytes->str (nonce/random-nonce 16)
to a field of type text in PostgreSQL results in the following error:
org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0x00
Using bytes->hex works, but that seems like a non-optimal way to go about solving the issue.
It seems like every run of (buddy.core.codecs/bytes->hex (nonce/random-nonce 16))
starts with the same sequence 000001549b, or as emacs interprets (buddy.core.codecs/bytes->str (nonce/random-nonce 16))
"^@^@^A".
What are you trying to do? The bytes->str
function just converts a byte array of octets encoded in utf-8 convert to readable string. But random-nonce
does not generate anything printable. It generates a cryptographicaly nonce. If you want to store it as string you need to use hex/base64 or something different encoding.
bytes->str
is not a magical function that converts whatever byte array to "printable" string. It just makes the reverse conversion of to-bytes
(that takes a string and return a array of bytes (octets) encoded in utf-8).
If you want to store a nonce in postgresql you have two options, use a bytea
field and just store the byte array, or encode that bytearray to some text representation (eg. hex or base64).
What I wanted to do is to store a salt, which as I understood from the docs [0] should be generated by nonce, in a field by the hashed password. And I read "Converts byte array to string using UTF8 encoding" to mean that it produced a correct UTF8-string.
[0] https://funcool.github.io/buddy-core/latest/#nonces-and-salts
First, salt and nonces are diferent, if you need a salt, please use random-bytes
if you need a nonce use random-nonce
.
About encoding, UTF-8
is one of the unicode encoding format that allow store unicode strings (java's String) in bytes. UTF-8 byte string (byte[]) can be converted to unicode string (java's String instance) only if the (byte[] contains properly encoded string using UTF-8
encoding) and in this case, random nonce does not contains properly encoded utf-8 byte string, it contains fully random data. I see, it is a little bit confusing but this is how the string stuff work.
I'm terribly sorry for posting what essentially is a support issue caused by my poor understanding on your issue tracker. Thank you for your time!