funcool/buddy-core

Codecs/bytes->str does not produce correct utf8 strings according to PostgreSQL

Rovanion opened this issue · 5 comments

Passing the output of

(buddy.core.codecs/bytes->str (nonce/random-nonce 16)

to a field of type text in PostgreSQL results in the following error:

org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0x00

Using bytes->hex works, but that seems like a non-optimal way to go about solving the issue.

It seems like every run of (buddy.core.codecs/bytes->hex (nonce/random-nonce 16)) starts with the same sequence 000001549b, or as emacs interprets (buddy.core.codecs/bytes->str (nonce/random-nonce 16)) "^@^@^A".

What are you trying to do? The bytes->str function just converts a byte array of octets encoded in utf-8 convert to readable string. But random-nonce does not generate anything printable. It generates a cryptographicaly nonce. If you want to store it as string you need to use hex/base64 or something different encoding.

bytes->str is not a magical function that converts whatever byte array to "printable" string. It just makes the reverse conversion of to-bytes (that takes a string and return a array of bytes (octets) encoded in utf-8).

If you want to store a nonce in postgresql you have two options, use a bytea field and just store the byte array, or encode that bytearray to some text representation (eg. hex or base64).

What I wanted to do is to store a salt, which as I understood from the docs [0] should be generated by nonce, in a field by the hashed password. And I read "Converts byte array to string using UTF8 encoding" to mean that it produced a correct UTF8-string.

[0] https://funcool.github.io/buddy-core/latest/#nonces-and-salts

First, salt and nonces are diferent, if you need a salt, please use random-bytes if you need a nonce use random-nonce.

About encoding, UTF-8 is one of the unicode encoding format that allow store unicode strings (java's String) in bytes. UTF-8 byte string (byte[]) can be converted to unicode string (java's String instance) only if the (byte[] contains properly encoded string using UTF-8 encoding) and in this case, random nonce does not contains properly encoded utf-8 byte string, it contains fully random data. I see, it is a little bit confusing but this is how the string stuff work.

I'm terribly sorry for posting what essentially is a support issue caused by my poor understanding on your issue tracker. Thank you for your time!