Question about using Strings

Question

Question about using Strings

jehugaleahsa opened this issue 2 years ago · 4 comments

Hello! This is mostly a question about your library.

I noticed a lot of the methods either accept a string or return a string, such as addSalt(String). I noticed when you convert these back to byte[], you are using UTF 8. When I take an existing salt, that's just a byte[], and try to convert it to a String using new String(bytes, StandardCharsets.UTF_8) I always get an error about it being an invalid salt. These are salts that were generated prior to using your library.

I am curious how UTF 8 works in this situation. Normally, byte values >127 are interpreted as surrogate pairs, meaning the character is composed of 2 or more bytes. Some sequences of bytes are not valid unicode, as some ranges are reserved. I am curious how UTF 8 handles this situation and was hoping you knew off hand. Is it replacing those byte sequences with a default character (like those that appear as a square symbol or question mark diamond), or is it keeping the byte sequences as-is?

Would it be possible to create overloads of addSalt taking a byte[] so I could use this library? In case you're curious, when I persisted my passwords and salts to a database, I used Base64 to encode byte[] rather than UTF 8.

Answer 1 · 2022-08-18T16:13:37.000Z

Hi @jehugaleahsa ,

what algorithm are you using? bcrypt for example generates by itself the salt and cannot be specified.

Can you provide an example where you get the error?
Thank you 🙂

Answer 2 · 2022-08-19T12:48:09.000Z

Sorry. I probably didn't explain that very well. Currently, I am using patrickfav's implementation of bcrypt. To generate the salt, I am using java's SecureRandom class. It works just fine - I wasn't sure if you were saying you're implementation always generates its own salt or if that's the nature of the bcrypt algorithm in general.

I answered my own question about the UTF8 thing, btw. The unit test below took one of my salts, encoded as Base64, and converts it back to a byte array. I convert it to a string using UTF8 and back again, and confirmed the byte arrays do not match. So the Java UTF8 encoder must map invalid character sequences, so the byte[] doesn't round-trip.

    @Test
    public void testEncoding() {
        byte[] bytes = Base64.getDecoder().decode("CBrGvzDkT1UFDkmPQ94pOQ==");
        String utf8 = StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes)).toString();
        byte[] encoded = utf8.getBytes(StandardCharsets.UTF_8);
        Assertions.assertArrayEquals(bytes, encoded); // <--- this fails
    }

Does your library avoid generating salts that would run into this issue? I am not going to be able to port to using password4j because I can't use the existing salts in my database.

Answer 3 · 2022-08-19T16:29:14.000Z

There are no inconsinstencies between the two libraries as you can see

String password = "MySuperSecurePassword";

// encrypt with patrickfav implementation
String hash1 = BCrypt.withDefaults().hashToString(12, password.toCharArray());

// encrypt with Password4j implementation
BcryptFunction f = BcryptFunction.getInstance(Bcrypt.A, 12);
String hash2 = Password.hash(password).with(f).getResult();

// check with Password4j patrickfav's hash
Password.check(password, hash1).with(f); // true

// check with patrickfav implementation Password4j's hash
BCrypt.verifyer().verify(password.toCharArray(), hash2).verified; // true

What I find weird is that you had stored the salt in a separate column in you database. You must always save the hash in its original form, like $2a$12$dp7gYK/LoX1Cm4jpZXp56Onv7Bnf178GpZQVYKwaS4VZvZ0fSLcPu.
bcrypt (but also scrypt and Argon2) integrates the salt inside the cipher text, but bcrypt has stricter rules:

a 16-bytes salt
salt and cipher text are encoded with a modified version of Base64

The last point probably is where you have the issue: you are using the standard Base64 decoder, while it was originally encoded with a modified one. I did't checked how patrickfav's encode the salt BCrypt.Result, but I'm quite sure it's not using the modified Base64.

The only solution I see is that you have to implement the modified version of Base64 (see for example com.password4j.BcryptFunction#decodeBase64(String, int). Even if Password4j would allow salts in form of byte[] (which is a good feature), you still have to decode from standard Base64 your salts and encode them back with the modified Base64.

Answer 4 · 2022-08-19T17:38:30.000Z

I am going to close this issue. Thanks for the education - it helped. What I was missing was that bcrypt stores the hash internally. The database was designed to handle various algorithms so the hash was basically just duplicated.