Hash password strategy for overlong password and NUL bytes handling
Indigo744 opened this issue ยท 7 comments
Hello,
This is a followup on the overlong password strategy, but this time with the hash one.
I read this article about using bcrypt for password hashing.
It suggests to pass the password through SHA384 before using bcrypt in order to circumvent the bcrypt limitation.
Your API support this approach, albeit with a SHA512.
However, looking at the code, I couldn't find any mention of base64 encoding after the hash and prior bcrypt.
base64 is suggested in the article because:
Bcrypt truncates on NUL bytes.
And
There is a nontrivial chance that one of the raw bytes in the hash will be 0x00. The sooner this byte appears in the string, the cost of finding a collision becomes exponentially cheaper.
For example, both 1]W and @1$ produce a SHA-256 hash output that begins with ab00.
And
A base64-encoded hash is guaranteed to not contain NUL bytes
Here are some other implementations seen elsewhere:
- In bcrypt.net, they do base64 after SHA256, SHA384 or SHA512. Note the legacy hash without base64, as I was the one opening a similar issue on the repo
- In PasswordLock PHP library, they effectively perform base64 after sha384.
- In passlib Python library, they also perform base64 after sha256.
So my suggestion are:
- Offer other hash (SHA256 and SHA384) to have better interoperability
- Do the same as BCrypt.Net, move the current SHA512 long password strategy as legacy, and implements base64 after hash (and before bcrypt)
Note that this should be a breaking change, so user would have to choose which strategy to use (and why it can be problematic to continue with the old one)
Thanks.
So, first of all I have my reservations on the expertise of Paragon guy, I read couple of their articles while researching another project and had some issues with their suggestions - to my understanding they are, like me, not security experts.
That being said, what Paragon is suggesting is to basically create another hashing protocol, which is fine, but cannot be a default in this lib. The current framework already supports your request. A quick and dirty solution would be to implement LongPasswordStrategy
with:
@Override
public byte[] innerDerive(byte[] rawPassword) {
return Bytes.from(Bytes.wrap(rawPassword).hash("SHA-384").encodeBase64()).array();
}
Here are my issues with the suggested protocol:
- "Use SHA-384 because it is resistant to length extension" - this is irrelavant since we do not use it as signature, just as a compression function, also one would probably use HMAC, or a proper KDF, like HKDF to circumvent this issue.
- "use base64 since it does not have a null byte" - I think this is an issue with some PHP implementations, the engine I use (jBcrypt) just transforms given bytes and does not terminate on null bytes (apart from my 71 byte truncation)
Bottom line: you are, with the current implementation, free to implement the suggested protocol - but it's only feature would be interoperability with other implementations.
Ah, I'm happy to see that this lib is not affected by the NULL byte issue.
However, offering more hash option (with base64) would help interoperability with the other libs I mentionned, I think.
Thanks for your reply!
Sure, but baking propriety hash protocols in this lib doesn't really make sense, especially if the developer can do it him/herself. If you like, I can assist you replicating other bcrypt dialects.
To reiterate the base64-ing: This doesn't really make sense from a technical or security standpoint; encoding is for either making byte-safe representation or transfer possible - this sounds like a very dirty hack for poor bcrypt implementations (no offence to the respective authors), so I dont really feel supporting this first-party.
EDIT: the main difference in view is that most bcrypt implementations use char
as input, while the inner engine of this lib uses bytes
, thats why Base64 just doesnt make much sense.
But stepping back a second: overlong passwords a really super special case in password hashing. I would say having a 70+ char password is seldom (if ever) used - I view myself as security aware, but don't really see the use case of such a long user-password. Here is an example of statistical analysis of passwords, where length 23+ is hardly ever used: https://www.codelord.net/2011/06/18/statistics-of-62k-passwords/
Not that it isn't an issue with the protocol, but it'll mostly affect your tester or QA ppl.
Sure, but baking propriety hash protocols in this lib doesn't really make sense, especially if the developer can do it him/herself.
Alright, then I don't see the point of the whole password strategy thing? Just remove it altogether and let the dev handle its special flavor. At least it will be consistent.
Again, I'm not saying peppering (hash, base64, etc) is good or bad, or anything like that.
This thread started as an interrogation on how this lib handled NULL byte from hash. Then the question was raised about interoperability and consistence.
I don't need any of these things, actually (the project I need this lib for uses bcrypt only, and only need the verify part). I am just offering some insight, hoping to improve your lib for future users.
From my point of view, a good lib is a consistent lib that works well with others.
Offering SHA512 flavor only while ignoring how others are doing it (SHA384, base64, etc...) is not consistent in my book.
But stepping back a second: overlong passwords a really super special case in password hashing
Of course, but it's not the topic and hand-waving it will not make it go away.
If it can happen once, it will happen, hence this is a case that should be tackled properly.
Every person with a password manager can use an overly long password. I, for one, have some.
As a dev, I would be sad if a user registered through my PHP website couldn't log in my Android app because of bcrypt interoperability issue.
Maybe I hit the wrong tone, but my criticism was not against you, but the general issue with long passwords and homebrew dialects.
Alright, then I don't see the point of the whole password strategy thing?
The point was that I am aware that there are a bunch of dialects to handle this issues, so I gave an entrypoint to the developer to customise the intended behaviour. I didn't want to provide an exhaustive list of popular dialects. The standard implementations are merely examples, enough for anyone to overcome this issue, if a different approach is needed, it is possible to implement it.
I don't need any of these things, actually
Me neither, but you brought the point up, thats why we are discussing :) (I actually enjoy discussions about these topics)
From my point of view, a good lib is a consistent lib that works well with others.
Consistent would be to just truncate and not support any long passwords at all, since according to the original paper, no handling was defined for such a case. From that point of view hardly any of the implementations are "consistent". I'm all for providing the tools to make them interoperable, but to call it consistent is misleading I think.
Of course, but it's not the topic and hand-waving it will not make it go away.
That's why I also wrote "Not that it isn't an issue with the protocol,..."; I was merely suggesting that beeing a special case it should should get the matching attention share.
Bottom Line: I think we are drifting a bit. Im with you, that this lib should provide the tools to support interoperability with most common bcrypt implementations. However, I wouldn't want to implement any "recommended" bcrypt dialect to handle overlong passwords (as suggested in the first post) as I think it just worsen the problem with bcrypt dialect fragmentation.
EDIT: If you are interested here is my take on a modernised password hash & key derivation function based on bcrypt (unfinished) - it's not intended for productive use, more like as a research topic: https://github.com/patrickfav/bkdf
Maybe I hit the wrong tone, but my criticism was not against you, but the general issue with long passwords and homebrew dialects.
I didn't take it personally, don't worry ๐ but English not being my first language, sometimes I can come across too blunt (or interpret a reply too harsh). My bad. I should use more emoticons maybe ^^
The current bcrypt landscape and its myriad of different implementation is really infuriating. The more I read into it, the more I'm baffled.
Anyway, thanks for listening ๐