cathugger/mkp224o

More secure approach to deterministic seeding?

Opened this issue · 3 comments

Currently the main issue with -p as I understand it is that we derive an initial seed from a hash password, and then continually add an offset to it to generate new private keys. Because of this the private keys of multiple matches will be closely related to each other.

What if we started off the same way, but when a match is found, we derive an entirely new seed by incrementing the salt passed to the password hash function? The password hashing is slow, but that should not impact things too severely as long as matches are reasonably rare. The checkpoint would only need to add a second field for the salt.

current scheme is for each L_x|R_x = sha512(determ_seed++) (formally, L_x|R_x = sha512(determ_seed + x)). determ_seed here is pwhash result.
however, each L_x which is essentially the real secret key, for the duration of "make new public keys from this seckey" loop gets reused for multiple passwords, to permit user to generate using single original key but different filters (to ensure that different filter find from same initial L_x wouldn't overshadow original find).
this means that wide enough filter set can result in more than one key, and if these are used in different places, leak of one means it's trivial to find different keys from this set just iterating forwards and backwards a bit.
due to relatively short use of that loop space ((1 << 24) 16Mi elements), and different threads using different xes, this should be quite unlikely to get hit in practice.
everything else looks kinda fine, except that maybe i should insert reseedright(sk); after L_x|R_x expansion, to prevent revealing full sha512 result from the first key (it'll probably be safer in some likely currently unimportant ways). will also result in re-generating R_x parts in non-deterministic ways, but i think it's p safe as long as it's proper entropy sourced.

maybe i should add switch to get new seed after each key find, ignoring overshadowing case, and instructing users to always use the same set of filters (or more specific ones) to not have possibility of missing any results.

incrementing salt is also an option i guess but idk if slowdown would be worth it, and would result in needing separate argument or major version permitting compat break, plus that getting new seed after key find, again needing the mention of filters. it'll be considerably safer though but idk if this kind of resistance will be of much practical use. as in, it'll complicate pw guessing / enumeration, which is something that needs to be combined with filters to be usable. it won't mask confirmation of first passwords as it's trivial to roll a few pwhashes more, & likely attacker will already have power to generally overpower original user who generated it. basically, if attacker gets pass word or they want to guess it, this will multiple their workload n times, where n is how many passwords user has skipped to roll their fav one. it doesn't really help when attacker already has password & wants to confirm it, due to available resources, but it should help against weak guessable passwords, but only if user waits for more keys to roll, and weak passwords tbh just shouldn't be used in the first place, the warning is not about this, but about leak of one of keys potentially influencing safety of other keys. and i believe that current scheme is already making this complicated in the same way, to gen pw having one of keys, you'd already essentially bruteforce, and after bruteforce you'd need to slide with sha512 a bit. how much slide? depends on which password user picked. slide too little & you may end up skipping some. ofc here sliding is a lot cheaper than what user used up for generating keys meanwhile.
I think that doing pwhash on thread initialization & loop re-initialization would be the most fitting place, but would eat into perf for each thread used. otherwise, if threads would share pwhash and rotate it only on successful find, they'd end up potentially missing keys if 2 threads are able to return from what they were given but then only first result would go in due to the way one gets randomly delayed before another.
so essentially, blocking each thread on startup & reinit, would get quite costy. maybe we could split pwhash into 2 parts, one for initial, another for salted sequence version, could balance costs a bit better that way maybe.
essentially i still think that it's an interesting thing to consider but it's protecting different, weak password use case, & idk if that's worth the cost, it's hard to measure.

if we consider sha512(determ_seed++) part to be unsafe, due to determ_seed being in sequence, I think masking off right part of sha512 result with reseedright(sk); as i mentioned earlier should help, iirc that prevents length extension attacks (sha512/256 variant), but idk if that's relevant.
ideally this could've been cascade of sha256 hashes, followed by additional (with differentiator) sha256 hash & sha512 to expand, or sha3 use too, but eh it wasn't done this way initially & then compatibility concerns.

this is all complicated & hard to think about, but i think i laid it out pretty well. feels like needs some time to decide exact modifications tho.

btw thanks for bringing this up & other community help stuff, it's nice & helps motivation.

actually it seems that reseedright(sk); is called before key is printed out so right side of sha512 is already being masked.

some stuff implemented in 80e1bd0
will probably make skipnear a default on next major release.