[Enhancement] new mnemonic via "words", and help w/ the last word

new mnemonic manual "words" input + help w/ last word

Currently, there is no option to create a new mnemonic by inputting words manually,
However, user can recover an existing mnemonic this way, and if they don't select the last word, krux will randomly choose a valid last word.

This was discussed this morning in the main telegram group.

It may be confusing for users to create a new mnemonic (where they need krux to calculate the checksum and propose one) via the "load mnemonic menu". Further, it is less than ideal to "lead" a user to a particular last word during recovery; it's better to let them enter it and learn that there is a checksum error (so that they verify all the other words too).

It may be better to have two modes... one much like the existing one in "Load Mnemonic" except it arguably might not offer any choice to leave it blank.

...and to have another mode in "New Mnemonic" where the user first selects for a 12-or-24-word mnemonic, and then as the 12th or 24th word is manually input... the list of available bip39 words is reduced to only those that could result in valid mnemonics. This way, we'd be helping users to choose a last word ONLY during new mnemonic, and wouldn't be leading them towards one during recovery. Further, they'll be in control of the choice for the remaining bits of entropy (selecting a word) rather than using krux's random choice.

Odudex referred to this code for what must be done to create the list of valid last-word candidates.

krux/src/krux/pages/tiny_seed.py

Line 396 in 58d6e73

def check_sum(self, tiny_seed_numbers):

I wonder if similar functionality isn't best placed within a method of embit.bip39. I've continued to prototype what I shared previously (and since removed). The original (expanded bits) and an improved version are now 2 comments below this post.

I agree in all details. We should create separated methods (at least from the user perspective). It would be much more intuitive.

I've played with 2 prototype versions, one expanding bits as originally posted above, another using an accumulator integer. If checking every candidate word to see if it creates a valid mnemonic, neither fails and they both take a long time (lots of calls to embit to see if it's valid whereas it's not necessary and the final result must get checked anyways in order to create a wallet). the accumulator version is slightly faster on a real pc, but significantly faster on k210 hardware.... and as long as not checking embit for each candidate, both seem usable (not dreadfully slow).

from embit.bip39 import WORDLIST
from hashlib import sha256


def last_word_candidates_by_expanded_bits(words, mnemonic_length):
    def bip39words_to_bits(words):
        bits = []
        for word in words:
            assert word in WORDLIST
            bits.append("{:011b}".format(WORDLIST.index(word)))
        return bits

    words = words.split()
    assert mnemonic_length % 3 == 0           # valid mnemonic length?
    assert mnemonic_length == len(words) + 1  # need last word only?

    bits = "".join(bip39words_to_bits(words))

    assert (len(bits) + 11) % 33 == 0	      # number of words + 1 divisible by 3

    len_target = (len(bits) + 11) // 33 * 32  # bits of entropy
    len_needed = len_target - len(bits)       # missing entropy bits
    len_cksum = len_target // 32              # additional checksum bits

    candidates = []
    for i in range(2**len_needed):
        last_bits = ("{:0%db}" % len_needed).format(i)
        entropy = int(bits + last_bits, 2).to_bytes(len_target//8, 'big')
        ck_bytes = sha256(entropy).digest()
        ck_bits = "".join(["{:08b}".format(x) for x in ck_bytes])
        last_word_bits = last_bits + ck_bits[:len_cksum]
        last_word = WORDLIST[int(last_word_bits, 2)]
        candidates.append(last_word)

    return candidates


def last_word_candidates_by_accumulator(words, mnemonic_length):
    def bip39words_to_indexes(words):
        indexes = []
        for word in words:
            assert word in WORDLIST
            indexes.append(WORDLIST.index(word))
        return indexes

    words = words.split()
    assert mnemonic_length % 3 == 0           # valid mnemonic length?
    assert mnemonic_length == len(words) + 1  # need last word only?

    accumulator = 0
    for index in bip39words_to_indexes(words):
        # accumulator*2048 OR leftshift 11 bits? which is faster?
        accumulator = (accumulator << 11) + index 

    len_target = (len(words) * 11 + 11) // 33 * 32 # bits of entropy
    len_needed = len_target - (len(words) * 11)    # missing entropy bits
    len_cksum = len_target // 32                   # additional checksum bits 

    candidates = []
    for i in range(2**len_needed):
        entropy = (accumulator << len_needed) + i
        ck_bytes = sha256(entropy.to_bytes(len_target//8, "big")).digest()
        cksum = int.from_bytes(ck_bytes, "big") >> 256 - len_cksum
        last_word = WORDLIST[(i << len_cksum) + cksum]
        candidates.append(last_word)

    return candidates

if __name__ == '__main__':
    try: from utime import ticks as time
    except: from time import time

    # all but the last word given, here lazily selected as adjacent bip39 words
    i=1024
    partials = {
        12: " ".join(WORDLIST[i:i+11]),
        15: " ".join(WORDLIST[i:i+14]), # not for krux, but this is bip39
        18: " ".join(WORDLIST[i:i+17]), # not for krux, but this is bip39
        21: " ".join(WORDLIST[i:i+20]), # not for krux, but this is bip39
        24: " ".join(WORDLIST[i:i+23]),
    }

    answers = []
    now = time()
    for key in sorted(partials):
        partial = partials[key]
        print("\npartial mnemonic {}:\n  {}".format(key, partial))
        candidates = last_word_candidates_by_expanded_bits(partial, key)
        print("candidates:\n  {}".format(", ".join(candidates)))
        answers.append(candidates)
    time_expanded_bits = time() - now

    answers2 = []
    now = time()
    for key in sorted(partials):
        partial = partials[key]
        candidates = last_word_candidates_by_accumulator(partial, key)
        answers2.append(candidates)
    time_accumulator = time() - now
    assert answers == answers2

    print("\ntime: expanded_bits: {}, accumulator: {}".format(
        time_expanded_bits,
        time_accumulator
    ))

I'll work towards more commits to "new_mnemonic_tweaks" branch atop "wallet_customization"
wallet_customization...jdlcdl:krux:new_mnemonic_tweaks

I believe that this issue can be closed as it was resolved w/ the merge of #371

Sure, thank you Jean!