Rust crate gives different mnemonic derivation to the offical JS bip39 package
CodeForcer opened this issue · 7 comments
Looking into the code the difference seems to be that the JS bip39 package uses:
const mnemonicBuffer = Buffer.from(normalize(mnemonic), 'utf8');
Whereas the Rust package uses the JS packages's conversionToEntropy
logic (in the Rust package this is phrase_to_entropy
) before proceeding with the derivation:
let mut bits = BitWriter::with_capacity(264);
for word in phrase.split(" ") {
bits.push(wordmap.get_bits(&word)?);
}
let mtype = MnemonicType::for_word_count(bits.len() / 11)?;
let mut entropy = bits.into_bytes();
let entropy_bytes = mtype.entropy_bits() / 8;
entropy.truncate(entropy_bytes);
The results of both operations with the same phrase is different.
There can only be one correct implementation of bip39, so which package is incorrect?
Can you link to the JS implementation you are using?
Also, the code you pasted for Rust is not matching the source code for tiny-bip39. NVM, I was looking at wrong method.
The relevant code in the JS package is here:
https://github.com/bitcoinjs/bip39/blob/master/src/index.js#L52
And the logic I was examining in this package was from the function called by the function you just linked:
Line 187 in 6abc65a
The phrase to entropy
logic is in the JS package as mnemonicToEntropy
, but it's not used when creating a seed from mnemonic, unlike Rust which does use it.
So, both tiny-bip39 and the bip39 npm module use the same test vectors. There is no normalization in phrase_to_entropy
since that's handled in from_phrase
. How are you using them that gives you different results?
The
phrase to entropy
logic is in the JS package asmnemonicToEntropy
, but it's not used when creating a seed from mnemonic, unlike Rust which does use it.
I was trying to investigate why the mnemonic derivations into keys using Polkadot give different keys to the derivations using the JS bip39 package. Digging into the code I first noticed the Polkadot JS packages wrap into a Rust library which calls:
https://github.com/polkadot-js/wasm/blob/master/packages/wasm-crypto/src/rs/bip39.rs#L40:
#[wasm_bindgen]
pub fn ext_bip39_to_mini_secret(phrase: &str, password: &str) -> Vec<u8> {
let salt = format!("mnemonic{}", password);
let mnemonic = Mnemonic::from_phrase(phrase, Language::English).unwrap();
let mut result = [0u8; 64];
pbkdf2::<Hmac<Sha512>>(mnemonic.entropy(), salt.as_bytes(), 2048, &mut result);
result[..32].to_vec()
}
I then examined the JS package which performs the same logic (turn mnemonic into bytes --> create salt with "mnemonic"+password, then run the hashing algorithms) but gives a different result:
function mnemonicToSeedSync(mnemonic, password) {
const mnemonicBuffer = Buffer.from(normalize(mnemonic), 'utf8');
const saltBuffer = Buffer.from(salt(normalize(password)), 'utf8');
return pbkdf2_1.pbkdf2Sync(mnemonicBuffer, saltBuffer, 2048, 64, 'sha512');
}
I realised the discrepancy is because the Polkadot derivation is converting the mnemonic to entropy during this process, whereas the JS package is not. This is the reason that deriving private/public pairs from the JS BIP39 package and the Polkadot keyring package gives you different addresses
EDIT: When I modified the JS seed derivation to use the mnemonicToEntropy
method then salt + hash I get the same results as the Rust packages. So I can get the same results but I'm not sure which one is the correct way to implement BIP39
Ok, I see where the confusion comes from.
BIP39 (the spec, not the crate or the npm package) is just defining the dictionaries and how to create mnemonics from entropy. It's BIP32 that does derivation for Bitcoin and Ethereum, which both tiny-bip39
and bip39
package provide for convenience.
However, Polkadot/Substrate does not use BIP32, what you want to use for Polkadot/Substrate is substrate-bip39
.
Awesome thanks for your help @maciejhirsz, you've cleared up all my confusion here