Optimize merkle proof by having the arguments in calldata
hrkrshnn opened this issue · 6 comments
Using calldata
is ideal here and would save a decent chunk of gas.
But this means that the assembly routine would be more complex. Consider the following instead:
- Use a regular for loop to load the value from
calldata
. - Have a subroutine called
cheapKeccak
, which uses scratch space for keccak that would do the same computation.
Saves around 200 gas if there are 5 elements in proof.
We can make a copy of the function just for calldata. Should we?
If @transmissions11 is ok with some code duplication, then let's go.
function merkleVerifyAssemblyCalldata(
bytes32[] calldata proof,
bytes32 root,
bytes32 leaf
) internal pure returns (bool isValid) {
assembly {
let computedHash := leaf
// Get the memory start location of the first element in the proof array.
let data := proof.offset
// Iterate over proof elements to compute root hash.
for {
// Left shift by 5 is equivalent to multiplying by 0x20.
let end := add(data, shl(5, proof.length))
} lt(data, end) {
data := add(data, 0x20)
} {
let loadedData := calldataload(data)
// Slot of `computedHash` in scratch space.
// If the condition is true: 0x20, otherwise: 0x00.
let scratch := shl(5, gt(computedHash, loadedData))
// Store elements to hash contiguously in scratch space.
// Scratch space is 64 bytes (0x00 - 0x3f) and both elements are 32 bytes.
mstore(scratch, computedHash)
mstore(xor(scratch, 0x20), loadedData)
computedHash := keccak256(0x00, 0x40)
}
isValid := eq(computedHash, root)
}
}
tryna think of a case where u need the proof to be memory
? im totally cool with just supporting calldata
but this seems like a really good change, 200 gas is sick
@Vectorized Do you really have to use calldataload(..)
in assembly? Do you have a benchmark for just proof[idx]
?
@hrkrshnn I tried proof[idx]
, but the assembly cannot compile.
Here are some numbers:
[PASS] testValidProofSupplied() (gas: 2464)
[PASS] testValidProofSuppliedCalldata() (gas: 2262)
[PASS] testVerifyEmptyMerkleProofSuppliedLeafAndRootDifferent() (gas: 1733)
[PASS] testVerifyEmptyMerkleProofSuppliedLeafAndRootDifferentCalldata() (gas: 1565)
[PASS] testVerifyEmptyMerkleProofSuppliedLeafAndRootSame() (gas: 1727)
[PASS] testVerifyEmptyMerkleProofSuppliedLeafAndRootSameCalldata() (gas: 1537)
[PASS] testVerifyInvalidProofSupplied() (gas: 2460)
[PASS] testVerifyInvalidProofSuppliedCalldata() (gas: 2304)
For a fair comparison between the two functions, all tests use calls,
which explains why the numbers for the old tests have increased.
Looks like there may be a very niche use case for memory https://github.com/fx-portal/contracts/blob/baed24d22178201bca33140c303e0925661ec0ac/contracts/tunnel/FxBaseRootTunnel.sol
But for their case, I think it's best if they have a custom implementation, which they did.