ChainSafe/Spectre

Missing components in instance commitments

willemolding opened this issue · 4 comments

At the contract/verifier level the only way to know the inputs that statements are being proven about is through the instance commitments.

For integrating with a smart contract this means a contract function takes inputs and hashes these down to an instance commitment. The result of the verification then informs the contract about the relationships between these inputs.

I think that some input constraints are missing from the circuits that prevent the contract from being able to verify what it needs to for the protocol to work.

Rotation Circuit

The instance commitment is currently just a single poseidon commitment to a set of sync committee keys. I think this needs to be hashed with an SSZ root of the committee keys to allow the contract to verify that the two are equivalent. See the following example in telepathy:

SyncStep Circuit

This one seems pretty good but I think this is missing the commitment to the beacon block root. I see the code is there but commented out. Without this it is impossible to link the result of a rotation with the result of a step and so the protocol cannot move forward.

@willemolding do we also need finalizedHeaderRoot to be exposed by rotation circuit?

Gas cost analysis

General references

Instance commitment in rotation circuit

We use instance commitment to reduce the number of public inputs of a given circuit, which is proportional to a number of calldata uint256[n] pubInputs arguments in the verifier contract. We achieve this via hashing multiple public inputs (ones that we want to expose) using SHA256 hash function.

To further reduce the number of actual public inputs, we compose them into a single big integer via fromLittleEndianBytes. Note, BN254 curve scalar field order is 254 bits (not 256) so also we need to truncate it to 253 bits in the circuit, as well as in solidity to be able to compare them.

Given this rationale, we are presented with a choice to whether use this technique in a RotationCircuit which needs to expose uint256 poseidonCommit (31 bytes), uint8[32] committeePubkeysSszRoot (32 bytes total).

  • Option (A): uint256[33] calldata pubInputs where pubInputs[1] is poseidonCommit (32 bytes) and pubInputs[1..32] are u8 (1 byte each)
  • Option (B) uint256[1] calldata pubInputs, calldata uint256 poseidonCommit, calldata uint8[32] sszRoot + commitment opening + truncation

Calculations

poseidonBytes = 32
gasPerNonZeroByte = 16
gasPerZeroByte = 4
sszRootBytes = 32
bytePerU256 = 32
# option (A)
zeroBytes = sszRootBytes * (bytePerU256-8) = 768
(poseidonBytes + sszRootBytes) * gasPerNonZeroByte + zeroBytes * gasPerZeroByte = 4 096

# option (B)
truncatedShaBytes = 32
calldataCostOptionB = (truncatedShaBytes + poseidonBytes + sszRootBytes) * gasPerNonZeroByte = 1 536

shaInputLen = 32 + 32 = 64
shaOpenningCost = 10 + ((shaInputLen + 8)/64 + 1) * 9 = 29
truncationCost = 10 # my guess = 10
totalCostOptionB = calldataCostOptionB +  shaOpenningCost + truncationCost = 1 575

Conclusion

Option (B) seems cheaper. The cost of zero bytes makes a significant difference, otherwise, option (A) would've been cheaper. This cost can potentially be omitted if verifier contract generation is tweaked, by changing a certain range of pubInputs to be uint8 instead of uint256, though this can mess with verification assembly code.

Note, Telepathy does not perform instance commitment for rotation circuit, ie. they use option (A). Their verifier contract also used memory modifier instead of calldata for public inputs.

Instance commitment truncation

As explained earlier, truncation allows to only expose of a single instance, thereby having a single uint256[1] calldata pubInputs argument in the verifier contract - option (A), instead of uint256[32] calldata pubInputs -option (B).

Calculations

## option (A)
truncatedShaBytes = 32 = 32
truncationCost = 10 # my guess = 10

truncatedShaBytes * gasPerNonZeroByte + truncationCost = 522

## option (B)
zeroBytes = 32 * (bytePerU256-8) = 768
32 * gasPerNonZeroByte + zeroBytes * gasPerZeroByte = 3584

Conclusion

Truncation makes sense, duh.

Onion hashing vs. Input concatenation

In zkLightClientStep function Succinct Labs chooses to do "onion" (chained) hashing instead of a single sha256 hashing over concated inputs. Following is the calculation to understand their reasoning.

Given that SHA256 (precompile 0x02): $10 + ((len(input) + 8)/64 + 1) * 9$ (source)

For a single operation in option (a) - input concatenation, the cost is:
$G_a = 10 + \left( \frac{(len(input_1) + len(input_2) + \ldots + len(input_n) + 8)}{64} + 1 \right) \times 9$

For ( n-1 ) operations in option (b) - onion hashing, the cost is:
$G_b = (n-1) \times \left( 10 + \left( \frac{(32 + 8)}{64} + 1 \right) \times 9 \right)$

For $n$ inputs, the breakeven point occurs when the total byte size of the inputs in option (a) equals
$32×(n−1)−8$. If the total byte size of the inputs is less than this value, option (a) is cheaper. Otherwise, option (b) is cheaper, as the cost in option (b) is fixed per concatenation and hashing, while in option (a) the cost increases with the total byte size of the inputs.

Code

bytes32 attestedSlotLE = SSZ.toLittleEndian(update.attestedSlot); // 8 non-zero bytes, 24 zero bytes
bytes32 finalizedSlotLE = SSZ.toLittleEndian(update.finalizedSlot); // 8 non-zero bytes, 24 zero bytes
bytes32 participationLE = SSZ.toLittleEndian(update.participation); // 8 non-zero bytes, 24 zero bytes
uint256 currentPeriod = getSyncCommitteePeriod(update.attestedSlot); // 8 non-zero bytes, 24 zero bytes
bytes32 syncCommitteePoseidon = syncCommitteePoseidons[currentPeriod]; // 32 non-zero bytes

bytes32 h;
h = sha256(bytes.concat(attestedSlotLE, finalizedSlotLE));
h = sha256(bytes.concat(h, update.finalizedHeaderRoot));
h = sha256(bytes.concat(h, participationLE));
h = sha256(bytes.concat(h, update.executionStateRoot)); // 32 non-zerobytes
h = sha256(bytes.concat(h, syncCommitteePoseidon));

Calculation

nInputs = 6
breakevenBytes = 32 x (nInputs-1) - 8 = 152
nonZeroBytes = 8 * 4 + 32 * 2 = 96
zeroBytes = 24 * 4 = 96

Conclusion

  • If considering only nonZeroBytes - 96 < 152 - option (a) input concatenation is cheaper.
  • If, on the other hand, consider both nonZeroBytes and zeroBytes equally - 96 + 96 = 192 < 152 - option (b) onion hashing is cheaper.

Also note, that option (a) is mildly cheaper in the circuit.

Optimisation idea

We don't need to use zero-padded 32 bytes for uint64 values (attestedSlot, finalizedSlot, participation) as it is up to us how to commit to public inputs. It doesn't have to use SSZ-chunked inputs. Thus, if we are hashing without zero-bytes - 96 < 152 - option (a) input concatenation will be cheaper both in-circuit and on-chain.

@willemolding do we also need finalizedHeaderRoot to be exposed by rotation circuit?

I actually don't know the answer to this. I think it depends on exactly what the rotation circuit is doing. If it is only proving the equivalence of a SSZ and Poseidon root then no.. but then we somehow need to link that root to a finalized header which could be done by doing a short SSZ proof in the contract.

My feeling is that this would probably be more expensive so we do want to include the finalizedHeaderRoot as input and do that proof inside the rotation circuit right?

My feeling is that this would probably be more expensive so we do want to include the finalizedHeaderRoot as input and do that proof inside the rotation circuit right?

Yes, it's best to outsource as much computation to the rotation circuit since its proofs will be posted infrequently and we are doing proof compression anyway i.e. inner circuit where committee update logic is done can be increased significantly without increasing verifier work.

Will add that in the #18 next PR