celo-org/optimism

Flaky withdraw e2e test

Opened this issue · 1 comments

The npm e2e test occasionally fails with

 ● execute a withdraw and a deposit in succession

    ContractFunctionExecutionError: The contract function "finalizeWithdrawalTransaction" reverted with the following reason:
    OptimismPortal: proven withdrawal finalization period has not elapsed

    Contract Call:
      address:   0x6509f2a854BA7441039fCE3b959d5bAdd2fFCFCD
      function:  finalizeWithdrawalTransaction((uint256 nonce, address sender, address target, uint256 value, uint256 gasLimit, bytes data))
      args:                                   ({"nonce":"1766847064778384329583297500742918515827483896875618958121606201292619776","sender":"0x976EA74026E726554dB657fA54763abd0C3a0aa9","target":"0x976EA74026E726554dB657fA54763abd0C3a0aa9","value":"1000000000000000000","gasLimit":"21000","data":"0x","withdrawalHash":"0x3c154e4d0dcefb2cb5473ba79c5dbbccd3efd61b79a64c309e06a99f498f7d1c"})

    Docs: https://viem.sh/docs/contract/estimateContractGas
    Version: viem@2.17.4

Example from CI: https://app.circleci.com/pipelines/github/celo-org/optimism/1882/workflows/c8d6dc1b-9f24-4fac-9025-2a27052b9584/jobs/34786

Trying it out locally, it only fails for me if place a sleep between the node startup and the test:

--- a/op-e2e/celo/run_all_tests.sh
+++ b/op-e2e/celo/run_all_tests.sh
@@ -25,6 +25,8 @@ for _ in {1..10}; do
   sleep 0.2
 done
 
+sleep 300
+
 ## Run tests
 echo Start tests
 failures=0

Increasing the l2OutputOracleSubmissionInterval to prevent the L2OO from falling behind does not solve the issue, which is expected since that should not matter when fault proofs are turned on (which is the case for the devnet).

The flakyness started with the celo8 rebase.

@ezdac knows more about how viem deals with the withdrawal finalization period.

Some insight into how viem / our e2e testsuite deals with the withdrawal on L1

The part affected by the error in CI seems to be the withdraw flow on L1:
First the OptimismPrortal.proveWithdrawalTransaction( ) is called:

provenWithdrawals[withdrawalHash] = ProvenWithdrawal({
outputRoot: outputRoot,
timestamp: uint128(block.timestamp),
l2OutputIndex: uint128(_l2OutputIndex)
});

and the block-timestamp at the time of the call is persisted as the proven withdrawals timestamp.

The e2e-testsuite uses viem, which calculates the theoretical wall-time until the L1 block of the withdrawals finalization period is elapsed based on the OptimismPortal.finalizationPeriodSeconds field.
It then waits that amount of time (+ some grace period) and calls the OptimismPortal.finalizeWithdrawal( ) function.

The contract itself checks the passed time based on the block-timestamp and reverts with the condition that the finalization period hasn't in fact elapsed:

// A proven withdrawal must wait at least the finalization period before it can be
// finalized. This waiting period can elapse in parallel with the waiting period for the
// output the withdrawal was proven against. In effect, this means that the minimum
// withdrawal time is proposal submission time + finalization period.
require(
_isFinalizationPeriodElapsed(provenWithdrawal.timestamp),
"OptimismPortal: proven withdrawal finalization period has not elapsed"
);

So to me it seems likely that there is a gradually increasing gap in produced blocks measured in wall-time vs. the theoretical production of L1 blocks when assuming constant block production time.