bluealloy/revm

feat: Introduce ByteCode format

rakita opened this issue ยท 18 comments

With a restriction that contacts can not start with 0xEF and with the possibility that in future we will have different ByteCode for EVM Object Format, it makes sense to introduce support for it into the revm. https://eips.ethereum.org/EIPS/eip-3541

The biggest benefit atm would be introducing ByteCode type that will contain jumptable and gas blocks and with that we would skip the needed analysis of the contract. This is a fairly big speed-u for some use cases noticed in the foundry as fuzzing tests call a lot of contracts with not much execution and analysis, in that case, is taking a lot of time.

The proposal is to introduce enum ByteCode that would somehow return: code, code_size and jumptable.

Basically replace this input with ByteCode:


and propagate change wherever needed.
ByteCode should set these fields in the contract:

and do the same for database struct:

To fully circle the story with EVM Object Format, CREATE and CREATE2 will need to return 0xEF type format in bytes that would get transformed to ByteCode for evm usage here:

// EIP-3541: Reject new contract code starting with the 0xEF byte
if SPEC::enabled(LONDON) && !code.is_empty() && code.get(0) == Some(&0xEF) {
self.data.subroutine.checkpoint_revert(checkpoint);
return (Return::CreateContractWithEF, ret, interp.gas, b);
}

But this is a consensus rule and out of scope for this feat, we should just continue using legacy ByteCode here. If needed we could in Inspector create_end transform that into ByteCode.

cc @onbjerg @brockelmore for when we'll want to further speedup fuzz tests

We would need a utility in either Forge or ethers-rs to convert from the old format to the new format given that the Solidity compiler does not support EOF yet

@onbjerg shoudn't utility be inside revm?

What i thought is to create/create2 use

ByteCode::Legacy {
  bytes: Bytes, 
} 

And you would call something like ByteCode:Analyse(bytes) inside create_end that would replace bytecode in subroutine with

Bytecode::Analysed {
  code:
  size:
  analysis: 
}

Or just directly use it inside the db.

Makes sense, however I'm not sure what the gains are for Forge here, since we run each test in a database we later discard. The main benefactor of an optimization here would be fuzz tests, but since we discard the database and cannot be sure that the same bytecode will be at the same address in the next run, we can't really change that behavior - is there any way we can have a performant-ish bytecode cache based on the bytecode itself instead of addresses, that we could then reuse across test runs?

@onbjerg hm so contracts for fuzz tests are created at evm runtime. Can you run with me an outline of how fuzzing is done?

Setup phase (this state IS persisted):

  • We deploy the main test contract (might create more than 1 contract)
  • We call setUp (might create more contracts)

Then, for each fuzz run we decide on some parameters using different strategies (not important for this outline) and we:

  • Call testXyz (assuming we're running fuzz tests for testXyz), running on a copy of the database we created in the setup phase
  • We use the new database from the previous step and call failed (implementation specific) to see if the test failed
  • We now know if the test succeeded or not, and in either case we discard the database
  • If the test succeeded, we go through the previous 3 steps again until we either hit our target runs (256 by default), or we encounter a failing test case

In most cases we will probably end up deploying the same contracts, however, we don't enforce that that is the case, so some users might conditionally deploy extra contracts based on the fuzz inputs

Does that make sense?

This would help only to preanalyse contracts deployed inside Setup phase, as you could save analysed contracts in db.
But if the ratio of contracts made by fuzz tests vs setup stage is significant, this will not mean much.

I created three types of bytecode:

  • Raw bytecode is bytecode that is not touched.
  • Checked bytecode is extended with padding of 33 zero bytes.
  • Analysed bytecode contains the analysis of jumptable and gas blocks.

So when implementing this and running a small contract 30k times I got this results:

Raw elapsed time: 517.084111ms
Checked elapsed time: 449.484905ms
Analyzed elapsed time: 253.557062ms

Raw and Checked are usually similar (checked is slightly better), while analysed bytecode execution is x1.5-2 faster.

Other then introducing Bytecode i added some more packed versions of jumptable and OpCode Info that got us a slight boost but nothing major. Fields for them became private but there should be alternative functions that can be called.

Just to visually compare it. First flamegraph is only raw bytecode, while the second flame graph is analysed bytecode.
First (Raw elapsed time: 459.801385ms):
flamegraph
Second ( Analysed elapsed time: 261.872689ms):
flamegraph

Once the Bytecode is analyzed, can we somehow cache it across instantiations of the EVM? e.g. in Foundry we instantiate a new EVM each time, so it'd be nice if we can store the analyzed bytecode and initialize the EVM with the already analyzed one

Once the Bytecode is analyzed, can we somehow cache it across instantiations of the EVM? e.g. in Foundry we instantiate a new EVM each time, so it'd be nice if we can store the analyzed bytecode and initialize the EVM with the already analyzed one

Database trait now returns Bytecode
https://github.com/bluealloy/revm/pull/156/files#diff-2955e45b5fcaef33495479dce9f6bd07c88fbb96b6f633a4b31e9362a12adedbR22
So it becomes the cache that you can move between new EVM instances.

Here is an example of how one analysed contract is called multiple times: https://github.com/bluealloy/revm/pull/156/files#diff-c28df8837c5c77a683c9974376c1b53605fe01730f1b81b25c38771d58ecda6bR52-R59

Does this work for you?

I think that it should work, yeah. An update of REVM in Foundry should probably provide a noticeable speedup for fuzz tests that deploy their contracts in setUp, but making it work for fuzz tests that deploy contracts in the fuzz test itself will need some more work

Will give it a shot soon and report back ๐Ÿ˜„

Seems to have shaved off a second of solmate's testing time on my machine with no special handling added! (just replaced the Bytes stuff with Bytecode::new_raw where needed)

image

Left some comments on the PR, but this is great work ๐Ÿ˜„ Thank you!

(Note: I couldn't believe it so I recompiled master locally to make sure it wasn't my local Rust compiler being smart, but same results - using the new bytecode format is faster)

Okay, i didnt expect it to be this big of a difference. Can you do two flamegraph between executions to better see where execution went down?

And thank you for review, will make those changes before pr merge.

i needed to use --no-inline to make flamegraph finish and needed to set all debug=0 to debug=true inside foundry/Cargo.toml so that debug symbols are present. it is painful to build the foundry :) but i manage to create some graphs.

This is running solmate on old foundry:
flamegraph1
As we can see call_inner-> revm::interpreter::contract::Contract::new takes a lot of time to process

sidenote: Additional improvement can potentially be done inside insert_account_info as it takes a lot of time to do hash of bytecode. We can probably precompute that. But this is the task for the future.

Second flamegraph is with the new revm and it does not have any Contact::new section. I tried to inline it and change code calls for it to show up but i got nothing
flamegraph

I did optimize to_analysis to be better and more compact but results on the real contract test are amazing.

@onbjerg instead of new_raw(bytes) you could use new_raw(bytes).to_checked() this will speed it up a little bit and it is a small change.

@rakita For sure! Thanks for the pointer. Are there any other changes you see having some impact we can do? I noticed you changed some stuff about the OpInfo struct too, so maybe there's something we can use there in coverage or the debugger?

In some places (the debugger and coverage) we also build our own PC -> IC maps and it's not really ideal. Do you think it is possible to reuse the analysis made by REVM for this purpose as well? For context, source maps in Solidity use instruction counters (e.g. the offset of the instruction in the bytecode minus any push bytes), but obviously during execution we get the program counter instead.

I just transformed internal representation of OpInfo so it is memory aligned to u32, it has same functionality as before it is just a little bit more packed, it is packed as: IS_JUMP (1bit) | IS_GAS_BLOCK_END (1bit) | IS_PUSH (1bit) | gas (29bits)

Bytecode::Analysed only contains JUMPDEST and gas block it does not contain where PUSH'es are. You can probably check the code in revm and copy parts of it.
And don use Maps here: https://github.com/foundry-rs/foundry/blob/8016f26e160cb574c8c4f6e97d76e1386f956902/evm/src/coverage/visitor.rs#L473
use vector if you can

Just for comparison, on old version of revm 1.7 on same test, i am getting around ~570ms.
That is boost of around ~10% for raw and ~20% for checked bytecode.

Foundry uses inspector and OpInfo, there is probably some more performance gains there and it still depends a lot on what contract is analysed. Didnt expect solmate test to be so fast.

It would be fun to see how we can integrate Analysed bytecode, it should be even faster than Checked.
On create i am calling to_analysed, now it makes sense for me where the speedup is.

gg