feat: Introduce ByteCode format
rakita opened this issue ยท 18 comments
With a restriction that contacts can not start with 0xEF
and with the possibility that in future we will have different ByteCode
for EVM Object Format, it makes sense to introduce support for it into the revm. https://eips.ethereum.org/EIPS/eip-3541
The biggest benefit atm would be introducing ByteCode
type that will contain jumptable and gas blocks and with that we would skip the needed analysis of the contract. This is a fairly big speed-u for some use cases noticed in the foundry as fuzzing tests call a lot of contracts with not much execution and analysis, in that case, is taking a lot of time.
The proposal is to introduce enum ByteCode
that would somehow return: code
, code_size
and jumptable
.
Basically replace this input with ByteCode
:
and propagate change wherever needed.
ByteCode
should set these fields in the contract:
revm/crates/revm/src/interpreter/contract.rs
Lines 11 to 13 in 0852824
and do the same for database struct:
- Return
ByteCode
:revm/crates/revm/src/db/traits.rs
Line 14 in 0852824
- Replace bytes with
BytesCode
:revm/crates/revm/src/models.rs
Line 24 in 0852824
To fully circle the story with EVM Object Format, CREATE and CREATE2 will need to return 0xEF type format in bytes that would get transformed to ByteCode
for evm usage here:
revm/crates/revm/src/evm_impl.rs
Lines 395 to 399 in 0852824
But this is a consensus rule and out of scope for this feat, we should just continue using legacy
ByteCode
here. If needed we could in Inspector create_end
transform that into ByteCode
.cc @onbjerg @brockelmore for when we'll want to further speedup fuzz tests
We would need a utility in either Forge or ethers-rs to convert from the old format to the new format given that the Solidity compiler does not support EOF yet
@onbjerg shoudn't utility be inside revm?
What i thought is to create/create2 use
ByteCode::Legacy {
bytes: Bytes,
}
And you would call something like ByteCode:Analyse(bytes)
inside create_end
that would replace bytecode in subroutine with
Bytecode::Analysed {
code:
size:
analysis:
}
Or just directly use it inside the db.
Makes sense, however I'm not sure what the gains are for Forge here, since we run each test in a database we later discard. The main benefactor of an optimization here would be fuzz tests, but since we discard the database and cannot be sure that the same bytecode will be at the same address in the next run, we can't really change that behavior - is there any way we can have a performant-ish bytecode cache based on the bytecode itself instead of addresses, that we could then reuse across test runs?
@onbjerg hm so contracts for fuzz tests are created at evm runtime. Can you run with me an outline of how fuzzing is done?
Setup phase (this state IS persisted):
- We deploy the main test contract (might create more than 1 contract)
- We call
setUp
(might create more contracts)
Then, for each fuzz run we decide on some parameters using different strategies (not important for this outline) and we:
- Call
testXyz
(assuming we're running fuzz tests fortestXyz
), running on a copy of the database we created in the setup phase - We use the new database from the previous step and call
failed
(implementation specific) to see if the test failed - We now know if the test succeeded or not, and in either case we discard the database
- If the test succeeded, we go through the previous 3 steps again until we either hit our target runs (256 by default), or we encounter a failing test case
In most cases we will probably end up deploying the same contracts, however, we don't enforce that that is the case, so some users might conditionally deploy extra contracts based on the fuzz inputs
Does that make sense?
This would help only to preanalyse contracts deployed inside Setup phase, as you could save analysed contracts in db.
But if the ratio of contracts made by fuzz tests vs setup stage is significant, this will not mean much.
I created three types of bytecode:
- Raw bytecode is bytecode that is not touched.
- Checked bytecode is extended with padding of 33 zero bytes.
- Analysed bytecode contains the analysis of jumptable and gas blocks.
So when implementing this and running a small contract 30k times I got this results:
Raw elapsed time: 517.084111ms
Checked elapsed time: 449.484905ms
Analyzed elapsed time: 253.557062ms
Raw and Checked are usually similar (checked is slightly better), while analysed bytecode execution is x1.5-2 faster.
Other then introducing Bytecode i added some more packed versions of jumptable and OpCode Info that got us a slight boost but nothing major. Fields for them became private but there should be alternative functions that can be called.
Just to visually compare it. First flamegraph is only raw bytecode, while the second flame graph is analysed bytecode.
First (Raw elapsed time: 459.801385ms):
Second ( Analysed elapsed time: 261.872689ms):
Once the Bytecode is analyzed, can we somehow cache it across instantiations of the EVM? e.g. in Foundry we instantiate a new EVM each time, so it'd be nice if we can store the analyzed bytecode and initialize the EVM with the already analyzed one
Once the Bytecode is analyzed, can we somehow cache it across instantiations of the EVM? e.g. in Foundry we instantiate a new EVM each time, so it'd be nice if we can store the analyzed bytecode and initialize the EVM with the already analyzed one
Database
trait now returns Bytecode
https://github.com/bluealloy/revm/pull/156/files#diff-2955e45b5fcaef33495479dce9f6bd07c88fbb96b6f633a4b31e9362a12adedbR22
So it becomes the cache that you can move between new EVM instances.
Here is an example of how one analysed contract is called multiple times: https://github.com/bluealloy/revm/pull/156/files#diff-c28df8837c5c77a683c9974376c1b53605fe01730f1b81b25c38771d58ecda6bR52-R59
Does this work for you?
I think that it should work, yeah. An update of REVM in Foundry should probably provide a noticeable speedup for fuzz tests that deploy their contracts in setUp
, but making it work for fuzz tests that deploy contracts in the fuzz test itself will need some more work
Will give it a shot soon and report back ๐
Seems to have shaved off a second of solmate's testing time on my machine with no special handling added! (just replaced the Bytes
stuff with Bytecode::new_raw
where needed)
Left some comments on the PR, but this is great work ๐ Thank you!
(Note: I couldn't believe it so I recompiled master locally to make sure it wasn't my local Rust compiler being smart, but same results - using the new bytecode format is faster)
Okay, i didnt expect it to be this big of a difference. Can you do two flamegraph between executions to better see where execution went down?
And thank you for review, will make those changes before pr merge.
i needed to use --no-inline
to make flamegraph finish and needed to set all debug=0
to debug=true
inside foundry/Cargo.toml
so that debug symbols are present. it is painful to build the foundry :) but i manage to create some graphs.
This is running solmate on old foundry:
As we can see call_inner-> revm::interpreter::contract::Contract::new
takes a lot of time to process
sidenote: Additional improvement can potentially be done inside insert_account_info
as it takes a lot of time to do hash of bytecode. We can probably precompute that. But this is the task for the future.
Second flamegraph is with the new revm and it does not have any Contact::new
section. I tried to inline it and change code calls for it to show up but i got nothing
I did optimize to_analysis to be better and more compact but results on the real contract test are amazing.
@onbjerg instead of new_raw(bytes)
you could use new_raw(bytes).to_checked()
this will speed it up a little bit and it is a small change.
@rakita For sure! Thanks for the pointer. Are there any other changes you see having some impact we can do? I noticed you changed some stuff about the OpInfo
struct too, so maybe there's something we can use there in coverage or the debugger?
In some places (the debugger and coverage) we also build our own PC -> IC maps and it's not really ideal. Do you think it is possible to reuse the analysis made by REVM for this purpose as well? For context, source maps in Solidity use instruction counters (e.g. the offset of the instruction in the bytecode minus any push bytes), but obviously during execution we get the program counter instead.
I just transformed internal representation of OpInfo
so it is memory aligned to u32, it has same functionality as before it is just a little bit more packed, it is packed as: IS_JUMP (1bit) | IS_GAS_BLOCK_END (1bit) | IS_PUSH (1bit) | gas (29bits)
Bytecode::Analysed
only contains JUMPDEST and gas block it does not contain where PUSH'es are. You can probably check the code in revm and copy parts of it.
And don use Maps here: https://github.com/foundry-rs/foundry/blob/8016f26e160cb574c8c4f6e97d76e1386f956902/evm/src/coverage/visitor.rs#L473
use vector if you can
Just for comparison, on old version of revm 1.7 on same test, i am getting around ~570ms.
That is boost of around ~10% for raw
and ~20%
for checked bytecode.
Foundry uses inspector and OpInfo
, there is probably some more performance gains there and it still depends a lot on what contract is analysed. Didnt expect solmate test to be so fast.
It would be fun to see how we can integrate Analysed bytecode, it should be even faster than Checked.
On create
i am calling to_analysed, now it makes sense for me where the speedup is.
gg