/evm_mlir

An EVM written with MLIR

Primary LanguageRustMIT LicenseMIT

EVM MLIR

Telegram Chat rust license

An EVM-bytecode to machine-bytecode compiler using MLIR and LLVM.

Progress

Implemented opcodes (click to open)
  1. (0x00) STOP
  2. (0x01) ADD
  3. (0x02) MUL
  4. (0x03) SUB
  5. (0x04) DIV
  6. (0x05) SDIV
  7. (0x06) MOD
  8. (0x07) SMOD
  9. (0x08) ADDMOD
  10. (0x09) MULMOD
  11. (0x0A) EXP
  12. (0x0B) SIGNEXTEND
  13. (0x10) LT
  14. (0x11) GT
  15. (0x12) SLT
  16. (0x13) SGT
  17. (0x14) EQ
  18. (0x15) ISZERO
  19. (0x16) AND
  20. (0x17) OR
  21. (0x18) XOR
  22. (0x19) NOT
  23. (0x1A) BYTE
  24. (0x1B) SHL
  25. (0x1C) SHR
  26. (0x1D) SAR
  27. (0x20) KECCAK256
  28. (0x30) ADDRESS
  29. (0x31) BALANCE
  30. (0x32) ORIGIN
  31. (0x33) CALLER
  32. (0x34) CALLVALUE
  33. (0x35) CALLDATALOAD
  34. (0x36) CALLDATASIZE
  35. (0x37) CALLDATACOPY
  36. (0x38) CODESIZE
  37. (0x39) CODECOPY
  38. (0x3A) GASPRICE
  39. (0x41) COINBASE
  40. (0x42) TIMESTAMP
  41. (0x43) NUMBER
  42. (0x45) GASLIMIT
  43. (0x46) CHAINID
  44. (0x47) SELFBALANCE
  45. (0x48) BASEFEE
  46. (0x4A) BLOBBASEFEE
  47. (0x50) POP
  48. (0x51) MLOAD
  49. (0x52) MSTORE
  50. (0x53) MSTORE8
  51. (0x54) SLOAD
  52. (0x56) JUMP
  53. (0x57) JUMPI
  54. (0x58) PC
  55. (0x59) MSIZE
  56. (0x5A) GAS
  57. (0x5B) JUMPDEST
  58. (0x5E) MCOPY
  59. (0x5F) PUSH0
  60. (0x60) PUSH1
  61. (0x61) PUSH2
  62. (0x62) PUSH3
  63. (0x63) PUSH4
  64. (0x64) PUSH5
  65. (0x65) PUSH6
  66. (0x66) PUSH7
  67. (0x67) PUSH8
  68. (0x68) PUSH9
  69. (0x69) PUSH10
  70. (0x6A) PUSH11
  71. (0x6B) PUSH12
  72. (0x6C) PUSH13
  73. (0x6D) PUSH14
  74. (0x6E) PUSH15
  75. (0x6F) PUSH16
  76. (0x70) PUSH17
  77. (0x71) PUSH18
  78. (0x72) PUSH19
  79. (0x73) PUSH20
  80. (0x74) PUSH21
  81. (0x75) PUSH22
  82. (0x76) PUSH23
  83. (0x77) PUSH24
  84. (0x78) PUSH25
  85. (0x79) PUSH26
  86. (0x7A) PUSH27
  87. (0x7B) PUSH28
  88. (0x7C) PUSH29
  89. (0x7D) PUSH30
  90. (0x7E) PUSH31
  91. (0x7F) PUSH32
  92. (0x80) DUP1
  93. (0x81) DUP2
  94. (0x82) DUP3
  95. (0x83) DUP4
  96. (0x84) DUP5
  97. (0x85) DUP6
  98. (0x86) DUP7
  99. (0x87) DUP8
  100. (0x88) DUP9
  101. (0x89) DUP10
  102. (0x8A) DUP11
  103. (0x8B) DUP12
  104. (0x8C) DUP13
  105. (0x8D) DUP14
  106. (0x8E) DUP15
  107. (0x8F) DUP16
  108. (0x90) SWAP1
  109. (0x91) SWAP2
  110. (0x92) SWAP3
  111. (0x93) SWAP4
  112. (0x94) SWAP5
  113. (0x95) SWAP6
  114. (0x96) SWAP7
  115. (0x97) SWAP8
  116. (0x98) SWAP9
  117. (0x99) SWAP10
  118. (0x9A) SWAP11
  119. (0x9B) SWAP12
  120. (0x9C) SWAP13
  121. (0x9D) SWAP14
  122. (0x9E) SWAP15
  123. (0x9F) SWAP16
  124. (0xA0) LOG0
  125. (0xA1) LOG1
  126. (0xA2) LOG2
  127. (0xA3) LOG3
  128. (0xA4) LOG4
  129. (0xF3) RETURN
  130. (0xFD) REVERT
Not yet implemented opcodes (click to open)
  1. (0x3B) EXTCODESIZE
  2. (0x3C) EXTCODECOPY
  3. (0x3D) RETURNDATASIZE
  4. (0x3E) RETURNDATACOPY
  5. (0x3F) EXTCODEHASH
  6. (0x40) BLOCKHASH
  7. (0x44) DIFFICULTY
  8. (0x49) BLOBHASH
  9. (0x55) SSTORE
  10. (0x5C) TLOAD
  11. (0x5D) TSTORE
  12. (0xF0) CREATE
  13. (0xF1) CALL
  14. (0xF2) CALLCODE
  15. (0xF4) DELEGATECALL
  16. (0xF5) CREATE2
  17. (0xFA) STATICCALL
  18. (0xFE) INVALID
  19. (0xFF) SELFDESTRUCT

Getting Started

Dependencies

  • Linux or macOS (aarch64 included) only for now
  • LLVM 18 with MLIR: On debian you can use apt.llvm.org, on macOS you can use brew
  • Rust
  • Git

Setup

This step applies to all operating systems.

Run the following make target to install the dependencies (both Linux and macOS):

make deps

Linux

Since Linux distributions change widely, you need to install LLVM 18 via your package manager, compile it or check if the current release has a Linux binary.

If you are on Debian/Ubuntu, check out the repository https://apt.llvm.org/ Then you can install with:

sudo apt-get install llvm-18 llvm-18-dev llvm-18-runtime clang-18 clang-tools-18 lld-18 libpolly-18-dev libmlir-18-dev mlir-18-tools

If you decide to build from source, here are some indications:

Install LLVM from source instructions
# Go to https://github.com/llvm/llvm-project/releases
# Download the latest LLVM 18 release:
# The blob to download is called llvm-project-18.x.x.src.tar.xz

# For example
wget https://github.com/llvm/llvm-project/releases/download/llvmorg-18.1.4/llvm-project-18.1.4.src.tar.xz
tar xf llvm-project-18.1.4.src.tar.xz

cd llvm-project-18.1.4.src
mkdir build
cd build

# The following cmake command configures the build to be installed to /opt/llvm-18
cmake -G Ninja ../llvm \
   -DLLVM_ENABLE_PROJECTS="mlir;clang;clang-tools-extra;lld;polly" \
   -DLLVM_BUILD_EXAMPLES=OFF \
   -DLLVM_TARGETS_TO_BUILD="Native" \
   -DCMAKE_INSTALL_PREFIX=/opt/llvm-18 \
   -DCMAKE_BUILD_TYPE=RelWithDebInfo \
   -DLLVM_PARALLEL_LINK_JOBS=4 \
   -DLLVM_ENABLE_BINDINGS=OFF \
   -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_ENABLE_LLD=ON \
   -DLLVM_ENABLE_ASSERTIONS=OFF

ninja install

Setup a environment variable called MLIR_SYS_180_PREFIX, LLVM_SYS_180_PREFIX and TABLEGEN_180_PREFIX pointing to the llvm directory:

# For Debian/Ubuntu using the repository, the path will be /usr/lib/llvm-18
export MLIR_SYS_180_PREFIX=/usr/lib/llvm-18
export LLVM_SYS_180_PREFIX=/usr/lib/llvm-18
export TABLEGEN_180_PREFIX=/usr/lib/llvm-18

Run the deps target to install the other dependencies.

make deps

MacOS

The makefile deps target (which you should have ran before) installs LLVM 18 with brew for you, afterwards you need to execute the env-macos.sh script to setup the environment.

source scripts/env-macos.sh

Running

To run the compiler, call cargo run while passing it a file with the EVM bytecode to compile. There are some example files under programs/, for example:

cargo run programs/push32.bytecode

You can also specify the optimization level:

cargo run programs/push32.bytecode 3  # ranges from 0 to 3

Testing

To only run the ethereum foundation tests, run the command make test-eth. if you want to run the rest of the tests (those that are not the ethereum foundation tests) just run make test

Debugging the compiler

Compile a program

To generate the necessary artifacts, you need to run cargo run <filepath>, with <filepath> being the path to a file containing the EVM bytecode to compile.

Writing EVM bytecode directly can be a bit difficult, so you can edit src/main.rs, modifying the program variable with the structure of your EVM program. After that you just run cargo run.

An example edit would look like this:

fn main() {
    let program = vec![
            Operation::Push0,
            Operation::PushN(BigUint::from(42_u8)),
            Operation::Add,
        ];
    // ...
}

Inspecting the artifacts

The most useful ones to inspect are the MLIR-IR (<name>.mlir) and Assembly (<name>.asm) files. The first one has a one-to-one mapping with the operations added in the compiler, while the second one contains the instructions that are executed by your machine.

The other generated artifacts are:

  • Semi-optimized MLIR-IR (<name>.after-pass.mlir)
  • LLVM-IR (<name>.ll)
  • Object file (<name>.o)
  • Executable (<name>)

Running with a debugger

Once we have the executable, we can run it with a debugger (here we use lldb, but you can use others). To run with lldb, use lldb <name>.

To run until we reach our main function, we can use:

br set -n main
run

Running a single step

thread step-inst

Reading registers

All registers: register read

The x0 register: register read x0

Reading memory

To inspect the memory at <address>: memory read <address>

To inspect the memory at the address given by the register x0: memory read $x0

Reading the EVM stack

To pretty-print the EVM stack at address X: memory read -s32 -fu -c4 X

Reference:

  • The -s32 flag groups the bytes in 32-byte chunks.
  • The -fu flag interprets the chunks as unsigned integers.
  • The -c4 flag includes 4 chunks: the one at the given address plus the three next chunks.

Restarting the program

To restart the program, just use run again.