A tiny stack machine for learning Ethereum bytecode (WIP).
I created TinyEVM as a mean to learn about Ethereum Virtual Machine. TinyEVM parses a piece of Ethereum bytecode compiled from smart contract languages like Solidity, convert them to the a vector of opcodes, and use the stack to do computations.
The stack has a word size of 32 bytes (256-bit) and a limit of 1024 items.
The following bytecode 0x6001600101
is equivalent to the following:
bytecode | opcode | description |
---|---|---|
0x60 | PUSH1 | Push the next byte of bytecode on the stack |
0x01 | The bytecode to push | |
0x60 | PUSH1 | Push the next byte on the stack |
0x01 | The bytecode to push | |
0x01 | ADD | Pop the last and second last items, add them up, and push the result back on the stack |
use tinyevm::{
lex_bytecode, eval_opcode, Stack,
types::{to_uint256, Endian},
};
fn main() {
let opcodes = lex_bytecode("0x6001600101").unwrap();
let (mut stack, _storage, _memory) = eval_opcode(opcodes);
let (last, rest) = stack.pop();
// Check that the result is 2.
assert_eq!(last.unwrap(), to_uint256(&[0x02], Endian::Little));
// Check that the stack is now empty.
assert_eq!(*rest, Stack::EMPTY);
}
The bytecode 0x6060604052
can be translated as the following:
bytecode | opcode | description |
---|---|---|
0x60 | PUSH1 | Push the next byte of bytecode onto the stack |
0x60 | The bytecode to push (1 byte) | |
0x60 | PUSH1 | Push the next byte onto the stack |
0x40 | The bytecode to push (1 byte) | |
0x52 | MSTORE | Pop 0x40 and 0x60 and store 0x60 at 0x40 location |
use tinyevm::{
lex_bytecode, eval_opcode,
types:UInt256,
};
fn main() {
let result = lex_bytecode("0x6060604052").unwrap();
let (mut stack, _, memory) = eval_opcode(result);
let (last, _) = stack.pop();
// Check that the stack is empty.
assert!(last.is_none());
// Load the memory at the 0x40 location.
let value: UInt256 = memory.load(0x40_usize);
let mut bytes = [0x00; 32];
value.to_little_endian(&mut bytes);
// Check that the value is 0x60.
assert_eq!(bytes[0], 0x60);
}
The bytecode 0x6101006201302f01
can be translated as the following:
bytecode | opcode | description |
---|---|---|
0x61 | PUSH2 | Push the next 2 bytes of bytecode onto the stack |
0x0100 | The bytecode to push (2 bytes) | |
0x62 | PUSH3 | Push the next 3 bytes onto the stack |
0x01302f | The bytecode to push (3 bytes) | |
0x01 | ADD | Pop last two items, add them, and push the sum back onto the stack |
use tinyevm::{lex_bytecode, eval_opcode}
fn main() {
let opcodes = lex_bytecode("0x6101006201302f01").unwrap();
let (mut stack, _, _) = eval_opcode(opcodes);
let (last, rest) = stack.pop();
// Check that 0x01302f + 0x0100 is equal to 77871 + 256
let expected = &format!("{:x}", 77871 + 256);
assert_eq!(last.unwrap(), UInt256::from_str_radix(expected, 16).unwrap());
// Check that the stack is now empty.
assert_eq!(*rest, Stack::EMPTY);
}
Run the above test case with cargo test test_eval_add
.
To run the test on the bytecode 0x6002600202
(5 bytes) or 2 * 2
, run cargo test test_eval_mul
.
To run test on the bytecode 0x6001600081905550
(8 bytes), run cargo test test_eval_push1_dup2_swap1_sstore_pop
Here is a list of supported opcodes.
Opcode | bytecode |
---|---|
ADD | 0x01 |
MUL | 0x02 |
SUB | 0x03 |
DIV | 0x04 |
ISZERO | 0x15 |
POP | 0x50 |
MSTORE | 0x52 |
SSTORE | 0x55 |
PUSH1 | 0x60 |
PUSH2 | 0x61 |
PUSH3 | 0x61 |
DUP1 | 0x80 |
DUP2 | 0x81 |
SWAP1 | 0x90 |
See also: evm-opcodes and ethervm.io
At the moment this is meant to be used as a Rust library. Here is an example of evaluating a simple piece of bytecode.
use tinyevm::{
lex_bytecode, eval_opcode, Stack,
types::{to_uint256, Endian},
};
fn main() {
let opcodes = lex_bytecode("0x6001600101").unwrap();
let (mut stack, _storage, _memory) = eval_opcode(opcodes);
let (last, rest) = stack.pop();
assert_eq!(last.unwrap(), to_uint256(&[0x02], Endian::Little));
assert_eq!(*rest, Stack::EMPTY);
}
It is also possible to read opcode files generated with Solidity compiler solc --opcodes -o tests
:
use tinyevm::Opcode;
fn main() {
let opcodes = Opcode::from_file("tests/Example.opcode");
let (_stack, _storage, _memory) = eval_opcode(opcodes);
}
Because all opcodes are treated as Opcode(u8)
. Any chunk of bytecodes greater than a byte being pushed to the stack (i.e. with PUSH2
, PUSH3
, and so on) is chopped up byte-by-byte. For example, PUSH3 0x2030ff
will be interpreted as a vector of following opcodes
vec![Opcode(0x61), Opcode(0x20), Opcode(0x30), Opcode(0xff)]
and reassembled during evaluation.