Dasm (.dasm
) is a made up assembly language that runs on a made up virtual machine.
The vm is a register-machine and is intentionally unsafe, with no
separation of code and data. Code is mutable and you can memcpy functions
onto your stack and execute them if you want. The instruction pointer is
a writable register, you can move rip [an array of bytes here]
to start
executing arbitrary data, etc. The only safety feature is that a shadow
stack is used for return addresses so you can't do ROP (Not that that matters
if you can just overwrite the code...). That was only implemented so that
backtraces would work.
Dasm only works with machine-word sized values.
Dasm has trivial native interop. An adaptor function has to be written that casts
from uintptr_t
to the actual type. You currently can't call native functions
with floating point arguments.
The Dasm vm has a debugger built in. You can step instructions and read registers etc. It has a disassembler, etc. It doesn't try to catch faults in native code, but you can run the vm in a native debugger for that. It is not a source debugger yet, it just disassembles the code.
It has regular decimal literals, binary literals, hex literals.
It has pointer size literals (0p41ad
), which are hex literals that
are multiplied by the size of uintptr_t
at assembly time.
It has string number literals (0sWhatever
), which are integer literals but
each character is converted to its ascii value and slotted into the resulting
number. 0sWhatever
would be equivelant to 0x7265766574616857
in hex. They
can be up to the size of the word size of the host.
Code must be contained within functions. Functions have a name, an optional number
indicating the number of arguments to the function, a series of statements and then
an end. The last statement of a function must be one of abort
, halt
or ret
, to
guard against just falling off the end of the function.
Hello world in dasm:
function start 0
move rarg1 "Hello World"
call function Puts
ret
end
Hackerman programming in dasm:
function death
abort
move rarg1 "Life after Death"
call function Puts
ret
end
function hackerman
# functions are a pointer to a function object,
# whose first member is a pointer to the instructions.
read r0 function death
# Add a word to that and store in rip, bypassing the
# abort
add rip r0 0p1
# die if we get here, to show we don't return to
# this function
abort
end
function start
call function hackerman
halt
end
Davescript (.ds
) is a barely implemented scripting language that can be executed
on the dasm vm.
Currently, this is implemented by compiling a .ds
script to .dasm
as text,
and then compiling that into the bytecode instructions that run on the actual
vm. This is silly and it should just compile directly to bytecode, but oh well.
The compiler is pretty buggy at the moment, it's not hard to get it to generate wrong code.
Davescript is basically c-like.
Davescript has functions:
import io;
function hello(){
io.puts("hello");
}
Functions are called in the usual manner, as shown above.
It has declarations, at both file and function scope. Arguments to functions are like declarations.
let x = 3;
function foo(a){
let b = a + x;
return b;
}
It has control flow:
import io;
function loops(n){
for(let x = 0; x < n; x = x + 1){
io.printf2("x: %zu\n", x);
}
let y = 3;
while(y < n){
y = y * 2;
}
if(y > 10){
return 3;
}
return 4;
}
function start(){
let x = loops(4);
io.printf2("loops(4) = %zu\n", x);
}
It has guaranteed tail calls:
function fib(n){
return fib_inner(0, 1, n);
}
function fib_inner(a, b, n){
if(n <= 0) return a;
return fib_inner(b, a+b, n-1);
}
import io;
function start(){
for(let i = 0; i < 20; i = i + 1)
io.printf2("fib(%zu) = %zu\n", i, fib(i));
}
Which compiles to:
import io
function fib 1
move r0 r10
move r10 0
move r11 1
move r12 r0
tail_call function fib_inner
end
function fib_inner 3
move r0 r10
move r1 r11
move r2 r12
move r3 r2
scmp r3 0
jump gt label L0
move r3 r0
move r15 r3
ret
label L0
move r10 r1
move r11 r0
move r3 r1
add r11 r11 r3
move r12 r2
sub r12 r12 1
tail_call function fib_inner
end
function start 0
move r0 0
label L0
move r1 r0
scmp r1 20
jump ge label L1
move r10 "fib(%zu) = %zu\n"
push r10
move r11 r0
push r11
move r10 r0
push r0
call function fib
pop r0
move r12 rout1
pop r11
pop r10
push r0
call function io.printf2
pop r0
move r1 r0
add r1 r1 1
move r0 r1
move rip label L0
label L1
ret
end
It has regular decimal literals, binary literals, hex literals, pointer-size literals and string number literals.
start
is the entry point for a davescript. You need to define a
start
function. The arguments to start are implementation defined.
All references to functions and variables are checked. Failure to define one you use will result in a linking error. There are some builtins that you don't have to define.
There are no guaranteed builtin functions. All builtin functions are exposed in code. The current implementation exposes some std c functions.
A collection of sample dasm files and davescript files are in the Sample folder.
Dasm currently has way more features than davescript.
- The dasm was originally written in C and then ported to D, more or less the exact same way.
- ddasm should be split into a vm, an assembler, a linker and a default runtime.
- I don't know if it works on Windows.
- I don't use exceptions (in fact I only use -betterC) and so the code is totally exception-unsafe.
- The Makefile is with ldc2 on macOS. The only mac-specific thing it does is
pass
-dead_strip
to the linker, so you can domake LDSTRIP=
and it should build on linux without that probably. Might need to add libm to the link line. - The Makefile doesn't work for Windows. It wouldn't be hard to make it work, but I haven't done it.
- There is also a meson build file. It works, but meson passes weird linker flags and the documentation is non-existant.
- You can also just build it by doing
ldc2 ddasm.d -i
and that'll just work. Throw in-betterC
too. - This only builds with ldc as it uses an ldc simd intrinsic to accelerate
escaping characters in strings. It could be made to work without that
by wrapping that in a
version(){}
block - I made it build with dub at some point. I don't know what I'm doing as
plain
dub
asserts, butdub build
works.