Typechecking - a contextual story
Closed this issue · 1 comments
A Typed Assembly Language is nothing without types (not nothing, but not a TAL, which defeats the entire purpose of being a TAL).
Types in N* serve the purpose of indicating which context is expected at which point in the program. As explicit types are attached to label, one may jump to those labels only if the current context contains at least the correct types for each register expected.
So, if we have this code:
F: { %rax: s64 }
If we were to want to jmp F
, we would need that %rax: s64
in the current context.
But what if we also have %rbx: u64
? Well that's alright. We can treat label type records as extensible row types. So the code above is really
F: forall (r: Row). { %rax: s64 | r }
and the row r
is infered at the jump site. This allows dropping/ignoring registers in the next context.
This however leads to a delicate problem. Consider this simple example:
mov %rax, 1 # %rax: s64
mov %rbx, 2 # %rbx: s64
jmp f
g: { %rax: s64 }
# what's the type of `%rbx` here? `char` or `s64`?
mov %rax, 0
int
f: { %rax: s64 }
mov %rbx, 'c' # %rbx: char
jmp g
When jmp g
is analyzed, it only requires the context to provide a %rax: s64
, but any information about %rbx
is completely lost. What would this mean? That we cannot read from %rbx
unless %rbx
is set before? This could be the way to go. In fact, it would mean that %rbx
does not appear to have valid data (or this cannot be determined, unless you add %rbx
to the label of f
), therefore it is quite normal to disallow reading from it.
I'm not sure if this is the correct way to go, though. This makes a bit of sense in my head at the moment, but there may be a better alternative that I didn't consider.
I was thinking a bit about my example here:
mov %rax, 1 # %rax: s64 mov %rbx, 2 # %rbx: s64 jmp f g: { %rax: s64 } # what's the type of `%rbx` here? `char` or `s64`? mov %rax, 0 int f: { %rax: s64 } mov %rbx, 'c' # %rbx: char jmp g
and realized that this is all deadcode without a main
(or _start
, it will be decided later on) label at the top.
I'm not sure if the first three lines should be completely ignored (for not being after at least one label -- thus having no context) or typechecked accordingly (beginning with an empty context with only label types), even though their types do not really matter as they won't even be executed (stripping them off would be done with any optimisation level greater or equal to 1, so with for example a -On
flag).