Crossing labels: is it illegal?

Question

Crossing labels: is it illegal?

Closed this issue 4 years ago · 6 comments

Our type system for our custom assembly language tries to prevent any runtime issue by adding types. For example, we can prevent missing registers when returning to our caller.
But there are things that may not be quite good to have. Let's consider this example:

f: forall (s: Ts). { %rsp: sptr *{ %rsp: sptr s, %rax: s64 }::s64::s }
    mov -8(%rsp), %rbx
    cmp 2, %rbx
    jz f_prime<s>     # jump if %rbx == 2

f_ret: forall (s: Ts). { %rbx: s64, %rsp: sptr *{ %rsp: sptr s, %rax: s64 }::s }
    mov %rbx, %rax
    ret

f_prime: forall (s: Ts). { %rbx: s64, %rsp: sptr *{ %rsp: sptr s, %rax: s64 }::s }
    mov 0, %rbx
    jmp f_ret

How safe is it to "jump" to f_ret? How can we infer a simple jmp f_ret<s> after jz f_prime<s> so that the jump becomes explicit, but only if needed (and how do we infer the specialization)? And how could we ensure that the context when crossing f_ret from f is correct?

Answer 1 · 2020-09-20T12:12:09.000Z

I got an idea 2 or 3 days ago: maybe we could generate a jump graph, in order to determine which "context" (where the context of a label l is a just some instructions following l until the next label (or end on file)) depends on which one.

This would mean that all jumps have to be explicit (so this code above would not typecheck for some reason (to be specified, because I don't really know how to typecheck this)) and fallthroughs would be forbidden.

Answer 2 · 2020-09-20T15:48:59.000Z

Also, it might make sense to instead of forbidding fallthrough, having it restricted by storing whether jumps in the jump graph are possible (for example on a jz) or with return (calls) or always coming (jmps). That way, if we either only have possible jumps or some that return (while considering ret), we could allow some sort of safe fallthroughs. The problem still being how to infer specialisations as in the last question of the issue.

However, I don't believe this really is that useful considering the possible complexity of the whole problem (not the algorithmic complexity, but the time spent trying to develop this). I'd rather disallow fallthroughs even if it is a bit more time consuming to explicitly put jmps, if it really simplifies a lot the time taken to develop the typechecker and the overall complexity of it.

There may be better alternatives, but can't see any at the moment. This really is a big issue for the typechecker at the moment, and I'm afraid that postponing this issue will break the whole typechecker when it will be considered.

Answer 3 · 2020-09-21T06:16:05.000Z

I found two other alternatives:

we could have untyped labels (just like in TALx86), which would allow fallthroughs, but disallow jumps. The only problem with the example above is that some code needs to be duplicated at some point, to allow the fallthrough (because we cannot jump to f_ret as it would be an untyped label).
we could introduce a scope notion (perhaps by space-sensitive parsing), and restrict jumps only to labels in the current scope (or one of its superscopes). I don't see exactly how this would make things easier to handle, mostly when considering fallthroughs, as there still is a specialization problem anyway.

Answer 4 · 2020-09-21T16:53:15.000Z

The idea behind a jump graph (with fallthrough disallowed) would be something like this graph:

Each nodes would be the label name, and each oriented edge would tell what jump kind it embodies.

Note: This package can be used to represent this graph.

Answer 5 · 2020-09-21T20:11:44.000Z

I was thinking about the scoping method. While this in fact does not have any problem with specialization (because the type variable s would be scoped, so could be reused), it would introduce some notion of scoping which I believe isn't something that assembly languages should have. While this might be far easier to implement than actually creating a control flow graph of the whole program, I don't think that it remains in the spirit of good old assembly languages.

Answer 6 · 2020-09-24T07:14:44.000Z

We might as well make two passes, where the first one is type checking, and the second one is some "branch checking", which determines whether labels fallthrough or not, if each call has its own ret (modulo jmps). It must be slightly easier to handle that way.