angr/pyvex

How to model program termination

Closed this issue · 9 comments

Hi 👋

I’m writing my own lifter for eBPF bytecode and have questions on how to lift an exit instruction (the one that signals exiting from the program):

  1. Should I invoke self.jump(None, ???, JumpKind.Exit)? If yes, I don’t understand what to put instead of ???
  2. The program should exit with a value of one of its register (r0), how can one model that?

Hi, yes, but it's for bpf -- not extended BPF, which has different ISA (e.g., more registers and different instructions organization)

If you meant to recommend using it as a reference, I see that the RET instruction implementation for bpf has the code below, but I'm not sure:

  1. I should do the same thing on program termination -- it seems to assume that RET just returns from function while I want program termination
  2. There is FIXME, so not sure it's a good reference to look at
self.jump(0, self.constant(MAX_INSTR_ID * 8, Type.int_32),  # TODO: FIXME
                  JumpKind.Ret
                  )

The best way to do this is to jump to a specific terminator address and then make the simos for your architecture hook that address with a simprocedure that does an exit.

Hi @rhelmot ,

Thanks for your reply.

Could you please elaborate on what a "terminator address" is?

I tried implementing what you said by overriding SimOS and setting a hook in the constructor as such

TERMINATOR_ADDR = 0

class ExitSimProcedure(SimProcedure):
    NO_RET = True 

    def run(self, exit_code):
        self.exit(exit_code)


class SimOsEbpf(SimOS):
    SYSCALL_TABLE = {} 

    def __init__(self, *args, **kwargs):
        super(SimOsEbpf, self).__init__(*args, name="eBPF", **kwargs)
        self.project.hook(TERMINATOR_ADDR, ExitSimProcedure())


register_simos('UNIX - System V', SimOsEbpf)

Does this make sense?

(note: I noticed that I haven't specified where the argument exit_code to run() is taken from -- I will need to specify a calling convention I guess?)

Yes, you understand perfectly. A terminator address is an arbitrarily chosen address (we usually allocate one in proj.loader.extern_object) which is hooked by a simprocedure that produces no successors, like the one you've set up.

I actually just read through the code again and it turns out you don't even have to set a terminator address - you can just jump wherever you want with a jumpkind of Ijk_Exit and the engine will see that as a cue to stop producing successors.

Big thanks! I managed to do this. I have some questions:

  1. So I use arbitrary (currently, zero) as the address of termination simprocedure. However, I would imagine that picking an arbitrary number may result in wrong execution, e.g., a user's program's unintended jump results in a normal program termination, which I would expect some warnings for. Do you have any ideas of how this arbitrary address binding could be fixed?
  2. What's the common way to extract the executed program's exit code?
  1. You can either a) have the termination address configured in angr to be an unused address, or b) you can do the technique I talked about in my last post and use the Ijk_Exit jumpkind directly from IR.
  2. Generally, just read the exit code register after you notice the program has terminated. If you want something more generic, you can do something along the lines of SimProcedure.exit(code), which adds a SimEvent to the state's history containing the code.

This issue has been marked as stale because it has no recent activity. Please comment or add the pinned tag to prevent this issue from being closed.

This issue has been closed due to inactivity.