angr/pyvex

Need guidance for making some RISCV ISA fixes

eboling opened this issue · 1 comments

Question

We've been working with ANGR on some RISCV code, and have been fixing a few issues with the ANGR support. We added support for MRET, WFI and some CSR instructions. One of the fixes we need some guidance on in order to submit a PR with our changes. RISCV CSR instruction support is currently limited in the VEX lifter to a small number of CSRs related to floating point status/mode CSRs. If the lifter encounters an instruction using another CSR (e.g. mepc), it will fail the decode of the instruction. There are a couple of use cases that are affected by this. If you just want to get a basic CFG, then you don't need a full throated implementation of CSR emulation in VEX, and we used that to hack in a decode that reused an existing register. That won't work for any symbolic execution support, so would not be acceptable as a fix here.

One option is to defined one additional CPU status register, and map all other CSR accesses to that one. Another option is to define individual registers for all the currently defined RISCV CSRs, and support reads and writes to them but do nothing else. This would probably still not really support anything useful with respect to symbolic execution, but it would provide a little more support for someone looking to extend the code later. We've not worked enough with angr to understand the the scope of supporting emulation over the full RISCV CSR set. However, one of these two options (map all to one new register, or create a bunch of new registers) would be good enough to allow better static analysis of RISCV applications that use these instructions, so would be beneficial. If we implement one of these options, is that something that would be an acceptable approach, and would be supported for the purposes of submitting a PR for upstreaming the changes?

If possible we'd like to get this item resolved so that we can complete the changes on our end, and submit a PR sometime in the next two weeks.

A good approach would be using a dirty helper to read and write an arbitrary CSR file. I'm not super familiar with RISC-V, but assuming these are something like x86 MSRs, that would be acceptable.