fox-it/mkYARA

Possible bug for escaping instructions starting with zero byte ("00", add X, X)

danielplohmann opened this issue · 0 comments

Hi!

I was working on my disassembler's instruction wildcarding and also had a look at how you have been doing it with mkYARA.
While revisiting capstone's internals, I noticed that you use instruction.opcode to determine the number of opcode bytes.
Please note that there are at least two cases, where you may possibly calculate too few opcode bytes:

  1. Instructions starting with "00" (add ...)
  2. Instructions starting with "0F00" (sldt/lldt/ltr/str/verr/verw ...)

Iterating over capstones instruction.opcode will here give you a 0 for these respective bytes in position 0 and 1, despite that one byte being an opcode byte.
Since these are very rare instructions, the impact imho is negligible but I thought you might be interested to know about it. :)

Here's how I decided to handle these special cases now (64bit aware, in case we have a REX prefix):

        opcode_length = 0
        if cap_ins.rex:
            # we need to add one, because we are apparently in 64bit mode and have a REX prefix
            opcode_length += 1
        if (cap_ins.rex and cleaned[2:].startswith("00")) or cleaned.startswith("00"):
            # this can only be ADD PTR, REG with exactly one opcode bytes 
            opcode_length += 1
        elif (cap_ins.rex and cleaned[2:].startswith("0f00")) or cleaned.startswith("0f00"):
            # this can only be *LDT/*TR/VER* with exactly two opcode bytes 
            opcode_length += 2
        else:
            for field in cap_ins.opcode:
                if field != 0:
                    opcode_length += 1

https://github.com/danielplohmann/smda/blob/4d2f5e4f47436ff2383347d9b303ec189136b3b8/smda/intel/IntelInstructionEscaper.py#L269-L282