Remove trapping #BP for handler?
Zero-Tang opened this issue · 2 comments
On the page with hook code, you used a byte CC to result trapping #BP to VMM. This comes to extra cost by unnecessary VM-Exit. The hook code can be jmp instruction to improve performance, decreasing the times of VM-Exit.
On Win64, you may use push-mov-xchg-ret as hook code. The opcode looks like the following:
50 48 B8 XX XX XX XX XX XX XX XX 48 87 04 24 C3
The opcodes are exactly:
push rax
mov rax, proxy
xchg [rsp],rax
ret
On Win32, you may use jmp rel32 as hook code (E9 XX XX XX XX).
This hook code seems long (16 bytes on Win64, 5 bytes on Win32) but does not matter if we are hooking functions compiled by Microsoft Compilers because function addresses are normally 16-byte aligned.
This may also apply on EPT-protected stealth hook on Intel processors. As read-violation comes, set to be readable but not executable. As execute-violation comes, set to be executable but unreadable.
Checkout my project NoirVisor. I made this come alive since January 2019, but Intel EPT only. I have not ported this to AMD machines yet.
https://github.com/Zero-Tang/NoirVisor/blob/master/src/xpf_core/windows/hooks.c#L136
https://github.com/Zero-Tang/NoirVisor/blob/master/src/svm_core/readme.md#stealth-inline-hook-algorithm
Hi, thanks for the idea.
I consciously made the decision to use #BP for few reasons. The primary one is actually to be able to handle smaller functions (especially on x64). They are not many and may not be interesting after all, but "nice" to be able to cover. The other is performance is not focus of the Intel version of this project, DdiMon, and this project was implemented in the same matter as DdiMon when makes sense.
I agree your idea is better for performance and probably simplicity too, so I might mention that approach and benefits somewhere.
(I would be interested in how much it speeds up code. In some use cases, that improvement might be significant)
Well, my idea is only applicable to functions compiled by MSVC in normal way. As long as they are 16-byte aligned, we may assert all functions are at least 16-byte long. This means we won't cover other functions.
It is true that longer hook-code increases the risk, since certain instructions are encoded in rip-relative manner. The longer the hook-code, the more possible we cover such instructions. This requires decoding the instruction and fixing it.