Instruction translating for RIP-relative addresses.
ethanporcaro opened this issue · 11 comments
I think functions with RIP-relative instructions at the start are common enough to justify such a feature. I did a bunch of research, and it looks like PolyHook was able to do it by translating the instructions into equivalent ones with absolute addressing.
stevemk14ebr/PolyHook_2_0#119
I tried making some changes in the inline hook creator, specifically ff_hook, but I can't quite figure it out. I've gotten fairly familiar with the library but this assembly stuff is still a challenge I'm trying to learn from and get through. Is this something that could be implemented?
Thanks.
I think functions with RIP-relative instructions at the start are common enough to justify such a feature.
The issue isn't as common as this statement makes it out to be. SafetyHook will already relocate RIP relative instructions as long as it can allocate space for a trampoline nearby (+-2gb). This is usually the case. So in reality the issue is functions that have RIP relative instructions at the start and we can't find space to allocate a trampoline nearby.
I did a bunch of research, and it looks like PolyHook was able to do it by translating the instructions into equivalent ones with absolute addressing.
stevemk14ebr/PolyHook_2_0#119
It's funny because I originally started SafetyHook as just a simpler alternative to the library you mentioned. It's impressive that they support translation of RIP relative instructions, but they lean on asmtk
to do so. I do not want SafetyHook to have any additional dependencies so if any work is done towards this goal it should not introduce an additional dependency.
Is this something that could be implemented?
I think it could be. The tricky part is to keep it simple while not introducing any additional dependencies. Zydis, the library SafetyHook uses for instruction decoding also has the ability to encode instructions. This could probably do a lot of the heavy lifting, but isn't something I've really looked into yet.
Do you have a concrete example of where SafetyHook is failing and where mid-hooking the function in question is not an option?
Yes, assemblers can widen some subsets of instructions to larger versions. Though some instructions will require some variant of:
push reg
movabs reg, address
cmp reg2, [reg]
pop reg
instead of:
cmp reg2, [rip+12345]
I believe Zydis has an instruction encoder which may be of use here.
Do you have a concrete example of where SafetyHook is failing and where mid-hooking the function in question is not an option?
I tried with a mid hook just now, it seems to work most of the time. I say most because I still run into #52 randomly both with freezing and thread trapping. I think it might be a problem with my application. I'll keep looking to see if I can get mid-hooks to be stable.
I've done some initial testing for re-encoding IP relative instructions to be absolute that seems promising. Expect a PR soon for testing.
I personally haven't run into this issue yet, but I looked at zydis and you can only assemble 5 instructions max which seems quite limiting.
I personally haven't run into this issue yet, but I looked at zydis and you can only assemble 5 instructions max which seems quite limiting.
That's the number of operands in a single instruction, not the number of instructions.
We could encode and extend these rip relative instructions with ease, I'm fairly certain. You can make multiple encoding requests: https://github.com/zyantific/zydis/blob/fd3e9a6cc8bdcc617b531feda186699e51664f76/examples/EncodeMov.c#L42-L57
Oh right, never mind then. This seems more clunky than asmjit but I suppose it would work.
Oh right, never mind then. This seems more clunky than asmjit but I suppose it would work.
It's different, but introducing more dependencies is definitely something the library wants to avoid. I guess we're lucky Zydis has encoding now.
Anyone have any ideas on what is causing this or how it might be fixed?
There may be a bug in the Linux version of vm_query
.
Lines 92 to 163 in b046e12
Linux support was added recently and is less thoroughly tested than the other supported platforms, especially 32-bit Linux. A minimal stand-alone example that reproduces the bug would be helpful.
It's likely unrelated to this issue since you mentioned this is a 32-bit problem.
well I looked into vm_query as you suggested and it seems like it is returning nullopt unless I do this
info = VmBasicInfo{};
I don't know if this is caused by another contributor setting the std level to c++17 or if it was always broken, I'm testing it now.
Also maybe it might be good to abort the hook if vm_protect fails like this in the trap_threads function. If it did so then I wouldn't have spent so long debugging this.
Sorry for going off topic on this thread, I'll delete this stuff once I figure it out.
Update
Yeah it appears to be caused by setting std level to C++17.
I believe primary support was for modern C++(23), anything else hasn't been tested much by anyone. There are a lot of differences as you go down.
well I looked into vm_query as you suggested and it seems like it is returning nullopt unless I do this
info = VmBasicInfo{};
I don't know if this is caused by another contributor setting the std level to c++17 or if it was always broken, I'm testing it now.
Also maybe it might be good to abort the hook if vm_protect fails like this in the trap_threads function. If it did so then I wouldn't have spent so long debugging this.
Sorry for going off topic on this thread, I'll delete this stuff once I figure it out.
Update
Yeah it appears to be caused by setting std level to C++17.I believe primary support was for modern C++(23), anything else hasn't been tested much by anyone. There are a lot of differences as you go down.
yeah It wasn't my idea to downport it that much, I wanted to keep the library as is so we could keep it updated with the upstream with minimal hassle but not much I can do since it's not my project.