ionescu007/SimpleVisor

Enabling interrupts in VMEXIT?

Nou4r opened this issue · 2 comments

Nou4r commented

I'm trying to implement a vmcall to read memory from another process, but I get BSOD with DRIVER_IRQL_NOT_LESS_OR_EQUAL.

Arg1: 0000023170e8e050, memory referenced
Arg2: 00000000000000ff, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffff800f1ef1773, address which referenced memory

It says IRQL is 0xFF, but when I check with KeGetCurrentIrql() it gives me 0(PASSIVE_LEVEL)?

The vmcall is made from the usermode app -> causes vmexit -> which executes vmcall handler.

I store in RCX the call index(VMCallFuncIndex), RDX containing a usermode pointer to a structure of data for the memory i/o request, R8 as current process(GetCurrentProcessId() currently for testing),

	case vmcall_read_memory:
	{
		/*
		KIRQL irql_lvl = KeGetCurrentIrql();
		DbgPrint("IRQL_LVL = %d", (ULONG)irql_lvl); //PASSIVE_LEVEL
		*/
		DbgPrint("Attaching to PID: %d\r\n", VpState->VpRegs->R8);
		PEPROCESS local, remote;
		if (!NT_SUCCESS(PsLookupProcessByProcessId((HANDLE)VpState->VpRegs->R8, &local)))
			local = NULL;
		if (local) {
			sIOReq req;
			KAPC_STATE apc_state;
			KeStackAttachProcess(local, &apc_state);
			RtlCopyMemory(&req, (void*)VpState->VpRegs->Rdx, sizeof(sIOReq));
			KeUnstackDetachProcess(&apc_state);
			VpState->VpRegs->Rax = 0;
			DbgPrint("target pid == %d\r\n", req.remote_pid);
			DbgPrint("success\r\n");
		}
		else
			DbgPrint("local == nullptr\r\n");

The code looks correct to me, so i'm not sure what is wrong.
(1 hour later)
So I opened up the crash dump in windbg and the first thing I noticed is:
FAILURE_ID_HASH_STRING: km:disabled_interrupt_fault_stackptr_error_hypervisor!vmxhandlevmcall
Which makes me speculate: Are interrupts disabled?
So I searched already opened issues on SV, and found this:
#3

So I decided to try it myself:

switch (VMCallFuncIndex) {
	case vmcall_read_memory:
	{
		/*
		KIRQL irql_lvl = KeGetCurrentIrql();
		DbgPrint("IRQL_LVL = %d", (ULONG)irql_lvl); //PASSIVE_LEVEL
		*/
		KIRQL old_irql = KeRaiseIrqlToDpcLevel();
		_enable();
		DbgPrint("Attaching to PID: %d\r\n", VpState->VpRegs->R8);
		PEPROCESS local, remote;
		if (!NT_SUCCESS(PsLookupProcessByProcessId((HANDLE)VpState->VpRegs->R8, &local)))
			local = NULL;
		if (local) {
			sIOReq req;
			KAPC_STATE apc_state;
			KeStackAttachProcess(local, &apc_state);
			RtlCopyMemory(&req, (void*)VpState->VpRegs->Rdx, sizeof(sIOReq));
			KeUnstackDetachProcess(&apc_state);
			VpState->VpRegs->Rax = 0;
			DbgPrint("target pid == %d\r\n", req.remote_pid);
			DbgPrint("success\r\n");
		}
		else
			DbgPrint("local == nullptr\r\n");
		_disable();
		KeLowerIrql(old_irql);
		return;

But I still get BSOD, however while it's still the same old DRIVER_IRQL_NOT_LESS_OR_EQUAL, this time it shows the IRQL as being 0x2.

Arg1: 0000026f1c1ee0c0, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffff800a6511783, address which referenced memory

It shows the faulting IP as being:

hypervisor!VmxHandleVMCall+a3 [c:\users\yuuar\source\repos\vt-x\hypervisor\source.c @ 519]
fffff800`a6511783 0f1001          movups  xmm0,xmmword ptr [rcx]

which seems to be this line:

RtlCopyMemory(&req, (void*)VpState->VpRegs->Rdx, sizeof(sIOReq));

So i'm not sure what's going on.

It sounds like you are corrupting the register state when making the vmcall. I know that SimpleVisor doesn't use a lot of assembly to handle entry into the VMM so I wonder if you are assuming the ABI is being respected here.

Actually, it was due to tandasat/HyperPlatform#3 (comment)
Let me quote that in case anyone has the same issue:

Hi Satoshi,

Wow -- I cannot believe you were crazy enough to try page-in from VMM
context! Let me explain to you why VMM context == HIGH_LEVEL :-)

  1. Is it safe for you to be context switched by the OS while in the middle
    of VMM mode? Of course not... So you are at least at DISPATCH_LEVEL. Is it
    safe for you to "wait" on an object while at VMM mode? Of course not -- you
    would be context switched to another thread/idle thread which would now be
    running as VMM Host!!!
  2. Is it safe/OK for you to receive DPCs while in the middle of VMM mode?
    Again, of course not. Another reason why you are at least at
    DISPATCH_LEVEL. Could you receive a DPC, even if you wanted to? Nope --
    receiving a DPC requires an interrupt, and IF is off, so Local APIC will
    never deliver it
  3. Will you receive any Device Interrupts? Nope, because EFLAGS IF is off.
    Would you want to be interrupted in the middle of VMM mode? Also nope. So
    you are at least at MAX_DIRQL.
  4. Will you receive the clock interrupt? Nope (also why you hit a CLOCK
    WATCHDOG BSOD sometimes)... So you are at least at CLOCK_LEVEL.
  5. Will you receive IPIs? Nope, because IF is off, so LAPIC will never send
    them. You also probably don't want to be running IPI while inside VMM
    host... So you are at least at IPI_LEVEL.
  6. Technically because you are not in the middle of handling an IPI, but
    rather you've disabled interrupts completely, you are at IPI_LEVEL + 1, aka
    HIGH_LEVEL.

In other words, if you call, for example, ExAllocatePoolWithTag, and this
is PAGED POOL, you can get unlucky and this will require page-in which
requires blocking your thread, and now, some other thread will run in VMM
host mode... Sure, you can get lucky and control will come back to you, but
this is insane... If you request NON PAGED POOL, it will "appear to
work"... And then in one situation, a TLB flush will be required, which
sends an IPI... Which can't be delivered... And so it will hang. Etc.,
etc...

Hope this makes sense.

Best regards,
Alex Ionescu

On Mon, Jul 4, 2016 at 9:21 PM, Satoshi Tanda notifications@github.com
wrote:

Thank you for the note, Alex.
I do not think I understand why you cannot call API when interrupts are
disabled. The eflags.IF is cleared when VM-exit happened but the IF only
affects hardware interrupt, and exceptions can still occur. I tested that
even page-in was processed fine in the VMM-context if IRQL is
PASSIVE_LEVEL. To my knowledge, the IRQL requirement mostly stems from if
page-in can be processed--in other words, if the process can enter wait
state--, and interrupts are irrelevant.
I bet that I am missing something and appreciate if you could explain a
bit more about why disabling interrupts is technically the same as being
IRQL==HIGH_LEVEL.

So what I did to solve it was:

My solution to this problem is to attempt to queue the requests into a static array of "IO_REQUEST"[5].
It will iterate the IO_REQUEST array, looking for one with a state of 0(which indicates it's not taken), set it to 1(InterlockedCompareExchange(&IO_REQUEST[i].state,1,0)==0), fill the record in, then set it to 2(using InterlockedExchange of course), It will set guest context RAX =1 if it was able to find an empty record to fill in, or else it'll set RAX=0 if it failed. Later on, my system thread (which was created during DriverEntry) will process any IO_REQUEST set to (2), setting back the state flag to 0 upon completing the processing.
This of course means it's limited to 5 requests at any time, but I plan to only have only one single threaded usermode application make these requests through vmcall so in my use case, it's a non issue. Might not be the best solution, but hey, it works.

(p.s. @rianquinn not sure what you're saying? It's late here so i'll just re-read your post tomorrow to see if I can wrap my head around it, but it seems like you're thinking of something else).
I've borrowed from SimpleVisor quite heavily, in fact, I have not made a single change to shvvmxhvx64.asm so I think you're thinking of something else?