Microsoft XFG

eXtended Flow Guard Under The Microscope

Microsoft seems to be continuously expanding and evolving its set of security mitigations designed and implemented for Windows 10. In this blog post, we’ll examine an upcoming security feature called eXtended Flow Guard (XFG).

XFG has not yet been released, and will not be part of the upcoming 21H1 version of Windows 10. It is, however, present in the Dev Channel of the insider preview1. At this time, the only public mention of XFG by Microsoft was in 2019 at Bluehat Shanghai2.

Although XFG has not been released yet, it is nevertheless possible to compile applications with XFG using Visual Studio 2019 Preview while targeting the insider preview version of Windows 10. A few blog posts on how XFG works have been released3, but these mainly consider how XFG can be compiled into a custom application rather than examining common exploitation scenarios in depth.

The aim of this blog post is to cast more light on whether XFG is truly a more secure and hardened version of Control Flow Guard (CFG). We’ll get started with a short recap on how both CFG and XFG work.

Setting The Baseline

CFG was introduced with Windows 10 in 2015 and has undergone several modifications to mitigate vulnerabilities in its implementation. In essence, CFG is a coarse grained Control Flow Integrity (CFI) solution that maintains a bitmap corresponding to every function and when invoked determines if the function in question is a valid call target.

Microsoft has publicly acknowledged that one of the shortcomings of CFG is return address overwrites. This issue will be addressed by Intel CET and the Shadow Stack. XFG has a very similar design to CFG, and thus must also rely on Shadow Stack to protect against return address overwrites.

Since we’re focusing on XFG and not separate security mitigations such as Intel CET and the Shadow Stack, we’ll explore vtable overwrites with valid call targets.

When creating a browser exploit, a common method for obtaining control of the instruction pointer is to overwrite an entry in the objects vtable and invoke the associated method. CFG was actually introduced to mitigate this exact type of exploitation scenario.

Since CFG is a coarse grained CFI solution, a vtable entry can be replaced with a different function pointer, as long as its a valid call target. This means CFG does not consider the call site, only the call target.

To understand this a little better, let’s examine the API NtCreateFile in ntdll.dll and how that is examined by CFG. A CFG check is performed by the function LdrpDispatchUserCallTargetESS, which expects to find the address of NtCreateFile in the RAX register.

To simulate this, we can modify RAX and RIP in WinDBG and step through the first part of LdrpDispatchUserCallTargetESS:

0:019> r rip = ntdll!LdrpDispatchUserCallTargetES
0:019> r rax = ntdll!NtCreateFile
ntdll!LdrpDispatchUserCallTargetES:
00007ffb`27dd11d0 4c8b1dd1910f00 mov r11,qword ptr [ntdll!LdrSystemDllInitBlock+0xb8 (00007ffb`27eca3a8)] ds:00007ffb`27eca3a8=00007df5f77d0000
0:019> p
ntdll!LdrpDispatchUserCallTargetES+0x7:
00007ffb`27dd11d7 4c8bd0 mov r10,rax
0:019>
ntdll!LdrpDispatchUserCallTargetES+0xa:
00007ffb`27dd11da 49c1ea09 shr r10,9
0:019>
ntdll!LdrpDispatchUserCallTargetES+0xe:
00007ffb`27dd11de 4f8b1cd3 mov r11,qword ptr [r11+r10*8] ds:00007ff5`e41c7868=1111144444444444

Listing – CFG locating bitmap value

LdrpDispatchUserCallTargetESS uses the function address in RAX as an index into the CFG bitmap and fetches back a 64-bit value.

Next, a bit test is performed to check if the supplied function address is a valid call target. This is done by using the function address again as an index:

ntdll!LdrpDispatchUserCallTargetES+0x12:
00007ffb`27dd11e2 4c8bd0 mov r10,rax
0:019>
ntdll!LdrpDispatchUserCallTargetES+0x15:
00007ffb`27dd11e5 49c1ea03 shr r10,3
0:019>
ntdll!LdrpDispatchUserCallTargetES+0x19:
00007ffb`27dd11e9 a80f test al,0Fh
0:019>
ntdll!LdrpDispatchUserCallTargetES+0x1b:
00007ffb`27dd11eb 7509 jne ntdll!LdrpDispatchUserCallTargetES+0x26 (00007ffb`27dd11f6) [br=0]
0:019>
ntdll!LdrpDispatchUserCallTargetES+0x1d:
00007ffb`27dd11ed 4d0fa3d3 bt r11,r10
0:019>
ntdll!LdrpDispatchUserCallTargetES+0x21:
00007ffb`27dd11f1 731b jae ntdll!LdrpDispatchUserCallTargetES+0x3e (00007ffb`27dd120e) [br=0]
0:019>
ntdll!LdrpDispatchUserCallTargetES+0x23:
00007ffb`27dd11f3 48ffe0 jmp rax {ntdll!NtCreateFile (00007ffb`27de1ab0)}

Listing – Validating the call target

In this case, we found that NtCreateFile is a valid call target, and LdrpDispatchUserCallTargetES dispatches execution to it through a JMP instruction.

We can sometimes leverage this when developing exploits by overwriting the vtable pointer with the address of a function that is a valid call target. For this to work properly, we would also need to be able to control the arguments for the function, which is out of scope of this blog post.

In summary, CFG is said to be “coarse grained” because it only takes the call target into account, not the call site. This makes it more susceptible to bypasses.

Understanding The Hash

With XFG, Microsoft is attempting to develop and implement a more fine grained CFI solution, according to the talk at Bluehat Shanghai. XFG is meant to take into account both the call target and the call site.

The general concept is that before each use of XFG, the compiler will generate a 55-bit hash based on the function name, number of arguments, the type of arguments, and the return type. This hash will be embedded in the code just prior to the call into XFG.

Below is a code snippet taken from Chakra.dll which makes use of XFG in the insider preview.

.text:000000000002E8CC mov rcx, [rbx+18h]
.text:000000000002E8D0 and [rsp+48h+var_18], 0
.text:000000000002E8D5 mov rax, [rcx]
.text:000000000002E8D8 mov r10, 0F8D8BEB272D33870h
.text:000000000002E8E2 mov rdx, [rsp+48h+arg_8]
.text:000000000002E8E7 lea r9, [rsp+48h+var_18]
.text:000000000002E8EC mov rax, [rax+18h]
.text:000000000002E8F0 mov r8d, 4
.text:000000000002E8F6 call cs:__guard_xfg_dispatch_ical

Listing – Hash is placed into R10

Just as with CFG, the function address used for the call target is placed in RAX.

The XFG function is called LdrpDispatchUserCallTargetXFG, and we can similarly demonstrate how it works by manually setting RIP to LdrpDispatchUserCallTargetXFG and RAX to NtCreateFile.

0:019> r rip = ntdll!LdrpDispatchUserCallTargetXFG
0:019> r rax = ntdll!NtCreateFile
0:019> p
ntdll!LdrpDispatchUserCallTargetXFG+0x4:
00007ffb`27dd1234 a80f test al,0Fh
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x6:
00007ffb`27dd1236 750f jne ntdll!LdrpDispatchUserCallTargetXFG+0x17 (00007ffb`27dd1247) [br=0]
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x8:
00007ffb`27dd1238 66a9ff0f test ax,0FFFh
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0xc:
00007ffb`27dd123c 7409 je ntdll!LdrpDispatchUserCallTargetXFG+0x17 (00007ffb`27dd1247) [br=0]
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0xe:
00007ffb`27dd123e 4c3b50f8 cmp r10,qword ptr [rax-8] ds:00007ffb`27de1aa8=0000000000841f0f

Listing – Call site is verified

In the last instruction shown above, the hash in R10 is compared to the value 8 bytes prior to the call target. The compiler will insert the generated hash just prior to each function.

Since the call target will move the hash value into R10 prior to invoking LdrpDispatchUserCallTargetXFG, the two values should match in order to allow execution. If the comparison is successful, execution is dispatched through a JMP instruction.

This use of compile time hashes is much harder to bypass than the coarse grained approach of CFG. Overwriting a vtable with a different function pointer seems nearly impossible as a hash collision is very unlikely, given the use of 55-bits in the hash generation.

Falling Back

At this point, it seems as though XFG has successfully managed to mitigate any attempts at meaningful vtable overwrites and is much more secure than CFG. However, we still need to investigate what happens when the hash comparison fails.

If execution is not dispatched to the call target, the code segment shown below is executed:

ntdll!LdrpDispatchUserCallTargetXFG+0x17:
00007ffb`27dd1247 4c8bd8 mov r11,rax
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x1a:
00007ffb`27dd124a 48c1e008 shl rax,8
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x1e:
00007ffb`27dd124e 418ac2 mov al,r10b
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x21:
00007ffb`27dd1251 48c1c808 ror rax,8
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x25:
00007ffb`27dd1255 49c1eb09 shr r11,9
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x29:
00007ffb`27dd1259 49c1e303 shl r11,3
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x2d:
00007ffb`27dd125d 4c031d44910f00 add r11,qword ptr [ntdll!LdrSystemDllInitBlock+0xb8 (00007ffb`27eca3a8)] ds:00007ffb`27eca3a8=00007df5f77d0000
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x34:
00007ffb`27dd1264 4d8b1b mov r11,qword ptr [r11] ds:00007ff5`e41c7868=1111144444444444

Listing – Fetching a bitmap value

We notice that the call target function address is moved into R11 and used as an index into the CFG bitmap. The very same 64-bit CFG bitmap value is moved into R11 at the end of the code segment.

This seems confusing at first, but if we continue execution, we’ll also find the code shown below:

ntdll!LdrpDispatchUserCallTargetXFG+0x37:
00007ffb`27dd1267 48c1c803 ror rax,3
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x3b:
00007ffb`27dd126b 448ad0 mov r10b,al
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x3e:
00007ffb`27dd126e 48c1c003 rol rax,3
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x42:
00007ffb`27dd1272 a80f test al,0Fh
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x44:
00007ffb`27dd1274 7511 jne ntdll!LdrpDispatchUserCallTargetXFG+0x57 (00007ffb`27dd1287) [br=0]
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x46:
00007ffb`27dd1276 4d0fa3d3 bt r11,r10
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x4a:
00007ffb`27dd127a 732b jae ntdll!LdrpDispatchUserCallTargetXFG+0x77 (00007ffb`27dd12a7) [br=0]
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x4c:
00007ffb`27dd127c 48c1e008 shl rax,8
0:019> p
ntdll!LdrpDispatchUserCallTargetXFG+0x50:
00007ffb`27dd1280 48c1e808 shr rax,8
0:019>
ntdll!LdrpDispatchUserCallTargetXFG+0x54:
00007ffb`27dd1284 48ffe0 jmp rax {ntdll!NtCreateFile (00007ffb`27de1ab0)}

Listing – Dispatching execution based on the bitmap

The call target address is yet again used as an index to check the bitmap value through a bit test. At this point, we should observe that the code is almost identical to that of CFG.

After the bit test, we once again find that NtCreateFile is a valid call target and execution is dispatched to it. This happens even though we did not supply any hash, and the initial hash comparison failed.

In effect, XFG falls back to using CFG when the correct hash is not provided. From the example shown with NtCreateFile, its clear that XFG does not block us from overwriting a vtable with a CFG-valid call target.

Conclusions

At face value XFG, appears to be a much more fine grained CFI solution that CFG and should mitigate most exploitation techniques that attempt to overwrite a vtable.

However, in XFG’s implementation, Microsoft has essentially built-in a security downgrade for cases when hash-based comparison fails. This downgrade means that XFG is not any more secure than CFG, and will be susceptible to the same attacks.

It should be noted that XFG is only available in the insider preview, and may thus undergo changes before a release. At the time of this writing, it has had its current implementation for over six months.

Interested in delving into vulnerability research, exploit development, reverse engineering, and operating system internals?  Explore the following Learning Paths and courses: Exploit Development Essentials, EXP-301, EXP-312, and EXP-401.


About the Author

Morten Schenk is content developer and trainer at Offensive Security with a focus on exploit development and mitigation bypasses on Windows. Morten loves to build exploits against difficult targets and continuously discover new techniques to combat mitigations.