Last year, I found a security hole in Native Client on 64-bit Windows that could be used to escape from the Native Client sandbox. Fortunately I found the hole before Native Client was enabled by default in Chrome. I recently wrote up the document below to explain the bug and how we fixed it.
Native Client currently relies on a small, process-local, in-memory patch to NTDLL on 64-bit versions of Windows in order to prevent a problem that would create a hole in Native Client's x86-64 sandbox.
The problem arises because Windows does not have an equivalent of Unix's sigaltstack() system call.
On Unix, a POSIX signal handler may be registered using signal() or sigaction(). When a process faults (e.g. as a result of a memory access error or an illegal instruction), the kernel passes control to the signal handler that is registered for the process. Normally, the signal handler is run on the stack of the thread that faulted. However, if an "alternate signal stack" has been registered using sigaltstack(), the signal handler will run on that stack instead.
On Windows, "vectored exception handlers" are similar to Unix signal handlers, except that a vectored exception handler always gets run on the thread's current stack. This might have been OK, except that Windows has a different division of responsibility between kernel and userland compared with Unix. Windows does more in userland, and this creates problems for NaCl.
On Unix, sigaction() is a syscall that user code calls that records a user-code signal handler function in a kernel data structure. If no signal handler is registered by user code, then when a fault occurs in a process, the kernel never passes control back to userland code in that process.
On Windows, AddVectoredExceptionHandler() is provided in userland by a DLL rather than being provided by the kernel. AddVectoredExceptionHandler() adds the handler function to a list in userland memory. When a fault occurs in the process, the Windows kernel passes control to a function called KiUserExceptionDispatcher in NTDLL, in userland. KiUserExceptionDispatcher checks whether a handler function was registered via AddVectoredExceptionHandler() (or via various other mechanisms), and if so, it calls the handler; otherwise, it terminates the process. The kernel therefore calls KiUserExceptionDispatcher regardless of whether any vectored exception handlers were registered.
This is a problem for NaCl because when we are running sandboxed code, %rsp points into untrusted address space. If sandboxed code faults on a bad memory access or a HLT instruction (which it is allowed to do), Windows calls KiUserExceptionDispatcher, which then runs on the untrusted stack. Since NaCl allows multi-threading, other sandboxed threads can modify the untrusted stack that KiUserExceptionDispatcher is running on.
Furthermore, KiUserExceptionDispatcher is a complex function which calls other NTDLL functions internally. These functions return by using the RET instruction to pop a return address off the stack, which means that control flow is determined by values stored on the untrusted stack. An attacker running as sandboxed code in another thread can modify the memory location containing a return address and so cause the NTDLL routine to jump to an address of the attacker's choice. This allows the attacker to cause a jump to a non-bundle-aligned address, which allows the attacker to escape from the NaCl sandbox.
The fix: patching NTDLL
Our fix is to patch the KiUserExceptionDispatcher entry point inside NTDLL so that this routine will safely terminate the process.
Our patching is process-local: it does not affect the NTDLL.DLL that other processes use. We only modify the in-memory copy of NTDLL.DLL that is loaded into the NaCl process.
Patching is necessary because the address of KiUserExceptionDispatcher is fixed in the Windows kernel at boot time. We don't have any way to tell the kernel to use a routine at a different address. The kernel randomizes the address of NTDLL.DLL at boot time, but thereafter NTDLL.DLL is loaded at the same address in all processes and the address of KiUserExceptionDispatcher is fixed.
What do we do to terminate the process safely? While we could overwrite the start of KiUserExceptionDispatcher with a HLT instruction, this instruction will simply cause the kernel to re-run KiUserExceptionDispatcher again, because the kernel does not track whether a thread is currently handling an exception. This would cause the kernel to repeatedly push a stack frame containing saved register state and re-call KiUserExceptionDispatcher until the thread runs out of stack. Although this will eventually terminate the process, it would be slower than we would like.
So instead, the original version of our patch overwrites the start of KiUserExceptionDispatcher with:
mov $0, %esp hlt
We set the stack pointer to zero first so that the kernel can't write another stack frame. If a fault occurs when the kernel is unable to write a stack frame to memory, we observe that the kernel does not attempt to call KiUserExceptionDispatcher and instead it just terminates the process.
Breakpad crash reporting
We extended the patch to support crash reporting for crashes in trusted code. (In Chrome, this uses the Breakpad crash reporting system.)
The extended version of the patched KiUserExceptionDispatcher looks at a thread-local variable to check whether the fault occurred inside trusted code or sandboxed code. If trusted code faulted, we know we are safely running on a trusted stack, and we pass control back to the original unpatched version of KiUserExceptionDispatcher (which eventually causes the Breakpad crash handler to run and report the crash). Otherwise, we assume %rsp points to an unsafe untrusted stack, and we terminate the process using the same "mov $0, %esp; hlt" sequence as before.
The problem does not arise on 32-bit versions of Windows because of the way NaCl's x86-32 sandbox uses segment registers.
On 32-bit Windows, when sandboxed code faults, the kernel notices that the %ss register contains a non-default value, and it terminates the process without attempting to write a stack frame or call KiUserExceptionDispatcher.
Our NTDLL patch has a couple of limitations:
- We are not using a stable interface. KiUserExceptionDispatcher is a Windows-internal interface between the Windows kernel and NTDLL, and it might change in future versions of Windows. We are relying on undocumented behaviour.
- We don't control what data the Windows kernel writes to the untrusted stack. This data is momentarily visible to other sandboxed threads before the process is terminated, and it will be easily visible in embeddings of NaCl that support shared memory and multiple processes. We don't think the kernel writes any sensitive data in this stack frame, and seems unlikely that it would do so in the future, but it's still possible.
There are a couple of alternatives to patching NTDLL:
- Using the Windows debugging API: It is possible to catch a fault in
sandboxed code before KiUserExceptionDispatcher is run using the
Windows debugging API. We use this API for implementing untrusted
hardware exception handling.
However, using this API comes at a cost: if a process has a debugger attached, Windows will temporarily suspend the whole process each time it creates or exits a thread. This would slow down thread creation and teardown.
In practice, we wouldn't want to completely replace the NTDLL patch with use of the Windows debugging API. For defence-in-depth, we would want to use both.
- Changing the sandbox model: We could change the sandbox model so that %rsp does not point into untrusted address space. One option is to use a different register as the untrusted stack pointer, and disallow use of CALL, PUSH and POP. Another option would be to have %rsp point to a thread-private stack (outside of the sandbox's usual address space). Both options would be quite invasive changes to the sandbox model.
Windows runs an internal function on the userland stack whenever a hardware exception occurs. This behaviour does not appear to be documented anywhere, which suggests that Windows treats this use of the userland stack as an implementation detail rather than part of the interface. For most applications, this use of the userland stack does not matter. Native Client is unusual in that this stack usage creates a hole in NaCl's sandbox, which we have to work around.
In addition to being undocumented, Windows' use of the userland stack can be surprising to people from a Unix background, given that Unix signal handlers behave differently. However, when one considers the complexity of Windows' exception handling interfaces -- which include Structured Exception Handlers, Vectored Exception Handlers and UnhandledExceptionFilters -- it makes more sense that this complex logic would live in userland.
When it comes to the kernel/userland boundary, what can be an implementation detail for one person can be an important interface detail for another. The lesson is that these details are often critical for the security of a sandbox.
The original sandbox hole: issue 1633 (filed in April 2011 and fixed in June 2011)
Breakpad crash reporting support for 64-bit Windows: issue 2095
Untrusted hardware exception handling for 64-bit Windows: issue 2536
Documentation for AddVectoredExceptionHandler() on MSDN