Notes on Virtualization.Framework

These are my notes from reversing implmentation details of the macOS Virtualization.Framework. This writeup doesn't lead to a notable outcome. It is loosely organized notes on a process with a few takeaways.

Background and Problem

Big Sur on macOS that added the Virtualization.Framework. This is a higher-level API for creating and managing virtual machines than the Hypervisor.Framework which gives you a seemingly thin wrapper around Intel VT-x.

The great thing about Virtualization.Framework is that it provides a fairly high-level API to boot into a Linux Kernel: VZLinuxBootLoader. Booting Linux was possible with Hypervisor.Framework, you would just need some bootstrapping implementation like xhyve.

Recently I was using vftool which provides a thin wrapper CLI around Virtualization.Framework but I was seeing fairly unreliable results with booting various Linux distributions. The workflow outlined in the Github issues vftool by user droidix was helpful in getting an Ubuntu VM setup but following similar workflows didn't quite work for all Linux distributions. I could successfully boot Ubuntu, Fedora, and OpenSUSE using the same rough workflow described here but was less successful in booting Arch and TinyCoreLinux. TinyCoreLinux is notable example distribution because that is a test distribution that xhyve uses. I had assumed (with no clear justification) that Virtualization.Framework was borrowing code from xhyve given that xhyve is used in popular projects like HyperKit. Attempting to boot TinyCoreLinux using vftool lead to a “successful” boot of a VM but as soon as the the VM starts it transitions to a ‘Done’ state within a second or so. There is no output to the serial console during this time.

$ vftool -k tinycorelinux/vmlinuz -i tinycorelinux/core.gz -m 1024 -a 'console=hvc0'
2020-12-29 11:25:19.792 vftool[54377:1257388] vftool (v0.3 10/12/2020) starting
2020-12-29 11:25:19.793 vftool[54377:1257388] +++ kernel at tinycorelinux/vmlinuz, initrd at tinycorelinux/core.gz, cmdline 'console=hvc0', 1 cpus, 1024MB memory
2020-12-29 11:25:19.794 vftool[54377:1257388] +++ fd 3 connected to /dev/ttys008
2020-12-29 11:25:19.794 vftool[54377:1257388] +++ Waiting for connection to:  /dev/ttys008
2020-12-29 11:25:24.905 vftool[54377:1257388] +++ Configuration validated.
2020-12-29 11:25:24.905 vftool[54377:1257388] +++ canStart = 1, vm state 0
2020-12-29 11:25:25.266 vftool[54377:1257486] +++ VM started
2020-12-29 11:25:25.913 vftool[54377:1257388] +++ Done, state = 3
$

It is very possible I'm not using vftool properly, or providing the right format of kernel or initrd to vftool. However, there isn't much of a feedback loop. It is very unclear why this failed to successfully boot. While it is probably not the most scientific approach to debugging, I started to dig more into how Virtualization.Framework was implemented so I could see if I could better understand what I might be doing wrong (Spoiler: I didn't figure this out but I took notes along the way).

Debugging vftool and Virtualization.Framework

My original approach was debugging vftool with lldb and trying to find some indication of something failing. During this time I also used this helpful guide in dumping the dynamic library cache so I could drop {Virtualization,Hypervisor}.framework binaries into Hopper.

This took me the most time (I don't debug Obj-C too often so it took some time to get cozy in the debugger) but I don't believe Virtualization.Framework itself uses much (if any) of Hypervisor.Framework. I didn't comprehensively reverse it but my takeaway is that Hypervisor.Framework is a small client that provides a high level API for sending XPC messages to com.apple.Virtualization.VirtualMachine.xpc which does the hardware emulation and kernel bootstrapping in a sandbox.

As an example see this Hopper disassmebly snippet for VZVirtualMachine's _ensureConnectionWithCompletionHandler which creates an XPC connection to “com.apple.Virtualization.VirtualMachine”:

                     -[VZVirtualMachine _ensureConnectionWithCompletionHandler:]:
00007fff6efb47a0         push       rbp                                         ; Begin of try block
<snip>
00007fff6efb47be         mov        rax, qword [rax]
00007fff6efb47c1         mov        qword [rbp+var_30], rax
00007fff6efb47c5         mov        rdi, rdx
00007fff6efb47c8         call       qword [_objc_retain_7fff8703c418]           ; _objc_retain, _objc_retain_7fff8703c418
00007fff6efb47ce         mov        rbx, rax
00007fff6efb47d1         mov        rsi, qword [r12+0x10]                       ; argument "targetq" for method imp___stubs__xpc_connection_create
00007fff6efb47d6         lea        rdi, qword [aComapplevirtua]                ; End of try block started at 0x7fff6efb47a0, Begin of try block (catch block at 0x7fff6efb4db6), argument "name" for method imp___stubs__xpc_connection_create, "com.apple.Virtualization.VirtualMachine"
00007fff6efb47dd         mov        qword [rbp+var_90], rax
00007fff6efb47e4         call       imp___stubs__xpc_connection_create          ; xpc_connection_create

Reversing com.apple.Virtualization.VirtualMachine.xpc

After sorting out that the Hypervisor.Framework client seems to defer most of the interesting virtualization logic to an XPC service I started reversing that service. Thankfully, the XPC service isn't in some fancy cache so I could just nab it off the filesystem.

$ pwd
/System/Library/Frameworks/Virtualization.framework/XPCServices/com.apple.Virtualization.VirtualMachine.xpc/Contents

$ ls -ltrh
total 16
-rw-r--r--  1 root  wheel   520B Jan  1  2020 version.plist
drwxr-xr-x  3 root  wheel    96B Jan  1  2020 _CodeSignature/
drwxr-xr-x  3 root  wheel    96B Jan  1  2020 Resources/
drwxr-xr-x  3 root  wheel    96B Jan  1  2020 MacOS/
-rw-r--r--  1 root  wheel   1.6K Jan  1  2020 Info.plist

$ file MacOS/com.apple.Virtualization.VirtualMachine
MacOS/com.apple.Virtualization.VirtualMachine: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e]
MacOS/com.apple.Virtualization.VirtualMachine (for architecture x86_64):        Mach-O 64-bit executable x86_64
MacOS/com.apple.Virtualization.VirtualMachine (for architecture arm64e):        Mach-O 64-bit executable arm64e

Running the VirtualMachine binary in a shell just crashes:

$ ./MacOS/com.apple.Virtualization.VirtualMachine
fish: './MacOS/com.apple.Virtualizatio…' terminated by signal SIGILL (Illegal instruction)
$

I didn't sort out why this happens and if it is intentional. I suppose I didn't expect this to work!

The XPC service has a pretty large sandbox profile that I didn't dig into too much:

$ head Resources/com.apple.Virtualization.VirtualMachine.sb
;;; Copyright (c) 2020 Apple Inc.  All Rights reserved.
;;;
;;; WARNING: The sandbox rules in this file currently constitute
;;; Apple System Private Interface and are subject to change at any time and
;;; without notice.
;;;
(version 1)

(disable-full-symbolication)
(deny default)
$ 

(I did notice that (even on successful Ubuntu boots) that some syscalls were rejected in the sandboxed com.apple.Virtualization.VirtualMachine process. You can see this in Console.app.)

And for reference here are the entitlements for the service:

$ codesign -d --entitlements :- ../../com.apple.Virtualization.VirtualMachine.xpc
Executable=/System/Library/Frameworks/Virtualization.framework/Versions/A/XPCServices/com.apple.Virtualization.VirtualMachine.xpc/Contents/MacOS/com.apple.Virtualization.VirtualMachine
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>com.apple.security.hypervisor</key>
        <true/>
        <key>com.apple.vm.networking</key>
        <true/>
</dict>
</plist>
$

I spent a bit of time statically tracing through the XPC service's subroutines. Based on the hv_* symbol references I found in the binary it seemed like I'd eventually find the bootloader routine. Starting from the entry point of the binary and tracing down was time consuming and wasn't turning up anything interesting.

I also spent some time tracing back from various hv_* and pthread* calls but didn't find anything that was obviously bootstrapping code.

I eventually started searching the binary for magic numbers referenced in this Linux/x86 boot protocol documentation and eventually found checks like (This is using Hopper's asm->C lifting):

loc_10008e0e8:
    LODWORD(rdx) = 0x1000;
    LODWORD(rcx) = 0x0;
    rdx = var_78;
    if (LODWORD(*(int16_t *)(rdx + 0x1fe) & 0xffff) != 0xaa55) goto loc_10008e4d4;

This appears to be checking that offset 0x1fe has the bytes 0xaa55. 0x55aa is a magic number used as a sort of checksum for a boot sector so this seemed promising. A bit further down in this same subroutine I found:

loc_10008e0fe:
    LODWORD(rcx) = 0x0;
    if (*(int32_t *)(rdx + 0x202) != 0x53726448) goto loc_10008e4d4;

0x53726448 is the ASCII string ‘SrdH’ which is yet another magic number from the boot protocol documentation (The 0x202 offset matches too).

The routine here is quite large and I assume it's some kexec implementation (or subroutine of kexec). I did not some variations between Virtualization.Framework's implementation and xhyve's kexec implementation.

For example, xhyve's kexec rejects booting kernels under the following conditions:

	if ((zp->setup_header.setup_sects == 0) ||    /* way way too old */
		(zp->setup_header.boot_flag != 0xaa55) || /* no boot magic */
		(zp->setup_header.header != HDRS) ||      /* way too old */
		(zp->setup_header.version < 0x020a) ||    /* too old */
		(!(zp->setup_header.loadflags & 1)) ||    /* no bzImage */
		(sz < (((zp->setup_header.setup_sects + 1) * 512) +
		(zp->setup_header.syssize * 16))))        /* too small */
	{
		/* we can't boot this kernel */
		fclose(f);
		return -1;
	}

It looks like the Virtualization.Framework implementation supports boot protocol 2.04 and above (Unlike xhvye's supporting 2.10 and above):

loc_10008e10e:
    LODWORD(rcx) = 0x0;
    LODWORD(rax) = *(int16_t *)(rdx + 0x206) & 0xffff;
    if (LODWORD(rax) < 0x204) goto loc_10008e4d4;

At this point, I found one potential difference that could explain the variation between xhyve's behavior and Virtualization.framework. I don't think this specific variation explains the difference in behavior I observed between vftool and xhyve (TinyCoreLinux seems to use the same boot protocol version as the Ubuntu image I was booting). However, I am happy that I did end up finding something that appears to have notable implementation details of the VZLInuxBootLoader.

I haven't yet finished digging here and may spend some more time reading the VirtualMachine XPC implementation. Here are some other miscellaneous obvservations I made when poking around.

Undocumented, potentially not yet implemented APIs

The Virtualization.framework implementation has references to bootloaders that aren't VZLinuxBootloaders such as VZEFIBootloader. These are not documented currently so I assume they are either not stable or not complete.

I'd assume that VZEFIBootLoaders would be more generally useful than the Linux-specific VZLinuxBootLoader.

$ pwd
$DYLIB_CACHE_PATH/System/Library/Frameworks/Virtualization.framework/Versions/A

$ file Virtualization
Virtualization: Mach-O 64-bit dynamically linked shared library x86_64

$ strings Virtualization | grep '_VZ'
_mode == _VZSocketSerialPortAttachmentModeServer
T@"<_VZVirtualMachineConfigurationDecoderDelegate>",W,V_delegate
v32@?0@"_VZGraphicsDeviceConfiguration"8Q16^c24
T@"_VZEFIVariableStore",&,V_variableStore
T@"_VZFramebuffer",&
T@"<_VZVirtualMachineViewDelegate>",W,V_delegate
_VZVirtualMachineGuestTypeLinux

$ nm -gU Virtualization | grep 'VZ.*Boot'
00007fff8d1de2d8 S _OBJC_CLASS_$_VZBootLoader
00007fff8d1deeb8 S _OBJC_CLASS_$_VZLinuxBootLoader
00007fff8d1de558 S _OBJC_CLASS_$__VZEFIBootLoader
00007fff8d1de300 S _OBJC_METACLASS_$_VZBootLoader
00007fff8d1deee0 S _OBJC_METACLASS_$_VZLinuxBootLoader
00007fff8d1de580 S _OBJC_METACLASS_$__VZEFIBootLoader
$ 

Useful next steps

Attaching a debugger to the VirtualMachine XPC service is likely useful as it seems clear that most of the interesting bootstrapping code lives there. However, I believe I'd need to disable SIP to do this and I don't love disabling security features on my hardware. If I revisit this (or someone else does) starting from here seems reasonable.

I also didn't dig too deeply into the XPC messages from the Virtualization.Framework “client” to the VirtualizationMachine.xpc “server”. It's my understanding that XPC is schemaless so I doubt it's easy to find the schema of messages passed over XPC statically. Debugging XPC messages might be a useful next step as I don't need to hook into SIP protected processes (Assuming I just care about the ‘client’ view).

Gotcha in using Hopper's dead code removal

Reversing Objective-C code that makes use of blocks took me some time. It was especially confusing to use Hopper's lifting of asm to C which, by default, would remove “dead” code. This would generate C like:

  if (rbx != 0x0) {
          var_40 = *__NSConcreteStackBlock;
          [rbx retain];
          dispatch_async(*__dispatch_main_q, &var_40);
  // snip

Which is perplexing since var_40 should be a block but is clearly just a pointer to a ‘type’. Looking at the asm directly was more useful here:

000000010001abd1         mov        rax, qword [__NSConcreteStackBlock_1000b4008] ; __NSConcreteStackBlock_1000b4008
000000010001abd8         mov        qword [rbp+var_40], rax
000000010001abdc         mov        eax, 0xc2000000
000000010001abe1         mov        qword [rbp+var_38], rax
000000010001abe5         lea        rax, qword [sub_10001ac60]                  ; sub_10001ac60
000000010001abec         mov        qword [rbp+var_30], rax
000000010001abf0         lea        rax, qword [__ZTINSt3__117bad_function_callE+696] ; 
# <snip>
000000010001ac16         call       imp___stubs__dispatch_async                 ; dispatch_async

I eventually sorted out that that the asm is probably building a block literal struct. Hopper seems to (understandbly) think the sub_10001ac60 reference is unused so it will remove it as dead code. However, I believe this is the invoke field of the block struct which points to the instructions that would be executed by the dispatch_async call.

P-256 in Sage