On a number of platforms, the hypervisor is not controlled by the OS vendor but by the platform vendor. In those cases, how are virtualisation capabilities exposed?
The platform hypervisor
On Qualcomm platforms, this role is fulfilled by Gunyah. On the Arm Realms architecture, this role will be fulfilled by the Realm Management Monitor, which is platform vendor owned and has a documented ABI for the reference architecture.
Note that a platform can both have a platform hypervisor at a lateral privilege level (ie. the Realms model) and allow the Linux kernel be ran at EL2, as such having KVM also available. However, at this point in time, Arm Realms does not exist in released silicon yet. Or potentially even using nested virtualisation for hypervisor compatibility…
The kernel modules
The Linux kernel in the host OS will have to bridge between the virtual machine monitor and the platform hypervisor.
If this kernel driver doesn’t provide a
kvm-compatible ABI, the virtual machine monitor will also need to be adapted to call the proper APIs for the specific hypervisor at play.
Opaque virtual machines
One of the reasons to delegate virtualisation to a platform hypervisor can be to have the memory contents and other state of the VM inaccessible from the host OS for security reasons. This can be combined with remote attestation to result in an enclave solution.
Note on Google pKVM and Nitro Enclaves
Stricto sensu, only a trusted hypervisor is needed to have VM enclaves, not for it to be a platform hypervisor.
AWS’s Nitro Enclaves uses that design principle.
Google’s pKVM also leverages that principle: the kernel image is a trusted component. For more security properties in case of breached kernels, Arm’s Virtualisation Host Extensions are not used in that scenario. KVM always runs at its own privilege level on protected KVM.
What are the downsides?
Platform hypervisors cannot be guaranteed to have a full feature set. Depending on the vendor, they might also be in a frozen state at device release. The risk is never getting new features compared to an OS-controlled virtualisation stack, or timely fixes if at all for security issues.
The communication channels between the enclave and the host will have to be carefully secured. Lack of hardening might result in the host being able to get code execution on the guest, defeating the point of having an enclave. Special care should also be taken for PCIe emulated devices, with a focus on DMA.
What does Android 13’s virtualisation support have?
Android 13’s virtualisation support does not mandate using pKVM or running the kernel at EL2. It also allows for platform hypervisors.