 Hello, my name is Dov Moorick from IBM. Together with Eubartus Franke, I will present securing Linux VM boot with AMD SEV measurement. First, I'd like to say that this is work of an entire team. First are colleagues from IBM, Tobin, James and Jim, and of course the feedback and helpful reviews from the DK2 and OVMF community and the QMU community all contributed to the work that we present today. First, let's set the stage and explain about confidential computing. So the goal here is to protect the guest from the hypervisor. So we assume there's a cloud service provider, which in this case is untrusted, which is deploying several host machines, which are again untrusted. And the guest owner or the customer wants to run their workloads in guest VMs inside these host machines on that cloud service provider. And it wants to run sensitive workloads there and the solution with SEV is to encrypt the memory of the VMs so that the host and therefore the cloud provider cannot read the content of the memory. The problem is that memory encryption is not enough. The memory might be encrypted, but the guest owner has no idea what's exactly running inside their guest. So it needs to verify that indeed the desired workload is running inside the guest and not something else which might steal their secrets or something like that. So if we talk about the hardware solution in AMD SEV, this is a feature of the modern AMD processors, which include a secure processor also called PSP, and this secure processor handles all the sensitive operations like storing the keys and so on. And the VM memory is encrypted. On the fly, the host cannot read the VM memory. Another feature which is relevant to our story today is the guest launch measurement. This is the hash of the initial guest memory before the VM starts running its first instruction. And this hash is assigned together with some other information about the VM by the secure processor so it cannot be forged. And similarly, another idea which is important for our talk today is the guest secret injection. Again, it happens in AMD SEV only during the launch time after verifying the measurement and the guest owner has an ability to inject the secret securely into the VM in a way that the host cannot be privy to this information. So let's walk over normal VM boot process with the dash kernel argument to QMU or a dash in HRD. So here's an example QMU command line for initializing such a VM. And what happens inside QMU is that it reads those files into the firmware config device and then it loads OVMF, which is the firmware that runs inside the VM into the guest memory. It instructs SEV to measure the memory. So it measures the memory and this measurement is reported to the guest owner. The guest owner then approves the measurement and tells the cloud provider or QMU to launch the VM. Then the VM starts running inside with OVMF code. OVMF then needs the kernel, so it reads the content of the kernel and in HRD and kernel command line from the firmware config device, loads them into memory and then jumps to the kernel to continue the boot process there. Here is the same process in a sequence diagram with all the players. So on the left is the guest owner, which is usually outside the cloud provider area. Then we have QMU which is untrusted in this setting and then the guest VM and the AMD secure processor which takes part in the measurement calculation. So first again QMU reads the kernel file into the firmware config device, loads OVMF into the guest memory and then uses AMD secure processor to measure the guest memory. The measurement goes back all the way to the guest owner which approves the measurement and tells QMU to launch the VM. The VM starts and then OVMF inside the guest VM reads the kernel from firmware config and QMU provides the content of the kernel and then OVMF can jump into that kernel to continue the boot process. We'll now describe an attack on boot with dash kernel. So assume a malicious host which runs QMU again and when the guest owner asks to run kernel 5.13.0 instead the host will run kernel malicious 5.13.0 which might include a module to steal memory or steal secrets. So QMU loads that malicious guest kernel into the firmware config device and then normally loads OVMF into guest memory and SEV measures the memory which includes only OVMF and the guest owner then approves this measurement and the launch continues. OVMF starts running. When OVMF needs the content of the kernel it reads it from firmware config but at that point it gets the content of the malicious kernel which is loaded into memory and then OVMF goes on and jumps to that malicious kernel which runs and then might be able to steal the sensitive information from the VM because it's running inside the trusted confidential VM and it happens because the secure processor hardware measured OVMF but didn't measure the other code that OVMF loaded which is the kernel, the integer D and the kernel command line because they're not part of the initial VM memory that are loaded later through the firmware config device. So our solution here is to basically to extend the measurement so what we do is add to the initial guest memory to add a list of hashes. These are the hashes of kernel, the integer D and the command line and since it's part of the initial memory then the secure processor will measure both OVMF and this list of hashes. Once OVMF starts running it reads the kernel and integer D command line from firmware config but it then verifies that the hashes of the binaries that it is reading did match the list of the hashes that appeared during the measurement. Here we see the hashes table that is constructed by QMU during boot and then read by OVMF to verify the hashes. So first we see in yellow the table header and then a length field of the entire table in green and then we see three entries. Each one has a grid which describes the entry then a length field in green and then the hash in red which is 32 bytes or which represent the SHA-256 hash of that entry. So we see three entries. The first one here represents the command line the second one the integer D and the third one is the hash of the kernel content itself and then we end with a few bytes of padding according to the SCB standard for encrypting initial VM memory. So as I said this table is constructed by QMU and then I use read by OVMF to verify the hashes. The same table should be constructed in the guest owner to calculate the same measurement and verify that the measurement reported the sign measurement reported by the VM during launch is indeed identical to the expected digest. So here's how the solution works from end to end. QMU loads OVMF into guest memory and then it loads the hashes of the kernel in an integer D command line into guest memory as well. So this is the page, the table that we've just seen in the previous slide. Then SCB measures the entire guest memory which now includes both OVMF and the hashes and if it approves then the VM starts running starts from OVMF. OVMF reads the kernel from firmware config device this is an insecure device but then OVMF verifies the content that it just read for example of the kernel against the expected hash that appeared in the hash list that we've seen before and similarly for initrd and for the kernel command line. If everything is okay it loads these content into memory and then starts running the kernel. So let's go over a few possible attacks here so the attack that we started from the host uses the wrong kernel or initrd on command line so in that case the host will compute the hash and since the hash of all these three entries are part of the measurement the measurement won't match and the guest owner will not approve the launch of the VCVM. A second attack is that the host replaces OVMF with their own version which doesn't perform the verification of the hashes in this case again OVMF is replaced so the measurement will not be matched. Another attack here is that host does use this OVMF and the hashes everything is expected but once firmware config device is read from it gives the wrong content. At that point measurement will be okay but OVMF will detect this because it will refuse to load the content because the content it reads from a firmware config device doesn't match the expected hash that appeared in the measurement. So we have all these three attacks mitigated by this solution. So the caveat for this solution is that in this case the kernel and initrd and command line are all readable by the host similarly as OVMF is in what we have now. So this should only be used when the kernel and initrd are not confidential and there are use cases for that. If they are confidential and you want to give them a secretly to the VM then we suggest to use encrypted disk boot and you can hear about it in another KVM forum from our team from James Bottomley of IBM and Bridgesting from AMD. So this solution is already implemented it has two parts, a part in OVMF inside the guest VM and a part in QMU when starting a VM. So in OVMF first we need to designate memory area discoverable by QMU for the hashes list and then the main functionality inside OVMF is as content is loaded from the firmware config device verify that content against the expected hashes list specifically the kernel and the initrd and the command line as they load from firmware config. If anything doesn't match the expected hash then the boot is aborted so the VM doesn't start at all. And this is all already reviewed and merged into the EDK2 tree back in July 2021. As for the QMU part when launching an SEV VM with a dash kernel argument then we add calculation of the hashes of the kernel initrd and command line if they are supplied populate that OVMF designated memory area described earlier so that OVMF can then verify the files as they are loaded. This is already reviewed and we hope it will be included in QMU 6.2 which is the upcoming version. This solution gives us the ability to use injected secrets inside confidential VMs. So as mentioned earlier once we properly measure a guest the guest owner can inject secrets into the VM memory in a way that the host cannot read those secrets. OVMF and QMU already supports this functionality for reserving area for injected secrets and the QMP command to inject those secrets securely. But once the VM has booted into Linux there is no way, easy way to access them in the guest. So we proposed an SEV secret kernel module which reserves this memory this memory area of the injected secrets so that the kernel doesn't use it for something else and then exposes the secrets in our file system interface using the SecurityFS functionality of the kernel. So the secrets appear in a directory where each file in that directory is a secret and reading the content of the file reveals the secret as well as removing the file erases the secret and this is undergoing review and discussion in the Linux Confidential Computing mailing list and other relevant mailing lists. Here is an example for the usage of this module so you load the module and then you have this directory under SecurityFS, SEV secret and each entry each file in that directory is one secret according to the guided structure of the secrets and you can read those secrets normally as files and you can also remove a secret and then it will be wiped for memory so it will not appear again so you can use a secret and then wipe it so further programs in the user land cannot actually access this secret. So our plans for the future is to improve the guest owner's experience so currently what happens is that every modification in the kernel or in the kernel command line modifies the measurement as expected but it is also an overhead of work for the guest owner to keep track of what's running so we need a way to improve that. Another work stream for us is to adapt this solution to support newer generation so SEV probably should work with the current code but we need to improve the measurements calculation side in the guest owner side to measure the initial CPU state as expected in SEV and modify the scheme for usage in S&P and Intel TDX which have the different measurement and attestation profiles so thank you for attending this talk about securing the Linux VM boot with AMD SEV measurement.