 My name is Tom Lendaki, I'm with AMD, and I'm going to talk to you today about confidential get services within an SVSM, starting that go a little. So I'll go over an agenda of reviewing SEB very quickly. I don't have a lot of time in this slot. Talk about a feature of SNP called VM privilege levels and how that will allow for the SVSM to operate, give an overview and some of the benefits for an SVSM. So SEB review won't say too much about this. The guests are cryptographically isolated from the hypervisor as well as other guests through encryption. Each guest has its own key as does the hypervisor. For SEB ES, that was the next step in the evolution of SEB, and Michael Roth had talked about this a little before, but we protect the guest register state in this case. We start with an initial known state that is encrypted and measured. It doesn't have to be the initial processor state. We can start in 64-bit long mode if need be as long as it's a known state so that the measurement can be verified each time. On each VM run, there's an integrity check that's performed to make sure that the hypervisor isn't trying to modify some of that state, and the VM run would fail if the hypervisor has tried to modify that state, and world switches swap all register state. We did this by splitting the VMCB and the VMSA, the control area and the control block in the save area, and we save additional state now in the save area. There are still times when you need to communicate with the hypervisor and have it do things on behalf of the guest, and to accomplish that we came up with a GHCB specification that will allow for communication via a shared page between the guest and the hypervisor. We do that for things like MSR Read and Write or CPID and MMIO. With SNP, we build on the confidentiality aspects of SCV and provide integrity protection. The integrity protection helps prevent replay attacks, corruption attacks, that kind of thing. It utilizes the reverse map table, and Michael Roth went into a whole discussion on how that all worked. So take a look at his slides, but the idea is that we have this table that can track information about the guest, and whether it owns the page or whether the hypervisor owns the page, the page size that's associated with the 4K page in question, whether it's part of a 2Meg page or a 4K page on its own, the GPA that has been assigned to it along with the ACID of the guest so that you can differentiate GPAs from the same GPA from multiple guests. Whether the page is a VMSA page, which can be used to then run an AP, it can be used then on a VM run command, and we'll be using this quite a bit in the SVSM in order to start APs. And also things like validation, right? The guest has to validate all the memory that it's going to use privately, and we will get a pound VC in the guest if the hypervisor tries to change things out from underneath us when we think we should be having access to a private page that we've already validated, but the hypervisor has decided to try and mess with the guest. So a feature of SMP that we're going to use is called virtual machine privilege levels. This allows the guest to divide the address space up into up to four levels. We have BMPL 0 through 3, with 0 being the most privileged, right? This allows a higher privileged VMPL to provide secure services to lower privileged VMPLs. The VMPL level is represented in the VMSA of the BCPU that is currently running. And today in the KVM in Linux SMP patches that we've been sending upstream and working with, everything is VMPL 0. The RMP entry has additional information in there related to the VMPL levels, right? So this is where that information can differentiate the privileges associated with VMPL 0, VMPL 1, and VMPL 2. We have read, write, execute permissions, and you can only set the permissions for a VMPL lower than you are currently running. So you can't try and change your current access permissions or those of a higher VMPL level. So that brings us to how we create an SVSM or a secure VM service module. There's a specification I was hoping would be posted already on our site, our developer site. It should be there very soon. It did go out on the SMP mailing list a while ago, and we were asked to take ownership of it at AMD. So just working through all that, and that'll be posted shortly. But the concept is that the VMPL 0 is used for the SVSM. So we will create the SVSM VSP at that level, and then initiate the booting. That VMPL 0, VMSA is then used, or VMPL 0, VMSA is then used for all the APs that we want to start. So we only measure, encrypted measure the BSP. So we should make it a little easier for now, multiple vCPU guests. You don't have to know how many vCPUs the guest is starting in order to determine your measurements for attestation and things like that. You'll have just the one BSP. The SVSM will then create a VMPL 1 VMSA that will then be used to boot the say OVMF BIOS and then the Linux kernel. So as you can see, we'll have an SVSM AP for each OS AP. They're just running at different VMPL levels, and I'll talk a little bit about the communication on how the OS will contact the SVSM in order to perform certain operations. So some of the uses we're looking at using an SVSM for is going to be live migration and a virtual TPM instance. And there are probably others out there, and hopefully once the specification is ready, we'll talk about that and do all the communication over the Linux Cocoa mailing list. So in order to do this VMPL transition, we had to create a new GHCB event and that will allow the guest running in say VMPL 1 to prepare its request, and that request will be anything from like say a pValidate instruction that will require input into general purpose registers, and then it will issue a VMG exit resulting in GHCB request to ask the hypervisor to now run the VMPL 0 level. The hypervisor loads up the VMPL 0, VMSA, and GHCB, the state associated with everything, and issues a VM run. So now the SVSM is back in control. It can look for the request from the guest, process the request, take any output that is generated as part of that request, put it back into the guest VMSA, the VMPL 1 getting VMSA, and then request the hypervisor to now run VMPL, and so now the guest is at VMPL 1 running again, and it can check the results of the request and continue on. The specification will talk about all of the ways that this is done as far as the protocols and functions that are associated with everything. There's a calling area page that's used to communicate to verify that request was actually made so that the hypervisor just can't try and run VMPL 0 all the time and try and trick the SVSM into doing something that's already been done or things like that. So the overall boot flow of how this SVSM will start up, it's very similar to standard SEV boot today, but instead of loading and measuring the S&P guest at VMPL 0, we're going to load and measure the SVSM binary at VMPL 0 along with all of its CPU ID page and secrets page and other things like that. Well then load and measure the OVMF bios at VMPL 1 and we also include the OVMF BSP contents and measure those. We don't create a VMSA page out of that, we just measure those contents and it'll be up to the SVSM to copy those contents and create the bios BSP. So the initial boot will load the VMPL 0 VMSA and then KVM will show the VMPL 1 run. At that point the SVSM will go through its startup and initialization and it'll accept all the memory that it has available to it. It'll create all the APs that are defined for the guest. It will then locate and prepare the bios right now. We're doing that through QMU firmware config, but it'll copy everything over as the specification talks about and then create the VMSA for the bios and set that to VMPL 1. At that point we'll then ask for the VMPL 1 to be run by the hypervisor and the hypervisor will kick that off and now your bios and OS start up as what you would normally see just at VMPL 1. So what did we need to do to get this all to work? So from a VMM and hypervisor point of view, particularly QMU, we had to load and measure the SVSM at VMPL 0 and then the bios and OVMF at VMPL 1 and then we are going to boot the SVSM BSP state instead of the bios BSP state. We had to modify KVM to be able to support a VMSA per VMPL per VCPU. So there now can be up to four VMSAs if you really wanted to do VMPLs 2 and 3. We're only really working with 0 and 1. There's a new GHCB request to be able to pull all the APIC IDs so that we know how many VCPUs are available and we'll use that to create all the APs. We can actually use this new request also just in SMP in general if it's available and then that way we don't have to measure all of the guest APs as we do today in the current SMP hypervisor patches. That will be something that we can look at doing once we get the GHCB specification updated to handle that. And then we just have to be able to switch between the GHCB and VMSA for the VMPL level that we want to run. In the guest support the main thing is to be able to detect that we are not running at VMPL 0. VMPL 0 is the highest privilege and only VMPL 0 is allowed to do a P validate and only VMPL 0 can do an RMP adjust against a page to create or turn that page into a VMSA page so that it's usable on a VM run command. So as part of detecting that we're at VMPL 1 or lower, higher numerically, we use that to help detect the presence of an SVSM and then use a new interface through the SVSM to actually P validate all the memory. So it's a little bit of extra time in switching back and forth but we try and do a lot of P validates at once and batch them all up. We use the AP create for creating all the APs in the BIOS and so as I said before we're only now measuring a single vCPU but now able to run multiple vCPUs without having to measure all those extra vCPUs. But we will have to make a call to the SVSM in order to change the page that we create from a into a VMSA page through the RMP adjust instruction. So for live migration we need this concept of the VMPL 0 and VMPL 1 in order to make sure that the hypervisor isn't trying to be malicious during our migration. So the hypervisor is going to maintain a list of guest page encryption state. We also know the pages that have been encrypted and the range of pages that are available to the guest. This is more of an overview of what we're going to do but we have to transform any encrypted pages for transport and the hypervisor would call us to let us know, call the SVSM to let us know that we need a page transformed and when that happens we're going to mark the page read only in the VMPL levels or permissions so that we know that if the guest has makes a change to that page but the hypervisor never comes back to us we'll be able to track the state and know and when it's read only the guest won't be able to really make progress because it won't be able to update it because we need to be able to in the SVSM change that permission in order for the guest to make forward progress. On the destination side the SVSM would be invoked in order to pull the page in and put it into the guest memory and at completion we would then just transfer all the state required over from the source to destination, terminate the source and start execution on the destination side. Kind of a high level overview picture of how this would all work. We'll have to work on how we trust the destination so that we're going to have to go through attestation from the SVSM side when we attest the SVSM on the destination side. At that point once that started we're going to move any state that's initially needed over to the destination side and then just start moving pages and again marking everything read only as we move it to ensure that we get everything copied over. Now at the end if we find that all the pages haven't been moved from because the SVSM has to finalize all of this. We find that the pages all the pages haven't been moved or there are pages that we thought should have been moved and weren't then we can terminate the live migration so that we ensure the destination isn't running in the wrong state. Another service we're looking at providing in the SVSM is a virtual TPM. This is probably going to end up being like a new protocol and functions within the SVSM specification, but since it would live completely within the SVSM it would become part of the attestation of the SVSM. Because it's running at VMPL 0 and BIOS and Linux are running at VMPL 1 we can use a secure boot within the guest to verify the booting of the BIOS and the OS. There's still questions on how we're going to say maintain persistent state if we wanted to do persistent state. How we provide the initial endorsement key. There's talks and I think there's going to be talks later about some of this, about maybe doing ephemeral TPMs where you just generate a new EKH time and have to somehow get the appropriate storage root key if you wanted to be able to use a TPM to wrap keys in a persistent manner. But that's all stuff that we hopefully will talk about on the mailing list once we start putting this out there for adding it to the SVSM specification. So just a kind of a summary of where we are now with the SVSM. We have a proof of concept Rust version up and running has support for most of the what's called the protocol 0 versions of the functions. So we're able to boot a multi-VCPU OVMF Linux and guest at VMPL 1. The guest support requires discovery of the SVSM, page validation through the SVSM and then VCPU creation all through again through the SVSM. The hypervisor support also requires recognition of the SVSM binary being requested, measuring that at VMPL 0 and then switching to measuring the BIOS OVMF at VMPL 1 and booting the SVSM BSP instead of the BIOS BSP. We also have to make sure that KVM can handle the multiple VMSAs per VCPU and also the switching those out when we need to run different PMPL levels. The code is all available right now upstream on our AMD page, the AMD ESE GitHub page. Linux SVSM was released a couple of weeks ago in Rust and we have preview branches that probably need updating under the updates to the hypervisor, SMP hypervisor patches. We're still running under some of the old SMP hypervisor patches. But the previews will allow you to boot and run in SVSM. That's all I had. Is there any questions? Yes. The question is why would the guest still trust the SVSM because the hypervisor is still involved and that goes into the attestation of the SVSM. We're still going to, as part of the SVV launch process, measure the SVSM, measure the VCPU state and then the SVSM will boot and start to boot the guest and then the guest will have to go through attestation to verify. Now we're going to have to make the attestation report available so we probably need to extend the SVSM specification in that case to make that VMPL zero attestation report available. The attestation report can only be requested from within the guest so the hypervisor can't mess with it in that sense. So once it attests everything then it can trust it. Yes. Yes, so the question is, I mentioned the Linux SVSM, is there a vision of adding it for say Windows or anything else? Our group personally probably won't be doing anything but the SVSM specification is designed to be just a general, how would any guest communicate with any SVSM? And so you can think of any, whoever is going to write an SVSM. The SVSM that we wrote is currently geared towards KVM and Linux but anybody could write an SVSM that as long as it follows the specification for the calling conventions for those functions should be fine. The question is, is there a calling convention or API specification between the SVSM and KVM? Now that was specifically kept separate so that you could have multiple SVSMs and multiple hypervisor interaction. So I envision that for the KVM SVSM we're going to extend the GHCB specification to control that interaction. But the guest to SVSM is what the SVSM specification covers. So the question is, are we putting any safeguards into the SVSM to try and guard against some of the issues or concerns that SMM has today? Probably. We have to take a look at everything but hopefully the VTPM instance should be only grabbing input from register state and not necessarily writing to memory and this is all going to be open source so we should be able to audit it pretty easily and make sure that we're not going to have anything where we're arbitrarily writing to memory, guest memory. Well so the SVSM lives in outside of any guest memory and has its own page tables and it's like our current implementation lives up around, I don't know, somewhere around 512 gig or something just as proof of concept and has like 32 may or 256 may of available memory to run it. It can be shrunk down but the guest doesn't know about that memory and even if it tried to somehow access it, the VMPL permissions would prevent it from accessing it and then the reverse, we have page tables in the SVSM that won't map or won't map guest memory unless it's needed to be mapped so like when we're dealing with VMSAs and we have to ensure that a request that came up is not trying to start say a VMPL zero VMSA, we have to map that temporarily to lock it, make sure everything looks okay and then un-map it, but in general we don't have the guest memory mapped. Any other questions? Thank you.