 All right, well, thank you, everyone, for coming. So my name is David Kaplan. I'm a security architect with AMD. And today, I'm excited we're going to be revealing, actually, for the first time publicly some new X86 security extensions that we've been working on related to virtual machine protection. And before I start, I was asked by the lawyers to put this up, which basically says this is all preliminary information, so it may change, et cetera. So what we're going after here is a problem that is sometimes referred to as confidential computing, basically the idea that you want to run a workload in an untrusted hosting environment. There are a number of potential applications for this. Kind of the most obvious one is public cloud, a case where you want to have security of your data, but you might want to run it in a remote location or somewhere where you don't have much control. What we find is that this is a pretty easy sell, in general, because the customers, especially in scenarios like public cloud, do want additional security for their data, which often can be sensitive in nature. But the cloud providers often don't really want to see what the customers are doing. And so being able to offer some hardware protection for this is where we've been developing. And this is not our first foray into this space. If you've been at this conference before, you've probably heard me talk about our secure encrypted virtualization technology, which we've been shipping now for a couple generations in our server product line, and which we first revealed, I think, maybe in 2016 or so. And this has been going through a generational improvement. So we started with what we call SEV. And this is supported in upstream Linux, starting about 4.15. We also have a feature called SVES, which I talked about last year, which is SEV with encrypted state. So that adds protection around CPU registers. And we have some patches on our public GitHub for that. And they should be getting upstream soon-ish. What I'm going to be talking about today is the next generation of this technology that we call SNP, or Secure Nested Paging. And this is an additional layer of security protection that focuses on memory integrity and a number of other attack factors that I'll go through. And we feel that this raises the security bar significantly and builds on top of the existing SEV technologies we've previously developed. So in our environment, we typically have these blocks. We have what's called the AMD Secure Processor as one of the trusted components. This is our embedded security subsystem that runs in ARM core. And it's executing signed firmware. It has a Linux device driver from a software standpoint. It looks like a PCI device. And it has an MMIO interface. And this Secure Processor exposes an API related to VM lifecycle management, including launching, migrating, and so on. The hypervisor interfaces with this API and is responsible for not only the lifecycle tasks, but also the scheduling and the allocation of the system resources. Inside of the guest, we have the guest operating system kernel, which is enlightened, and does some things like choosing what memory it wants to keep private versus memory it wants to share with the rest of the world for things like DMA. And then we have guest applications, which are generally completely unaware of the security going on here. So we don't have to modify the applications in this technology. Rather, all of the special enlightenment is done at the operating system layer. And we've tried to make this technology relatively generic. It is tied to the CPU, the x86 virtualization instructions. And we've tried to keep the performance overhead down as much as possible. I believe if you were to benchmark some of our current technology, you're talking about kind of low single digit overhead in most cases. Now, with the SNP technology, we are taking a stronger threat model approach than we have with the previous technologies. You heard me talk in the past years, I would talk about something called a benign but vulnerable hypervisor, which basically means that we think it's working, but there might be bugs. And so we want to have just some extra protection against those bugs. In the SNP model, we're taking that a step further. And we're saying, we're not going to trust the hypervisor and related components at all. We're going to assume they can be malicious. They can be conspiring with one another. And we want to protect the guest in all of these cases. So the components that are trusted here is our hardware and our signed firmware. And I'll talk later about how that stuff can be tested. And then the operating system software running inside of your protected VM. Everything else, including BIOS, SMM, other VMs, and whatnot is untrusted. So let me go into a bit of detail about what exactly we are and are not protecting against in each of these generations. And I'll start with the confidentiality protection. So in our previous features, we have used main memory encryption as a key technology. We have an AES-128 engine in the memory controllers. And that engine is capable of encrypting and decrypting the guest memory at very fast speed. In the SCVES feature, we added confidentiality protection for the CPU register state. So whenever you switch out of the VM, all of the registers are similarly encrypted in a way the hypervisor can't see them. And we also have DMA protection so we don't allow devices to get to this memory. So that's been what we've had before. The big new thing that we're adding now is what we call as integrity protection. And this is specifically around software-based integrity attacks. So things like replay protection, data corruption, memory aliasing, things of that sort. The way that I try to summarize this is that integrity means that if one of these VMs reads a page of memory, then one of its private pages of memory, then it must always read the value of the last row. Now, there is a catch here that says that if it's able to read that memory. And I'll talk later about how there may be denial of service cases where it is not able to, but either we want to return the correct result or we want to return an error. And because there may be many things happening without the guest knowledge, like memory swapping, migration, what not, this integrity protection has to hold across all those cases. And so we have support in the architecture for that. From an availability standpoint, we do care about certain types of availability guarantees. In particular, we want to maintain the guarantee that exists today in most virtualization systems that a malicious guest cannot cause a denial of service on the system. And so it should always be possible for a hypervisor to terminate a guest at any time without the consent of that guest. And all of our features do support that. We do not support any availability guarantees in the other direction because a hypervisor can always choose just not to schedule a guest, and that's their prerogative. And from a physical attack standpoint, we do include protection for what we call offline DRAM attacks. So I call boot attacks because the memory is encrypted, then we have protection for that sort of thing. We do not have protection in these generations for what I call here active DRAM corruption if you're physically on the board. That's a much more difficult attack that is currently beyond our scope. Besides those big ones, we have also added a number of other miscellaneous protections in the SNP generation, some of which are on by default and some of which are optional. We have added more support for what we call TCB rollback. So basically, reverting the firmware that's involved in this feature. And I'll talk a bit later about how that works. We also have some optional controls related to protection against malicious interrupt injection, certain kinds of speculative side channel attacks that have recently been in the news, CPU ID spoofing, and so on. We do not have protection in any of these features for what I like to call architectural side channel. So you still have a cache. You still have a TLB. And if your algorithm is susceptible to side channel attacks in that way, there is nothing extra that we are doing in the hardware here to protect against that. Similarly, we do not have protection in these features for things like performance counter monitoring. The rationale that we've taken here is that we're primarily trying to protect the data inside of virtual machines. We are not as worried about protecting the code. So if someone is able to figure out that you're running a database, OK. But we don't want them to be able to figure out what the data is in that database. That is the higher priority information. So how do we do all this? Well, let's start with the integrity promise. And this is enforced through a new structure that we call the reverse map table. And this is a large structure that is allocated at boot time. It is a single global structure. And basically, it contains one entry for every page of memory. And included in each entry are various bits that indicate who that page belongs to. So we can have hypervisor-owned pages, which is the default. We can have pages that are assigned to specific guests. We can also have pages that are reserved for firmware use for various lifecycle tasks. There are new x86 instructions that we've added that I'll talk through in a bit that are used to manipulate these entries. So the table cannot be directly manipulated by software. The primary purpose of the RMP is to indicate ownership and therefore writeability. So we do not want to ever allow software to write to pages, which it is not the proper owner of. The way that this is enforced is as part of our CPU and IOMMU table walk behavior. So in native non-virtualized mode, we will translate an address from virtual to physical. And then we use that physical to index this RMP table. And we will read out the entry corresponding to that and check who the owner is. And if it does not say that it's a hypervisor page, then you get a page fault with a new bit that says that this was the cause of the error. If you are in the guest, the check is a little bit more complicated. In this case, we have two-level paging. So we translate a guest virtual to a guest physical and then to a system physical. Again, we go and index the table. And in this case, the table should say not only that this page belongs to a certain guest, but also what physical address it was supposed to be mapped at in the guest address space. And so we take that value, we compare it against what we had just encountered in the walk. And if anything there is not cool, then we generate a fault. This check is done for any rights because, of course, we have to protect against corruption. We check it for reads in some cases, specifically for guests. But we do not check the RMP for reads in general, basically for performance reasons, in part because we already have the memory encryption going on. So if you're in hypervisor mode and you attempt to read memory as part of a guest, you're going to see ciphertext anyways, and we're cool with that. So the RMP, by its construction, directly protects against things like data corruption and replay because of this new check. It also protects against memory aliasing because you can't have a single page mapped to two guests at the same time. There is another aspect here, too, which has to do with memory remapping. And we address this problem through a technique we call page validation. And essentially what this means is that adding a page to the guest address space is a two-step process. The first thing is that the hypervisor has to donate the page to the guest, and it does this using what's called RMP Update Instruction, and that lets it write to an entry inside the RMP. And that puts the page into a state we call guest invalid. And when the page is in that state, it is not usable by the hypervisor because it's given it away. It's also not usable by the guest because it hasn't accepted it yet. The guest can then execute its own instruction called pValidate, pageValidate, which will set a bit in the RMP, indicating that the page is now in the guest valid state and it can be read and written by the guest. The key thing is that the guest is expected to only validate each page in its guest physical address space once. And this is required in order to maintain the appropriate security around the mapping between the guest physical space and the system physical space. And this is probably a little bit easier to explain with a picture here. So the idea is that, say we have a guest physical address A and it is initially mapped to system physical address X through the standard nested page tables. The guest boots up and it says, all right, I want to use address A, so it does pValidate on that. The hardware in turn sets the appropriate validate a bit and it can use it. If the malicious hypervisor at some point here tries to remap this page to a different location, it could go and it could try to create an RMP entry at a new location, let's say for page Y, and it could flip the nested page tables over there, but the catch is that when RMP update is executed, the hardware clears this validated bit to zero. And if the guests were to try to access it, the hardware will generate a fault on the guest access. And that fault actually goes to the guest and basically says you accessed a page that was not validated. And the guest at this point could say, well, that's very strange because I had previously validated address A and therefore I think that I'm under attack. And so this is what I mentioned earlier about how the guests may not always be able to read a page of memory. This is one case where that could happen. In this case, we get a fault saying that the hardware wasn't able to figure out what the correct data was. So we're just gonna tell you that we failed. So I mentioned there's like a guest valid, guest invalid state. We actually have about eight total page states in this architecture. I'm not gonna go through all of these, but these are used at various points in the guest's life cycle. We also have certain pages that are used around metadata when you go to swap pages to disk and things like this in order to maintain all the integrity guarantees. So pages change state in three ways, as you can see here. So there's the RMP update and P validate x86 instructions that we just mentioned. And then all the rest of the green arrows here are due to various API calls to that AMD secure processor. Another feature that we have added into this architecture that I think is pretty interesting is something we call virtual machine privilege levels or VMPLs. And this is an optional feature, but it allows for dividing a guest address space into up to four different levels. So we have, we call VMPL zero, which is the highest privilege and VMPL three, which is the lowest privilege. And each VCPU of the guest may be associated with a VMPL level. And that's like a static definition for that VCPU. Each entry in the RMP in turn is extended with permission bits, various read write execute bits, for each page at each level. And so the idea is that you can have different VCPUs with different access to parts of the guest address space. And the guests can change these permission bits using a new x86 instruction we call RMP adjust. So why would you want to do this? There's a few architectures that we have in mind and you guys may think of some others as well. One case here is that you may want to have a security enforcement layer that exists on top of a rich OS. So in a standard computer, you could do this using virtualization, using the hypervisor, but once you move to a cloud environment, then the hypervisor is already there, you can't control that. So in this model, VMPL zero can do a lot of that same security enforcement by marking certain pages read only or not executable in a trusted manner on top of the rich OS. So in a sense, it's like a form of nested virtualization. There's a couple other use cases that we see this as being interesting for. One of them is around interrupt protection. So it is likely that Linux and other operating systems make assumptions about interrupt behavior based on bare metal systems. You may assume that you don't get interrupts when interrupts are masked or that you don't get a UD exception on an ad instruction because that wouldn't make any sense. All of those behaviors, however, are possible under a malicious hypervisor and they could take a guest out of, let's say it's design space. So what we have defined as a couple new modes that are designed to be used in conjunction with this VMPL architecture. And the idea is that you have the VMPL zero layer, which again is this security layer, which can communicate with the hypervisor through some sort of a para-virtualized interface using a doorbell. So instead of a hypervisor injecting interrupts directly, it would say using some memory, I've got some events for you and it would inject a doorbell using a new exception vector we call pound HV. In turn, the VMPL zero could then go and actually inject those interrupts when it deems appropriate into the rich OS. So if you imagine that you have some VMPL three VCPUs, the VMPL zero VCPU could go and set some bits in a protected area that say, inject a certain interrupt, a certain exception, so on. So combined together, this basically turns into secure APIC emulation. We've moved the APIC emulation inside of the guest, which is now inside of the trust boundary. And so if this is a security concern for a particular use case, then this can help address that. Another interesting capability of the VMPL architecture is that it enables better support for unenlightened guests. So earlier I mentioned that the guest operating system had to be enlightened in order to work with this SEV technology. But of course there's a lot of workloads out there that that's not practical. And with the VMPL architecture, we can basically use a higher privilege level like VMPL zero as a shim. And whenever an event occurs at a lower privilege level, we can actually exit. Hypeizer can go and invoke the shim and say, figure out what went on here. And that shim can take care of doing all the appropriate instruction emulation and instruction cracking and anything else necessary. So this won't be as fast as if you had native enlightenment, but we think it's an interesting approach to possibly extending this type of security protection to workloads that otherwise wouldn't be able to benefit from it. And the final thing here I want to mention on the hardware capabilities is we call it trusted platform information. So traditionally, CPU ID is the x86 instruction that's used to discover capabilities of the platform. And it's typically intercepted by the hypervisor and the hypervisor may choose to provide different information of what the native system supports. In the S&P architecture, we've added a capability we call CPU ID filtering that can allow us to make sure that the hypervisor is not supplying data that could cause a security problem. So in particular, there are certain CPU ID leaves that specify things like floating point save areas, which directly turn into buffer allocations and software, and so those really need to be correct. There's also instructions and capabilities that if the CPU ID says it's there, the guest will try to use it, and it might get very confused if it's not actually there. So in the S&P architecture, we basically added a way for the secure processor to filter this. And this can either be done at boot time in like a pre-filled block of information or at runtime as the VM discovers capabilities. And after this data gets filtered, then it will be more secure. It will not allow the hypervisor to lie about the capabilities of a platform. It will allow the hypervisor to restrict the capabilities of a platform in case it wants to for migration compatibility, but it can't go the other direction. So let me finish up here by talking a bit about lifecycle management. So to start off with, how do we launch one of these guests? Well, it's a three-step process, so it's very simple. It's basically the same as in our previous technologies. So we start with an unencrypted image. And first, the hypervisor would ask the secure processor to basically create a context, create a random encryption key, and then it will provide the unencrypted image, which gets encrypted by the secure processor and measured. And then at the end, the hypervisor closes the context. And a new thing that we've added here is we allow the association of what's called an ID block. And an ID block is a piece of information that is signed by the owner of that VM. So that might be like the customer in a public cloud environment. And then ID block contains things like what measurement you expect to get. It contains various information about who you are based on the public key and whatnot, policy information, and so on. And assuming that this stuff passes at this time of launch, then this ID information is bound to that guest and will be useful in attestation. Now before we get all the way to attestation, there's another key component here, which is a TCP versioning that I alluded to earlier. So we have things like CPU microcode patch. We have firmware that runs on the secure processor. These are all mutable components. They can all be upgraded in the case of a security vulnerability. And an enhancement that we've made in this generation is something we call a version chip endorsement key, where basically we take the authority of AMD, which is fused into the chips. And we combine that using a cryptographic method with a specific version number of all these components. And we produce a version chip endorsement key. And this is done in such a way that it is irreversible. So a compromised component cannot lie about being a newer version of that component. So putting this all together, after a VM boots, we can then do attestation. This is a different attestation flow than what we've had before. But based on feedback, we've wanted to provide more flexibility with our attestation model. At runtime, the guest machine can ask for an attestation report directly from the security processor. And they're able to talk using a set of communication keys that we provisioned at the time that the guest started. When the guest asks for this attestation report, it supplies some amount of arbitrary data. And that data will be included in the report and gets back. So a typical use case might be that the VM would generate a public private key. It would publish the public key. And it would create an attestation report with a hash of the public key. And that attestation report contains all that identity information from launch. It contains TCB information, the supplied data. And it's all signed with this version key. And the idea is that you send this then to a remote party who is able to check everything here and determine if they trust the versions of a TCB that this thing is running at. And then it knows that, OK, this key actually is associated with this particular VM. And now I can communicate with it by encrypting stuff with this public key. And this attestation report can be regenerated whenever is desired. And it could be countersigned by the cloud or anything else. So that's kind of the startup and attestation flow. Attestation is very important because, of course, you need to supply secrets into one of these guests because they start up unencrypted. So things like disk decryption keys and whatnot. Once the guest is running, of course, you may want to migrate it. And my colleague Tom gave a talk at KVM Forum yesterday with more details about migration. But I'll just summarize here. There's two primary aspects to migration, what we call authentication and data movement. So authentication is can the destination machine receive my VM because it's in compliance with policy? And then, of course, you actually have to move the data. A new concept in the S&P architecture is what we call a migration agent, which is a separate virtual machine that runs under all the same protection that is responsible for the first part of this, for the authentication, the policy enforcement. Previously, we've had relatively simple migration policies that kind of say migrate or don't migrate and a couple other things. The migration agent being a piece of code can enforce an arbitrarily complex policy. And that's more flexibility. The migration agent is bound to the virtual machine at creation time. And information about it is included in that attestation report because it is considered part of the TCB. And the migration agent itself does not migrate. So the idea is that you start up one of these things on both machines. They communicate, figure out if everything is OK. And then, if so, then the migration agent on the source machine will send the relevant secret information to the one on the destination. The data movement piece we can handle through the security processor. Or there's also this concept called a migration helper, which we talked about at KVM Forum yesterday. And that's kind of some in-guest code to make migration go faster. And that's something that can work with this architecture as well. And the final thing I want to mention here is just a little bit about side channels because this is a hot topic. AMD has added support for mitigations against some of the various specter side channels, especially Specter V2, which is one of the ones that does affect AMD products. This includes things like the spec control MSR, which has IBRS and STIVP and bits like that. That MSR is fully virtualized in our architecture so the guest can choose its own policies for that. But that's not always enough. So in this model where the hypervisor is not trusted, we are also concerned about the hypervisor poisoning the BTB of the guest. We don't want it to be able to launch a speculative execution attack against the guest. And so we have an optional protection for this where we can track whose predictions are in the BTB, that's the branch prediction buffer. And if we determine that they do not belong to the current guest that we're about to enter, then we will do a flush in hardware. So we'll guarantee that you never use predictions that are not yours. And this requires a little bit of care from the hypervisor side to avoid being a huge performance issue. But we do have some support for this, and this is something guests can choose to opt into. The final thing we have related to side channels is we do have a policy bit that we've added in the architecture around SMT. So if there is a guest that decides it is not comfortable with SMT for any reason, we will enforce in hardware that it can only be run on a machine with SMT disabled. So it's a big new feature. There's a lot of information here. I hope it was able to make some sense. We do have a white paper that I'd hope to get out today, but it's probably going to be Monday. That is going to be posted on our developer site. That's developer.amd.com. Slash SEV. And that white paper goes into more information about the material that I presented here and about the new capabilities of the SNP architecture. We're pretty excited about it in that we think it offers significantly stronger security protection as well as more flexibility for different use cases. And that's in the form of the integrity protection, the VMPL architecture, more support around attestation and memory over commit, things like that. I did want to say also that the reason that we're talking about this here is that we want to begin engaging at this point with the open source community around Linux support for some of these capabilities. And in particular, figuring out if and how it makes sense for Linux to take advantage of some of these capabilities, things like the VMPL architecture. So if that's something that anyone here is interested in or has opinions about, I'd love it if you could come find me maybe at the break. And I'm trying to sort of gather the stakeholders and figure out who wants to be involved in the discussion. And so that'd be really, really appreciative if you are interested in this space. I was also asked to say by our Linux manager on a somewhat related note that we're also hiring. So if you're interested in that, talk to me as well. And with that, I think we have a couple of minutes for questions. Actually, I can start with one question. What can you tell about the overhead? About what? Overhead, performance overhead. There's nothing that we're prepared to say at this point about overhead other than we believe that performance is important in the presence of security features. And if it's too much, then people won't turn it on. And so we're certainly do everything we can to keep the overhead minimalized. So questions? Thank you very much for the nice presentation. Could I ask you to turn back to slide number 14, I think? Yep. So actually, my Crap Report, I think, had a talk about isolating Linux namespaces with different page tables. Do you think that could help implement different page tables in hardware-assisted manner in order to isolate Linux namespaces? It's an idea. It's not a direction that we have really pursued too much, in part because in this model, we basically try not to trust the page tables. We don't have any way of protecting them, and it's very difficult to do so, given their structure. So that's why we've basically done all the protection at the very end. We said, however you want to get to a physical address, that's fine, but at the end of it, we're going to make sure that it's safe for you to use. We haven't tried to control the page tables as much. I don't know if that quite answers your question, but OK. OK, and another question. When those CPUs will be available on the market? Any idea? You're asking when this is going to be available on the market? Yeah, we're not prepared at this point to discuss that. We're just disclosing the technology so we can start working with the community. But more information will come in the future. OK, thank you. A question. Say in what situations do you consider hypervisor to be malicious? So that's not necessary for us to judge. We're trying to just offer the protection where, as a cloud provider, someone you can basically offer the promise that the hardware prevents us from looking at your data. So it's not necessarily that we expect the hypervisor to be malicious. We just want to have that protection, especially in the case where, say, there is a software bug or some sort of compromise and it now has become malicious because someone has taken it over. So we're wanting to add an additional layer of protection for that. I have a question. Thanks for your talk. There are some researchers about Intel architecture which allow using JTAG and other debugging technologies on Intel platform. And I guess AMD has the same, some JTAG or something like that. And did you work on separating these security mitigations in AMD platform with debugging capabilities? Yeah, so let me answer that a couple of ways. So we do have JTAG debugging capabilities on our platforms like all chip vendors. However, all the parts that we ship for production have those disabled. And so that is not a capability that customers outside of our labs are able to utilize. So JTAG is, and I'm not sure how other companies work with that, but at least for us, JTAG is not a customer visible debug feature. Instead, we do have a debug policy bit that guests that run in this mode can choose to use. And when they opt into that, then we allow the secure processor to decrypt and encrypt their data basically at the hypervisor's discretion and provide it back. And so if you want to run GDB or something like that on one of these VMs you can, as long as the guest has opted into allowing that debugging. Does that make sense? I think so. Other questions? How does this interact with the row hammer and things along those lines? It seems like it would be hard to attest to the goodness of your DRAM. Yeah, so this was not designed specifically for row hammer, but there is encrypting the data in DRAM does help in some regards. The encryption does happen before things like ECC protection, so you don't lose out on your ECC by using memory encryption. And we do have other mitigations for row hammer in our memory controllers, which I'm not an expert in, but I know that there's some new capabilities in DDR4 that help with that. Based on some papers that I have seen that have dealt with row hammer attacks, memory encryption, which, by the way, is available today in our hardware, can help in many cases because of the fact that if you do flip one bit, you effectively flipped 128 bits, but it was not designed specifically for that. You mentioned that the host is able to read the ciphertext of the VM at any time. Is there any temporal protection to prevent fingerprinting attacks? For example, if the host reads a memory location repeatedly seeing whether or not that value has changed, whether it changes back to a previous value, it's very temporal protection. We don't have any fingering protection in this generation. It's certainly something that we're thinking about. We do have the encryption algorithm is tweaked, so the same plain text at different locations will appear different. But if you're looking at one location and just whether that changes or not, that's not something in our current scope. Any other questions? If not, let's thank David for talk.