 Hello, my name is James Bottomley and today I'm going to talk to you about encrypted virtual machine images for confidential computing. This is a joint demonstration done by me of IBM and Bridgesh Singh of AMD. Unfortunately, owing to pandemic conditions, we would have presented this jointly in person, but since we have to do little video recordings for you and the fact that we're both a long way from each other means that one or other of us had to do this recording, Bridgesh already has another session, so I get to do the recording of encrypted virtual images for confidential computing. Before I start, I should say a little bit about both of us, because if you have any questions about this presentation after the fact, you can contact us both by the email address as you see on the screen and, alternatively, you can contact me by my Twitter handle theoretically. I have to confess that Twitter for me is pretty much right only, so if you want to contact me on my Twitter biography, there's actually a matrix handle that you can use for direct messaging, and I'm actually responsive on matrix, whereas I log on to Twitter about once every other month. Before we begin, I should also apologize for the screen behind me, but what I've learned in the past 18 months of pandemic meetings is that people are much more apt to concentrate on the interesting background behind you than they are on the actual content of what you're saying, and in order to restore that possibility for you, I decided to screen it from you. So with that, let's start with the presentation. First of all, I want to get into what is confidential computing. This really is going to be a 10,000 foot overview, because there are several other presentations at KBM Forum that get into the actual nitty gritty of what confidential computing is, and all I'll really be talking about is what I am actually defining confidential computing to be. And for me, for the purpose of this presentation, confidential computing is all about x86 and bringing up encrypted virtual machines. Now, before those of you with other architectures leap down my throat, I have to point out that this isn't the only way of doing confidential computing. For instance, doing an encrypted envelope to protect the data isn't necessary if you have an isolated envelope to protect the data. And there are architectures like IBM's S390 and our arms trust zone, which actually do isolation instead of encryption to protect the confidential envelope. However, for the purposes of talking about encrypted virtual machine images in confidential computing, I'm concentrating exclusively on the x86 case. And that means the AMD SCD, the secure encrypted virtual machine technology, and the Intel TDX, which is the trusted execution technology. And effectively for both of those technologies, it means bringing up a virtual machine inside an encryption envelope. Once you've done this, what it means is the host system and particularly the administrator of that host physical system is incapable of capturing the contents of that virtual machine because they're encrypted against them. Only execution happening within the virtual machine sees unencrypted content. Both Intel and AMD implement this as encryption in main memory. And what that actually means is that things like the first and second level caches are not encrypted. And this also means there are lots of interesting attacks that you can go off and read about in academic papers that actually use the unencrypted cache to exfiltrate the data. So you shouldn't think of encryption as being the be all and end all of confidential computing. I'm fairly certain this is actually the beginning of a journey rather than the end of the road for all of this technology. So if you look at how we implement confidential computing in KVM, the simplest method is actually what we already have today. So KVM and QMU already have a mechanism for actually handling encrypted images. And it's called the Lux QCAR format and it's mediated by QMU in user space. And what it actually means is this. You actually get you actually have to hand off a key to QMU KVM and it will actually use it to encrypt and decrypt the contents of an encrypted disk before it goes into the virtual machine. And the problem is if this is an encrypted virtual machine with a confidential envelope, all of that encryption decryption is actually happening outside of that confidential envelope, which obviously means that if that confidential envelope, the encryption and the virtual machine is supposed to protect your secrets, it's not doing a very good job. And for that reason, we unfortunately can't use the Lux QCAR format mediated by QMU for actually doing this encryption, we have to look at something else. So one simple solution would be to move this Lux encryption into the image itself. So the encryption is actually executing within the confidential environment, which would solve most of the problems. And the good thing about building such images is they actually exist today. It's how most encrypted laptops work. And if any of you have a laptop that either you use for yourself to protect while you go through airports or custom zones or anything else or protects your employer's secret, chances are today you're actually using encryption technologies that work exactly like this, you'll have a Lux encrypted partition or Lux encrypted entire physical disk that you use to protect those secrets, meaning your laptop won't boot up unless you give it a password. So if we look at how this would work inside the confidential envelope, instead of being inside QMU KVM, the decryption itself is actually happening within the virtual machine. And that means that all QMU KVM the virtual machine monitor is doing is passing encrypted disks through so that they can be decrypted within the virtual machine. What this means is that the virtual machine monitor has no access either to the unencrypted data, also the key that is being used to decrypt and re-encrypt that data as the virtual machine is executing, which would seem therefore to give the complete solution for confidential computing virtual machine images. The untrusted VMM no longer sees the data or the key, which is the goal. And the problem now is how to place the decryption key securely into the guest such that it's not subject to interception by the untrusted virtual machine monitor. The first thing we have to talk about in this is something called attestation. Attestation basically means measuring something by a secure hash and validating that measurement. So SEV and SEVES, which are currently the only facility you have available in the market to place today, and they only do something called pre attestation. What pre attestation means is you verify pieces of virtual machine that are placed into the image before it started. Currently, only the OVMF firmware is one of these things that is placed into the virtual machine before it started, and therefore in pre attestation, that is the only thing you can attest to for a booting virtual machine. Even if we could attest the virtual machine image, because remember in the previous thing, the image is not placed into the guest virtual machine. The virtual machine boots from an encrypted disk to actually alter the memory contents to run it. But even if we could attest to the encrypted disk image, we wouldn't want to because it mutates as the virtual machine executes. The thing about attestation is the SHA-256 sum of the contents of something, and you want to validate that SHA-256 sum, you can't have it changing from moment to moment. You need it to be reasonably static. And the virtual machine image by virtue of the fact that it mutates doesn't provide the necessary static nature of that to allow you to attest to it. This means that for an encrypted virtual machine, what you really want is for the encryption envelope itself to protect the mutable bits of our virtual machine, and you want to attest to something that doesn't mutate. Obviously, with pre attestation, we can use a QAPI, a QMU API to retrieve that measurement of the P flash pages placed into the guest. And the measurement has to be delivered by a trust development. And on AMD, this trust development is called the platform security processor. And on Intel, it's called the quoting enclave. All current technology, by the way, requires UFI boot with OVMF. There is no encrypted machine technologies today that will actually work with SE BIOS. So everything from this point on forwards is talking about using a boot with OVMF. Current encrypted UFI boot on a physical system would go via an unencrypted EFI partition, which is where you would actually store grub. And then grub would receive a password that you type in, which you then use to decrypt the encrypted partition. So this dual unencrypted and encrypted piece is what you use to bring it up. The reason you can trust grub on a physical system is usually because it's verified by secure boot, or at least I hope it is. But this technology is not available to us in a virtual machine system. So we have to think of some other way of protecting that grub. And obviously for key release to work securely, every unencrypted piece must also be attested. And that has to include grub. So we have to find a way of actually attesting grub. However, only elements in P flash zero, which is where OVMF resides can be attested. So the simplest solution to all of this is actually to pull grub inside of OVMF, which is what we actually do today by introducing a new AMD SCD OVMF package, which actually includes grub. The patches that do this already upstream in OVMF today. So if you look at the EDK2 packet, the EDK2 GitHub repository, you'll see this AMD SCD package. Once we have attestation, we can then get on to secret injection. The AMD PSP can inject a page of data into a confidential virtual machine before it actually launches. The guest virtual machine has to find it and use it. And this causes another huge problem because you can inject it at any physical address designated actually by the guest virtual machine. So how do you find it again? Particularly as if you look at main memory, the P flash one that you insert into the virtual machine attested is now going to be uncompressed all over main memory. And you have to ensure that it doesn't trample all over the secret you're going to inject. So the first problem is to find a GPA which you can inject secret at. And the way we do this is in OVMF, we set aside a region of the MAMFD, just one page for actually storing that secret. And if we use that page, that page is then known to OVMF. The big problem is how do we actually make it known to KMU. And the way we solve this is to declare this page in OVMF inside a labeled part of the reset vector by what's called the GUID, which is really just a 128 bit random number. QMU can now actually scan the P flash image to find this GUID at the location we've placed it in of the reset vector. Once it has this GUID, it can then extract the parameters for the GPA to inject the secret at. This means that this GPA value is known to both OVMF and QMU. All tables in the solution are actually good described. So the secret table itself would actually contain a GUID description. All it basically means is a random GUID, 128 bit number, and a length, and then something that's described by that GUID within that length. All tables for OVMF are constructed like this. It's actually very simple. In order to be sure you're handling the correct table, all you have to do is find the GUID. And because 128 bits is rather large, randomly choosing a GUID is usually good enough. The next problem is to thread the secret handling through OVMF and up into Grub. Such the Grub itself can use the secret you injected into the virtual machine that was picked up by OVMF to actually perform the decryption. Since the secret is installed in a known MMFD location in OVMF, OVMF will cover it with what's called a boot time hob to protect it. Hob is handoff block. Boot time hob really means that this block will remain protected in the OVMF until you call Exit Boot Services, which isn't actually done until you get to the kernel OVMF bootstab. So what we can do is actually add an OVMI, add a UF like configuration table entry to point to the secret area once OVMF pulls it out. This configuration table is actually handed up to Grub. Grub itself does not call Exit Boot Services, the kernel does, so the secret is still available to Grub at the time it's handed off. Grub can now retrieve this configuration table and obtain the secret. Once Grub has the secret, it can use it to decrypt the encrypted volume and boot the virtual machine. To get all of this to work, OVMF has to be stripped down to its essentials and may hard fail if boot fails. That so it can't be tricked into revealing the secret because remember it is notionally under the control, the executing virtual machine, of the cloud service provider, the untrusted element. Even if they can't see inside the confidential computing envelope, they can alter various configuration parameters and if they can trick the boot into failing and giving up the secret, everything is lost because this OVMF and Grub is the only thing protecting the injected secret. So at this point, I would have loved to give you a demo. Unfortunately, when I made this video, all of the confidential computing machines, so we have AMD machines in the IBM Yorktown labs, were actually shut down for maintenance. So instead of giving you a real-life demo, unfortunately what I have to do because that shut down window occurred when I was making this video is tell you what would have happened. But hopefully after the fact, I'll actually be able to run through this video again or this demo again when the machines are up and actually give you a copy of them and I'll post that somewhere on the web so you can actually see proof that we got it working with a real-life virtual machine. So if we look at a demo, what you can see above me is the guest owner proxy and then to my right is the encrypted virtual, sorry to my left, is the encrypted virtual machine. This consists of the physical system which may or may not be trusted, but it contains a trusted element called the PSP, you can see it marked in green. On top of this physical system, the cloud service provider will bring up an untrusted virtual machine monitor, QMU KVM, marked in red, and on top of that you hope that it will bring up an encrypted virtual machine to which you will verify by attestation. So the first thing that actually happens with this PSP is that the guest owner proxy receives a Diffie-Hellman certificate from it. This Diffie-Hellman certificate can be verified with AMD to prove that you're actually talking to a genuine PSP. Once you have this Diffie-Hellman certificate you can use the Diffie-Hellman key embedded in it to formulate shared secret which will only be known to the guest owner proxy, you and the PSP itself. So when you have this guest owner secret you actually transfer it back to the untrusted virtual machine monitor as part of a launch bundle. This launch bundle will contain a whole load of encrypted things including a transfer key and an integrity key and it's given to the untrusted element but because it is all encrypted the untrusted element can't actually tamper with it. So they will either launch the virtual machine with this in which case you'll get attestation and proof or they won't in which case you won't. So the next thing that happens to the encrypted virtual machine is the untrusted element actually places this AMD SEV package inside it, the OVMF and GRUB combined and obviously at this point this is actually untrusted because it was put in by the untrusted element. But the next thing that happens is that the PSP will perform a measurement of these two components and hand you over a trusted channel the measurement itself and a random nonce which you will then use to formulate your your secret bundle. And as long as you agree with the measurement you'll then actually move on to wrapping the key inside an envelope which is encrypted with the encryption key but is also integrity protected with the nonce and the integrity protection key that you encrypted in the launch bundle. So once this is handed into the PSP the PSP will verify the integrity and if it doesn't match if the nonce doesn't match it will reject that key but if everything matches it will then perform the decryption and place that encrypted key that unencrypted key into OVMF at the location you both agreed to using that scanning technique we discussed earlier. Once this happens OVMF will hand that key up to GRUB using the configuration table and GRUB will then actually hand that key into or will actually use that key to decrypt the encrypted volume and this will then mean that all of the volume decryption encryption is happening inside the encrypted envelope. So effectively we have a successful launch of an encrypted virtual machine such as the untrusted element the red piece if you look at it wasn't able to receive any part of either the bits comprising that encrypted virtual machine or the encryption key. So what we need to do now is to actually analyze the security of this envelope and the first thing you can see is the security rests on a tested piece and an encrypted piece. The encrypted piece isn't attested because it's mutable so only the first piece that the cloud service provider would place into the machine is tested and this means that the encrypted piece is actually vulnerable to substitution attacks that's where you take cipher text from one incarnation of execution and place it into another incarnation of an execution. You would actually be able to trick the virtual machine as it's executing into possibly putting known plain text into that place and then substitute in place this is a cipher substitution attack. These are very dangerous. We could fix this in the image by adding integrity so we would use dmcrypt plus dmintegrity to do it. The reason why I haven't done it in the solution today is that the technology itself doesn't have memory integrity. Now it will in a future incarnation which is scv s and p and obviously Intel tdx but they're not available in the marketplace today which is why the the current solution is based on dmcrypt only not dmcrypt plus dmintegrity. And with that we move on to the conclusions. With current technology available today and actually mostly upstream so I have to tell you that the handling for this is upstream in ovmf. It's upstream in qmu but it's not quite upstream in grub because grub was actually having a feature freeze over the last year when this technology was developed but I'm hoping to get it upstream into grub very soon. It relies on pre attestation only which means it's compatible with the current scv and scv es and as future work what we're actually doing is we will think about what we apply this to post attestation and other future work includes getting the grub piece of this upstream and it also includes extending the secret handling to use other secret elements which can be passed on to say the kernel use them things like how to contain us and for other actual uses that we haven't yet thought of that would require secrets to be injected into the operating system itself. With that if you like this presentation I have to tell you that it was not done in Prezi, it was done by inimpress.js by Bartek Sopa that unfortunately does make me a web developer which in kernel terms puts me even below that of Rust developers and with that I will say thank you and call for questions.