 Hi. My name is Russ Lagwall. I work for Citrix on Xen and Xen server, and I'm going to be talking about implementing Secureboot. There is no audio. So you need to actually keep quiet, and then you will be able to listen. The mic is only for the live streaming. So I'm going to be talking about Secureboot on Xen. So I'll briefly go into some background about it and why it's useful. So Secureboot is basically a way of preventing malware from running at boot. So the worst thing that can happen is malware affects the bootloader or operating system kernel image. And once it's got to that point, it can basically own the system. I mean, this could happen from like a rogue update or something. So there are a number of ways that it could happen, but one way of preventing it is to use Secureboot. So the firmware basically has a way of working out whether the image is trusted or not, and preventing you from booting untrusted images. So this works well if you've got real hardware, but if you've got a VM in the Cloud, for example, like running Xen, unfortunately it doesn't support Secureboot at the moment. So what can we do about it? So just on some background, Secureboot is actually part of the UEFI specification, just added in version 2.3.1. And this is basically a replacement for the BIOS, which the firmware sort of starts operating system and provides some services to it. So the first thing to notice is that, well, if we want to use Secureboot, then we need to be using UEFI guests. Luckily Xen does support UEFI guests already, so there's nothing for us to do. It makes use of OVMF, which is kind of like a build of the Tiana Core open source UEFI implementation tailored for virtual machines. Now the way that UEFI starts kernel or bootloader is kind of different from if you're used to BIOS. So with BIOS, it basically chooses a disk to boot and starts executing from the NBR at the beginning of the disk. With UEFI, the firmware knows how to understand GPT partition tables and FAT file systems. And so it's configured to boot a particular file, a particular file system, and this is sort of quite a lot like how an operating system would start some executable that it knows about. Now, when you turn on Secureboot, there's an extra step basically before executing that file, it verifies whether the file is trusted and can be executed. So how does this work? So hardware has NVRM, which is kind of like non-volatile storage separate from the main disk. And these store UEFI variables, which are kind of like key value pairs. And some of them contain certificates. There are a number of them and I won't go into the details of what they mean, but essentially the bootload or kernel that you're executing needs to have been signed by one of these so-called trusted certificates. And therefore, if the kernel is replaced with something else, the firmware will refuse to start it because it hasn't been signed properly. The certificate databases themselves can be populated in a number of ways. One of them, which is probably the most common, is just that at factory install, the certificates are loaded into the NVRM and it just works because the laptop or whatever comes preloaded with an operating system. And so if you're running Windows, it'll come with the Microsoft certificates. If you want, you can update the certificates in the database. So these databases are called authenticated variables and they require, to update it, you make a sort of runtime call called set variable. But this update needs to be signed by specifically one of the certificates that's already in the database. And so malware can't sort of trick its database it can't trick itself into being trusted by inserting its own certificates in the database unless it can sign the update, in which case it could sign the bootloader anyway. The third way is typically, there's like a platform-specific reset method. So on real hardware, you would say press F2 during boot and with physical access reset the certificates to all clear them. For a VM, typically the way I implemented it on Zen was just in the hypervisor or specifically DOM zero. There'd be some button that you can press or command to run and you can clear the database for a VM. So if you look at how this is implemented on real hardware, kind of the most important thing is that the code to update the set variable code which handle, and firmware which handles updating those certificate databases needs to be protected from being sort of interfered with because if malware could interfere with this code and some are circumvent the checks, then it would be possible to just insert your own certificates or if it could write directly to the flash then it's a problem. So there needs to be a way of protecting the code that's running but it's kind of just, it runs on the same CPU that's executing the other code which it could be some sort of rogue device driver that's running as part of the operating system. So there needs to be a way of sort of defining an extra level of privilege or execution context. And on x86 processes this already exists in the form of something called system management mode or SMM for short. So this is kind of something that you can jump into and execute code from a special section of RAM called SM RAM which is hidden from the rest of the system. So the sort of security sensitive part of the firmware is placed in that SM RAM and the NV RAM is configured in a way that it can only be written from within SMM. So making a variable update then requires doing an SMR and then it traps into this system management mode and in theory it's secure, at least in theory. And so KVM has implemented secure boot. The approach that they took kind of, it virtualizes what real hardware does. So KVM emulates some flash memory which is the NV RAM and KVM emulates system management mode for guests and it kind of implements or reuses parts of the tianocore firmware for the SMR handling and the way that it jumps into the sort of SMM part of the firmware. So there's an interesting talk by Apollo about implementing this on KVM so I won't get into too many details because I'll probably get them wrong. So how should this be implemented on Zen? So there are kind of lots of vulnerabilities against SMM mode because it's kind of tricky the way that it's implemented and so there are lots of ways of attacking it, all sorts of cache attacks and the previous talk I mentioned about implementing it on KVM also details some of these attacks. In addition Zen does not have any support for emulating SMM so implementing it would require, well could introduce more bugs at least until they ironed out. Thirdly the using emulated flash kind of limits the flexibility of how variables are stored because the code that writes the flash is stored in, well it's part of the firmware which runs inside the guest. So this is kind of okay for regular hardware but for VMs we want to be more flexible because you want to be able to import VMs, export VMs, migrate them to different hosts. So it'd be useful if you could have something that was a bit more flexible. So with virtualization there are kind of already two distinct privilege levels or execution context, broadly speaking the guest and the hypervisor. So using SMM is not really needed to create this boundary or separation. So what we propose is to run a daemon in DOM zero which what for Zen would be DOM zero essentially part of the hypervisor that implements the variable services outside of the guest itself and then add a new module to OVMF which implements the variable services by essentially proxying them to the daemon that's running in DOM zero. So there are about four or five different variable services that it does this for. So this means that the guest does not have direct access to the code so it doesn't need to make use of the special SM RAM. It doesn't have direct access to the storage so that isn't anything specifically needed for the flash emulation. And it means that the variable storage can easily be abstracted into different back ends. So you could use an SQLite database or a Zappi database or flat files kind of whatever you need to use to get the situation done. So I'll just talk about how this works in a little bit more detail with an example. So let's suppose that operating system wants to make a update to one of the certificate databases by adding a new certificate. So it would do this by calling the set variable call which is a runtime service. It's sort of a bit like a system call to the firmware or an indirect function call. And it would send the new data and sort of a signed authentication descriptor which needs to be signed by one of the existing certificates in the database. So this goes into the proxy module in OVMF which we write, which we call zen variable. And it finds the set variable handler there. So that basically has some memory which it's set aside and it serializes the function call parameters into this separate memory and then makes an IOPORT write to a well-known port number with the address of that memory. This causes the demon in Dom's area to wake up which we called vast or D. And so it handles the IOPORT write from the guest. That basically un-serializes the function call that's, or it maps the memory from the guest, un-serializes the contents of it and works out which command to run. In this case, it's the set variable one so it calls the set variable function and this then proceeds with the regular behavior of set variable and all the various authentication checks. And if it's a successful call, which in this case, let's say it is, then it stores it in this API database which could be anywhere, for example, could be on another host even. Once this has happened, the response is then written back into the memory mapped buffer. So it would be EFS access, just a simple status code for this. That returns from the IOPORT write which then causes the guest to continue executing and the zen variable module basically un-serializes the response and returns it back to the operating system. So there's kind of a clear separation between what happens in the half of us and what happens in the guest which makes it quite easy to analyze from a security perspective. So just to go over that, so we wrote a daemon called vast.d which implements this and at the moment there's a single backend which is the ZAPI database which is used on the Zen server but it's kind of written in a way that makes it easy to use other backends. And then there's an OVMF module called zen variable which implements this proxy and we've got it working on zen at the moment and so you can implement secure boot and test it with both Linux and Windows guests. I believe it could be used with KVM without too much difficulty due to the fact that nothing in it is really Zen specific. So this could be a different approach to SMM. It's also not really a platform specific implementation so SMM is kind of tied to the X86 platform as far as I understand whereas this, okay, it uses an Ioport right but the same sort of approach could be used on any platform. So I've got a demo video which I didn't want to do it live since it seemed a bit risky, but. So what I'm going to do here is start a VM. This is running on Zen server and it's booting up in UEFR and at the bottom right is the console log for vastld which is essentially logging the runtime service request that vastld is handling on behalf of the guests. So you can see various get and set variable commands. I'm just going to check that the kernel reports that it has been booted securely. So Linux reports that secure boot is enabled. I think it's a bit small to see the back bit. The video is online if you want to look at it afterwards. So what I'm going to try here is modifying the boot loader what the firmware executes and in a way that would not ordinarily cause it to well it should still boot afterwards but because the signature is different it's going to, well the firmware should refuse to boot it. So I've just written FOSDEM 2019 to the very first string in the program then the VMs then restarted and if all goes well the firmware will stop and not boot the operating system. So this is a trivial example but it just kind of shows what if Malware tried to somehow patch the kernel. It's not very clear but that's the UEFR way of showing that it didn't boot which is not exactly the greatest user experience but it's possible to then run the command manually or execute the boot loader manually and you can see it says security violation which means that secure boot is doing its job. So that just leaves when will it be available. Unfortunately it's not yet been publicly released and so we intend to release it shortly and it'll be announced on the Zen mailing list so if you're interested please look out for it there and that's the end of what I've got to say. Are there any questions? The question was how does it relate to QMU and how does it work with QMU in stub domain? So at least as we've implemented it it's a separate daemon that's completely separate for QMU from QMU. So it then has support for they call IRX servers and you can have them in separate programs essentially so it's kind of not related at all and if QMU is in a stub domain then the daemon could be anywhere else including in a stub domain as well. Yes, good. Yes, so the question was do I have working code for the certificate and authentication stuff? The answer is yes. So the demo that I showed was kind of all working. The talk is not over yet, please close the door. You will have time actually to enter the room. Sorry, can you repeat what you said? Yeah. Yeah, that's implemented. So I didn't have a demo of that because it's not really much to demo there. You mean the implementing time? I mean it's kind of a few thousand lines. I don't think it's not super intense to write. Kind of also developed a test, fairly extensive test suite for it because it is quite complicated. Yeah, yeah. So I mean that is kind of the disadvantage of this approach is that it has to duplicate some of the code that's already been implemented in OVMF. But it's a few thousand lines of C code so it's not terrible. Well I mean it maps the memory but then copies it out of the memory that it's mapped. Before it uses it. So I don't know what you mean about how else would it get it if it didn't map the guest memory? And with SMM you can use them to bypass your boot. But with this if you actually find a window and you can bypass your boot you escape from zero. So the difference is that with SMM the bugs stay in the guest. Here the bugs go to zero. I agree that it's probably not hard to hold it for things that are as trivial as the objective of them. Yeah. Still it seems to me that the good design would at least eliminate the possibility of them happening. So the question was basically how the question is how is the security of ASTL be handled because it's running in DOM zero. It's kind of an extra attack surface. So at least for how we've implemented it on Zen server specifically it runs kind of sort of with no privileges in a sort of container environment and so even if you could escape into DOM zero it should well it should be contained and you can't really escape out of that extra jail kind of thing. There's a question at the back. Yes that would be so the question was can you run the DOM zero bit in a stub domain? So that would be the answer is yes that would be another approach to reducing or removing any potential security risk of running it. So yeah either within a jail or some kind of stub domain. Is there one vast for D for all VMs or is there one for every VM? It's a separate instance. Is there one vast for D for every domain? The answer is yes. It's a separate instance kind of like the way you get a separate Q and U for each domain. So time is up. You can find me afterwards if you want to ask any other questions. Thank you.