 Hi everyone. I'm Elena and today I'm going to be presenting a project which many people listed on this slide have greatly contributed and also many which actually not listed even. So this is a really joint work between many great people. And the topic I'm going to talk today, I mean the slides titled Hardening Elena's Guest for Confidential Cloud Computing, so what you're going to be covering is, first we're going to talk about, well, what is a protected guest concept we're talking about? How is it, what is so specialist with confidential cloud computing threat model? And why do we need to do some separate security hardening for it? Then we're going to go into the overview and kind of details on our hardening strategy. And one thing I'd like to say upfront is that this is not a work which we are done basically, so which is finished is still work in progress. So some things might still change and especially in regards to this our last direction of passing, we're still trying different approaches, seeing which one performs better and so on. So and the goal of this talk is to both present what we've been trying to do to try to gather suggestions on how it can be improved, as well as what's your little talk to end, basically discuss of community, how we can do better together and why we think that this should be like a joint effort to community and our company effort which is just like until doing with one for one of the projects we're doing. So okay, so let's start. Why do we need to harden and what are we trying to harden? So if you take a look on this picture right here where we have shown kind of legacy in VM sector stock, so again I'll try to base generic as possible but for example no setup they're using it's a KVM based setup so I mean the components are listed here called for the KVM virtualization stock but in principle this can be applied to any other hypervisor as well. So traditionally I think for the last decades the trusted computing base for VM guest has included the host and VMM so in this case a KVM and QMU. So the KVM is in full control of the VM guest it executes, it can access memory, register state, execution content, inspect any registered ones upon the guest exit and so on. So this is a classical kind of legacy scenario and of course in this setup VM guest is fully dependent on the KVM and the host and it's included in its TCB. So what has changed now in this protected VM guest concept with the introduction of this confidential cloud computing is that we have started to see over past years number of technologies on hardware CPU technologies which are different CPU vendors like AMD Intel coming with is that we're trying to take the aim of these technologies is to try and to take VMM out of the TCB for the productive guest. So we would like to run the guests which are protected from the attacks done by the hypervisor and how it's done for each technology it's out of the scope of this talk so if you're interested to learn more about those different technologies I encourage you to check the relevant talks there's a lot of information on AMD servers, a lot of information on Intel TDX so go read all or kind of watch all this information but the main thing which is relevant for us is that the virtual machine monitor of the KVM is not anymore able to go and inspect the state of the guest, the memory is protected and guest is kind of separated. So for example in Intel case just to again very a lot of details but just kind of to give one obstruction level is that we have this component which called TDX module here just a software component and it kind of plays a role of a trusted stream between the guest and the host and it together with our technologies like memory encryption and so on it prevents the malicious potentially malicious host VMM to directly inspect the guest memory, guest register states and all if the guest needs to exit the exit first happens to this trusted TDX module component and so on. So if we take this for granted this protection switch all the CPU technologies offer us and we're going to take it for granted from the during this presentation. So one might ask like what is where to harden if these technologies actually provide us this way of separating this guest TCB and not including the host into the TCB anymore but unfortunately what we have still is this various communications which VMM guest needs to perform and where the existing communications which has happened I mean they have been used for a long time and for example like if a guest needs to read some MSR or perform some poor German myoxys it needs to get request this from the from the supervisor from the host and in this scenario like and again if we talk concretely about Intel TDX yes it's going to use a specific TDX specific hypercall TDVM call to do this and kind of it's in practice going to be kind of going through the TDX module but the input which is going to be consumed so that MSR which is read it's actually the VMM and the host is going to be filling with values. So if you're now assuming what in this new or confidential cloud computing thread model that now this is a malicious actors so we have to also assume that basically all this paravirt inputs are now malicious so they can be malicious we can contain malicious input and the guest software stack which consumes this malicious input has to be hardened to to withstand potential attacks or using these attack vectors and in addition to this paravirt inputs we also have thing which is called shared memory it's memory pages which is shared between the protected guest and the host where the host has full access and the VMM has full access to kind of modify them and do whatever it wants with it. It's used for a number of purposes in our case one big usage is virtio because we use virtio for as a main communication channel but now for the protected guest like to outside world when networking goes through the console and so on and so guest has to be aware that this is it's actually consuming this input which can be malicious and again kind of handle this input gracefully and maybe one thing which is important to mention here is that while the stack or the guest software stack can look very different this is just one example here we have some virtual firmware on the bottom the kernel linux kernel we're talking about linux here obviously and on the user space on top you can have you know boot loaders in addition in between you maybe you can run very minimal virtual firmware and so on but the work we have doing have been doing and presenting now i'm talking i'm going to be specifically talking about the guest kernel on guest linux kernel but it has to be actually applied to all the components of the stack if we want like which which are able to receive this malicious input from the host to the MMM so but for this talk we're going to just focus on the kernel as this one of the major piece here in the stack. So back to the talk surfaces i've been discussing here so this red attack surfaces so one kind of aspect of this attack surface is this very distributed attack surface so if you take a look for example and i have numbers here for like 511 if you take 511 kernels i'm standard to buntu config you will have over 26 000 different code paths in the guest kernel which handle this parallel inputs and this is actually even not including shared memories it's just different parallel inputs so it's going to read some mmsrs from portaiom and we are the of course 90 percent of this is going to be in drivers not mmsrs but for portamom but but this is a very big surface so it's we have to kind of really think how they're going to harden this very distributed surface and the important part for it is that many of these drivers for example they have the exception problem networking drivers maybe some other drivers which kind of already did some went through some hardening for some different use cases most of this code has never been written in mind with this inputs which we are kind of reading with you know the mmsr or portaiom is actually can be malicious so and because this code has never been written that in mind we might be able to see actual problems there and our input complexity of handling of this inputs can be very different as we saw from the looking into code it can be some you know you read an mmsr you change mask a couple of bits and write it back it's a very simple like nothing really to kind of exploit much or misuse or when it can be very complicated logic with kind of upgrades and addresses and obtained from this mmsrs and our portaiom inputs so you but talk surface is big you don't know where the problems are you might guess where you know some problems might be but there's no way to kind of really knowing for sure apart from kind of developing methodology to kind of harden this talk surface and this is what we we have been doing this project so before going into kind of the hardening strategy uh itself let me just like you know very quickly kind of go over high level requirements that you try to kind of either put up on ourselves or just simply request it so of course like ideally what we'd love to do is that we'd like to say what i'll just let's cut all these channels and let's enforce these trusted channels only and let but we can't do it so i mean each of these channel presents a functionality which is needed in the guest and we can't just say that we are going to just kind of disable that functionality take security so we we have to kind of figure out how to secure it what we have of course whatever technology or whatever kind of methodology we develop it has to work with any custom kernel again we are for our project we are doing it for a particular kernel but uh any csp any uh vendors kernel links kernel vendors uh they have to do i have to be able to kind of uh efficiently repeat this work for their own kernels so it has to kind of take this in mind uh the work is big so i would like to kind of automate it as much as possible minimize the amount of code instrumentation we have to do this is particularly important for fuzzing because i mean with fuzzing you can kind of you can successful fuzz if it's like very narrow pieces of code if you kind of put very fun grade kind of harnesses around but uh we're talking about big amount of code we need to kind of potentially reach so we can't like uh we can't we can't really kind of follow this approach in generally like to use really available open source tools because that's the only way we can actually kind of uh freely publish which we're doing with tools we're doing uh hope for feedback and uh that people can take and develop it further and hardware independence setup is also important because not everyone will have access to a hardware very soon and having a setup that people can replicate it doesn't require dedicated hardware is important in our opinion so um so what is our hardening strategy so kind of on a high level you can think of it as three kind of three activities which are outlined here in the kind of in the snatch loop and and they're really not like step one step two step three activities they're really this age-ready approach where one activity will kind of you know results of one activity will feed and kind of help improve the next activity or modify kind of maybe the target uh targets of next activity and so on and we keep doing this in a circle kind of while we're fine tuning all our tools and finding the best kind of the best possible stop so the first one is is that we try to minimize of course amount of code that gets executed in the guest kernel without that it it wouldn't be possible to actually kind of do the arrest of two activities it would be too much to look manually at all that code and you know even plus all that code so we try to disable as much as we can i'll go next slide of what kind of how we try to do this um now for the code which you can't disable so with code we actually need for our functionality and the code which we again we're only talking about we're concerned here with code which can take this malicious input from the host so we have to manually audit the code and we have methodology i'm also going to explain the kind of form