 All right. Let's go ahead and get started. I know it's super early and people will probably trickle in but you know We really only have 20 30 minutes anyway, so we can start This is a weird talk for this conference just so you know I really didn't even quite expect it to be Accepted because it's much much lower down the stack than most of the things we talk about in infrastructure But it has a huge impact on the things that we work with in open infrastructure every day So it's in the realm of this is really good to know It may be a while before it affects your day-to-day work, but it's really good to know so I've been working in the industry since the 90s on large-scale server deployments We didn't call it cloud and containers back then But it was a lot of the same technology a few years ago I took a break from working at Hewlett-Packard Enterprise and their cloud BU and started a PhD at the University of Cambridge so this is kind of a combination of Large-scale server deployments and security and all the way down to the micro architecture level So it's a it's a weird span So there's really three things I want to talk about and one is looking at these particular vulnerabilities How they started how they affect us in cloud and containers Talk a little bit about the PhD research I did because that was some practical solutions and then also want to point out some risk-five work that you might be interested in So What I'm talking about what my what I was studying Well, originally what I was studying was virtualization security. That was my PhD topic I started the PhD in January of 2018 when the Spectre and Meltdown vulnerabilities were first revealed They'd been known about privately for about a year before that but that's when they were first published And I pretty quickly realized that the kinds of virtualization security I was trying to work on were completely undermined by these vulnerabilities In some very weird scary action at a distance ways that you wouldn't expect So there have in the past four years a quite a number of different variants have been reported But they all boil down to two simple classes of vulnerabilities and once you understand those you pretty much understand the whole set So the first class is the Spectre class And what these set of vulnerabilities do is they mistrain predictors in the micro architecture So down even below assembly language like down down very very bottom level of the machine And they can either leak privilege data or manipulate control flow or actually Manipulate control flow in order to leak privilege data. That's the scary things about them The Meltdown class Is somewhat similar, but they actually take advantage of the fact that the way the micro architecture works at a very low level You can have exceptions that you know, it's an exception like you know They shouldn't have access to that memory in that page table But you don't actually make that exception visible until a point in time when you're certain that it should happen Again, it's about these weird predictors and doing things that may or may not be real So meltdown actually turns out to be pretty easy to mitigate So I'm mostly going to talk about Spectre that the trick for meltdown is You actually have to do those permission checks for what memory you should be able to access Right away and don't just go ahead and look what what happens is they load the memory value It's like a kernel memory. You shouldn't have access to it But the way it works underneath it just it just goes ahead and loads the value and it says okay You can't see this so don't worry about it. It's not really here And I'll eventually erase it when it's time to tell you that it's not real But it turns out you can actually kind of poke holes at it behind the scenes So those are the two classes we're looking at that probably makes very little sense right now. So let me Dive down a little bit deeper The way that most of us software developers think about like the machine at the very lowest level is We just sort of see it as okay We write our code it translates it down to machine code and then it just goes through chunk chunk chunk chunk chunk It fetches an instruction it executes it it gives us the result That's it's not a it's not an inaccurate model. It is very simple model. So Here's a little bit more what's really going on So we have these things called predictors at a very very low level of the hardware And when you for example, you have an if statements that says like if password is valid do this stuff So what happens is the first time you do it it does just Go through fetches execute gives result which is to say it does that it gets that branch instruction Checks the condition if the condition is true. It does the other stuff if it's not it doesn't do the other stuff but What happens is it then trains this predictor to say? Okay, when I hit this condition when I hit this branch condition It tends to be true. It's often true. So to speed things up Just go ahead and execute it as if it's true Just just go ahead and do it and you know you'll enter this what's called like transient execution state So it's you'll enter this state. It's like well It's possibly possibly true possibly not true But we're gonna go ahead and execute the code anyway And you have all the side effects of executing the code and then at the end when it finally Manages to evaluate the condition Then it says oh, well, okay, that did turn out to be true. So yay Go keep running ahead or no that didn't turn out to be true. So Erase well in theory erase all that stuff that you did It's not always very good at erasing and it turns out that you can actually Poke holes in it even while it's in that transient state So here's the worst part. It's actually more complicated than that and that is You don't just have a single instruction stream using this predictor You have a whole bunch of instruction streams and when you're working in like a context where you're running like a Massive number of cores and you're actually over subscribing your vCPUs on your CPUs and like you actually have a large number of instruction streams Running through the same course right so using the same predictors So all of those are training the same predictors using virtual addresses for that branch instruction so And here's where it gets bad if you have a malicious instruction stream It can start feeding bad predictions into that predictor and then this this victim Instruction stream is going to start doing the things that that predictor tells it just executing those instructions that like Executing that if password is valid, you know do this thing It's just gonna start like assuming if the if the attacker trains it to believe the password is valid It's just gonna start doing that and then it starts doing all these extra things that cause side effects that so it's It's it's a little bit mind-blowing when you first kind of settle into this concept that the machine is actually doing a lot of work That it shouldn't be doing and this is this has performance benefits. It has huge performance benefits So this is too much information in terms of what it's really doing Getting slightly more realistic. You don't need to see all of that This is what really matters for this particular set of vulnerabilities is understanding this so When you have a branch instruction For example that goes through a series of stage stages It's fetched from memory a reorder buffer kind of holds it for a while It goes into these reservation stations that queue it up for executing and then that that big red zone here Is that red zone where you might be in this? Executing but you might be executing Speculatively you might be executing in this transient state where you might just throw it away afterwards If you're good enough to actually clean it up you might not but for now it's kind of a I don't know you could almost call it a superposition of they are not there It's like it's not really reality yet. It hasn't really gelled into reality yet But that zone right there That is the zone of risk for these particular vulnerabilities and that is the only zone of risk for these particular vulnerabilities It's actually very small so the good news is There are actually some really simple techniques to mitigate these vulnerabilities the bad news is they all have huge performance costs because that that technique of predicting and just going ahead and executing things that you're not quite sure is You're going to need yet That's actually a really good way to increase instruction level parallelism That is it's a really good way to run a whole lot of code through fast But it's that sharing of the predictors You know it's like so as the predictions get better Which they train across a large set of code then the code is faster so the machine is faster So sharing those predictions is good For performance sharing those predictions is bad for security So it's it's this it's that hard old-age security performance trade-off, but in a really really brass tracks way So one of the tech sneak techniques is isolation which is to say, you know Don't share those predictions everywhere Just you know each VM has its own set of predictions and it can't share them or the kernel has its own predictions And it can't share them That's one really good way to isolate it does have big performance penalties because you're not sharing them anymore Another is to flush so you can just erase predictions at points where you think you might Have risk That's not a very good solution and it's a very expensive solution But it is one of the techniques that active hardware hardware is actually shipping today and another technique is to disable speculation for certain contexts Like to say okay, I my VM is a confidential VM and I just don't want speculation on it all don't don't do the predictions at all So that's another set of techniques The very briefly it is actually a fairly small number of predictors that we have in the system They have a huge impact though, so The the first one is handles direct and indirect branches There's a second one that kind of gives a little more information to direct and indirect branches There's another one that handles conditional branches another one that handles returns So it like predicts where your return address is going to be to kind of speed things up And another is the memory disambiguator which basically says it it predicts whether You've had extra stores to an address since the last load So it kind of like predicts what the value will be or whether it's safe to use the same value and it can be wrong So then if it predicted wrong it will like go back in and reload it So there's way too many variants to go into in detail. I just want to give you some Sample ideas. So this is one this though. This is one called specter BTB So it mistrains direct or indirect branch predictions Which means it Basically convinces your machine that the address that it's supposed to jump to for a branch is different than the actual address So it mistrains that branch and then it starts telling it no jump to this other address completely different address Which means it can tell your machine to execute completely random code that it shouldn't be executing at all And what's worse is it can do this across? Security domains so for example An unprivileged guest in your VM in in user space in the VM can mistrain your branches and then When it makes a system call down to the kernel if they're it's using the same branch Which you have to be a little bit smart about mistraining to actually get the right one But then it can then make the kernel jump to execute some code that the kernel never should have been executing But now it has kernel privileges because it convinced the kernel to jump to that code and start executing it. It's like Machines should just not do this. They should just shouldn't So some variants on that one is SGX Specter which exposes Trusted execution environment or enclave secure enclave secret data things like provisioning keys and seal keys and attestation keys So it's like the keys to the kingdom that make your confidential computing safe, it's then leaking them all over the system and completely undermining those protections There's another variant that bypasses certain mitigations around flushing or partitioning there's another variant we found that like until launched this like supposed fix for the first vulnerability specter BTB, but then later it was discovered that Actually that isolation on the on the branch target buffer because there's this extra source of data The the branch history of it that's feeding into it You can mistrain just this other source of data which still wasn't protected and that then manages to mistrain The main predictor and then you end up with the same vulnerability So like Intel has product has protections shipping in production that actually don't do anything unfortunately So and and that particular that last one is actually not being released until I mean it's public But it's it's scheduled for a conference in August So stuff is still actively happening on this even four years later. We're still finding new stuff So as I said in the last slide that it can be in place or out of place And this is where that scary action at a distance stuff starts happening So you can mistrain a branch In the within the same process you can mistrain a branch from a different process You can mistrain a branch from the same address space or actually from a congruent virtual address in a different address space because the predictors are going by virtual addresses you can actually Trick it into you can do the training in this one address space and trick it into doing a completely different branch instruction in a completely different address space The wrong way because you mistrained the predictors and they're shared so Kind of started with this with two questions when I started my PhD work And one was how the heck did we get here? Like we've been doing this speculative execution stuff since the 70s How did we get to a point where our hardware is so completely undermining our software security? And nobody noticed it like that doesn't make any sense And then the next piece is is there any way I'll ever trust multi-tenant computing again like cloud and containers? Because I now know this It's kind of like being the cook you know what went into the food and sometimes you don't want to know what it went into the food So my my work was design space exploration Mainly around those disable mitigation techniques. They aren't the only techniques. They're just one tool in the toolbox And I prototyped three different variations of a risk-5 core a speculative risk-5 core that is known to be vulnerable to kind of explore the possibilities here and then I Ran those simulations on Amazon Amazon FPGAs, which is actually a pretty cool use of FPGAs You can compile risk-5 cores and full SOC's compile synthesize and then run them on the cloud And it's so it's like it's like the way we would think of doing CI for cloud or containers But you're actually doing it for the hardware designs It's pretty cool There's a lot of room there that we're not there We're not really making use of yet And there's also some tools fire marshal builds your full like Linux images with your with your workloads for the Amazon FPGAs and fire Sim does some orchestration around spitting out like thousands of FPGAs at a time running SOC running risk-5 cores. So some pretty cool tools out there So my work was specifically focused on a very familiar space to you and that is cloud and containers this multi-tenant infrastructure where you have a host OS on a machine and a series of guest OS is VMs or containers It doesn't matter which for this particular domain. It doesn't matter which the security problems are the same So in this context, you know, there are many channels of information flow that we want to allow Right, we want to allow guests to talk to each other We want to allow the top the host to talk to the guests We want to allow connections out to the internet like and back in like these are all good, but For every channel of information we want to allow there are channels We don't want to allow and that's where secure isolation comes in is it's not so much about blocking off everything It's about choosing What channels are allowed and what channels are not allowed? So how did we get here? To this point where there's this fundamental flaw in the hardware that we didn't know about and are really still not quite sure how to fix So I spent a lot of time looking through computer history And that's kind of there's a big section in my dissertation on that But the really quick answer is in the early days we used to co-design hardware and software So like this is that that's an example of a machine that was like co-designed for security They did the the team who worked on it. They built the hardware. They built the operating system. They built the applications They built everything So there were people who had a good understanding of the full stack like the full full stack Not just front and back end but all the way down to the lowest level of the hardware It turns out that's not a very efficient way to design things and over time as systems got more complex We went for more modular and recombinable sort of stratification architectural stratification and standardization So we have reusable CPU memory storage that we like combine into building machines We have reusable kernels and system utilities and operating systems and applications that we combine to build our host and our guest workloads So this is a wonderful thing for ease of maintenance and development Because you can focus on you're one area of the stack and that's good And you don't have to redesign everything from scratch every time The hard side of that is is over the decades that means fewer and fewer people actually understand the full full stack and that has problems both ways so I mean there's there's the one side of like the software developers Don't really necessarily understand how the hardware works like that like like You don't really need to think about speculation in your day-to-day work on cloud computing It's not a thing that comes to mind. There's no reason it should it's just a waste of brain space But on the other hand the micro architecture designers do not understand the software that is running on their machine And they don't know what they're designing and this comes up over and over and over again where I'm talking to them And they're like that we do this we do this is like that's not how people actually use servers. No so This is how we got here. This is how we got here is that we're not talking to each other We're not communicating across the full level of the stack and nobody realized nobody at the low level realized What the high level was doing nobody at the high level realized what the low level was doing and here we are So there was an assumption that the micro architecture designers made that it's fine to create Sort of transient state. It's in this in his zone in the zone of speculation. It's fine to create state. It's fine to like Store passwords or store data that's like kernel Kernel memory that you shouldn't have access to and things like that It's fine to store it as long as it's cleaned up and it's never Visible from the architect at the architectural level so it's never never visible at the high level in theory It turns out that once it's there you can find sideways backdoors to get at it. So That was not a good assumption that they were working on that it was safe to just randomly create state And then hope it'll be cleaned up eventually There was a guy in 2005 who published a paper that started to kind of he hinted in on some of the risks of These sets of features being combined But even he didn't really quite see the potential And it wasn't until 2017-2018 that people really started to realize what this meant So the question of Will I ever trust multi-tenant computing again? There are a few angles on trust You've actually probably heard quite a number of them this week in various different talks So when I where I started was just do I trust that isolation between vms? Do I trust that anymore because it's been clearly demonstrated that it can be violated? But there are other aspects there's you'll see trusted computing Floating around and that's that tends to talk about things like attestation and cryptographically signed software Confidential computing tends to talk about things like encrypted memory or trusted execution environments They're all related. They're related concepts, but technically they're slightly different. The hard fact is they're all undermined by speculation Um, so it's like this really does affect our day-to-day lives So here's this how do I trust it now speculation has performance benefits Restricting it has security benefits So if you share less of those predictions you improve security if you share nothing then you get even better security So My look at it was well, is there some way we can combine speculation and no speculation? And actually give system software developers the ability to choose Like is this a confidential computing environment where I absolutely desperately need this protection? Or is this like game high scores that like it really doesn't matter if they like it's not a big deal And if we could combine them and if we could give system software developers the power to choose how would that work? So very quickly Talk to me if you want more details Um, I made one prototype that just took a speculative corn and non speculative corn stuck them together Heterogeneous computing and if you want secure workloads you run them on the non speculative core So the non speculative core has the advantage that it can't mistrain other workloads So if you have code you're not really sure you trust you can run it there and it can't mistrain anything on the system And it can't be coded on the on the non speculative core can't be mistrained So it's the protect confidential side like nothing outside your host os no no other vms Nothing can mistrain that particular body of code Um, I learned the performance is not very interesting if you run it on the non speculative core It runs as well as a single non speculative core if you run on the speculative core Runs as well as a single speculative core. That's not a big deal But it's not viable for cloud computing because Doing it as heterogeneous cores means you have to decide in advance how many speculative How many non speculative cores that is a really horrible resource allocation problem to decide in advance for your servers So not viable for that. It can be viable for like tablets or laptops or something like that I made one variant The names are all physics jokes that tachyon is Violates the the laws of causality. So it's impossible So I made one that's like entirely non speculative Which does predict against all known variants and all unknown future variants of speculative execution vulnerabilities Um, I expected the performance to be absolutely terrible Uh, it turned out the performance was not all that bad It was a 30 to 60 performance penalty which sounds horrible until you realize that for current mitigations The performance penalty for just one that mitigates just one variant can ease 30 percent in many cases And it can be as bad for fleshing techniques as a 200 performance penalty So 30 to 60 for complete protection is I mean, it's not great. It's not It's actually not all that bad compared to the other mitigations out there Um, and I also learned that you can improve performance by increasing parallelism in other ways There are there are these low-level You'd kind of call them data structures in the hardware that manage the flow of instructions And if you increase parallelism in those without using speculation You can actually increase performance and still not have the the risk of speculation And the last one which is probably the most interesting is Combining the two on a single core so that you can for example say Okay, this this vm. This is a confidential vm This vm. I don't want that to do any speculation at all Uh, but the rest of the system. I really don't care about or you could say The kernel I never want the kernel to speculate because I really must protect that but you can speculate in other places So it it gives you a little bit of more power over where you speculate and where you don't I prototyped it as an isa extensions would actually added non speculative instructions But that's not the way I would do it in production. I would do it in production I would associate it with like a vm or a process or you know, like a logical existing security domain Um So I learned that performance is determined by how much you use non speculative regions So, you know, if you if you use very little non speculative code The performance can be as good as or almost as good as a straight up speculative core If you use a lot of non speculative code then the performance can be As bad as or almost as bad as a completely non speculative core, but it kind of gives it gives you more control So this is my food for thought for you if the hardware of the future Gave you the ability to say I want my vm like in your cloud panel where you're like launching your vm or on your command line If you gave it if it gave you the ability to say I want this vm To be more secure. Would you use that such a feature? I think in some ways containers, they're more familiar with this concept because you do kind of like Decide what layers of security protection you want to wrap around your containers That's just sort of like people are used to that. I think in vm's people are less used to that They're used to configuring, you know, how many vcu pcp use how much ram how much storage But they're not used to thinking about configuring how much security um And if you think it might be interesting, this is how you get involved So we have a group in risk five that's called the micro architecture side channel security group We're actually designing these protection features for future hardware There's also some other groups that if you're interested in just that whole, you know We should have more people who are involved in the full stack and talking across the full stack There's some other groups in risk five. You might be interested in there's the trusted computing sig and they're implementing trusted execution environments There's a reliability availability service ability sig There's a quality of service sig so they're they're really they're designing the features that will make risk five hardware Work for the cloud Also, as i'm sure you've heard a dozen times from various different talks all week long We are hiring and I will say virtualization and rust are Some interesting ideas some interesting hiring needs right now. So That is it Thank you We might have time for one or two questions, but or catch me afterwards. Yeah So that's one of the hard truths of this is this can both leak your memory after you decrypted it It can also leak your encryption keys, which then allows other VMs on the system to access that so it does undermine I mean, it's still a good idea to encrypt your memory for your confidential VMs like that is for confidential computing It's a good idea, but these particular vulnerabilities do undermine those kinds of features as well Yeah Really? Yeah until amd arm everybody's vulnerable to it. There there are there are mitigations in place now for some things They can mitigate some things they can't others it's kind of a patchwork And some of it is like features you have to turn on and not all cloud providers do because of the performance penalties And you kind of have to whatever your cloud provider decided to turn on or not in the hardware is what you're stuck with You don't have any choices Yeah, sorry someone came in halfway through your question But I think the question is if you have speculative and non speculative code running on the same machine How do you protect against? speculative code attacking The protected non speculative there the reason I call this technique ghosting see and think of it like okay Someone sent you a text and you just didn't reply The disabling techniques part of the way they work is they just don't create that micro architectural state that the vulnerabilities leak They never create it and so it can't be leaked and that's kind of their that's their game Uh, they never create those predictions. They don't use those predictions. So they can't be they can't be influenced by them or manipulated by them So that's that's why it works. It does mean that that the speculative code is still more vulnerable, but I mean Not everything is so so Affected by leaking with speculation, you know like game highs, you know game high scores Like nobody really cares if someone leaked your game high score They really care if you know You took an action that was supposed to be password protected and you ignored the check for whether it was password protected And then you leaked some data that was never supposed to be Accessed at all right like those things they care about, but there's other things that you don't so it's like The idea is to give people the options to choose Specifically the software developers themselves the options to choose because they're the ones who know Is this game high stores or is this hospital patient data? You know is this is this something that really needs to be protected from from leaking or is this something that This is not actually important great Thank you everyone