 Well, maybe I'll just do this without slides because my slides actually are not that important. They are. They're online. So, yeah, you can walk along with it and my slides are somewhat content-free because I really took it seriously that when I presented this topic that it would be a discussion. I saw that there was this nice, shorter thing where you could get input from everybody. So, while I do have some ideas I want to present to you, I'm really hoping to get your opinion on some of the things that I'm going to talk about. So, the first thing I want to do is introduce myself. Some of you guys know me, but a lot of you don't. My name is Kristin Acardi and I work for Intel's Open Source Technology Center. And I have worked in the kernel for almost 20 years now. It's been all over the place. I started out networking and then I changed over to various other subsystems over the years. What's notably lacking from my previous experience in the kernel is security. So, this is my first try at security. The previous five years I've been working on power management. And the thing that's really exciting about working on security is that it shares a lot of similarities with working on power management in that it's a system level problem, that it takes user space and kernel cooperation in order to solve. And also, we had a lot of the same issues with having to make tradeoffs that most of the time people would decide they didn't want your feature because it impacted performance or you'd put it in and somebody wouldn't turn it on. So, I'm very familiar with a lot of the issues here. The other thing that's very exciting is now that I've joined the security team, I have lots of people who are really excited to help me review my slides. So, when you're going through them, you can see that they've had a lot of attention put to them. So, when I started on the security team in February, it was not entirely clear what I was going to work on. One thing that was clear though was that the world was going to be changing a little bit for the next few years. We had recently had the Spectre and Meltdown attacks made public and the theory is that these attacks are now very exciting and new and are going to be something that we're going to be hearing a lot of at least for the next few years. So, I think that CPU micro architectural side channel attacks are going to be something that researchers are going to continue to work on. So, we as a kernel community need to sort of anticipate that they're going to keep coming. So, what can we do to get ourselves out of this cycle where we're constantly trying to fight the latest thing? How can we move to a place where side channel attacks are not quite as destructive as they might be? So, that's where I come in. As they went through the Spectre and Meltdown mitigation work last year, which started over the summer, they came up with all kinds of ideas for things that they wish had been done better, but they didn't have time to work on it, right? When you're in the middle of working on a mitigation for a specific exploit, you don't have time to go back and do some long term stuff. So, this is really what my mission is. My mission is to try to find how we can be more proactive about preventing future side channel attacks and harden the kernel against other potential exploits. It's really important to note, not just because legal wants me to say this, but also because it's true, we're not trying to address specific CVEs or security gaps. What we're trying to do, and this is important because we don't expect anything that we're working on to be a 100% solution to any sort of problem. What we're really trying to do are create speed bumps, things that will slow attackers down, so it's not going to be perfect security. There is a lot of overlap as a result with the work that Kase is doing on KSPPP because what we're really a lot of the things that we came up with that would make these side channel attacks harder are just general kernel hardening things. So, what I'm going to do today is I want to walk through some of the project ideas that came out of the Spectre work last year and that are now at the top of the priority list for my team. I should let you know that we're a very small team, including myself. There's two and a half people. If you count Kasey as half a person. So we don't have a lot of people on our team. If any of these projects sound super-duper exciting and you think, hey, I wish I could work on that, then you can come meet me and maybe we can hook up. And the other thing, as I'm going through this list, if any of these projects seem totally ridiculous, I would really like to know about it, especially some of the longer-term ones, because I certainly don't want to spend a year of my life on some of these things only to get a resounding no way in hell. All right. So the first project I want to talk to you about is kernel address randomization. So right now, we have KSLR implemented in the kernel. It's a little bit weak as it is, and some of the side-channel exploits that we had would take advantage of being able to easily get a kernel address. The other problem we have with the existing implementation is that you find one and you find everything. So one idea that's been thrown out, and this is kind of, I guess this is the scariest thing on my list from my point of view, is the idea that maybe every time you boot the kernel, you should rearrange it. And so basically what this is going to do is relink the kernel on every boot. And the way we would do that is by incorporating a linker inside the kernel. And this isn't as weird as it sounds in that we sort of already have a linker in the kernel in the module loader. It's a pretty brain-dead linker, and it doesn't do what we would need it to do in order to achieve this. But you could use it as a foundation for doing this work. So the idea would be that we break the kernel up into effectively modules, but they're not really... I mean, I guess modules are really just relocatable object files with a little bit of gunk thrown in there to make it easier to load them as a module. So you could modify the module loader to be able to just take .o-files. We may be able to leverage any kind of Vmalloc allocation algorithms that we could write that would be more random. And we might be able to basically take all of these .o-files and create a new section in the binary. Now, right now when the kernel boots up, it takes the compressed binary and it uncompresses it and then it does its loading. So what we would do is insert into that process a step where we take the kernel binary and we read the special section that we've added. We take all of our .o-files. We rearrange them according to a random algorithm and then we load them up into different places in memory. So that's kind of the idea that we have it now. It has some benefits, obviously, otherwise we wouldn't be considering it. It really does increase the difficulty on side channel attacks in that now when you find one address, that's all you really get. You're not going to be able to get a second address. And also this is going to strengthen what we've already got as far as KSLR. The challenges to this is that even fine-grained KSLR can be worked around and it might be sufficient to have a single info leak in order to find the rest of the kernel. And this certainly is going to increase the complexity of the kernel. It's going to make it harder to reproduce bugs because now you might have issues where your kernel is arranged in a certain fashion and you only have a bug that happens when it's laid out in that particular way. We would have to probably make some allowances and maybe have a seed that we could export or whatever to try to reproduce images. But the big thing would be customers of distros are going to be running different kernel images, all of them. So now instead of everybody running the same Ubuntu image or Fedora image, they've got the code but it's all rearranged. So it's going to make it a little bit difficult for distros to be able to assume that they know exactly what the customers got. So this project is still in the research phase. So I'm interested if anybody has any opinions. There's an opinion over there. Do we have a microphone? Have you all thought about how this would impact live patching? Because there are a number of environments that really rely on that for rolling out emergency bug fixes, especially when they're security related. I have wondered about live patching. It just seems to me this might make life much more difficult. Yeah, I'm pretty sure it would. I wasn't actually sure how prevalent it was to actually use live patching because when I was researching it, it seemed that many people considered this to be still fairly out there as far as what people do. I think I am not an expert. I do not speak for Red Hat on this, but I think when it happens, it's really important. And it doesn't happen very often, but when it does, you really, really want to make it happen. So putting huge roadblocks in its way without sorting that out and I know we're not the only vendor to do it, is likely to be an issue. Okay, so don't break live patching. Yeah, I think it's a bit of a chicken and the egg problem. People don't adopt live patching because there are things that break it really badly. There's one right over there. So I wanted to ask about, because we are randomising things at boot time, so I wanted to ask about entropy because this is something that can be a problem at boot time and especially when it comes from what I remember of virtual machines and things like that. So, I mean, you currently have issues with entropy already because you have to be able to... I mean, there's currently requirements that you have a certain amount of entropy when you boot a VM, for example. And I think that at the moment, we would have to rely on the existing entropy sources for what we're trying to do. That's a good question, though. I should be taking notes since no one can see what I'm doing. So, more of a meta comment. But OpenBSD implemented something along these lines a number of years ago. They called it Kernel Address Randomized Link, or K-A-R-L. Carl, yes. I know all about Carl. So the main difference between what we're doing and Carl is that what Carl does is it actually recompiles the entire kernel at boot time. And so it will boot into a little mini user space, recompile the kernel, and then boot to that. I actually considered this while we were looking at the architecture for this. And it seemed to me that this was more complex. I know that sounds weird because I'm talking about adding a linker into the kernel, but it did sound more complex than trying to do it without the user space portion. So the other thing was just as far as adoption, I felt like it would be really hard to coordinate. I think you'd have to sort of coordinate the image part, the mini user space and everything with distros, and that just seemed hard. Does it impact the signature validation measurements? The signature validation measurements. Since you're relocating the... For secure boot? I don't know. I guess it would depend on what we wound up actually implementing and where you put those measurements. I'm going to write that down, though. What about secure boot? I wonder if, in your research, if you found any precedence for this in any microkernel approaches where you might have a much smaller footprint, you may still have the same problem at the core, but at least you pushed everything out of the core itself, and then your recompiles much smaller surface area. Did you come across anything similar at all as far as with microkernels? No, I can't say that I looked at any microkernel implementations. I did consider what is the difference between what we're doing and just really trying to strip the kernel down to a really core thing and then build everything else as a module. I think it's very similar to what I'm proposing is very similar to that idea with the difference being that you don't have all the dependencies between modules because you're really still making a static monolithic binary at the end of the day once you load it up into memory so you don't have quite the same environment and also not the dependency on file system. Okay, so no one was throwing tomatoes too bad, I guess. I guess you're waiting for the first patch. Well, that'll take a while. Okay, so next project. This one's easier to talk about because it's in progress already and also it was conceptually less scary. The idea here is that we wanted to apply randomization not just to the kernel text section but also to the module text section. So these patches have already been posted on the mailing list. Some of you guys might have seen it but I'll just sort of run through the design for this. What we want to do here, currently the module text range is a gigabyte-ish and so we've apparently had proposals for doing this in the past and people were concerned about fragmentation of the memory area. So we try to solve this by basically splitting the existing module text range into two sections. And the first section is going to be where randomized text goes and then the second section is going to be linearly allocated. So if you fail to find a sufficiently large space inside the randomized section, then you can go and use the old way of doing it in the linear section. In practice, we found that it takes loading a large number of modules before you ever touch the randomized, I mean the linear section and I don't have the numbers with me on how many it takes but Rick has done a lot of testing now after he was given his first round of feedback and it's looking pretty good. And sort of a side effect to this is this new algorithm is actually a lot faster than linearly allocated because now you're basically bisecting the space and you're sort of randomly fitting. So your module load time actually improves with this algorithm. The other benefit to this is we've increased randomization now we get about 17 bits of entropy and you can now link, you can now leak an address in one module and you don't automatically know where all the other modules are. This one has the same sort of challenges as the other project does in that it still might not be good enough. One address leak could still be sufficient. We have slightly increased memory usage due to the increase in page table entries that we have to have. And again, we wind up with increased complexity just because now you don't know where modules might be. But at the same time, this is to me anyway, somewhat of a no-brainer thing to do. So if you do have severe issues with this one, I guess you can still comment on the mailing list since we're not merged. Or if you have any comments you want to tell me right now about it. But I feel like this one's somewhat non-controversial. Just correct me if I'm wrong. Nope, okay. But that, are you guys realizing that lunchtime is in five minutes? Yeah. Okay. Here's another idea. This is a POC that we have. It's under development. So we're pretty excited about this one to the point where we've actually got things that we're working on. So the idea is that we need to start protecting pages that have secrets in them. And so our thought was we should allow user space to be able to tell the kernel that a page contains a secret that needs special protection. So we think that maybe you could add a new flag to the Mlock 2 system call. And this flag would be used in order to apply mitigations to the memory area and also maybe the process that's mapping it. And these mitigations might include things like making this page not dumpable, making it so that you don't copy or fork it, or even disabling caching on that particular page. So in this one the only downside that I see is of course we're adding a new system call that comes with all of the maintenance overhead of that, but I wasn't seeing any downsides to this one. I'm curious what people think. There's a comment back there. If you have any problems when you share pages with other processes, you might trick them into accessing a page which has properties which they might not expect and then you might have side channel attacks that you actually create with the system call instead of fixing them. Could you repeat that? I'm not sure. You're now potentially changing the properties of user space page tables and so now if you somehow find a way to trick another process you might effectively create new issues rather than fix side channel attacks with visual visual. I see what you're saying. Thank you. I'm writing it down. Okay. I think this will be the last one since it looks like we're almost out of time. So the last project that I wanted to talk to you about today is removing basically cache breadcrumbs data that might be left behind in system calls. So this one's not addressing a particular attack per se. I don't know of an exploit that does this, but just sort of thinking about it. If you call a system call and you don't really have permission to do what you ask the system call to do, you might still impact the cache by doing this. There might be things left behind as a result even though eventually you got an eno perm or whatever. So the thought was that in an error path it's likely not performance critical. And so what if every time we would return an error we sort of randomly perturbed the cache a bit before returning back to user space. So for example any error that's not sort of a fake error like try again or something like that you might rewrite the cache. There's also a new MSR that you can use on Intel processors thanks to the L1TF issue. We now have an MSR that we can use to wipe out parts of the cache. Otherwise you can do it the old fashion way just by reading random junk into it. So this would definitely impact performance in the error path but it's really easy to implement. It's very simple to understand. It also would definitely make cache contents harder to guess on errors. It would have increased at least with the POC that I've done so far when you're not able to use the MSR it does come with increased memory consumption because of the junk that you have to put into the POC. I currently implemented it with sort of like a per CPU data area that's just twice as big as my L1 cache and just sort of randomly go out and touch it and mess things up and I do mess things up pretty good. So anyway I'm curious what people think about this idea. It seems like a good idea to me especially given that many of the cache probing attacks involve sending some wildly out of bounds index into some system call. So if the system call is returning Ian Valard like what the hell are you doing having it do something like that that slows down the attacker is actually kind of a good thing. I think this is along the lines of possibly being able to do some sort of intrusion detection by monitoring the error rates of system calls on processes, right? So there's probably I think a lot of exploration that could be done there. Yeah, so to sort of be more intelligent about applying it. If you know that you're going to be generating this is going to be slowing things down could you use it as attack as well? If you know that this is going to generate a performance hit could you possibly be using this as an attack? An attack. Dos attack, yeah. Oh, a dos attack. The performance hit is not that bad. I'd be curious to see to get numbers on how much it affected the existing cache attacks, like the side channels with meltdown for example actually take this mitigation and apply it to an unmitigated CPU that's vulnerable to meltdown and see what it does to it and try to change I imagine it would break all of the existing reproducers, but if you can try to update the reproducer to say okay to bypass this we need to sample it a thousand times more or a million times more to sort of gauge how much this would affect it because I guess if you clear all of the cache then you have no signal but if you're doing the random updates then trying to get a sense of how much it would affect something. Yeah, that's the part I haven't got sorted out yet. Right now in my POC I just clear the entire cache but I feel like that's probably overkill that you don't really need to do that. Sometime ago I've heard about Intel cache allocation technology but maybe we can think not about supporting the cache but about separating, isolating caches for various groups of processes. So the cache allocation technology it's not my area of expertise but my understanding is that it applies to L3 cache and not L1 cache so I think we still have some issues. I would want to ask about the implementation of such modifications of cache because to me it seems that it would require basically a lot of manual work because for example when we have something like IOControl then we have a lot of sometimes we have a lot of return error, this return error that so is there a plan for some kind of implementation of this that would not require thousands of repetitive code insertions, some macro or something like that. So we have the MSR that they have in microcode that can be used on Intel platforms. I don't know what's available on other architecture so I have been doing it the hard way in that case when that MSR is not supported. So cache probing techniques like probe and flush and reload typically rely on knowing something about the cache geometries, you're probing a particular cache line or an aliasing cache line so applying random cache perturbation is probably not might lower the signal to noise ratio on those but it's there is still probably going to be a reasonable signal to noise ratio left over for those style of techniques. So you advocate wiping the entire cache. I think if you were trying to do this wiping the entire cache would be what you would have to do otherwise you're just lowering the signal to noise ratio and I suspect that these things are quite resilient to random perturbation there are lots of papers on how they're resilient to timing perturbation. I guess it would depend on how random you are, right? I suspect the thing that it's critically dependent on is how many samples you can make over time and how many of these things have been to timing perturbation in the past. I would strongly suspect they would be resilient to random cache maintenance. Well, it's certainly easier to just blow everything away. Okay. Thank you very much if you want to talk about any of this further. I'm here.