 Welcome, this rockin' audience. I'm Nolan Karpinski, the product lead at Immutable Systems. And for the last year-ish of my life, I've been living and breathing this idea of immutability as a way to secure cloud workloads. Excited to be at Cloud Foundry in particular, because I feel like the idea of immutability in immutable infrastructure is sort of alive and well within these walls. I'll be talking to you about how immutability, known typically as a DevOps concept, can also have a positive influence on security. And I won't use symbiotic or symbiosis because the bells will go off. But there is an opportunity for ops and security to work together and have a shared common goal here with immutability. I'll also be showing you how we have, at Immutable Systems, a new piece of technology that's specifically looking to actively implement this level of immutability that I'm going to be talking about for the security buyer and security people. So let's talk a little bit about immutability and what it means. So we define an immutable system, and the plan words ceases to amaze me, when you write it in written collateral, like Immutable Systems does an immutable system. But it's the fact that systems that cannot be tampered with or modified at runtime. And so that's how we define it. And there's plenty of other ways to define immutability. Someone came up to me at the booth and was like, well, do you mean functional programming and immutability in that context? And I'm like, no, but I'm glad that you know what immutability is. The other way that I think sometimes people think of immutability is immutable data. So once something is written to disk, it's read-only can't be overwritten. And if maybe you do need to overwrite that data, you have a scratch disk or some sort of delta disk that you can revert back to a good known state. And I want to key in on that idea, the good, well-known state. That place in time when you want to begin to say, yes, this is immutable going forward. And for us at Immutable Systems, we define that as the time a system goes into production. So this is really important to define when immutability starts. Because at that point, it's known, it's good, it's blessed. And the issue for forever and ever has been that point in time changes when you have developers and app teams and operations teams on the box in real time trying to care for and maintain long-lived servers over, honestly, years. Because if an ops person gets into that system and starts modifying system files, the known good is now ambiguous. But the exciting piece and exciting for Cloud Foundry as well is that assumption that once a system goes into production, that is a known good state is an assumption we can finally begin to make. The way we deploy infrastructure, especially in cloud environments, and I'll define VMware here because I feel like I have to, as a cloud environment, it's no longer a super long-lived application. You don't have literally enterprise applications that have been up for five, six, seven years, 10 years, running on that same server. We don't have to have huge instances because we're running microservices architectures. And upgrades and updates are happening with infrastructure as code by redeploying containers as opposed to on the box live and saying, hey, I want to update this security patch right now. And so with that, we have sort of an opportunity, and I'm going to get into what this is, but an opportunity to use security knowing that that known good state is when you went to production. So these next few slides are just to prove a point that it's not just a me thing. Other people think this too. And it's typically in a DevOps context. So I ran across this data point, I don't know, a month ago. And it says that 46% of all cloud developers are using immutable architectures. And I think in order to qualify this, you'd have to use immutable architectures as a goal. The idea that you're never going to have your ops team with SSH access into whatever container environment you're using, that is a goal, not necessarily some place that we're at yet. And so that's an important caveat to this assumption around immutable infrastructure that I'm going to get back to when I talk about the solution. Second point here is from the Netflix engineering team, which has been using immutable architectures for years now. There also have been at the forefront of AWS adoption at one point in time. I assume that they still are. We're the biggest users of AWS infrastructure. And so they're saying that modifying instances in production should be highly discouraged to reduce, and important here, configuration drift and ensure repeatability, which of course is actually an operations sort of value proposition, not a security value proposition. But we get even closer here in a Martin Fowler blog, sort of well-known DevOps blog, that he says any change to a running system introduces risk. And in the context of this statement, he's actually using risk as operational risk. Like, oh, what if this has a dependency that I didn't know about? But this is getting really close to that security value proposition that we're talking about. And I define this as sort of the new modern DevOps workflow. And what this really means, if you're using immutable architectures, is that once the system goes into production, we do know that it's known and good, and it shouldn't change. And we have an opportunity now to rethink about how we implement security in this context, when we know that things shouldn't be changing in production. So this is this mutual value proposition. And I want to go through a couple examples of, OK, an operations person is going to say, my executables can't be modified, or no new code can be installed. And why is that a good thing for a security person? So let's think about an attack chain and a threat model here, and let's focus on the sort of toolkit example. So an attacker gets access to a container and can download code to that container and then execute that code. So in that example, you need to do two things. You need to download that code, which means write it somewhere, and then you need to execute that code. And so if we decouple those and never make the possibility of doing those two things at the same time, then we can thwart that kind of attack vector. So if you cannot download code and then execute it, then that threat vector just goes away. So if you have an executable, it can't be modified. If you have a writable directory, you can't execute out of that writable directory. And it's a pretty easy thing to do, but more difficult to implement. Going to get into the next one of these, which is that random scripts can't run, no interactive shells of any kind. That's, again, sort of an immutable architecture value prop. But security likes this, too, because immutable systems working with an education startup right now, where they're so prescriptive about what runs in their container that they have one process per container image. And they know what that is. And if we know what that is, then why should anything else ever be running in that container? And if we have a way to implement no other process besides this one process associated with this container image, that is a security value proposition. Because it means that attackers cannot run additional code on your system. I'll get into this next example a little bit more, because I actually have a demo of a privilege escalation in a recently released February 12th mainline version of the Ubuntu 16.04 OS. But the idea that if privileges can never change, my capability struct can never change, that is a good thing, both from an operational standpoint as well as a security standpoint. Because then you can't do privilege escalations. And then finally, no one can hide in my OS. You can't install additional kernel modules. You can't be updating my kernel live. That is also a good thing, because we know that is how Linux kernel module rootkits insert themselves. So making sure your debug registers are immutable. Making sure your system call table is immutable. Got all that. So security people should just, this should be exactly what you want, right? As long as you have a way to implement this, this should be something that you want. So let's take a look at specifically a Diego cell, specifically Cloud Foundry, implementing this kind of intelligent immutability where you're focusing on core components of the runtime and making sure they don't change is actually really hard. There's sort of three things you need to be able to do as a system that's enforcing immutability. First, it has to be very specific. You need to know this file, this directory, this specific set of privileges, this kind of process. It has to be very specific. It also has to be dynamic. And I'm going to go back to the assumption, the caveat to the assumption that I made, which is that immutability is a goal. However, if you go to an ops person and say, yep, you never get SSH access into any of your systems ever again, they're going to tell you to screw off. And so it has to be able to be dynamic. You can't just lock down a system and never let it change. It has to be able to be toggled on and off. And the final piece here is it has to be hardened. And what that really means is that, especially in a Linux OS world where you can get on your Linux box and go into slash lib and just delete everything. And your Linux OS is going to be like, sure, no problem. Happy to totally brick your machine. That's sort of the blessing in the curse of root access as a Linux user is you can do whatever you want. And so in order to implement immutability, we need a way to prevent those kinds of changes. Because any barrier that we throw up that is going to be circumvented by either an ops person or a malicious attacker, those kind of barriers aren't worth putting up. So that's like any agent-based approach really. And so here at Immutable, this is the cool piece of technology that we implemented. It's called MetaVisor. And it is quote unquote super root. And I know that's sort of like a marketing term. But the idea is that it's truly underneath your operating system. It is a very lightweight virtualization layer. There's like a million ways that I could explain this. And like half of you will understand it one way and half you'll understand it another way. So I'm going to try a couple different things. But it's truly outside the operating system. If you grep around inside your Diego cell, if you grep around inside a regular Ubuntu OS, you will not see us. We're going to look like a Zen hypervisor because we're actually virtualizing your virtual machine. So I'm going to go into a little bit about how we actually deploy that because this can trip some people up. So typical system. A stem cell root disk. So as you know, like that stem cell light, you can just pull down the recent AMI version. Every stem cell you build is just an Amazon AMI, which is just a pointer to a disk snapshot. Some, a lot of bits, right? And typically, you just boot that. You have your ring zero, you have your ring three. All good to go. It turns into a Diego cell or whatever else needs to. When you're booting a system that has MetaVisor, there's an additional disk on that AMI. The root disk for your stem cell is completely unmodified. I can't stress that enough. Bit for bit, exactly the same. We're not doing anything at all to the root disk. But we're adding an additional disk and putting it at the boot device. So what actually boots first when Bosch is like, OK, go spin up another Diego cell over there. What actually boots first is the MetaVisor runtime. And then it chain boots whatever guest a West that is on top. And then your guest operating system still has its virtual ring three, virtual ring zero. It doesn't necessarily know, actually it doesn't know, that it's being virtualized. But it is. And what this enables us to do is have this super root privilege. And so now, anytime we see file access, anytime we see privilege escalation, anytime we see kernel memory being mapped or unmapped, we're also managing your memory for you, anytime we see privileges being escalated, anytime we see new ports being either connected to or bound to, we not only see that, we also have the ability to say yes or no in a way that even the mandatory access control on your Linux system, I see Linux or AppArmor, can't. So that is MetaVisor. This is how we're actually implementing this idea of immutability. This is the ecosystem of what it looks like. It has a one outbound port that has a management control plane. I talked about the need for this to be dynamic. And you'd be able to flip it on and off really easily. So we have what we call the MetaVisor director that can basically, every 10 seconds, update this. So you can say, yep, OK, Etsy can be written to now. Yes, you can install kernel modules if you really need to do that. But then once you lock it back, it can't be changed. So this has a lot of benefits. I've sort of laid a few of them out. But I want to sort of go over a few of them and then show you a demo of this working live. The demo is, as I sort of mentioned up front, a stock Ubuntu OS being exploited with a kernel vulnerability that was released on February 12. Actually, an exploit that we took from a POC, a public POC, and then modified a little bit. I don't know if there's legal ramifications of that. So there's sort of four Cloud Foundry benefits in specific. The first is you get container-specific visibility and enforcement. So we're still working on this in the lab. But basically what we do is we are listening to the Docker socket on the VM. And with that, we can look at the API calls the Docker daemon is making, and we can associate namespaces with a Docker image. And we're working on being able to do that same thing with Cloud Foundry and potentially even with their new garden implementation. So you get that, you get visibility out of the box. You also get zero-day protection from a lot of things that we can lock down because we know that they're bad. We know that if you have interactive shells being spawned from web processes, that is a bad thing. We know that privileges being escalated, i.e. container escapes, are a bad thing. Your capability structure should simply never change. We know that Linux rootkits and OS level persistence are bad things. You shouldn't be installing additional code. You shouldn't be inserting additional binaries into your operating system. This last, this third point, is actually also really cool. So if you think about it, we're in line with the network traffic. So we have our own network stack. What that means, and this is, again, also in the lab, unfortunately, but we can redirect traffic to third-party appliances. So especially in an east-west configuration, we have a lot of microservices. It's difficult to insert, like, I don't know, a Palo Alto box IPS device in between two microservices. Because you'd have to hairpin every connection. But if you can do that selectively and transparently to the application, you can say, OK, I thought that this virtual machine or this container needed to talk to this container, actually we're going to reroute that through a third-party appliance completely transparently to the application, because we're actually owning the network stack and can basically change header. It also, and this is the fourth sort of really important point, especially for host-level runtime security for a Cloud Foundry environment, they don't play with agents very well. Do not play with agents very well. And so what that means is that it's tough to install things post-deployment. Bosch can sort of do some of this, but the nicer way that we insert, of course, is deploying as part of the AMI that you never have to worry about. Once Bosch does its thing and spins up a new instance, that's the only thing you have to do. You've just added a disk to your AMI, and that's the only insertion mechanism that you need. It's just going to automatically install and deploy. So at this point, I'm going to show you a demo of this working live, and hopefully it goes well. And I have a video if it doesn't. So I have a stock Ubuntu Linux OS, 16.04, and here it is. Start over from scratch. This is always so scary to do live. Oh, it's not showing. Oh, OK. Sorry. Can I just mirror? Sorry, y'all. Now I can't see me, though. Let's just try to mirror real quick. So Linux AMI, connect to it just like a normally would. So I have a bunch of exploits on this AMI. But again, completely stock Ubuntu 16.04 from February 12th. So Docker build, Docker run, I'm in the container. But as we all know, the capabilities of a root user as a container are not true host root. So yes, it looks like I'm root, but if I actually print my capabilities, and if I search for the ability to insert kernel modules, I see that I don't have any. So now if I try to insert my kernel module, it's not going to work. Linux says, no, you don't have those capabilities. But I have an exploit here. Again, completely stock Ubuntu 16.04. And I have an exploit here that's capabilities. And what this is going to do is it's going to overwrite the place in memory that has my capability struct. So it spawns a shell. So this is actually a new shell that you're seeing here. And now I have those capabilities. So if I do that same thing ID, it's going to look like I'm root again. But if I print my capabilities, full capabilities. So I'm now a root user truly on the host. And so ends mod, and so if I go to my management control plane, that metavisor director that I was talking about, CF demo, that's great. Well, it worked. And now I'm going to show you it not working. So what I can do here, what this would have shown if I had done it on the right instance, was it would have shown an alert. It would have said, hey, capabilities being exploited. And then what I can do is I can actually, so right now, this is an alert mode. So if I go here, this is my kernel guard actions are all alert modes. So what I can do is I can change everything to prevent. Again, that's going to update in 10 seconds. And I can actually go into that instance again now. This is going to be a different instance. And do the same thing. So 0, 0, 4, 5. I'm not going to screw it up this time. So make sure it got the policies already. See if I go to events. Time stamps right. So I actually spawn a shell there. I had to exit twice. I go here. All right. I'll show you video instead. So here's what it's going to look like. So I'm going to do the exact same thing. This is saying the metavisor always knows. This is what it looks like, capabilities escalation. And then I'm going to go back and do the exact same thing. So I'm going to change this to prevent mode, which means that when I see that escalation, I'm actually going to revert that those capabilities in line. So that container cannot escalate its privilege. And then when I go back and do the exact same thing, once again, I don't have the capability. And I try to run my exploit. It fails. And metavisor is preventing that action in real time. And I don't have the capability to insert a kernel module. And when I see the events here, I actually see that I killed the process when that happened. And I see that as the prevent action. Anyway, that is the demo of the container escalation. And I think the important point to harp on here is, if I go back to my slides and stop sharing, the first point I want to make is the attack agnostic point here. And that is that although I showed this one vulnerability, which is CVE 2017-16995, that is just one vulnerability of many that that immutable privileges protection could stop. So it also stopped Wade ID on day zero. I don't know if you've heard of that one. Timeout Pwn stopped on day zero. There's a bunch of these exploits that have happened over the last year, two years, that this is going to stop on day zero regardless of how that vulnerability works or what that exploit does, because your container simply can't escalate its privilege. That's the same case for any kind of kernel root kit. It doesn't matter what it is. It matters what it does. And so you just can't change your system call table. You can't hook those functions regardless of what the actual root kit is. And there's a lot of things like this that we do. It doesn't matter that you have this specific vulnerability that allowed this one root toolkit to overwrite a binary on your system. We just don't allow that regardless. So it ends up being attack agnostic from the perspective of the vulnerabilities that we stop. Broad coverage, and I want to touch on this point too because the vulnerability that I just showed you was a kernel level vulnerability. There is no other product in the world that I know of that's going to be able to actually stop that. So SE Linux can come close. And the fact that we're protecting from kernel level all the way up to the container is what I would call broad coverage. And then the final thing, and this is just such a cool thing specifically for Cloud Foundry environments, is that it inserts not as an agent, but as part of the AMI itself. So you're deploying security without having to install additional agents or anything else on that host. Once it goes into production, you know that when you deploy things, that is when you're saying, yes, this is known and good, it's in production, therefore it can't change. And that is security through immutability. And I'm working on it every day, trying to make this a thing. So thank you all for listening. Any questions? Yeah. Totally, I mean, basically, if you took what I said in one direction, you could have been like, well, he's running a hypervisor and a hypervisor. So that sounds crazy. So we don't do a bunch of things that a hypervisor does. Just for some context, and I'll get into the numbers. We don't do binary translation. We're passing a lot of things through. So from a memory perspective, we're just going to carve out about 500 megabytes at the beginning of the runtime. So when it boots, the guest just cannot access that memory, regardless of how big that instance is. So you do the math on however big your memory is. So memory footprint, not an issue. CPU is going to have some overhead because we're proxying your system calls. So if you run a web server test against us, those systems aren't CPU bound. So you're going to see, honestly, 0 to 5%. If you run a kernel build against us, you're going to see 15 to 30, because that's just hammering system calls the entire time. So mileage may vary, I guess. But I wouldn't tell people today, maybe tomorrow, to run this under a DB server, for instance. A, that's not the kind of attacks that we're really stopping, like we're not a DLP. And most of the vulnerabilities that you see are going to be in the app tier and the web tier. So it's more relevant to us and fits our performance profile better. From a response time perspective, it's like two microseconds. So especially in the cloud with Amazon, that latency is going to be in the noise. Anything else? So we deploy it with a concourse pipeline. So you're going to have a step in there that it's going to send an API call, it gets the latest AMI in whatever region you're at. And what it's going to do is instead of just deploying that directly, it's going to take that, add the disk that I talked about. So it's going to switch out the root disk, insert the publicly available MetaVisor snapshot, and then use that from there on for each of the subsequent jobs within that concourse pipeline. Other than that, Bosch is just going to do its thing. It's not a big deal. So I actually have a demo environment of running this under PCF. And there are certain things that work really well and certain protections that, at least initially, we're still sort of ironing out. Especially trying to quantify every directory, because they have this base OS, and it has to be able to support every container they're ever going to run on this thing. So the number of directories is a lot. We had to put a lot of exceptions in. From the kernel protections and your container can't spawn an interactive shell from a web-facing process, those things are still going to work. There's still some work for us to do to iron this out. But it runs deploys. It does actually work, I can show you. Especially for the kernel stuff, you have to be able to tell an ops person that, don't worry. You don't have to run this in prevent mode. So for most cases, we're actually going to mimic the underlying mandatory access control. So SE Linux, the app is going to appear that SE Linux is doing whatever it's doing. And so there's a cool demo where, as root, you can try to remove RM minus R star in slash lib. And what that's actually going to do is it's going to give you a ton of permission denied, which is the exact same thing that you would get if you ran it as a standard Ubuntu user. So we're actually mimicking the same response. Cool? Thanks, guys. Thank y'all.