 I'm hopeful that we'll start here, I guess probably now, since I just interrupted you all. So my name is Tycho Anderson. I'm here with James Page. I'm on the LexD development team. I've worked for Canonical for about two and a half years, and about two of those years I've focused on working containers, mostly in user space, although more recently in the kernel. Hi, so I'm James Page. I'm technical architect of the OpenStack team at Canonical. I've been with Canonical about five years, most of which I've been working on OpenStack. And I'm a Debian developer, Ubuntu Core developer, and OpenStack contributor. So my team is responsible for all the packaging, charms, and QA around OpenStack that we do on Ubuntu. And so today we want to talk a little bit more about something Mark mentioned, which is LexD, the peer container hypervisor, and in particular, what sort of benefits it offers you if you use it as a part of OpenStack. So one number Mark mentioned was the 14x density. We'll talk a little bit more about other numbers later. But first, I'd like to give you just a little introduction to LexD itself. LexD is one of the core tenants that it's very simple. You can use it as a building block for any project that you're involved in. The way to think of it is as an API for containers. It has lots of things like secure by default and other things. But the core of LexD is really the API itself. And so it offers a very simple way to access and create and destroy and do all the hypervisor-like operations you'd like to do with containers as a very nice and designed experience. So this API was not grown organically. But the command line and the API as well were both designed so that it's nice and easy to use. And the second thing is that LexD is fast. And James is going to talk a little bit more about some benchmarks that we have prepared for this summit to just give you a little idea of how fast it is. So I don't want to steal all of its thunder, so I'll just leave it at that to say that LexD is very fast. And finally, LexD is secure. So we use all of the available kernel security primitives today in order to make sure that your containers are isolated from each other and that they can't attack the host or do anything nefarious. And so the point here that I want to make is just that we use everything that the upstream community knows how to do today to make containers secure in order to make them secure. The last thing that I'd like to do is position LexD. When people think of containers, they typically think of application containers like Rocket or Docker. And we at Ubuntu really enjoy using these tools. The innovations they bring, in particular around image management, are really fantastic. But there's also another set of things here, the more traditional hypervisors that you think of, VMware, Hyper-V, those sorts of things, which do the full hardware virtualization. And kind of where LexD fits in is with these down here. Although it's not a full hardware virtualization, and in particular, that's what gives it all of the advantages, the performance advantages that it has, it looks. The API looks and feels. The toolset that you have looks and feels like a hypervisor. And that enables you to do hypervisory things. So you can do things like take snapshots. You can do things like live migration and other things that you expect to be able to do with a hypervisor. This is a traditional virtual machine experience. The thing looks like a machine. I think James and his demo later is going to log in and look at the syslog and other things. All of these things that you expect to be able to do with a machine. So your users are not confused. They perhaps don't even know that they're using containers. But here you are getting all of the performance benefits that we're going to tell you about. And with that, thanks, Tyco. We're now going to take a look about how we've used LexD in the context of OpenStack. So this is LexD as a hypervisor for OpenStack. So first things first, this is a complimenting technology. We're not saying you have to run your entire cloud on LexD-based containers. This should be part of your cloud story in a hybrid cloud. So alongside KVM, VMware, and Hyper-V, LexD provides an alternative way to provide machine resources as part of your cloud story. And LexD fits in to OpenStack just like Libbird and KVM does in an OpenStack deployment. So what does that look like? Well, it looks just like using the Nova API or Horizon, which we'll look at in a minute as well. So you've got all the standard operations, boot, reboot, stop, delete. You can associate floating IPs. You can make snapshots. You can do resize operations. You can do migration operations. So these containers, when the LexD containers, when they're part of an OpenStack deployment, feel just like a KVM instance from an end user perspective. It should be a familiar experience. And as a result, all the tooling you already have that works with your KVM virtual machines should also work with a LexD container as part of an OpenStack cloud. We also are able to manage resources. So in the same way as when you boot a particular flavor under KVM, you get a fixed number of CPUs and memory and disk resources. You get exactly the same in LexD. So we can constrain the containers to ensure that they have the resources of the flavor that they've been booted as. And that's exposed directly down into the container. So the container itself, when the process is running, they can tell they've got 2 gig and a core or 16 gig and 8 cores to actually consume so they can configure themselves appropriately. And we'll take a little bit of a look of exactly how that works in a bit in the demo. You can also migrate containers. I've touched on this before. So we can shift our workloads around. So if we do feel that a container needs moving off a very highly contended host, it is possible to then migrate that off to something that's a little bit spizzy. That allows us to have the migration features that you typically find in a KVM-based cloud, but also in a LexD-based cloud as well. So we can move those workloads around if need be. So let's dive into that demo and see how this goes. OK, so we've got a LexD OpenStack cloud deployed. It's the one that we did our benchmarking on, which I'll talk about in a minute. It's just a small 4.0 cloud. And you can see, let me just dive in here. And it's now going to want me to log in. I logged in too early. Apologies. OK, so we can see we've got three LexD-based hypervisors in this cloud. They've all got 24 cores, so relatively reasonable performance machines. We have deployed a workload onto this cloud, and we've done that using Juju, our service modeling tool. So we've used the OpenStack provider for Juju, and we've deployed one of our reference big data bundles, which includes Hadoop, Spark, and Zeppelin, onto that cloud in machine containers. And if we drill into this a little bit, we can see that we have multiple services all related together, all running on individual machine containers, which we'll dig into in a little bit. We pop back to our Horizon dashboard. We can see all of those same services. We can see the instances that are actually running them as well. So these are the actual machines. And you can see their specifications as well that are running that service. So let's hop on to the terminal and take a look at that as well. So this is the Juju command line. It shows you exactly the same information as the GUI, but obviously in a command line consumable format. So we can see here that we've got multiple compute slaves in our big data deployment. They're all ready to go. You can see that they've all been deployed and they're booted and ready to start running workloads. You can see all the machines that are supporting that. And again, we can focus. I've got a little bit of lag here. This cloud is on the Isle of Man, so it's a fair way geographically from here. We can see all the Nova instances running, and we can do the kind of operations you'd expect to be able to do on those things. So we can pull the console log. We can see the log from when the machine booted. We can do all the standard operations. And we can also log into these machines. Just to demonstrate that this feels like a full machine container, we can see this is a 4GIG machine. You can see that it's got its own process stack. We can see the processes that are running for this machine container. It has init, it has syslog, it has cron, it has all the features you'd expect from a full system. And just prove the process a bit as well. Got a couple of calls this machine. So we're able to restrict those resources to allow the container to run in a very specific configuration. OK, and just to prove that is all functional, I'll also just run a quick reference on there. So this deployment is running Spark. So we're just running the SparkPy sample application, which uses the big data cluster to very quickly, to not a very large number of figures, crunch the value of Pi. It does that in a random way by looking at circles and throwing darts and stuff I don't particularly understand. But anyway, there's calculated Pi. It's used the resources in the cluster underneath. And it's given us the result back. So that's the end of my demo. So if anybody wants to try this, we've got to bundle up on the juju-tomstore. Needs about four physical machines. Gives you three compute nodes and a control node. And you can try this stuff out for yourself. OK, so that's lexd and over lexd. That's the positioning piece. That's how it all plugs together. That's the vision. Let's look at the reality of what a workload looks like from a performance perspective between lexd and KBM. OK, so let's just talk about the reference platform a bit. OpenStack Liberty on Ubuntu 15.10. It's Ubuntu 15.10, not Ubuntu 14.04, because that's our primary development focus for lexd right now. Lexd is coming to 14.04. That's work in progress and should be completed in the next couple of weeks, at which point you'll also be able to consume it on our current LTS release as well. They're four unit clouds, one control node, three compute nodes, 24 cores, 48 gig of RAM, and KBM or lexd for a hypervisor. But these things, two clouds running at the same time, completely identical. So the first benchmark we looked at was Azure Terrasort Benchmark. So Terrasort is a pretty industry standard benchmark for big data workloads. It generates a number of rows of random data and then sorts those into order. And it's a fairly standard measure for evaluating big data processing performance. So we looked at that in three contexts to evaluate how the workload would perform, both when there was nothing else running on the cloud and when there was contention happening on the cloud. When there's nothing else happening on the cloud, the performance between KBM and lexd is very comparable. The bars are not particularly different. But what's interesting is when you start to get more workload happening on the same underlying physical hardware and get contention between those things, how lexd performs compared to KBM. So you can see from the graph I've got up that when we added more underlying units onto the cloud and added extra work going on in the background alongside the Terrasort, that we had a pretty consistent performance story from lexd. It pretty much took the same time, however much extra load we pushed onto the cloud. But with KBM, we did see some degradation. So the reason for that is that all of the processes when you're running on lexd are running on a single kernel, single scheduler, same file system cache is getting all the optimization that we can deliver by not having multiple kernels full virtual machines running. Under KBM, you obviously do all have all of that overhead. And the cost of switching in and out KBM machines on and off processor is relatively expensive. So when you get a lot of that going on at any given point in time, you do see this type of performance degradation. So it's a great story for lexd in terms of how it manages that contention. And I think that emphasizes if you've got a busy cloud with lots of busy workloads, then lexd is a way to squeeze more out of the same physical resources without seeing the same levels of contention. The second benchmark we looked at for this talk was a Cassandra benchmark. So there's a talk called Cassandra Stress that allows you to exercise a Cassandra cluster in various different ways. And we looked at right performance specifically. The configuration we used was 200 threads running at any given point in time, performing about 10,000 operations on a three-unit cluster. So just a single unit on each of the underlying hypervisors. And that produced some interesting results. And I didn't believe it to start off with. So I tore all the clouds down, redeployed them, revalidated everything we've set up and ran them again. And this is where we do see a big performance difference between lexd and KVM. So this first graph is about the latency of every ride operation. So the average latency under lexd was about 30 milliseconds per ride. And under KVM, it was about 105 milliseconds per ride. This workload is lots and lots of small IO from multiple different sources all at the same time. And we do see under lexd much, much better performance. When you have this high number of context switches happening on the network and disk all the time, we see this type of latency difference. And that does translate into a very big difference in throughput. So maybe 3,000 rows a second under KVM and almost 16 and a half thousand rows under lexd. So there's a great story there for workloads that are doing lots and lots of small IO, whether it be disk on networks with lots and lots of riders. There's a real big difference there. So it gives you a bit of an idea about how lexd stacks up from performance perspective. So what's up next? As Mark detailed before, we're working towards Ubuntu 16.04 which is our next long-term support release. The plan is that lexd and NOVA lexd, our driver for OpenStack, will be production grade about in six months time. They're consumable now, recommended for testing and evaluation. We want people to start consuming them now and give us some feedback on them. But we've got a few things to finish off. So although we can do resource management right now, it's a little bit crude and we wanna improve that and make that a much richer experience so that we can expose more of the resource control semantics of OpenStack down into lexd containers. So that includes improved CPU control and storage control and network quasi-control as well. We've got some additional options coming along for underlying storage containers. Right now we support LVM and butter FS as backends for the container root file systems and that allows us to use the features of those storage technologies to do things like fast cloning of machines and stuff like that. You may see in some announcements that Ubuntu 16.04 will include ZFS so we will be adopting and testing with that as well. There's a few more pieces to do around live migration which Tyco can talk to in a little bit of detail to allow that to work fully and securely across the board. Okay, so that's our presentation. We've left a good amount of time for questions. We thought them probably a bit for you. So if any of us have any questions, please fire away. Yes. What tools do we use for benchmarking? So the way we deploy the workloads is the juju and juju exposes a feature in charms called actions which allow us to run operational maintenance commands but also to do things like benchmarking. So for example, the big data bundle includes actions to run TerraSort. So it's relatively easy to execute the benchmark and then grab the metrics back and Marco Keppi, who's one of my teammates is going to be talking about our benchmarking story and how we're using that both across private and public clouds. 1150 today. So we've got a story on how we do now and we have a framework for pulling all that data together and that's broadly how we've done this benchmarking as well. Other questions? Yes. Sorry. No, those guys are still hard at work. This is, I think, just a different take on how to do machine containers and that sort of thing. So really the idea, as I mentioned earlier, is the core of LexD is, this is a particular implementation using LXC and Linux containers but the core is really the API, the experience you have while using that. That's what we want you to take away from this. So in principle, it would be possible to implement LexD back end using open VZ containers. So, yes. Yes, so LexD is, it's developed upstream, so github.com slash LXC, LXD. The way to think of it is LXC is really basically a C API with a few command line tools that call into that C API but they expose a fairly low level view on that and so what LexD does is it adds a daemon so it's written in go but it uses the go LXC binding to just call that C API. So LXC, the LXC that you've been using for years is still the thing that's actually doing the container management and managing the C groups and setting up all the bind mounts and whatever the VEs and everything. And LexD is really just basically a thin layer on top that exports a nice API that calls down into that particular binding. There's also a reason to have a daemon to do things like live migration and a few other bells and whistles that you can add. There's commands in LXC, there's the LXC checkpoint command so you can kind of fake it but LexD is really the, to give you a nice view on that, yes? Yeah, other question? Yep, yes, exactly. So the LexD resource management story will be much improved come 16.04. Right now we have very basic controls for CPUs. I actually don't think you can do pinning today. We have basic controls for memory but it's all implemented under the hood through C groups. But the idea is actually here that the way that it's expressed in LexD is platform independent. So if something else comes along, for example, if you were to do an open VZ implementation and they have their user bean counters as the underlying implementation, you could take what the language that LexD uses and describe it in that format. So yes, the way we implement it today is with C groups and like I say, there's a specification, I think we've done an internal review and it's a go so I think we'll be posting that shortly but there's also a mailing list discussion thread that we had about four months ago just asking for user feedback about what sort of limiting you would like to do. So if you're interested in that at all, we'd love to hear from you, so yes? So, is that a question? Yes, so you're asking, can we expose the block device to a container? Is that your question? So this is tricky. There's a problem. The kernel, so the way the kernel does amount is it reads the first few however long on the block device and then tries to parse the super block and stuff like that. The super block parsers in the kernel have not before been considered safe to arbitrary input and so there's an issue where if you write some garbage to a super block and then the kernel tries to parse it and there's an exploit, bam, you have a kernel root. So it's kind of tricky. We have a guy, so the good news is that Ted So who is the maintainer for the X4 file system has said he would consider any bug that was some security exploit like this to be an actual bug in the parser and he would fix it, which is great. Not all file system maintainers in the kernel have said that. There's also a patch set by a guy on our kernel team to enable mounting of X4 file systems from within user namespaces. So this is ongoing work, but the story is still progressing, so yeah. Does that answer your question? Okay, great. Just to take a slightly different take on that. In the same way as we have local LVM and butterfs storage options for container root file systems now, we have talked about the concept of using Ceph block devices, remote Ceph block devices and the features of thin copy-on-write clones to provide that. That's thought and conversation right now rather than reality, but it's. Yeah, it's possible, but not arbitrary mounting of a block device into a container. That's the piece that Tyco's talking about, but we could through LexD have a storage backend that was Ceph-based to provide those root file systems. And I guess one final note is that we've also talked about having some API where users could potentially ask LexD to say, hey, I trust this block device. Can you mount it into this container at this location? So we have a feature where you can pass mounts into a container, so it'd be something like that. This is all, the real thing I think is to fix the super block parsers in the kernel, but that is like a process, so. Yes? Yes, so I hear, yeah. I intend to have a conversation with Sage this week about. He mentioned it, he's talking about that. Yeah, and exposure of storage via something like Manila, via the mechanism that Tyco just talked about, so being able to present a file system and then have LexD present that to the container rather than it being presented directly in the container is something that we are looking at. Unfortunately, the Nova API doesn't have semantics for that yet, but that's a different challenge. Other questions? I mean, once? Twice? All right, thank you very much.