 Okay. Well, let's get underway, I guess. Welcome everyone back for this afternoon. I'm sure we're going to have some additional folks joining us as soon as they get out of the food lines. This afternoon's featured talk is from Greg K. H., who as you know stands head and shoulders above his peers, both figuratively and literally. This of course highlights one of the major problems we're having an open source right now. We can't seem to feel the good basketball team. So you need to recruit some of your tall neighbors to come and work with us, okay? But in the interim, here is Greg, and I'm sure it's going to be interesting. Okay. This is easy. So please heckle ask questions. Make this fun. Otherwise, I'll just burn through these slides and make everybody feel bad that they missed it afterwards. Okay. So first off, I want to say Daniel, yes. Thank you. And thank you very much for this and you'll see why later. Thank you. I mean, Akanu, however you pronounce that, thank them. They sponsored Daniel to do this work a long time ago. Actually, when was this written? When did you do that work? Okay. One year ago. But it was run on a kernel that was four years old. Or more. Okay. So here's this talk. So a lot of people are like, why are you working on Xen to me? And I'm like, good question. The problem is that I ran into a lot of people. The Cloud provider you guys have, if you're a DomU user on EC2 or whatever else, it's really, really hard to change your kernel. That's not true. Okay. I'll show you how it is hard for you. Okay. I will. I'll show you what I want to do and you can tell me how to do it right. Because that's how I got involved. I was showing people how to make a really easy way to update their kernel quickly. And then all of a sudden it was like, oops, just didn't work on Amazon. So I want to be able to boot the latest kernel that I came out with yesterday, or that came out with stable kernel releases. And I really want to be able to boot. If that kernel fails, I want to fall back to a safe kernel. And I want to do it in a Cloud provider independent way. Yeah. I also want to do it a hardware and I want to do it in a hypervisor vendor, hypervisor neutral way. And the way people do it today, like you will say you can add a PV grub, you can have configure files, and you can make custom images. EC2 lets you make a custom image, then you got to make a custom image for Rackspace, you got to make custom image for Linnode, custom image for Dreamhose. Anyway, you got to make custom images for your OS. It's hard, or you got to mess with the package manager. You can package manager, you can have hooks into the package manager to go and select these properly if you update using RPM or using Debian. But there's a lot of people these days that aren't even using package managers. So how does that work? You have to hit the card. So best thing to do, that I always pointed people at and I was like, hey, just use Kexec. Kexec is great, and I'll talk about what it does. I think in a minute. Yes, Kexec lets you run Linux from Linux, if you don't know what it is. So Kexec basically says, user space says here's a chunk, a binary blob, run it, and get out of the way. And that's really cool. So it lets Linux boot Linux, you can put it in a kernel, and Linux become a bootloader. On a lot of hardware platforms, Kexec is used as a bootloader. PS3, Kexec was the bootloader. Linux was the bootloader. Coreboot, thought they started out messing with this, and they ended up making it smaller, but it isn't that way. So we got it working on Xen. No, I got to work on KBM and a few other things until things died. But a lot of all the enterprise distros use Kexec for another reason. When the kernel panics, Kexec can be told go run this binary blob, which is another Linux kernel, which then dumps all of memory. So Kexec is cool as it doesn't free memory, it doesn't clean it up, it just instantly transfers ownership over and boots and goes. So Kdump works really, really nicely. When the machine boots up for the first time, you pass a binary blob to Kexec that says, if any panic happens, run this code. And that code just scans a whole big memory, dumps it to a disk file, and then reboots itself again to another known good kernel image. All the distros use it, they can send off core dumps to people. It's very, very powerful, and that's a reason a lot of people use it. This is the reason why you wanted to do this for EC2, we wanted to get Kdump working. You can't do Kdump today on EC2. I would like to use Kdump. A lot of people want to use Kexec. So the real cool thing is this, this is why I'm doing this. I told people, hey, sure, use Kexec, it'll work great. And then they said no, it doesn't work on Xen, I go crap. Wait, wait, wait, geez. All right, here we go. I will cover your pedantics. Dom zero, Kexec will work today. Here's the way Kexec will work today. You can go Kexec, Dom zero, there's Dom zero, you can go Dom zero to Linux, so you can blow away all of Xen, or you can Linux to cut Dom zero. You cannot do Dom U to Dom U, which is what all the cloud providers, because you don't have access to Dom zero. Unless they're running HVM, but then that's KVM, right? Well, no. Well, no, how does that work? So HVM, you're not Dom U, you are Dom U, but I could not get it to work, it just fails. Linux to Linux and Dom U will not work. It should. Which version is Xen? Will that work? Three or four? Shouldn't? It should be. It shouldn't. Okay. Oh. The HVM itself will work, okay, I'll have to play with that. Maybe that's the solution, and I won't have to get all this code upstream. So anyway, Dom U to Dom U would be nice, right? I mean, it would be a good goal. You could run, you could get Red Hat on a tiny instance, or something on EC2, to actually get a crash dump out of it. Things like that, that would be cool. And thankfully, you did all the work. And also I want Linux to be the bootloader. Linux is a really nice bootloader to do. You can do some really cool and powerful things. Within the Linux kernel, you can pack on a whole CPIO image of your NIT-RAMFS, and then you can do fun things like sign that, in a cryptographically secure way, so you know the NIT-RAMFS and the code that you're starting up is signed. So UEFI can verify this. Your bootloader, your hardware above that can verify this. You can do fun things in that, and then you can only transfer ownership to another signed image beyond that. And I'll show you some examples of how you can do that. A lot of people wanna know that they're really running the code they think they're running. And if you sign entire partitions, if you sign entire disk images, you can ensure that. If your bootloader was signed in the first place. So using pvgrub, you can't. Which is a lot of people in the cloud care about this these days. And it's a valid thing to care about. You wanna know what you're running. So here's how I talk about this. Chrome OS does this today on Chromebooks. It's a really, really powerful method. They do the booting of the kernel in hardware. They have their own OS, they have their own BIOS, and they can do that. But this is what I wanna do in the cloud. You wanna boot a kernel that's signed, it's self-contained, it has it's little net RAMFS. And then when a boot's up, it looks, I have two system images. Tiny system images, base core operating system only. And they're signed, they're secure. And then I know, I pick one to boot, because they both start off the same. I boot it, everything's happy. And then all my stateful data can be overlaid on top of that, and that's access, my other bigger disks, whatnot. But then when I wanna update my distro, you wanna update things, you write to, not your image, you write to the other image. And you write it all in one big block. You write it as, you can do diffs here, you can do binary diffs, the Chrome OS guys have a really, really good incremental diff for binary files that can push out tiny bits of data and you implement everything. It's not for like a package manager where you implement some increment or some files and not other files. So you want to have a whole system image updated at once because especially with RPMs, you don't wanna stop halfway, you get upset or something happens with the network and then you reboot and your state of your system is in an unknown state. So you update your other system, you flip a bit in the partition table and you reboot, your boot kernel never changes. This is an old stable, tiny kernel. It says, oh, I need to read this one and boot this one. If this one fails, it reboots again or you reboot the system, it knows that it failed because the partition table wasn't marked and it goes back to your other good one. So you have two states of good known good images or a bad one and then you just ping pong between the two. Chrome OS does this, the core OS guys do this today as well. It's a really nice way to update your system and keep things secure. Boot kernel, that stateful data down there. And this overlays on top using AUFS and then you put Docker on here with containers and you have all sorts of fun things. That's why I'm doing this work. That was too fast, yes. They know our room was so big. Okay. It's being corded. I'm curious, oh, that, that's fine. Did you try fall back and save default in PVGrid? Do we try fall back and save back? No, we did not because I wanna be able to do this for any type of cloud provider. I don't wanna have to write a new image. In case a cloud provider should be providing PVGrid. Yes, any one should provide PVGrid but then I can't check the signature to be sure it's secure. And there's a lot of people out there that wanna know that this really is what it says it is. And that's a very good thing to do. A lot of people who are more cared about security these days, they wanna know that, especially running in a cloud environment. PVGrid has support for measuring using a VTPM now. Oh what? A VTPM. Yeah. And PVGrid, I didn't say that. So then who controls the TPM there? Well, you're in the cloud. I know, so that's why, yeah, that's the problem. So I want to, then you have to trust the TPM there. If I just know that this signature matches the signature I was okay with, then you're good. Because we have the DM, one of the DM modules. I'll control your partition to make sure that it's, and it actually starts reading from it and executing from it while it's checking it out. So speed-wise it's much faster. And then if it finds out that it's bad, it just instantly reboots. Again, the Chrome OS guys did this. So yeah, but I don't wanna mess with PVGrid. I want Linux to be here and Linux there. Because if I do this type of thing, I can run anywhere. I can run a native hardware. And then what people can do is, what people want to do is they wanna become cloud provider agnostic. They wanna take their images and they wanna be able to put them anywhere. You not just be able to move from different zones of Amazon, which is great, but be able to move to Rackspace, be able to move to OpenStack, being able to move their own thing, or being able to take their existing images and migrate it to real hardware. And I'm seeing that happen a lot. A lot of people wanna move to their own hardware because when they get so big, they're tied to writing million dollars checks to Amazon, and then they wanna move their own stuff. I hire their own people. So I wanna give people flexibility. So I want Kexect to work in Zen. Don't you? That's why I did this work. And spoiler, it doesn't work. Yet, I got distracted. The Kexect people were like, hey, we want signed Kexect to work and then the interface is nasty and why don't you do this? And I got messed up with that. And then the libraries, it was written for I think Zen 2 and a two, six, 16 heavily modified kernel patches. And the Zen library is very weird. So I'm messing with that. So I wanna have the code up and running. It's barely working. I wanna have it working properly by the developer summit, which is in Edinburgh, right? So I'd like to show something working with it. That's it. That was fast. Does that make sense? Why we wanna do this? And why I wanna do it in a neutral way? I mean, do you have any objections to having Dom you work Kexect? Because then CrashDump works. And CrashDump is really good. And I wanna, hey, yeah. So all the distros that care about that. And a lot of people care about that. That would be cool. Well, how do you get the CrashDump image out? Only the second kernel boots. Yeah, you're right. You're right. You're right. But where does it write the image to? Where does CrashDump write the image to? Wherever it's configured to. Okay. So it can write to a block device. You can write to a block device. You can write over the network. You can stream it over SSH. It's completely scriptable. Oh, cool. But this idea of using a kernel self-contained with this file system image is a really powerful thing. Right now the kernel's kinda hard to build that way. We have some, you have to build the kernel twice. Which is kind of a pain. Or you build all the drivers into the kernel. Which is. Are you okay with still using PVgrub to get to the first boot kernel? Yes, that's fine. I know I have to do that. How else can I do that? I mean, yeah. So, when it comes to measuring that first boot kernel, like you're kind of still trusting the PVgrub. I still have to trust the PVgrub, yes. That's fine. And the fun thing is, also when I'm doing these reboots from here to here to here, I don't have to go all the way back out to EC2 or to Zen and reboot the whole thing again. So I can do much, much faster boots. I can do this whole thing running on KVM in three seconds. Two kernel boots. Whole file system up and running. My laptop running in QEMU, I think it's five seconds. So, and I got a really big hardware to test this on it. It was like two seconds. For two full kernel boots, load the file system, verify, check some everything and go again. Fast, fast boots. The bias on those servers takes half hour. You don't want to mess with that. You never want to reboot. And then, yeah, you don't want to have to worry about the latency involved in your cloud provider doing the reboot as well. Any other questions? That was too fast. I'm sorry. Oh, I was allocated 20 minutes. I did 18. So that gives you two minutes to sing a song. No, I don't say. I don't think I have anything else. Oh, yeah, it works. Oh, I have more slides. And I know, yes, thank you. I didn't want to start from the beginning. And obligatory Penguin Picture. I talked to Karen, she is, or Kate, is one of the current developers. She gave me this Penguin Picture years ago and she said she'd give me new ones now. She took these down and that's in Chile, but it was on her way to Antarctica. So those are actually a Linux kernel developers Penguin Spectres. Not mine. Cool. Any questions? Anything? So, each, I want to try that on a big instance to see how well it will work. I'll mess with that later today. Let's have a good hand for Greg, okay? All right, thank you.