 Okay, we're going to get started with our next session. We've entered the longer session part of our day. So these sessions are now going to be 40 minutes until later this afternoon. So without any further ado, I'm pleased to present one more from Red Hat, who will be talking about the merger of VMs and containers. Thank you, Roman. Hi, everyone. So Mr. Michael, maybe you could open the door. Hi, everyone. So, yeah, that's great. So, hi, everyone. Glad to see you all here. As for another Z, I'm Roman Moore. I'm working for Red Hat as a software engineer. And today I want to talk to you a little bit about how we can arrange pet VMs and containers. And for all of those who don't know what pet VMs are, pet VMs are basically the opposite of global VMs. And that means whenever you have an application inside a VM and you think a lot about how you can migrate it or keep it up without shutting down the VM, then you're basically talking about pet VM. And on the other hand, you have containers or cloud VMs and you're obviously there. You don't care that much about the state of the VM. It's just more about spawning the VM again if it goes down. But now let's start with the actual topic. So first, I want to talk to you a bit about how we can arrange pet VMs and containers together in with different management solutions. Then I want to tell you a little bit on how we even care about it. After that, I want to show you how typical visualization and container stacks look like in the host. Then finally, how one could try to bring VMs into Kubernetes, follow a little bit more of a question than is how we can bring VMs into a container if you want to. And finally, I want to give a small overview about Keyboard, that's the project I'm working on where I hope is a good project which tries to combine all the insights I want to present to you first. So let's start with a little story about how we can manage VMs and containers to get us. Let's suppose you're an administrator in a medium-sized company and you're used to your O-word or VM-ware, virtualization stack, and there's an engineer coming to you and he says, yeah, I have this new containerized scalable application I want to host with Kubernetes. Could you give me, I don't know, maybe 10 VMs, make sure they're running in different hosts and I will do the rest for regarding to that. And the guy goes away and finally a few days later someone says, hey, the application isn't up, can you take a look? And you look inside all the containers and VMs and there you see, hey, the guy was really good at business doing, he created his Kubernetes deployment in there, maybe the HA, made sure that his application can scale up and down in there. But with this engineer, it looks pretty much like what you're doing in the host level already with your virtualization stacks. He's collecting metrics, he makes sure that specific components are always running, yeah, he's monitoring all that. And so start thinking, maybe you as an infrastructure guy should also care a little bit about what's going on there especially if something isn't right. So you start looking into that more closely how you could start managing that. And one thing which is obvious there is that you had different layers of services, image error, and they're all managed by different schedulers. For instance, Kubernetes has its own scheduler, over it has its own scheduler, remember there are schedulers and you have to, and so, and even the applications inside of Kubernetes do their own kind of scheduling regarding to scaling up or scaling down regarding to traffic and so on. And when we're doing virtualization, like we have it here, with containers inside or a container management framework inside, it's pretty hard for instance, the DevOps team to just care about the container infrastructure. You basically have to go back to the host level and the VM level to make sure that your containers are properly placed around the stack. So as an industry, you're not thinking about removing that layering to get more out of your VMs, more performance, make the placement of the services and containers easier and you might think about placing them next to each other on the same host. And there you can have, in the area you can run into a lot of problems. Most of the demands for different management solutions are completely taking over the host. So even just trying to place two different management solutions on the same host is very, very hard. Another thing you could do is you could try to split your data center. So yeah, let's keep the traditional data center where we're caring about all our VMs and only put them on one kind of host and put all the containers on another type of host where you're managing everything with, let's say, communities. But that kind of solves at least these problems we've been mentioning before, but you still have all the applications regarding to monitoring everything, collecting the metrics, making, you have two different systems which you need to keep HA also. For instance, Overt or Wendware needs to be HA, Kubernetes needs to be HA. And so, I mean, you can do that. It solves some problems, especially relating to host utilizations on the special part of your data center. But for instance, if you have a lot of spare resources on your VM part of the data center, you can't just move containers there. So it's still not optimal. And another approach would be, hey, let's start running VMs inside containers. The advantage with this approach would be that containers have much less assumptions than pet VMs about lifespan. If it goes down, you can just start it again. And so it is a kind of lower layer for all the services which you could use. And the VM, which for instance needs to migrate, can be migrated off of a container before the container is actually removed. So that could fit pretty nice into one management solution. Or you could go even a little bit farther, but that is more the cloud approach. You could start putting inside VMs again containers. For instance, Google Cloud Engine is doing that, if I remember correctly. I think they're using Borg, which creates containers where they're starting VMs. And these are the VMs you can rent when you're renting VMs. And inside there, whether people are doing that, they are deploying Kubernetes on top of it. And there it completely makes sense because there is a different assumption there. You're just talking about cloud VMs. And also about the separation of concern. Google as a cloud provider doesn't care what the application does. But you as a administrator, you as a small company, you definitely have to care about it. So that is probably not the one here, but it's a nice thing you could do. So to sum it up a little bit, with some facts is why we really care about bringing all that together, is for one kind of applications which have specific needs. There are those VMs which need really high performance, those services, which really need high performance. And if you have a lot of them, virtualization overhead might matter. Another thing is that you have your, I don't like calling them, I guess applications that you have your monolithic applications, which in most cases are even until now your cash calls for your company. So you can't just leave them behind. And then you have the microservice-based applications. They don't care much about performance, but they are already cloud ready. And you have your DevOps teams which just want to scale up and down. However, it pleases them. They don't want to ask for new VMs to administrators which container is allowed to run on the same host or the same VM or so on. They just want to deploy everything when you run. Then you have management needs like hyperconversions. As you've seen before with the difference of approaches presented, when you have a lot of different frameworks, you have a lot of different components which are very hard to administer, especially when you run them on the same hosts. So you would end up with a lot of small clusters which are there for different things. One cluster would be there to manage your VM infrastructure, make it a hell of a little cluster would be there for just giving you the network provider for that manager. Then you have a network provider for your Kubernetes stuff and so on. And it's really hard to get all that together when you really try to manage two management layers in each other. Another thing is, yeah, if I've already mentioned it somehow, it's high availability of all the services. For instance, Kubernetes has a very nice concept of high availability built in. So if you're not trying to teach your applications which are ready for the cloud to work with your virtualization environment, you could instead rewrite your virtualization infrastructure that you can just deploy it as a container and make use of each eight concepts which you can use from Kubernetes out of the box. And when we can achieve a goal of combining those two, it costs a lot of money. Less components to manage means less things which can go wrong and makes everything easier. But that's not the only kind of benefits we can get. Another benefit, of course, is that when you look at the host stack of virtualization infrastructure and of container infrastructure, they kind of do similar things in slightly different ways. But for instance, in the Ovid stack, you have VDSM, that's the host demon which is doing monitoring of VMs, does storage mounting, disk provisioning. Then you have network providers inside VDSM. And then Kubernetes, you have the Qubelet which is the demon of Kubernetes. And it does pretty much the same thing. It does monitoring, it does storage. It somehow makes sure that network is there. Of course, especially for instance, network in the container world, you don't care that much about layer two networking, but it also doesn't hurt, for instance, if it would be layer two for them. If it still provides the appraisal, and so you could then, for instance, use your networking stack for both, for VMs and Kubernetes. And when you don't go further down a little bit more, there are also some differences, of course. On the virtualization stack, you can have something like Libre or Cuyuna. Which is very VM-centric, and there's a lot of knowledge in it which reflects how you keep VMs alive, how you can migrate them. And on the Kubernetes stack there, you have something completely different that's the container and the implementation. But if we can somehow manage to keep the Libre knowledge in the picture, or the Libre knowledge in the picture which Libre provides, and still can run everything in ports, it might be a nice solution which we end up with. So we actually tried a few different approaches, or I mean we didn't try all of them, but we at least played through a few approaches on how we could combine Kubernetes and Petlion's. And one very obvious approach would be to just try to extend Kubernetes, to support everything a VM would also need. So instead of parts it would then also be able to start the VM, you could say, okay, I want immigration control in there, and no hot plugging and so on, we could all at least try to add all of that to Kubernetes. But I think that we basically destroy Kubernetes. Kubernetes is so nice because it doesn't have to care about this. It's a very nice low-level sandbox which is really well suited to clouds and to provide basic rollout of services. So another approach would be we could try to, instead of teaching Kubernetes everything about VMs, we could try to kind of mutate parts to VMs by just replacing the host team of Kubernetes, that's the KubeNet. And there are two approaches out there which are trying to, or two projects out there which are trying to do that, one is Hyper and the other one is rootlet. And they're both trying to replace the KubeNet. And instead of starting a part, you're basically starting a VM, where in one case of Hyper, you're then running the part inside there, or in the case of rootlet, you're just having a playing VM then where you can do whatever you want with it. And that kind of works, but you really don't have any real way of expressing your virtualization needs on a cluster level. It's really, you kind of have to fit everything in the automated data and then extract it there and it kind of feels hacky and it gets even harder than when you think about Numer support and the migrations and stuff like that. You can't just easily pull that back to the scheduling layer, to the cluster layer regarding to scheduling and representing that nice. Another approach which you could try is to keep the KubeNet like it is and just have one container inside the part which spawns a QEMO process at the end. A VM is nothing more than a QEMO process. Of course, you have all migration possibilities and everything, but QEMO does all that for you. It's just a process at the end. But then you have finally left the problem. Yeah, I have the QEMO process in the container, but how do I manage it? I mean, I want to do a plug now, so I have to connect to it and so you have to start the infrastructure on it. The simplest thing, of course, would be to put also the update logic and everything inside a container next to the port or into the port. But that's kind of hard to manage and again, you don't get any benefits from Kubernetes regarding to managing all that. So what we finally came up with is trying to build VM management on top of Kubernetes. And Kubernetes helps us there already because they've seen, hey, there's so many people which are trying to do their stuff with Kubernetes and we can't manage to integrate all that in Kubernetes. So they're decomposing it and one thing they, for instance, introduced our third-party resources. That means, like a pod, you can register your own objects in the API server. You can post them. You can listen for changes on those objects as well. And we started doing that with a VM object. This also, and even better, Kubernetes extracted the core code for doing that as some kind of an SDK. So we can really use the core Kubernetes code for expressing that. And then we can also create our host demon which can work with the same principles as the keyword but just has to care about Wemston. I will tell you, I will have some slides later on where you can see how we're actually doing the placing. We can fire this, since we have our own demon now, we can fire this to let the root do the root stuff. And so we can basically build our own pipeline next to the Kubernetes pipeline after pod is scheduled in a node. And we can reuse the complete Kubernetes infrastructure there. And yeah, as I said, the big question is how do we find a place to VM inside the running pod because that's where you end up at the end. So you let Kubernetes do what it can best. You just let it start a pod. And when the pod is running, there's still no VM inside. Now, only when the pod is running, your host demon for the VM is notified that there is no pod and is trying to start the VM. So you kind of have to bring it inside the pod. And there, you can ask yourself a few interesting questions. First of all, does it even make sense to have a VM with a pod? And there are, that was a very hot topic when we started with Kubernetes, if it's even possible or if it makes sense. And to answer these questions, we have to look, we need to have a look on what containers actually are. And from my perspective, containers are two things. One, it's just a tool for resource isolation, for security reasons, for resource usage. And the other thing is you can group your application with different containers together. So it's not a logical thing on how you want to view on your applications. And when you look at the technical details, C groups are just, containers are just C groups and namespaces. And so the next question is, what does Kubernetes really care about? Does it care about this one? Or does it not care about the logical representation? And yeah, I mean, it's kind of both. It's, of course, at the end here you have containers and you have namespaces and C groups. But Kubernetes doesn't care that much how much isolated your containers are. So you can, of course, use hostpids with your pods. You can, of course, use hostnetwork if you want to. It would still be containers and it would still be pods. So for who we need this, that's fine. That's very, I think that doesn't sound like very much, but for us that's a very, very important effect because it gives us some freedom on how we decide to bring VMs into containers and how much we want to bring them in. For instance, when I think about device hot-plugging, if we can create our own C groups just for devices where we allow specific stuff. It doesn't necessarily have to be the C group of a pod. But on the other hand, for instance, the pygmy space of a pod can be very important for our VM just to ensure that when the pod goes down that the VM also goes down, for instance, stuff like that. So we can choose here out of the best roles, I think. Whatever we want. And further, when we now go back to the VMs, Librit already puts VMs half into containers. So it's not only possible, it's what we kind of have already anyway. So that's nice in the first glance. It kind of is also a problem for us if we want to freely, freely decide where we want to bring the VMs so we are also very happy that Librit is an extremely flexible project and you can also disable it that it tries to manage the C groups. Firebase has even more hooks. Like if, for instance, you can say instead of directly calling QEMU, you can tell Librit, okay, when I want to start a VM, please don't call QEMU directly, call my wrapper binary before so that you can do additional stuff. And that's the final piece we need to bring our VMs finally into the pod. Because now we can freely decide where we want to start our QMU processes. In which C groups, which namespaces is completely up to us and Librit still provides us all the high-level virtualization functionality we need. And that finally combined, all that combined leads to Librit, or to QBrit, where we tried to final that. And before we look on how we started a VM in QBrit, it's probably good for everyone who doesn't know how to have a look on how plain Kubernetes starts a pod. And there is not so much going on, actually. You have a tool called QubeControl. With QubeControl, you can just post the pod definition to the API server. When that's posted, the scheduler sees, oh, there is a new VM definition, pod definition, so I have to react to it and schedule it here somewhere. After it has done that, the QBrit sees from the API server, oh, now there is a pod scheduled to my node, so I have to take care of it and start it. And that's basically all. That's some of the things I like very much about Kubernetes that although it's, when you look at it on the first glance, it looks so complex what it can do and how it works, but at the end, that's already one of the core things Kubernetes does for you, and it's really simple. In QBrit, since I said we're just trying to build on top of Kubernetes, we just added a few things there. Just two components mainly, the WordController and the WordTendler. The WordController is a cluster-wide daemon, which is watching for new VM specifications which are posted. And the WordTendler is our host daemon which sits next to the Kubelet. And when you now start, try to start a VM, you're not posting a pod, you are posting a VM definition, which we implemented based in third-party resources. That's what the WordController sees. The WordController is now with the VMWatch loop. The WordController is now creating a pod out of it and posting it to the API server. That's what the schedule sees. It's then scheduling the pod. The WordController sees that the pod is now scheduling and running, updates the VM definition that it was assigned to a specific pod, to a specific mode. And there, WordTendler now says, oh, this is a VM which is for me. So we're basically following the same principles Kubernetes is already using for the pods, we're just extending it that we can schedule something on top of the pods. And the beneficiary is that we let every component do what we think they can do best. Kubernetes just has to take care about pods. We have our WordController unit there, which is doing the translation between the two layers of Kubernetes and our VMs. The WordTendler itself doesn't even have to know anything about pods. It just has to look on the API server for our VMs, which appear and are meant to be assigned to it. And it can then move on, everything, pass on everything to the Word to start the VM. And it doesn't have to know anything about C-groups or namespaces or pods even. Really just plain VM management. And the Word itself just does what it always does. Just starts the VM or updates the VM. But since we are using this emulator tag in the Word, we can, before we directly call QEMU, we can take all the QEMU arguments, detect the namespaces and C-groups on the host, completely isolated from all other components, which we want to use and start the QEMU process there. I think I was way too fast, because I'm already at the end. So we have a demo up and running on the Internet on Qiverge. We have a demo up and running on the Internet on Qiverge. You can just try the one line of demo here. It will spawn a VM for you, and you can play around with our Kubernetes experiment. And you can also visit us on GitHub, and we'll be happy if you contribute, or try to understand what you're doing, and give us feedback if you like it or not. And thank you, and questions. So are there some questions? Yeah, please. So you're injecting the QEMU process. Sorry, I can't hear you. You're injecting the QEMU, even the port die, if the VM comes to that. How does the VM die? So the question was, since we're using the namespaces of the port for the QEMU process, what happens if the port dies or the VM dies? What's going on there then? So in the port where we're moving the VM, we have a shin process, which is basically just an infrastructure process which keeps the port open for us. And this one also detects when the QEMU process is started there. And when Kubernetes wants to stop the port, it first stands a sick term, and you can react to that signal, and you have a specific amount of time you can specify, and we're forwarding that sick term to the QEMU process. So the QEMU process gets it normal, shut down, signal it's used to react to. Of course, when it comes shut down fast enough, it will get a sick kill. Or basically the shin process will get a sick kill. And since then the hit namespace is gone, the VM gets a sick kill too. In that case, nothing, because we disabled it from the configuration. It will just be ignored. The question was, what happens to the LibVirt XML? In the LibVirt XML, when you're managing C-groups, you can specify how much CPU memory the VM can get. And what happens if we disable that? So the answer is it just gets ignored. Kubernetes itself provides the same mechanism, you can add it to the pod, and all the VM definition will pop in the logo. So the pod gets it too. Okay, I think that's it. Thank you for your attention.