 Hi everybody, I'm Stu Gott and this is a practical guide to Kubevert, hopefully a guide for the rest of us. Just a current state-of-the-world or, you know, statement on where it is, containers are increasingly becoming the de facto standard of how we're packaging applications and Kubernetes and OpenShifter becoming kind of the de facto way that we do that. But that's for new applications. When you start talking about virtual machines, I've heard some people say they're going away. Well, no, they're not. For business reasons, it's hard to redo some applications and for technical reasons it may be impossible to do that. For instance, if you need windows in a machine or if you attended the inter-cernals talk yesterday, that would not be something you would put directly in a container. Now in today's world, we traditionally have separate management infrastructures for these stacks, which unfortunately means underutilizing some hardware because you just can't mix. So that's where Kubevert comes in. We're looking at a technology that enables unifying these two infrastructures so that you can build, modify, deploy applications that are virtualized alongside your containers, or in other words, put virtual machines right under your Kubernetes product, projects. So we do this by using a custom resource definition that we drop into existing Kubernetes clusters. Now, this is really important to say we don't require, one of our requirements for ourselves is that we do not allow modification of the Kubernetes cluster before we deploy. In other words, we can't change container run times, we can't add system accounts, or what have you, it all has to be done as part of our deployment or it can't be done. And so by doing this, we extend the Kubernetes infrastructure so that it, you know, as Kubernetes native possible, away as possible. And so by doing this, the virtual machines are actually inside a container. Some solutions out there such as Kata containers, I believe, Vertlet maybe actually modify the container runtime. That's something we're explicitly trying not to do because we don't want to be modifying that ahead of time. Now, in the future, that might be a restriction that's lifted because dynamic container run times is something that may come to Kubernetes in the future. But for now, that's kind of a hard and fast rule and one of the reasons that we're doing it the way we are. And, you know, by leveraging our existing ecosystems, teams are allowed to use the proper tool for their solution, whether that's a virtual machine or a container, they can put that right into their CI CD pipelines. So for the way we implement this, we actually are using a custom resource definition. This is, you know, I've got an example of one over on the right. It's basically just a YAML file for those who haven't seen this before. For those who have seen Kubernetes constructs before, they should look pretty familiar. In this case, the only thing special about it is the kind is a virtual machine instance. So virtual machines here have their own kind. And this gives us the ability to express all common virtual machine parameters such as memory, CPU, and the like. Because we're implementing this as a custom resource definition, we also inherit RBAC rules. So users are only allowed to modify things in the namespaces they're designed for and what have you. So here's a little bit of the workflow. This is a busy slide, so if I could take a minute to explain this. When the user implements this custom resource or posts it to the system, that's a virtual machine instance. And so that's actually just a record on the, you know, in the SED cluster. We've got a controller, a VRT controller, which is monitoring for changes to custom resources or to virtual machine instances in this case. And when it sees one, it actually schedules a pod. And that's all it does at this point, just schedule a pod. And you can see that, you know, this is the third step here. And then now the VRT controller is a cluster level resource. So its only job is to schedule pods. Then on each of the individual nodes, VRT handler is running. And that's another controller we have. And it is looking for these pods that have a special label on them so that it knows that it owns that pod and it will then schedule starting the virtual machine inside of it. Now, there's a little bit of hand waving there, of course, because I said start a virtual machine in a container that's already running. So what we're actually doing is we've got a daemon called VRT Launcher inside this pod that's actually doing that work. You know, just some full disclosure there. So as far as scheduling, as I said, VRT controller is scheduling a pod. That means that we're actually literally using a pod and pod rules for where we're going to end up placing virtual machines. That means anti-affinity, affinity labels, selectors, all of those constraints that you can put on a Kubernetes pod still work. And you can even use a custom scheduler if you needed. Now, the applications within the virtual machines, because they are leveraging a pod, all existing Kubernetes constructs such as services and routes still work. And we'll get a little bit more into what those are later. But we actually use labels on the service itself to designate which pod the service belongs to or where to route the packets, basically. So virtual machines live in pods. Now that's transparent to higher level management systems, but technically that's not worse than it currently is before we did this project. Now virtual machines leverage pods. When we have a new virtual machine record, any labels that are on that will be translated over to the pod. We're going to need that for scheduling, of course, and to match things like services. CBQ and memory resources, they're actually matched from what the virtual machine's definition has to the pod so that we're not over allocating how much we're requesting. And of course affinity and anti-affinity, I talked about that. As far as the storage, that's where the real rubber meets the road. We're using persistent volumes for this at the production level. And so what that basically means is that any existing storage backend that you already have for Kubernetes cluster, we can take advantage of. And that's one-to-one mapping between the persistent volume and your virtual machine disk. And so by doing this, of course, we can benefit from all existing ecosystems that are currently out there. And there's a lot. So for the disks, and this is actually not a Kuber project, this is a sister project, the containerized data importer. I'm actually going to take advantage of that and use it in the demo in a few minutes because it's just that awesome. This is something that allows you to take an existing virtual machine image, a raw disk or something, and actually import it directly into your Kubernetes cluster on the fly. Which is something that obviously we're going to need if we're going to be able to pull this off because that's been something that's been complicated for us in the past. So this is also, like Kuber, a declarative Kubernetes utility so that you have controllers and operators monitoring for resources that will take action when those resources show up. And so the two use cases here to either designate an HGP URL that you would use to download an image from, or a second use case that I won't be showing is to actually use a read-only namespace within Kubernetes to copy golden images to your user's namespace. That way they wouldn't modify the original and all the goodness that comes with that. So as far as the network goes, we're actually using the pod network for the virtual machine. That's both bad and good. The good, of course, is that you're able to communicate with any existing container resource as it currently exists. So we can also expose these services from our virtual machine using services and routes, as I've mentioned, to expose specific ports on your virtual machine to the outside world. We're looking at alternative networking options such as multiple networks or different variants, but right now what we're using is just a tap device inside the virtual machine. The unfortunate part about that is we lose the ability to do live migration, because in the beginning we actually had Libvert outside of our pod at the cluster level. Or we had one Libvert per node, and that allowed us to do migrations between a virtual machine between different nodes on your Kubernetes cluster. The trouble with that was a little bit of a rabbit hole, but we had some issues with pid namespaces and the like that we were violating assumptions, and we just really couldn't do that. It wasn't a good model. So instead we're actually doing one Libvert per pod, and so Libvert actually lives inside of the pod that we're deploying our virtual machine in. What that unfortunately means is Libvert has no network access to the cluster or to other nodes. So we lose the ability to do live migration for now. Once we implement other networking options, we can reintroduce that. So looking at the virtual machine client tool, this is vert control. One of the things that I sort of skipped over or have glossed over at this point is we're looking at virtual machine instances versus virtual machines. These are two different kinds of records. A virtual machine is kind of a static template for a virtual machine instance. Point being in Kubernetes world, if you start a pod or stop a pod, you're basically creating or deleting a resource. And so that's what our virtual machine instance is. It's kind of an analogy to that. But we recognize that that doesn't really translate well to the vert world of rev over and the like. People coming into this ecosystem or the tools that we're trying to translate to the system, that doesn't really work well. So we created the virtual machine object, and that's where when I'm talking about starting and stopping, that's what we're doing. You actually issue a vert control start command on a virtual machine and it will kick off an instance. It also allows us to connect to the console or use VNC in order to be able to interface with your virtual machine and kind of get a snapshot of what's going on, because you're obviously going to need that. Now there's two ways to do vert client, and that is either as a standalone command, which is what I'll be using, or you can actually use it as a cube control plugin. So it would come straight off of cube control. And time for a demo. Real quick, before I do that, I'd like to explain what my system looks like. So this slide, other than being an example of something complicated, is what my development environment looks like. Inside the physical machine, we're actually running, and I'm sorry Dan, I have to say a Docker cluster. Nearly got away with it, but it's on the slide. Inside of the D word, Docker, we're running a vagrant instance. And the reason we're doing this is for streamlining development, so that everybody's machine looks the same. We're getting consistent builds and the like. But unfortunately, it adds a little bit of complexity that I can't get around when I'm showing this as a demo. We'll be using cube control commands directly from the physical machine. We've got a little sleight of hand where we're actually proxying these calls through the different layers here and down to node 01. But when I start working on the networking, the edge of the light gray box, node 01, is where your node ports actually terminate. And so I can't reach them from the physical machine. So I had to explain that before we get into this. So if we start this off, I think I can just hit... So I'm actually just running a QMU instance here. Let's see if I can make this bigger. I don't want to do that. So I'm just booting a QMU instance here that we've got two gigs of RAM and just a network device. So just a standard CentOS machine. It's logging in to show what the environment looks like. It's a standard run-of-the-mill CentOS machine. And all I did was take a 10 gig image ID from Dev Zero and then run a QMU install on it. And of course re-skinned it with the DevConf logo so that we would have something recognizable. So killing that off. And I'm actually going to start a simple HTTP server here using just Python, which I wouldn't really recommend, but it works great for a demo here. So I'm going to use port 9090. And of course, when you run this, it's exposing all the files in this directory. One of them, of course, is disk.img over port 9090 as a web server, which will become important in a minute when we start using the containerized data importer. So here, this is the containerized data importer. And what just happened was I'm using the git tree, as you can see. There's a little bit of cruft there from the video. Ignore the second argument or the last part of it. All it is is a pointer to my git repo with containerized data importer. And all I've done at this point to that is run make manifests. So it's just a straight git tree you can check out and run directly. And that's all I did here is just deploy these different pieces. So I've got a service account, the cluster roles that are needed to actually do these actions, and of course, the controller that is monitoring for persistent volumes, or persistent volume claims that match its, they're actually annotated. I'll show that in a minute. So I'll show you what the persistent volume claim here looks like. We're using annotations, and that's all that the containerized data importer needs in order to recognize persistent volume claims that is supposed to be taking action on here. So as you can see, I've got port 9090 disk IMG. This contrived IP address actually points back to my bare metal machine when I ran this demo. And of course, the key value pair. The key is the Kubevert IO storage import endpoint for telling the containerized data importer where to go to fetch this image. The commented out secret name obviously I don't need because I'm just using Python simple HTTP server. So it's just going to serve up whatever file it sees. So I'm going to go ahead and create that. That's created the persistent volume claim already. And we'll look at the pods here real quick to show that that's the case. Now there ends up being a little bit of lag here, and that's an unfortunate side effect of our current implementation. We're, and as you can see, it's now running. We're running the upload for containerized data importer directly through the Kube ABI server. Now in this case, that's a 10 gigabyte image. We're moving 10 gigabytes through the Kube ABI server. That's causing lag. I'm sorry about that. That is what it is. In the future, we're going to be re-implementing that as its own service endpoint and using different solutions in order to be able to authenticate so that we don't have that particular issue. So checking the logs here, you see the import has begun. And so we're going to look at the virtual machine instance itself here. This is what's going to actually be using this persistent volume claim that we're creating right now. So, you know, this is basically the bare minimum that I would really want to define in the first place. I've got two gigs of RAM. I've got a persistent volume claim in this case, which is mapping back to devconf-pvc, which is the persistent volume claim that we're actually creating with the containerized data importer as we're going live. So highlighting some of that here. So, of course, two gigs of RAM. Sorry, I shouldn't have paused at that point. I'm basically repeating what it's doing. So quit out of there and then we'll check on the persistent volume claim again to see if it's up yet. Still not. But for now, let's look at the services. We don't need to wait on the persistent volume claim itself to be instantiated in order to be able to instantiate the service. So this is what it looks like. At this point, we're basically just going to use port 22 as our service SSH because it's already running on the sentos box. And we're going to be exposing port 30,000 as a node port, which means that on the outer level of the light gray box, we're going to be using port 30,000. As SSH. So over here in devconf.yaml, I'm highlighting the incorrect thing. Actually, what I was wanting to show is the label here is devconf.us demo. And that's just an arbitrary label that was put on to that I put on. And so the selector on the service here, devconf.us colon demo, that key value pair is what indicates that this service should match that virtual machine. That's all you have to do. So we can go ahead and create that and it's up. So I'll show that. So we've got a cluster IP of 1099, 134, 244, and the port that we're exposing is 30,000. And we can check the endpoints real quick. So services always have an endpoint, which is where they map to on the other end. And at this point, of course, the endpoint that we're mapping to is none because we haven't created the virtual machine that maps to this yet. So let's check again, and it looks like the import has now completed. So let's check the logs real quick to make sure that everything went okay. And it did right there. Import complete down near the bottom. And we don't need to worry about the warning for file because we didn't use that. We use HTTP. And here is the persistent volume itself that we're bound to. And it is mapped to the DevConf PVC. We don't need to worry about its name because that will be looked up automatically. So let's go ahead and create the virtual machine instance now. And show that we have a new pod for Vert Launcher. And it is running. It went up for three seconds. And here's the virtual machine instance that it goes with. So let's check the VNC console here. Login through VNC. And the screensaver kicked on. Don't know how to get out. Actually, no, we ran out of power on the laptop. We never plugged it in. So yeah, the power ran out on the laptop. Here's a plug. We never plugged it in. We never plugged it in. So at that point, we were showing the virtual machine, you know, missed it right there. VNC was going to boot up and actually show that we were using the exact same image that we had started with on bare metal, which was, of course, well, we've got a charge now. It would take a minute for the machine to boot at this point. But that was basically the only other thing that I wanted to show was the service and actually how we mapped from that virtual machine, exposed that service because the endpoint that I showed you a moment ago was mapped to that pod for that virtual machine. And because we're mapping the virtual machines, the pod's IP to the virtual machine, that then terminated at the machine itself. So then we were able to SSH in from the node IP itself. So it's something outside the cluster could then SSH into that box, which obviously if you're going to have a cloud based virtual machine is a very essential point. From here, the, yes. Yeah, we was just going to talk about the next steps. Of course, one of the things that I glossed over was that we don't or that I was using local storage on this because it's a single instance machine. You can use other back ends. However, I chose not to do that because of the complexity of actually doing that on a single node. One of the things we'd like to work on in the future, of course, is making that a little easier to do and multiple networks is another thing that I had mentioned that is on our major wish list. But from there, I guess it's time to take questions. We're bringing a mic one second. All right. Can you guys hear me? All right. I want to make sure I understood the containerized data importer. Is that what it was called? Yes. Okay. It's basically just a utility that'll go grab an HDB image of a disk image, right? Yes. But it runs as a container. Is that what it's called containerized? It is a container that is basically running as a run once container. And that, because as you see, it's said completed when it finished actually doing the import. And so what that pod's job was to do is just simply to load the local storage on one end and connect HTTP on the other and just move the bits. Move the bits into the PV. What's that? Just dumps the bits into the PV. Exactly. Okay. And it's doing it in the namespace that you want on purpose. So whatever namespace you had mentioned in your virtual machine is where the containerized data importer is going to start its pod. And then we're not running into permissions issues by doing it that way. And is that the only utility you have today? You don't have anything that'll pull? I was talking to Steve Gordon here and I thought there was some way to pull an OCI image that might be embedded with a kernel and all the things that you want to be able to do that. So pull an OCI image that's already got all the bits you want? Yeah, I mean it's a nasty abuse of OCI, but it also would make it nice in that all the infrastructure would be the same, right? I think what you're talking about in that case is using a different, excuse me, container runtime? No? No, no, no. I'm saying like dump a VM image in an OCI image, stuff it in a registry, pull that in instead of just a random disk image. Or another option would be just pull street from Cinder. I always wondered why. So we actually do have registry disks as a possible option here. So you absolutely could do that. I think that in a production environment, I would imagine maybe that the persistent volume claim would have just a more universal appeal. But yeah, we certainly could put it into a registry as well, a container registry, right? We're actually doing that on our dev images right now. So we created Docker registry, sorry, a container registry. When we instantiate the develop environment, we stock it with an Alpine image and a Cirrus image and a couple others, just so that we have base images to do all our testing infrastructure in. And so yeah, we put that directly into a container registry and utilize those images inside Kuber as well. In that case, we actually have a, there was one other pod that was sitting on the, you know, when I would, was listing the pods. That's what was doing that, yet one more pod. Any other questions? So this project's been going on for a couple of years. Are there examples that are using this for like primary security or are they just migrating existing virtual machines to make it easier to their existing workflow? And if the former, can you give an example of where it's better than just... I'm getting a lot of echo. I couldn't hear the question. I'm sorry. So one of the benefits of running virtual machines on top of Kubernetes is you get more than just like LXC and C groups to do like sandboxing between different environments. And I was curious, like, are there either customers of Red Hat or Red Hat that's using it in an area that they've like pen tested this on direct containers and, you know, they weren't secure enough. And this was secure enough. And can you talk about that a little bit? So in terms of what I think the question is, is, you know, address security concerns or was security one of the things we were trying to address when we set this up? Yes, it is more secure. That wasn't necessarily our stated goal in terms of something we were setting out to accomplish. But you're right, there's a lot stronger process isolation when you run services inside of virtual machines like this. However, I would still point out that between other containers, there's no isolation at that level anyway. So you're just really, I mean, if security of now, if you're looking at untrusted workloads in the virtual machine, yeah, this is great. If you're looking at not trusting anything on the cluster, you're going to need stronger guarantees. Does that answer the question? So the question was, how is it different from in the virtual machine versus just in a container? And maybe who cares about that? Okay, so in terms of what's the difference in level of security between a virtual machine and a container? Level of process isolation. I still can't hear you. I'm sorry. The level of process isolation? Oh, the level of process isolation? Yeah, so that is one of the reasons that you would run a virtual machine is because you have that stronger process isolation than you do with a container. Which you see groups, namespaces, SE Linux, I mean, those are strong guarantees. But in theory, you might be able to do something. I don't know. So, so yeah, I mean, the virtual machine is a stronger guarantee. But that's not what we set out to do when we do that, when we set this up. Thank you. Anybody else? If you don't mind, I might try to answer this question. I think I understand it. Okay. So like, I think you're thinking about it reverse, like, like you're thinking about it from a security perspective. He's saying it exactly right. Don't think of it from a security perspective. But we should tell him what to think about it. Think about it as a tool to pull a VM into an application similar to, you know, in a Kubernetes VM file. Like that's phenomenal, right? Because now I've got a database living in a VM. I've got front ends living in a container. It's a way to scope the entire application with a single application definition. That's the beauty of Kupfer. On the other side of that, I would say something like Cata containers is where you're now running a container in a VM. It still pulls the VM image. It still uses all of the, I remember, I'm sorry, the container image. It uses all the container constructs that you care about. So now it's a packaging format. In that scenario, you're adding extra isolation, you know, around a container. I think that's the way that you think about it in a security construct. Now it's an isolated container, but I still get all the packaging format advantages. This is the opposite. You don't get the packaging format advantages. You get bringing old stuff that's in a VM advantage, which is like the converse, essentially. Thank you. First, you owe me about 50 cents. Can you spot me? So one of the interesting things I would like to see with Kupfer is to sort of be able to handle the Cata container use case. In that, you know, Kubernetes is running along and finds that it needs additional resources. An application figures out that it needs additional resources. So it calls into Kupfer and launches additional VMs that then Kubernetes could take advantage of those new VMs to launch more containers inside of it. Is that been considered or is that? So the big limitation to what we're setting out right now is that we're attempting to not modify the Kubernetes cluster or not require it to be modified ahead of time. And so because in order to be able to run a virtual machine as a container, that requires a different container runtime than the default. So we can't ask an administrator to necessarily do that because that's going to increase the friction. In the future, if Kubernetes does allow dynamic container runtimes to be injected in on the fly, then we can explore using virtual machines directly or using them in the Cata container sort of approach where the point is for process isolation. Does that answer the question? I mean, I'm not looking for necessarily the, I mean, the isolation to be great, but just being able to launch more, you know, at a certain point in applications run out of resources inside of the existing VMs and it needs to launch more VMs. So usually Kubernetes breaks at that point and has to fall back and open stack or some other tool for launching more VMs. I just thought that Kupfer could be a way to actually. Oh, so resource overrun. Right, right. I hadn't thought about that. That's an interesting angle. More questions. We have about three more minutes, maybe less. Is there a way to pass like a cloud config as part of the virtual machine instance object? Oh, cloud init data? Yeah. Yes, I did not show that app that into the virtual machine and you know, basically cloud init is its own standard. So we're not doing anything special there. We're just injecting that data and making it available. You can use that usually through its own volume mount. So just to confirm, the Kuvert has no access to the VM itself, right? It just brings up the VM and you're expected to know how to make use of it. That's a loaded question. No, what I mean is like there's no, for example, if there's no like injection in the cloud config of like an SSH key. Well, you can do cloud init data. You can do an SSH key that way. Absolutely. Right, right. But I mean, that's up to the user. That's up to the user to do what they had to set out. Presumably you would know that you wanted to do that, however. So yeah, we're exposing the ability to do that. And yes, you can use that, but you have to know to do that. Right, what I just mean is the Kuvert just brings it up and that's it. Like there's no management inside of the VM itself. In August 2018, the answer is yes. That's true. We are looking at other possibilities in terms of doing a monitor application that would be available. But of course, how do you do that in a generalized fashion if you're building up a generalized virtual machine? Suddenly you're building your own VMware sort of infrastructure. So yes, we're looking into possibilities for limited cases. Last question, maybe? Okay, out of time. Thanks everybody. Thank you.