 Hello and welcome to my session. I appreciate that you stayed long enough to attend here. I'm Jan. I'm with SAP based in Valdorf, working in the Cloud Foundry dojo labs, mainly on Bosch and the Bosch OpenSec CPI. What I'm going to talk about is kind of a side project. So take that as disclaimer. It's very experimental. And I had only limited amount of time working on exactly that. And that thing is a Kubernetes CPI for Bosch, so that you can deploy Bosch releases onto Kubernetes. So what I kind of want to try to bring over is that running Cloud Foundry on Kubernetes is an interesting problem, and that a Bosch CPI for Kubernetes is a possible solution to that problem. And furthermore, lots of changes that might be required in Bosch to make that really usable and closer to productive use aren't actually beneficial for virtual machines as well. So that's not, even if that experiment turns out to be a bad idea, there's some things to gain there. And that's what I'm going to do today, showing where I come from, so why did I started with that, what I did, so how did I do it, and what's the state of that thing, or what do I think you can do with it, or where we can go with that. So where I come from, Cloud Foundry is running apps in containers since its inception, since the very beginning. However, Bosch is running Cloud Foundry on virtual machines. So my basic thought was like, wow, why not run Cloud Foundry in containers? So shouldn't that be possible? Like, if containers are cool, why not use them for running Cloud Foundry itself? And then there's a container schedule that is getting more and more popular, which is Kubernetes. Why not use that to run Cloud Foundry on top? Apparently, there are services that are actually deploying or deployable to Kubernetes quite well, where the provider of these persistent services are more working on the container runtimes and running them in containers. So if you anyway have to have a Kubernetes cluster to run these services alongside your Cloud Foundry, which without persistent services is not really a complete product, then why not run Cloud Foundry there as well, which would remove your dependency on whatever IIS that you're running on that? I think I'm not the first one actually being interested in running it on Kubernetes. So HPE, now, Suzer has this project called Fisile that is generating container images out of Bosch releases. What I think why it's not a solution is like, one is you're kind of binding you on Kubernetes, which you don't really know if it's turning out to be really the container scheduler to go to. And when Bosch gets features that releases use, like links, that's something where you always have to keep up with and implement that for this solution. So that basically boils down to, where is Bosch in that game? Because I think like Bosch is the best tool for the thing it does that I've seen until now. So what about writing a Kubernetes CPI that was like my thought process? So let's have a look. What's in a Bosch CPI? There's like basically three entities you cope with, that's stem cells, like base OS images, VMs and disks, and there are a couple of methods that a CPI has to implement that work on these entities, like creating stem cells and deleting them, same for VMs, for disks, it's also attaching and detaching them from and to VMs. There's a couple of more methods, but they are rather minor, like list disks and others. So I kept them out for gravity. This is the main thing you have to do if you implement a Bosch CPI. So if we look now at a rather naive mapping of concepts from the Bosch world into the Kubernetes world, then you have VMs, as I just said, on the Bosch side. And the thing that like, from my point of view, best matches on the Kubernetes side is a pod, which is basically a collection of containers that shares some namespaces. In particular, the networking namespace. So a collection of pods maps to VM that runs processes. Oh, sorry. The disk maps pretty well to persistent volume claims on the Kubernetes side, which is basically a handle to a disk that exists somewhere that you can attach to containers, to pods and mount them to containers. And a stem cell is basically a container image. But let's look a little bit deeper into that mapping and you'll see why I call it naive. Let's start with the VM pod mapping. What does Bosch do? It basically starts a VM that has an agent process running there and then remote controls that VM. It downloads like a compilation VM that it compiles packages on a runtime VM. It downloads packages, downloads rendered templates like the configuration and then starts up jobs using a monitor as the health, also via monitor. On the Kubernetes side, you create a pod that actually already has multiple containers that are based on existing images and the configuration level. The configuration is already mounted and you basically monitor containers, not the additional monitor. I'll come to what the problems here are in a bit. On the disk side, on Bosch, you create a disk, you create a VM, you attach that disk. And like on the Kubernetes side, you create a persistent volume claim and you create pods that already have a persistent volume claim attached to them. So that disconnect that the CPI API has is problematic, which you'll see exactly now. So what's the naive Kubernetes CPI doing? For create VM, it's actually creating a pod with a single container that then via agent, downloads stuff, installs them in the right place and starts up lots of processes in that one little tiny container. So that's not optimal when you look at it from the Kubernetes side, where you have basically one container per job. Create disk is not a big deal. You create a persistent volume claim that you can later use, but attach disk is the problem. Like in my current implementation, I actually have to delete the pod that I just created. I mean, if you look into the Bosch director code, that's just the next line that attaches the disk or at least it's very, very close. So you create a VM, that's where I spin up a pod and then a couple of seconds later, I delete that pod, start it up again with another disk, with a disk attached, which actually makes a problem because when I started these efforts, there wasn't actually, the director wasn't actually, wasn't even waiting for the agent to respond very long because they didn't expect the agent to go away in between attach disk. So attach disk actually failed because there were no hot beats for however long it takes to bootstrap the agent. And another thing that does not work is things like migrating disks because the agent bootstrap doesn't like it very much if you bootstrap it in the middle of migrating a disk, which is an agent local operation, at least right now. So these are kind of the problems that you have with a naive Kubernetes CPI, but it does work. So you can get something running. So what I'm going to do now, if everything works fine, is deploy a ZooKeeper release, a ZooKeeper cluster and run the smoke tests that are provided as a Bosch errand on that thing. So let me mirror my display. This is too small, I would say. Why is that showing the history of that session? So is that big enough? So what we have here is a Bosch director in a recent version with a Kubernetes CPI deployed next to it that is actually rather empty. So there's no VMs running. What we do have is the ZooKeeper release already there so that we don't have to wait for it to compile and things like that. So we are going to deploy this little manifest. So now it's creating, missing VMs and stuff like that. So we might want to actually look at that. So if we now look at the VMs, there are already some running, there it's already updating them. We can also look at that on the Kubernetes side of things of the world. So there's pods running. If you see that, there's five pods actually. Two of them are the director we are working on and the director that deployed the director that we are using. So three of these pods are actually the ZooKeeper things. So maybe dragging that, well, it doesn't work. Then don't drag it down. So it's taking some time. Let's actually start watching for these VMs and watching for these pods. We don't need that much space here. One is already done, next one is there. So this is done. Now comes the hacky part of my demo. I have to immediately recreate this cluster because there's, I suspect it's a bug in Bosch DNS which is bleeding edge that I'm using. I had just half an hour with Dimitri looking into it, but that didn't help to actually get to a solution. So I think I have his curiosity to get that thing fixed. It's like Bosch with local DNS is putting the DNS entries in at-c hosts and I don't know why it's not done the first time. It might be related to the CPI or mini queue where I'm running on, but somehow I suspect it's something with local DNS which is like work in progress. The worst thing is that it took a minute already and I was like out of words already to entertain you while that was running. What I'm going to do next is actually run the smoke test. So there's an errand job in the same manifest that will talk to that Tsukipa cluster, set values, delete them, test for latency, things like that. And that's what I'm going to run as soon as this one here is good. Maybe, yeah, sure. Yeah, I did use the Bosch CLI version two and create it with the CPI with the same one. So there's nothing special I would say for that CPI. Why I did that is what, that was my first experiment. When I had that CPI basically running, I wanted to see if it can deploy Bosch as the first use case I had. I've chosen Tsukipa for this one basically because it has an errand job that shows that something is running there. I mean, theoretically that I'm using that inner Bosch to deploy things shows that the deployment used, done with my CPI actually worked, but I thought that was more accurate. So this is now creating a missing VM to run that errand job. You should see that both down in the Bosch VM swatch and in the get pod swatch. Now updating that, so putting packages on there and the rendered templates for control scripts. Now it's running the errand. I mean, you now have like maybe I could show something like that. You really have, oh, it's done. The formatting is not very nice, the output of that test, but you basically see that it's talking to Tsukipa cluster, putting values in there, watching them, getting them, deleting them. There's I think three test suites in there. I didn't look very deeply into the tests that are done here. It was just a showcase that the cluster actually is running. You basically have everything that you would suspect, like you can Bosch SSH into these things. Bosch SSH, so no, oh yeah. So there you go. So you can basically use it as any other CPI. Let's get back to the slides. Here we go. So I hope I could have shown that that Bosch CPI for Kubernetes works, kind of. So there are some problems. Some of them I already told you, like it's really just a single container running multiple processes, which is not nice. It is really ugly to attach a disk. And there's some other things, like Kubernetes out of the box has no manual networking, which is why I use Bosch DNS, which is in the early stages, or well, it's not finished yet at least. You could do something on the Kubernetes side, but I just didn't want to do that. With Coleco, I think you could kind of fiddle around, but that would pose requirements on the Kubernetes cluster. So if you get a provided one, and that doesn't have Coleco configured to do that, so I didn't do that. It runs privileged containers, which is probably not nice, but the agent does things like mounting disks and others, so it's problematic not to run that, so that's a big problem there. And I had a problem with nested containers. So I actually wanted to show something like CFBush look, here's Cloud Foundry, but Garden doesn't like to run in an overlaid X4 file system. There might be solutions. I already talked to Jules, but yeah, I don't know. I'll keep working on that, but nested containers is one of the other problems that I see here. So how would a native, a more native Kubernetes CPI look like? I talked about that a lot already. It would be a container per job. You wouldn't have Monit to start things and run unprivileged containers. I put the last thing in brackets because that's not really where I'm focusing on, but you could try to move some lifecycle capabilities out of Bosch and into Kubernetes. I'm not sure if that's worth it, but that's something I also considered like how to get there. And now if we have a look at how Bosch could help, well, it could run jobs as containers. First of all, that would actually give us releases that work in container isolation, which is not the case right now. I mean, if you look into that, jobs are not isolated at all from each other. They all run as v-cap if they actually do that, otherwise they even run as root. And they can all write in each other's storages like ephemeral and persistent storage, so there's basically no isolation in Bosch. Changing that would help with getting forward with the Kubernetes CPI. Actually, I don't think that anybody likes Monit since we are locked on that version that we are locked on, so we would get rid of that or have to get rid of that. And we would run jobs with these privileges. If I look at that list, none of these items is something that I wouldn't want in the VM world, like jobs that are really isolated that can't walk on each other's food and things like that, getting rid of Monit. So I think that's actually work on this side is beneficial. Actually, there's some project already going on in that direction on the pivotal side. I think it's crucible, so you probably hear about that. Anyway, so on the other hand, how could Kubernetes help? Well, the problems were basically that I couldn't modify things like the list of containers, the list of disks at runtime. That would be really great to actually come to an implementation like that. I didn't dig very deep in that, but there's discussions apparently in the Kubernetes community coming from a use case of I wanna debug a running pod and attach debugging tools for that. And that's basically the two options that they were talking about, just allowing to attach a sidecar container that has these tools or at least a volume. So that would pretty much help. You could think about an agent container that spins up sibling containers or something like that if the Kubernetes side would allow that. So concluding, Abash Kubernetes CPI is possible, but there's really quite some work to make it less naive than what I have now, which is pretty naive. But the work is valuable for the VM world as well. You can use that there. If you wonder what's next on my agenda, I think I'm going to look like how far can we actually get with a naive approach, like building up, using more things. But at the same time, I will try to influence Bosch and Kubernetes communities to get to a better mapping. And the last point basically depends on the community. The project is on GitHub in the SAP org. I'm completely happy to incubate that in the Cloud Foundry community, but that depends on you. Like, if there's no interest, I'm not going to do the paperwork for that. I have something to push to and it's open. So that depends on you, like ping me on wherever you want, Slack, Twitter. If you're convincing, I'm really happy to do that. So that's basically it. One more thing we are hiring. So if you're interested to work at the SAP Cloud Platform, no problem. And with that, thank you very much for being here with me. There's like contact data for me. If you have questions, we have time for questions now and I'll be around like till the end of the day and I'm around on Slack and yeah, thank you. If there are questions, I would prefer for the video to use the microphone, actually. Anyone? Thank you. Can you describe a little bit more what you said when you tried to deploy this? You mean what happened? Yeah, because you said that there was problems with containers running inside of containers. It was like, I mean, I was using CF deployment, which is going to be the next big thing on how to deploy the Cloud Foundry and Garden didn't start up. It didn't find a driver for the graph storage stuff. That was like, probably basically because it was an OverlayDX4 file system and then Garden is starting OUFs and that doesn't like to run an Overlay file system. So that was the basic problem. Ideas from Jules that I'm going to try out is using GrootFS or stuff like that. Although I heard from the Bosch team that they experimented with GrootFS and were not getting far with that for, but that was just like. Would you, if that wasn't a problem, do you think you would be able to run it with like the amount of memory you have on your system? Yeah, I think you can like use Bosch Lite and run Foundry on it. That should have similar memory boundaries. So yeah, but it did put my system under pressure. Yeah, okay. Running MiniCube with eight gigabyte of ROM don't expect to work on that machine without looking at other programs at the same time. Especially when it's compiling or things like that. Hey, let me ask you, did he, like if your objective is running Cloud Foundry in Kubernetes, do you think that Bosch is the right call to do that or? Did you see a better tool to run distributed systems in clouds? I didn't yet. I like Bosch. Specifically here, I like Bosch. But if you're running something like Kubo, right, like you're running Kubernetes in Bosch. Yeah, use in Bosch. Why not just try, give all the parameters to Kubernetes like in a YAML file and say, hey, run Cloud Foundry? Well, I don't like Bosch manifests very much, although they got a good way better, but this what you describe would be even worse, I guess. It would be even longer, even more YAML. Like, I don't know, is there anything more than an enterprise YAML architect? Like, somebody who is even more, I'm not sure. And it's like as the community installs Cloud Foundry with Bosch, I think it's problematic to try other ways like you lose the community. How does this relate to Kubo? What does it unlock? Well, a toy thing that I would have loved to present is like the ultimate inception to install a Kubo with Bosch onto Kubo. But that would just be a toy thing, right? I don't know if that unlocks something. I could imagine that Kubo could get into a stage where more and more parts of Kubernetes are self-hosted. But I'm not sure, because I mean, I need an API server at least, and then you're already pretty far. Like, if you look into what the Kubernetes community is doing concerning bootstrapping, they are thinking of like starting a Kubelet that looks like a pod and then start an API server. So that's things that won't be possible with the CPI, I guess. If there's no more questions, again, thank you very much for attending. Have a good evening.