 Corona mewn bwysig y tal hon o gyda'r projecthau o boblau o'r ffordd yn bod yn ffordd. Mae'n gweld y bwysig o'r projecthau paulig, yn ddweud o'r bwysig yn apparatus mewn gwilio railwyr ei weld rhan, ac rhaed i ddwy wedi bod y bydd y cawr wnaeth o bobl yma yn ei wneud bwysig. Beid repl ocha yn ôl gyda ironyg dawnol intendå cried CF. The three of us that are going to be talking. Unfortunately, two of us are called Jools, where is going to be a problem. So I'm going to talk about what it is and why we did it. I'm Dr. Jools Andrew who's not Jools. Thankfully, he is going to talk about how we did it and then Jools from Germany or A Hee Felly, os er ôl, rydyn niín gallwyddu o ryngwyr hyn yn goddwyd, byddai hefyd yn wydwydwydrwyng. Felly ond peir, roi'r rhai cyfraffio gweithio, drwy fitiaid yn fod yn sgwrdd yn ein ffordd mwy fryd, oedd yn gweithio bod gennym hwnnw yn gweithio'r gwaith yn gallu'n gweithio'r gwerthodol. Felly ond mae'n rhaid i gael gweithio'r gwaith a'n ei wneud cymryd i'r byd. interesting stuff. So, what is this talk about? This talk is about a little elephant named Kubernetes, which is in the room. Has anyone heard of Kubernetes? Ha! And the elephant is a little bit scary, if we're honest. It's a big elephant in the room. And so what we've done is we've given it a nickname to make it sound less scary. And that nickname is CFCR. But it's still quite a scary elephant, actually, in terms of what Cloud Foundry should and shouldn't be. Why is that? Let's talk about what Cloud Foundry is. I think Cloud Foundry is two things. One which I love is a developer experience. It's a developer experience about stateless apps, CF push, CF bind service, and don't push Mongo to the cloud. That's the CF push journey. It's also, of course, a container orchestrator. It's Diego and Garden and all the stuff about how it happens. So it's a developer experience, and it's how we happen to implement that developer experience today. These are two roles. These are one role which just hopefully doesn't even see this underlying stuff, but also a role that sees this underlying stuff all the time. So, Kubernetes, what is that? Well, it also is a developer experience. It's a developer experience about deployment and replica sets and nodes and taints and annotations and all of this stuff. And it has a container orchestrator. It is a container orchestrator. I would argue some of these roles are a bit more overlapping in Kubernetes. The people who might use deployment sets and replica sets might be developers, and they might be operators. But there are higher level tools as Helm and Scaffold and things like that and Draft that are more focused on that developer role. So let's put them side by side, right? That's our current solution, is how about we put them side by side and we'll call one CFER, one CFAR, and everything's solved, right? Yeah, kind of. But you still have these two very separate systems, two sets of nodes that operator has to manage both of these in pretty different ways. And actually, it's a bit sad. The communities are still kind of separate. Even though they're interacting better, they're not really integrated as much as you would hope that they might be. And as an app developer, you're not really getting any of the features of Kubernetes, right? Your apps are in this walled garden and you can't take advantage of custom scheduling or anything like that that might be in the Kubernetes ecosystem. So we've had various ways of trying to do this better or differently. One, we spent a lot of time working on and had a talk about last year, actually, was why don't we just write a Bosch CPI that deploys to Kubernetes, right? Bosch is great. Bosch has this CPI abstraction, which is why you can deploy Cloud Foundry to any cloud that you like, right? You can deploy it to Amazon or GCP or Softlayer or IBM or wherever. And it doesn't matter, right? So why don't we just treat Kubernetes as a modern IaaS and deploy it to that? And you get some advantages from this, but your app developer still doesn't get any of the advantages of Kubernetes. That's still sitting higher up. You still have to operate both of them. Your operator still has to operate Diego and Garden as well as your Kubernetes stuff. So you're not really getting the benefits. You just sort of put one of them on top of each and top of the other. So we looked at this FISIL approach, which is a Susie's approach originally, which is to take the Bosch releases, convert them into containers, and then put those onto Kubernetes. And this is nice because you don't have both Bosch and Kubernetes fighting over your set of nodes. You have a native containerized Cloud Foundry, which means you can use more of the Kubernetes features and actually get something out with the fact that you put this layer in there. But you've still got Diego. You've still got Kubernetes. Your app developer still can't use any of the Kubernetes features and your operator still needs to know about both of these systems. There's a side one which Garden, the Garden team are doing at the moment, which is, well, why don't we have Garden use container D? Then at least we'd be able to use the same container engine and maybe share some nodes, but that still doesn't really solve the problems. So why did all of these solutions keep failing? What's the fundamental problem that we keep running into here? I think the fundamental problem is this. It's layers, right? If you keep adding layers, no matter how great each of the layers is, your life doesn't get simpler. You only get a benefit when you replace something. You have to take something away. If you keep adding really good things, you still end up with increasingly complicated things. And I summarized it a few months ago by saying now you have two problems, right? When you put one of these things on top of each other, you have two problems, you have two things to manage. And actually, for a lot of these solutions, instead of it being N problems from Cloud Foundry application runtime and N problems from Cloud Foundry container runtime, when you put them on top of each other, you actually have N times N problems because each one can explode at each other and you've no idea what's going on. So why don't we do option five? I think this is the obvious kind of way that this has been leading. Option five, obviously, is, hey, Cloud Foundry is a developer experience. It's CF push, CF bind service, don't push Mongo to Cloud. Kubernetes is an operator experience in a scheduler. It's a really great scheduler that clearly does a lot of what we need. Why don't we just use Kube as a CF scheduler and get the best of both worlds, right? And that way, your app developer is happy, your operator is happy, everyone is happy. Why don't we all be happy, right? Well, there are some reasons, right? So there are reasons we didn't do this in the past. Why didn't we do this? Because Kube is a lot bigger than Diego and that used to be a problem. It's a much bigger thing for us to maintain than a small scheduler where we can operate quickly. And you need to move fast sometimes. It's important to really ask about every dependency what value is that bringing to you because there's an opportunity cost to pulling that in that could be spent on other stuff. And frankly, who cares? As a Cloud Foundry user, what does it matter? What the scheduler is, I'm going to do a CF push and I'm not going to see it. So why do we want to do this if these are all true? Well, basically lots of bullet points. What it comes down to is Kubernetes is a great scheduler. Scheduling is commoditized and it turns out that so many Cloud Foundry customers now have Kubernetes for Cloud Foundry container runtime or to run their functions that they're already having to operate it. We're already having to figure out how to operate it and how to make that something that works for users via things like CFCR. So given we're already doing it, wouldn't it be nice for those users to let them reuse that for their app runtime? And that's basically what it comes down to. And because it's commoditized, you've got Cuba as a service, you've got ops teams that already know how to use it. There's a big community with lots of available skill for it. And frankly, it lets us focus on the really important stuff, in my opinion, which is not pushing Mongo to Cloud. Because if we move the scheduler piece out, then we can stop having the conversation about which scheduler is better or about how Diego and Cloud Foundry compare or Cloud Foundry and Kubernetes compare. We can start having the real conversation, which is about CF push and not pushing Mongo. So what did we build? Let's talk about what this actually looks like. And Andrew will do that. Okay, so what did we build? There we go. Okay, so first we'll talk about the four main things that we've built, and then we'll talk a little bit about what we need to do more and how things are going. So the four things we have is OPI, and I'll talk about each of these individually. Sync, Registry, and Stager Nities. I think we're going to call it. It's a good name. Good name, I think it's a good name. So OPI, and basically OPIs is the main thrust of this proposal and work. It's to provide an interface, and we called it the Orchestrator Provider Interface, which is an abstraction over what goes to the scheduler from CF, right? And so it's inspired by Diego's LRPs and tasks, and just the CPI model, right? So it's the API we will communicate from CF to either Diego or Kubernetes so that you can pick and choose. So next thing is Sync, or Sync, or another reason Jules is a really bad nameer. Originally, Sync from Diego was NSync, and I think it's because they like Justin Timberlake really a lot, but it's really the convergence loop that checks what's in CF and decides what should be in Diego or in the orchestration layer, and then makes sure that they're in sync, right? So it downloads the stage app as a Docker image and it creates Kubernetes deployments inside Kubernetes now, right? There is another piece to NSync, which is a API layer where Cloud Controller will send requests to start, stop, tasks, and LRPs. That's to be another one of the items that's to be done. Right now, we are just using the sync loop right now, but once we have that, it would be very easy to implement the other two pieces. Okay, so the next thing is a registry. So we implemented a registry that is an OCI registry that vends the images based on the CF droplets and on the CF Linux FS2 base, and it's still used droplets, so we're just generating the droplets that CF uses and be able to create an OCI image from that and then run that inside Kubernetes. And now, our favorite name, Stager Netties. It implements staging inside Kubernetes by running a job, right? A single one-off task like what happens in Diego is we run a task to do the staging to upload the bits, and it just runs as a one-off staging task. It does full build pack detection, et cetera, just exactly like it's done in CF on Diego, and then uploads the droplets in through the Cloud Controller API to upload. So that's what we have built so far. So what's next? Well, we still need to build route emitter, register the routes, TPS, log streaming, bunch of other stuff. Oh, wait, we already built route emitter. Yeah, we did route emitter about a week ago. So we do have in the demo that Julian's going to show, the routes are already registered and in the go router, but we still have TPS, log streaming, et cetera, to be done. So now let's do the demo, because that's way cooler. Okay, yeah. We're switching monitors now. Hopefully it shows up. That will be hard to hold it. So it need both hands. Okay, so let's do the demo. I will first explain what you will see here on this four paints. So on the right two paints, I will do a watch on the Kubernetes pots. And a watch on the Kubernetes jobs. So we can see that there are no resources yet. And larger, I will try it. It was like, is that better? That's good. Okay. So we're stopped. Okay, and here we will see in the upper left pane, we'll see the CF push. And in the lower left pane, I will show the staging logs and also the deployments that are appearing on Kubernetes. So let's just start in CF push Dora to coop. Let's see what happens. So the first thing you see is it's starting creating the app, binding the routes, uploading the app bits. And after it's uploaded, it will start with the staging and you immediately see that a job and a pot is running on Kubernetes doing the staging. And we will now take a look at the logs if I'm fast enough. Coupe logs. Let's see, with an XF. And now we can see, okay, it's already out almost through. It's done the staging. Staging is completed. So the next thing that we will see is the deployment appearing in coop.get deployment. And I reduce already. And the pot is also running. At that point, Dab is deployed to Kubernetes. So it took less than a minute. And we pushed an app to Kubernetes with a CF experience. And now let's just curl that app. That was my test before that one. So called Dora, I'm Dora. And yeah, that's about it. We just pushed the app to Kubernetes. Nice, so I think we have zero more slides because how would you follow that? But we're very happy to take any questions. Do you plan to add any question into the TCP router or just the Go router? So at the moment, we're just focusing on the regular CF push and then we'll start to add features like TCP routing and things like that. So I would, so. Stating the somewhat obvious, but I think it's worth stating. This is an MVP. This is not complete. And there's a decent amount of work to make this complete. But it does show you that you really can get an end to end CF push using this approach. So it's a proof of concept and it does prove the concept. But there's going to be a lot of edges that we need to do to complete the whole of what Diego does for you today. Yeah, so I mean, this is a lot of information to digest, but some of the approaches that you mentioned earlier, trying to fit the big elf and into the room, we tried that. But this approach, what I understand, I just want to make sure I understand it correctly. So you're saying basically we're going to use the cube Kubernetes to replace the garden C and the garden container? Is that what? Both garden and Diego. And Diego is what you're trying to say. And not replace, so it would be an option. So this is an approach that we've entirely stolen from Docker, right? Lots of people like Swarm, lots of people like Kubernetes. So Docker said, let you pick. If you've got an investment in CUB, we'll let you use CUB. If you like the features that Swarm gives you, you can use Swarm. In the same way, I think a lot of people are going to want to stay with Diego because it gives you this nice integrated all-in-one experience. It's very simple and it's tailored for stateless apps. But on the other hand, a lot of people have investments and skills and existing Kubernetes deployments, either for CFCR or for functions, which are often run on a Kubernetes deployment. And for those people, it might well make sense to plug in Kubernetes. So it's not an either or it is an abstraction. And I think this is the benefit of this now being commoditized. Now that scheduling is kind of a commoditized thing, we can give people choice about which one of those to use without really slowing down the higher levels because it's kind of an agreement on what scheduling looks like. Yeah, one thing I wanted to say is, we know that the Kubernetes developer experience isn't great. It takes a long time to get an app up and running. And so this is also a bonus to the Kubernetes community, to be able to have a better CF push experience for Kubernetes. Maybe that's something people who are already running Kubernetes and they want a better way to do it. Yep, they will have it. So Cloud Foundry is kind of opinionated when you run apps. Like you can't have persistence and things like that. But do you think that this will allow apps running on Cloud Foundry to utilize persistence or like UDP routing or things that Kubernetes allows? Yeah, so one thing that this does and one thing we like about this approach versus some other ways people have tried to do this, is it does create really first class cube objects. So what you end up with is a real deployment object that's not special in any way. It's just that the images are just images. There's no like machinery to stream a droplet in or to do anything like that. We've just created an image that is your latest staged app and given it's cube as a deployment. So it is running in kind of a special namespace. It's running in a cube namespace and it's certainly not part of MVP for you to be able to play with that and do stuff. But on the other hand, when you want that escape hatch of, okay, I've done this, I've pushed it and now I want to start working with that, it should be quite easy to do that because you've actually got a real cube object and a real native cube object. So if you want to pull the escape hatch and move it into a different namespace and start working on that yourself or even ask cube to pull down the object model for it or create a new one because you've got the image, it should be a lot easier to do that if you want to. But it's not certainly part of our initial MVP to add those features directly to CF. And my personally, I'd be hesitant to add them to CF user experience because I think the CF user experience is about not having those features. But the escape hatch is probably nice. Can you show us what components are running in ports? What cloud foundry components? So everything is containerized, right? So can you show us? Yeah, so just for ease, most of the components at the moment are deployed as Bosch jobs. Just because that's an easy way of, because Bosch will give you easy ways of linking like the username and passwords that you need for things like Cloud Controller to grab the droplets. All the cube stuff runs in CF and then talks into a Kubernetes for the deployments. I think as we do this more for real, it might make a lot of sense to do CRDs and API aggregation for some of these things. So to use Kubernetes for some of these components. But at the moment, we do them within kind of as Bosch jobs. So we are kind of replacing the CPA as well. The Bosch CPA, which talks to the provider. So at the moment, the actual jobs and cloud foundry is deployed by Bosch. So you have options. You can continue to deploy the cloud foundry however you like and talk to a Kubernetes, which may be CSUR, maybe something else. Or you can use one of the projects like the Kubernetes CPI or like Fisile so that your cloud foundry jobs are also in Kubernetes. So that that's kind of an orthogonal problem, right? How do you get the cloud foundry bits on Kubernetes and how do you get the apps cloud foundry creates on Kubernetes? And we're solving the second problem and there's quite a few approaches to the first problem, but that's kind of out of scope for us. That's not something we're worried about. That's great, cool. So this is cool, but obviously I'll do a free plug, I guess. So Dimitri and I have been working on a CPI. The problems that Jules mentioned, we solved those confidently. So come see that. If you're coming to KubeCon, we'll make it public. It's not necessarily a replacement for this. I think we'll see what the community says. But my issue and I guess question for this is at first you criticize the CPI approach because you'll have two problems. I feel to see how you have one problem here because let's say Kube has an update, you'll have to update Kube and you'll have to keep it in sync. Let's say CF has an update, right? So you still have those two problems and then more importantly you have another problem which is that you define a layer or an API for those layers and unless Kubernetes and CF are working together, one or the other are gonna break that abstraction. And obviously since you work in CF you probably can convince Eric and the rest, maybe with some being nice to them to not break the layer but how are you gonna convince Kubernetes? So this is a good question. So there's two things here, right? So first is the sort of CPI versus this and these are totally different problems. So the new CPI is gonna be awesome for how do I get my Cloud Foundry stuff onto Kubernetes, right? Because it's gonna be a nice approach so you can do a Bosch deploy and now you're Cloud Foundry and if you're not using this, your Diego are running on top of Kube. This is solving a completely different problem which is the apps that you created with Cloud Foundry. We wanna run those in Kubernetes as well and that's just a different layer. I think the second part of the question is how do you deal with the fact that you still have the Kubernetes? So when I say you only have two problems, I'm not suggesting that once you do this you'll never have a problem again. You will have problems but instead of having all of Diego and all of Kubernetes to manage, you will only have Kubernetes to manage as a scheduler. Absolutely. And sorry, just one thing. So in terms of that API bit, in terms of like how do you manage the fact that Kubernetes is gonna have new versions? We manage it exactly how Bosch manages it. So Bosch has a CPI which gives you a little abstraction so you're not exposed to the whole thing and so that you can keep that working. We're doing the exact same trick. It's an OPI so it's the same trick as Bosch does but at the orchestrator level instead of at the IaaS level. Right, but the point I'm making, so two points besides my plug, okay? So I won't plug again. But the point I'm making is that it's not just about getting it running the first time. It's keeping it running and keeping it running while you're deploying and you're doing updates. Why? Because there are people trying to hack into the system or creating CVs and so on, right? So that's the number one point. And the second point is that you've defined an abstraction that unless the two communities are agreeing, they're gonna break. Now one of them may not break it because you can convince but the other one might break because it might add something new or change things that doesn't work. The cube interfaces are pretty stable with regards to the bits we're using. They're just deployments. And so you are right. You have to worry about how you keep that cube maintained and running. That's what CFCR solves. So what this is saying is for people who don't wanna worry about cube at all, hey, use Diego. But if you already need to get that cube running and patch and updated for your functions, for your cube, for your stateful workloads, and so you already have to solve that problem, then you probably don't also wanna solve that secondary problem of doing that again with Diego. I love Diego but it's an additive problem that if you already have to solve the second one, which is a bigger problem, I'd rather only solve the Diego problem but given many people can only solve the Diego problem and have to solve the cube problem, it's nice to give them the option of reusing that investment to also power their apps. And this is a problem that most people have to face for things like functions. So whether you're using WISC or RIF or any of these approaches, you're probably sitting on Kubernetes in terms of the things that get spun up and you also have to worry about what if Kubernetes changes there, but actually the APIs are reasonably stable and cubes have been pretty good, I think about keeping things working across releases. So do you expect it sounds like, I saw you guys run kubectl commands, so obviously the Kubernetes cluster is also exposed with API endpoint. How is that set up and do you expect that users would potentially hit Kubernetes directly in addition to CF push or what's the thinking there? I think that's a decision that different people are going to make differently. I'm pretty opinionated personally. I think the vast majority of developers should never see Kubernetes. I think they should be CF pushing, binding a service and reusing services for stateful stuff. So they probably shouldn't ever need to do that. And I think the more you can make that happen, I think the happier your life is. But on the other hand, very clearly many people disagree with that view and do want all the power of flexibility of being able to go and have that escape hatch when they want it. So it is something that I think the project is agnostic to. We want to give people both options. But the fact that you do have both options is nice. So there would be alternate approaches where you just build all this stuff in Kubernetes and really integrate it with the CUBE APIs. But in that case, it's very hard not to expose your app developers to all that stuff. This is saying, this is the full CF push experience which hides all that stuff, but also if you want to look under the hood, you've got a really CUBE native thing that's being created. And then also, I guess, one of the things Gartin helps manage today is the Windows abstraction. And so I assume the thinking would just be to let the Kubernetes community catch up there and at some point that would be something you could manage there. Exactly that. So because we've got this abstraction, we're not coupling ourselves to CUBE. We're not saying we're going to make all these CUBE abstractions leak up because then things like Windows become really hard. Because then you are locked to Kubernetes because you need all the Kubernetes primacy. You need SCD and all this stuff if you represent that into your API. We're not doing that. We're just saying if you already have a CUBE investment, in the same way, I guess it's a weird analogy, in the same way today, if you do CF deployment by default, we'll deploy a Postgres database for you. But if you've already got a Postgres or a MySQL or RDS or some other managed database, you just give us the URL as a property in CF deployment and we'll use that. It's a very similar thing. There's a built-in scheduler, which is probably Swarm. But if you've already got CFCR or you've already got some CUBE as a service, just give us the URL, give us the API endpoint, give us the credentials and we can use that instead. And so on Windows, you just carry on happily using what you have today. As soon as CUBE catches up on Windows, you have the option of using that on Windows. Make it good. Just quickly, you mentioned binding services. Is that supported now? And I guess maybe a broader question, like if I have questions about does this thing support X, do you have a public backlog somewhere? It is public. It's currently in github.com slash jools, which happens to be this jools, slash CUBE and you have to hear the C. It's C-U-B-E. Sorry, CUBE. I know this is a problem and so we are going to have to rename Kubernetes. We will come up with a better name. We hope to submit it into the, you know, as a proposal to runtime PMC soon at which point it will move, hopefully, assuming everyone's on board with putting this in. But at the moment it's there, we have some issues there. I think a lot of this will just work because it's just the case of mapping it into environment variables. But I mean there's a lot of rough edges so I doubt it works right now. Contributions, welcome. Yeah, yeah, well, I sort of immediately started to think about volume services and how that would work. Contributions and by the way, this fell with a K? That was contributions.