 Hey, everyone. Welcome back to the next half of Cubed India 2023. I'm Nabarun. And I work at Broadcom. I build SAS control planes at Broadcom. And apart from that, I have been contributing to Kubernetes for quite some time now, maybe four and a half years or so. I'm a Kubernetes maintainer and a standing committee member. I also chaired the Kubernetes Special Interest Group contributor experience. And I've recently been onboarded as a CNCF ambassador. So what are we going to talk about today? I'm going to talk about Kubernetes as a universal control plane. But before that, I want to cover something very important. If you are a Kubernetes consumer, if you have been consuming upstream binaries and upstream Kubernetes Linux packages, do note that the legacy app.cubinities.io and yam.cubinities.io, they are deprecated and they're going to be removed next month. So you are encouraged to move to the new package repositories because they are on community infra and they are maintained by the community directly and not on third party dependent resources. That's all. So the question is, why would you even need to build an universal control plane? Let's go through some of the factors that might bring you to do that. So the factors are no particular order of importance or precedence order. They're just some factors that you might look when building platforms or deploying your apps or if you are an end user who uses cloud. So first thing, you might be on hybrid cloud. You might want public cloud and private cloud both to deploy apps. In order to do that, you might need something which can universally place apps onto those cloud platforms. With that, every cloud provider has separate APIs. They would have different ways to do things. They would have different terminologies. Now, how do you manage them? You need some ease and some usability factor baked into whatever or however you are deploying those apps. Now, once you have navigated that problem, the problem is, how do you go from 1x to 100x? How do you do whatever you're doing at scale? What if you want to manage 100,000 deployments at scale? That's going to be an issue when you are on hybrid cloud. And when you run such a large-scale inbound traffic facing infrastructure, you have to also think about security. Now, all of that aside, how is it even related to Kubernetes? And why is Kubernetes so good for this? You might know Kubernetes of something as that can run apps, run your deployments, run your databases, run your ML workloads, run your batch applications. What makes it so great is, number one, the declarative state management engine that comes with Kubernetes. You can just apply some sort of manifests and then you expect Kubernetes to reconcile those reliably and act on them and create and do things for you. That is the beauty of Kubernetes. And Kubernetes does it at really good scale. By good scale, what I mean is you can do one Kubernetes cluster up to 5,000 nodes. That's what we conform and give a guarantee with. But obviously, you can try out more at the risk of losing conformance. Now, what is the essence of the talk? The thing that I'm trying to put out is how do you build an universal control plane? How do you eventually get to build one? And what are the technologies that you can look at to build something like that, which can talk to multiple cloud providers at scale, give you the same ease of management as Kubernetes, and at the same time be secure enough for your use cases? I'm going to show a few key CNC projects that are in the ecosystem right now. By no means they are the only ones. They are the ones that are doing a really great job in building something universally acceptable to everyone. The first project, cross-plane. So what does cross-plane do? Think of Kubernetes and a Kubernetes cluster. When you run cross-plane on it, what it does is it creates a set of CRDs and runs a set of controllers for those CRDs. But then what does it even mean? That's any other Kubernetes CRD and its associated controller. So when as a user, you talk to a cross-plane-enabled Kubernetes cluster, you can talk to several cloud providers. Or as a matter of fact, you can talk to anything that has an API. So for example, it would be counterproductive, but then if you have a kind cluster and you want to talk to a cluster, you can even talk through a cross-plane-enabled Kubernetes cluster just as a POC. But then what if you want to have some fun? You can even order a pizza. What if Domino's is an API to order a farmhouse for you? You can do kubectlapplyfarmhouse.yml and get a pizza. But since we are in Bangalore, why not get a need-lead instead? So cross-plane enables you to kind of order something from a cloud provider and cross-plane with the whole Kubernetes reconciliation idea and declarative state management will get you that resource in that cloud provider. Or as a matter of fact, as I told, anything that has an API. People have crazy proof of concepts on Twitter. If you search cross-plane use cases for foobar, similarly, I got the pizza use case. You can just order a Margareta or a farmhouse, whatever you want, whatever you fancy. Or if Ramesh Aram or A2B started having an API for their orders or maybe Swiggy started having a public endpoint, you can even start ordering food using Kubernetes. But that aside, what do you end up with when you use cross-plane? When you use cross-plane, you end up with something that has Kubernetes-style declarative configuration, where everything is stored in a single source of truth and you can optimally do GitOps-style deployments, where you might have one Git repo where you store all the manifest and then you have GitHub actions running. Whenever it sees a chain somewhere, it just talks to that Kubernetes cluster, applies the YAML, and then it's cross-plane's rule. Cross-plane will create that thing for you. Extensibility. As we know, Kubernetes is highly extensible using the whole concept of custom resource definitions. Similarly, if you want to build your own APIs, all you need to do is just implement a provider and you get your own lifecycle run. Unification. So what we essentially are talking here is infrastructure and apps being defined at the same place. So you have all those things in the same Kubernetes cluster. You define your infrastructure using a Kubernetes cluster by writing YAMLs and also your apps and how your apps look. Automation. Since we talked about GitOps and how Kubernetes reconciles things, you get a highly automated workflow. All you need to write is a YAML, Git push, and everything is done for you at the end. Now, this might seem a little counterproductive when I talked about unification, but I'm not talking about separation. But here, the separation is on policy and app level. So you can decide how the policy looks like, who can create stuff on your Kubernetes clusters and, essentially, your cloud providers just by doing a Kubernetes-style R back. Now, NFF Word cross-plane, actually, the next talk in the very same room is about cross-plane. So if you want to learn more about cross-plane, please do stay here and listen to the next talk by Arshan Ramiro. Cool. We talked about one thing. Now, what is the next thing from CNC of that? You can use, what is the next project? The other thing is KCP. There's a very popular misnomer that KCP probably is Kubernetes control plane. It's not. KCP is actually defined as a Kubernetes-like control plane, but the acronym doesn't mean anything. That's from the authors. What KCP does is it takes a Kubernetes cluster but doesn't have anything which is related to a workload. A KCP cluster or a KCP control plane will only have CRDs, namespaces, events, some R back things, and essentially any resource types which are required by CRDs and namespaces to work. And a few other great things, which I'll talk about. So under the hood, KCP also has multi-tenant capabilities, and that's what makes KCP a really good deal if you want to build platforms for multiple orgs in your company, or if you're a SaaS provider, if you want to build a SaaS, you can build something using KCP. So KCP has the capability to behave like multiple Kubernetes clusters. Now, they call it logical clusters, and each of these clusters are independent. They are completely new Kubernetes clusters, and KCP also calls them as a user-facing concept called workspaces, which essentially mean like an isolated Kubernetes cluster, just that it doesn't have any workload APIs. Now, along with this, it's not just about resources and multi-tenancy. What is also good is KCP bundles with a lot of workspace-aware controllers. It also bundles with workspace-aware controller runtime. So if you are using controller runtime to reconcile your CRDs, KCP bundles with a controller runtime where you can talk to multiple workspaces at once. That's the beauty of the tooling that KCP gives you. Now, let's talk about the API service provider model. See that some portion of the screen is cut. I'll try to speak about those portions. I just noticed. So it is API service provider model. Let's say you have a company called Acme Corp, and you have a platform team inside the company, and you have multiple teams called dev team and ML teams. Maybe you have more teams like a data science team doing inferences, and then the ML team using those models and running it in production. Now, as a company, you might not want everything to be available to every teams. So here, the service provider model of KCP comes really into rescue. What you can do is, I'm not sure if you can view everything on the contrast, but the platform team can essentially define CRDs in their workspace and create an export for other people to use. And you as a platform admin can define those bindings in specific team-level workspaces and they can basically subscribe to those exports and get those CRDs in their namespaces. To give a very simple example, the platform team can define an ingress gateway, can define something like a deployment. So in KCP, there is nothing called a deployment. Remember I mentioned that no workload-related APIs are present. So the platform team can define a deployment, an ingress gateway, and a load balancer, and the dev team can only bind to those APIs for dev team. But the ML team may need batch workloads. They may need jobs to work. The platform team can define those, but the binding would only be for jobs. So in that way, there is a service provider, which is the platform team, and there's a service consumer who are the separate teams. And they can consume different kinds of APIs depending on what's their use case. And this is tightly controlled by KCP's RBAC. Based on Kubernetes RBAC, but they have a concept where they do it across workspaces, across namespaces as well. So one more thing, when I talk about workspaces and when I talk about independent Kubernetes clusters, inside each workspace, you get separate namespaces as well. Those are basically separate independent Kubernetes clusters. It is all Kubernetes conformant. It just doesn't have those workloads. The next bit, which is very important of KCP and what it does is resource syncing. What do I mean by resource syncing is? So the diagram is not cut. It's deliberately scoped and zoomed to that very portion. So when I talked about a dev team having access to a deployment, remember KCP doesn't have any compute capability. You have to attach compute to it. So the two clusters below, one running in Mumbai and one in Hyderabad on AWS, one in AP South 1 and one in AP South 2 zone, they can run workloads. They are proper Kubernetes clusters with compute capabilities. In the KCP world, there is an agent called sinker. So you can run a sinker on each Kubernetes cluster that you want to run your apps on. When you run a sinker, you define the target cluster as well. You give it the location of the KCP cluster. You give it the location of the workspace that it needs to subscribe to. So when KCP sees, oh, hey, there's a sinker available and there is a placement strategy. So you can tell, oh, hey, for this app, can you be available as three replicas in each Kubernetes cluster that is defined, like each workload that is defined? So in that way, you can do very noble placement strategies. And along with placement strategies, you can also do failover scenarios. So in this case, if one goes down and your cluster operator, let's say you are managing your Kubernetes clusters using EKS, and a new cluster comes up. You can say, in your templates, you can define, oh, hey, I need to run one sinker again when the cluster comes up. And when it comes back up, KCP would know I have a new sinker there, and I'll target the workload again. So in this way, there is self-healing capabilities as well in KCP. So KCP can help you do multi-random capabilities. In cross-plane, we saw that it helps as a generic API provider plane, and now we are seeing something which does an API service provider and consumer model with multi-tenancy and resource thinking. So just think about those concepts, and then we can practically build a control plane based on all of this together. We've talked about a lot of, we have talked about two CNCF projects here. But what about Kubernetes? Kubernetes has an API server right now, but the API server has everything bundled together. Workload capability is all the CRD mechanisms, API aggregation, the controllers running in controller manager. The Kubernetes community is trying to decouple those. They're trying to abstract out one more concept called a generic control plane where all the generic bits of Kubernetes API server will be abstracted to. And then you can essentially publish it and consume it as your own control plane. I'll come to why that is very important to KCP. So the reason we are abstracting out code, but then keeping it the same place, is to keep it like a unified code base. Kubernetes exists as a monorepo with some publishing mechanism for consumers to use things that we publish as Go modules. But then I'll focus on this graph a little. So the vision here is, so this is a very complicated scenario right now of dependencies. So essentially the green bit, the generic control plane, is something that is not there right now in Kubernetes. It's being built. So what will happen in the future, like two or three Kubernetes releases down the line, is the Kube API server, the exact thing that you consume as a consumer, will use the code from generic control plane. And generic control plane will build on the Kubernetes API and the Kubernetes API server. So basically we are splitting Kube API server into a generic bit and something which is very Kubernetes specific. How this will help KCP is, so KCP tries to be Kubernetes conformant with multirenant capabilities. But then KCP also needs to change a lot of code to make that work. So when we do generic control plane, consumers like KCP can build on top of generic control plane and build their own Kubernetes-like server. But it won't be Kubernetes. And it will come with all the shames like state management, reconcilers, you will have the optimum controller model. So what will be in and what's out? So in the generic control plane, CRDs and name services will be bundled. So you see a lot of synergy between how KCP is dreamt of and how generic control plane is written. And anything like secrets, config maps, RBAC, service accounts, admission webhooks, quota controllers, aggregation services, Kubernetes right now, you can write your own Kubernetes-style API server in some other thing or whatever framework you want. And then in the Kubernetes cluster, you can define it as an API service. KCP right now doesn't support that. But then with generic control plane coming in, it will also be a tunable where in go code, you can define this. I'm not meaning as a consumer, you can tune the API server. It will all be in code. But then when we start the API server, you will get all these options. So this is functionality-wise. What about controller-wise? So just took three very core controllers of the Kubernetes control plane, resource quota, garbage collection, and in Swiss deletion. Now they are by default enabled, even in KCP. Even in the Kubernetes API server. But the vision is that when I talk about generic control plane, it's two things, generic API server and generic controller manager or generic controllers. You will also be able to turn on and off controllers that are required for Kubernetes to work. So remember, I'm not talking about controllers that are defined in controller manager, like job controller, which anyways you can turn off. And all it will happen is you can't use the job type anymore. But then the vision is the core tenants of Kubernetes they are being questioned. And in order to build something which only has Kubernetes style API and reconciliation loops, we want everything to be tunable. There's a very beautiful demo of this feature from this KubeCon. When I share the slides, you can go to the link or you can search for API mission KubeCon, cloud native on North America. You can see the recording and the specific timestamp. There's a demo of this in action. This is still in flux. The feature is still being built. And remember, this is not something that will be published as a Kubernetes feature. It is mostly like a code reorganization. At the end, what people will see is two things. Kubernetes code is split in the same place into two specific bits. And a sample implementation of the API server, which is generic, and a sample implementation of the controllers which are generic. So that's the end goal of this feature. And this way, you can build any random API server that you want with any kind of resources that you want. Now, we're almost at the end of the talk. We have been very thick, heavy. But to conclude, when you start going into that path, what are the things that you are gaining? It's a lot of challenges. It's a lot of problems that you will solve eventually. But the things that you'll get are portability. The universal control plane is supposed to be deployed anywhere. It doesn't depend on cloud providers. It may even run on my machine. Generic control plane, when you run or KCP, when you run, it can run on maybe Netlify, I guess, or anywhere where you can run a persistent process with a backing DB, or KCP also starts with an in-memory DB. Consistent interface, you have Kubernetes and all version APIs. So if you're on the same version, your APIs are same. There is no inconsistency. And when you talk to cloud provider APIs, you are just talking to Kubernetes. You are just speaking Kubernetes and not AWS, GCP, Azure, or vSphere. You're talking Kubernetes language. Centralized API. So when we talked about platform teams building things, you have optimum control at the org level to define policies and to define APIs that would be available to your teams. It's not as if there might be an interrupt. And all of this will be configured by YAMLs again. Everything can be GitOps. Avoiding vendor lock-in. So when I talk about portability, you also make yourself not lock into a particular vendor. If your app model is dynamic enough to run or abstract out compute, storage, and network, which are the three pillars of cloud, you can run anywhere. Deployment strategies. We talked about how do we set strategies of actively deploying to each data center or each cluster that you define. And your organizational hierarchy can be anything, your cluster hierarchy or your failure domains can be based on zones, based on regions, based on countries. And you can have strategies based on that. And it's at the right at the core of the infrastructure. Data operations. So it is very well known that Kubernetes upgrades are tough. It's not that tough if you're using vanilla Kubernetes, but history is seen in a lot of surveys. In a lot of surveys, we have seen that people are reluctant to upgrade Kubernetes, and it's a very challenging proposition. With upgrades, if upgrade fail, you also need to roll back. Or how do you mitigate those kind of scenarios? So with this, when you're doing GitOps-oriented style, you don't need to think much, where you have resiliency as well. With that, thank you so much for attending the talk. Questions. We have a lot of time for questions. No questions, anyone? We talked about KCP and Generic Control Plane. What's the guidance in general if I want to embark on something now? Should I wait for Generic Control Plane to observe and start playing around with that, or start with KCP? You're talking about the future? What will be, what you can use? If I had to work on a project now. Now? So now, Generic Control Plane is a vision not completed. So I'll just repeat the question once. So if somebody wants to build a universal control plane now, do they use KCP or do they use Generic Control Plane? The thing is Generic Control Plane doesn't exist now as a consumable, but it will be available in one or two Kubernetes releases. We hope that it would be available in 1.30. I would pin my hopes more when I see the reference implementation and the reference binary to be around in the circles. So if you want to build something right now, KCP is a thing that you can use, but do note KCP is very experimental. So when you use KCP, you have to take care of some gotchas as well. Any more questions? All good? I'll be in the hallway. You have five more minutes to enjoy. The slides will be available on my speaker deck, speaker.speakerdeck.com slash pal na barun. I'll just upload the slides right now so that you can view them and the links right away. And thank you everyone for attending.