 Welcome everyone. This is Hasan Turkan. I'm a software engineer living in Turkey and working remotely for UpBound. I have been working as a back-end engineer, mainly focusing on our hosting solution for cross-plane, which is an open-source CNCF project that I will talk about soon. In this talk, we are going to discuss the minimal control plane components, when we want to use Kubernetes as a general-purpose control plane. Building on top of that, I will present our solution to run multiple isolated control plane instances as tenants on a single Kubernetes cluster. At the end, there will also be a quick demo that provides a practical example of the solution proposed. Okay, let me start with the term Kubernetes as a general-purpose control plane. Kubernetes is the most popular container orchestration tool today. It's API and control plane is primarily built for managing containers. However, thanks to its great extensibility story, Kubernetes is getting more and more popular for managing any type of resource other than just containers. When we say Kubernetes as a general-purpose control plane, we mean using Kubernetes API and its control plane to manage resources that are not living inside the cluster but external ones. Cross-plane is a very good example for this and the underlying reason triggering us to think about the solution that I will present in this talk. Cross-plane is an open-source CNCF project. Cross-plane itself is also extensible with providers. For example, AWS provider or GCP provider, which allows you to only install the controllers you need or to implement your own provider to manage whatever resource you are interested in with cross-plane. But why Kubernetes API? Why using Kubernetes to manage something that is not actually living inside the cluster could make sense? I believe the biggest advantage of using Kubernetes API is to be able to use the same API that we are using to deploy our applications, which enables easy and native integration between our application and whatever resource that we want to manage. For example, this way, we will be using the same API both to provision our infrastructure and deploy our application. Cross-plane already leverages one API for all concept by building models around infrastructure and applications. A very good example of this is cross-plane being an implementation of open application model, which is also known as OEM, which is a team-centric standard for building cloud-native applications focusing on separation of concerns between app developers and info operators. Kubernetes API is easily extensible with custom resource definitions, and there are existing tooling to easily implement our controllers without dealing with common needs like queuing, caching, and even initial project scaffolding. With its techniques API allows you to define the desired states and leave the resource to bring the system into the desired and combine them together with GitHub-style pipelines. By using Kubernetes API, we will also be able to use existing machinery like namespaces, RBAC for access control, garbage collection with finalizers, and so on. With well-defined API crude semantics, we can just really need in our controller. After this introduction, now I will make a quick definition of the problem that we are trying to solve in this talk, and then I will continue with the solution and challenges that we have faced while implementing that solution. Let's start with a simple question. How do we typically run isolated instances of Kubernetes control planes? Okay. Let's start with a simple question. How do we typically run isolated instances of answer is simple? We run a dedicated Kubernetes cluster which comes with its own control plane. However, what if we need to run these control planes at scale? For example, imagine we need to run millions of these. Then running a dedicated full-fledged Kubernetes cluster does not sound like a good idea since we are only interested in its API and control plane. Before moving on to the solution, let's define our problem as running isolated instance. Okay. I'm continuing. So I will start from this one. Here you can see a diagram of a Kubernetes cluster with all of its components. There are working components running on worker nodes and there are control plane components running on master nodes. Do we need all of these components when we want to use Kubernetes to manage external resources? No. Since we are not going to use Kubernetes as a container orchestration tool, we won't need worker nodes. If there are no containers and workers, there is nothing to schedule. So we wouldn't need the scheduler as well. Similarly, there won't be a need for a network set up between containers, which means we can remove Qt proxy and Qt DNS. Cloud Controller Manager is a component responsible for cloud-specific logic for your Kubernetes cluster running on a Cloud platform. For example, bootstrapping a VM and adding it as an event, exposing service endpoints, or provisioning volumes for your application. All of these functionalities are required for containers. If there is no container, then there is no need for attaching a volume to it. So we can remove it as well. Okay. This makes sense. We just need Kubernetes API. So we have API server, and API server requires ETCD for persistency. But what about the Controller Manager? Do we need it as well? Controller Manager contains core controllers that are shipped with Kubernetes. Let's have a closer look to decide whether we need them or not. Here you can see all the controllers available in Controller Manager as of Kubernetes 118. There are controllers like Demon Set, Jab, Garbage Collector, and CertiCamp token, and for our solution, we won't need the ones that are related to container orchestration. After removing those, these are the required controllers that we will need to use Kubernetes API and its control plane for managing external resources, but still to leverage existing machinery that are not directly related to container orchestration. So the Controller Manager binary has a flag to set the controllers that we want to activate, which we will pass this set of controllers for our solution. Coming back to our diagram, these are the components that we really need to use Kubernetes as a general purpose control plane, which looks much more simple than a full-fledged Kubernetes cluster. But we have a problem. How do we run our operators? Which typically run inside the cluster as a container. Remember, we want to extend Kubernetes API with custom resource definitions and our controllers to reconfile on those custom resources to manage something external to the cluster. Just to make it clear, please don't confuse when I say operators or controllers. I'm usually using these terms interchangeably, actually referring to the same thing. Anyway, so let's also include our operators to the set of components that we need to run. Okay. These are the components that we need to run for our control plane. Remember, we want to run isolated instances of control plane. So let's put them into containers as the first step of isolation. And also we want to run them at scale. For example, targeting millions of instances. I guess we will need a container orchestration tool to manage that amount of containers. And let's use the most popular container orchestration tool for this purpose, which is again Kubernetes. Before diving into the challenges that we have faced with this solution, I want to spend some time on this slide to make the two definitions that we are going to refer in the following slide. The first one is host cluster. It is the Kubernetes cluster, which is hosting our control plane instances. The second one is tenant namespace, which is the dedicated namespace that we are deploying components of a control plane instance. And when we say tenant Kubernetes, we refer to the Kubernetes API of our control plane. Let's talk about the challenges. Kubernetes operators or controllers typically run inside the cluster that they need to interact with. However, in our case, the operators run on the host cluster, but need to watch the tenant Kubernetes API servers. This introduces three problems that we need to solve. Connectivity, authentication, authorization, and packaging. To deal with the first two, let's have a look how this is handled in a regular setup where operator is deployed into the same cluster that it needs to interact. Kubernetes provides necessary mechanisms to connect, authenticate, and authorize a process running inside the cluster against its API server, which is called as in-cluster config. On the right-hand side, you see the source code from Kubernetes go-clined for initializing in-cluster config. The two environment variables, Kubernetes service host and Kubernetes service port, allows the operator to find the Kubernetes API server endpoint. For authentication and authorization, Kubernetes generates a token for service accounts of the operator and automatically mounts their token to the port at a predefined path. We don't have to do anything for a regular deployment, but in our case, we need to adjust them to be able to configure against the tenant API servers. We need to generate a service account token for service accounts of operators on tenant API servers and make it available to the operator port running on the host cluster. To achieve this, first, we need to deploy service account into tenant API servers, then service account token controller in controller manager, which is one of the controllers that we kept enabled, generates a token secret, but inside the tenant API servers. Finally, we need to copy this token secret from tenant to host and mount to the operator port at a predefined path expected by in-cluster config. Once we have the token secret available on the host cluster, we can adapt manifest of our operator port as shown on the right-hand side. For connectivity, we override the two environment arrival, Kubernetes service host and Kubernetes service port for connecting to the tenant API servers and for authentication and authorization, we disable auto mount service account token in pot spec instructing that we want to opt out from automatic token mount and mount to token secret for the tenant API server instead. Another problem related to running on host cluster but watching the tenant API server is related to packaging. A typical deployment package, let's say a health chart of a Kubernetes operator contains custom resource definitions, RBAC rules for accessing those custom resources and deployment manifest of the operator itself. However, here, we need to deploy those manifest into different API servers. We want to extend tenant API server with custom resource definition, but we want to run our controller ports on host cluster. So the solution is packaging controllers and type separately. On the right-hand side, you can see cross-lane packaging as an example, where we have two different health charts, namely cross-lane controllers and cross-lane types. Once we have different packages as such, we can deploy controller chart to the host cluster and type chart into the tenant API server. Next challenge is running ETCB for tenant API server. ETCB as the only persistent component of our control planes requires some special care which introduces some trade-offs that we need to think of. So we have three options to satisfy API server's ETCB dependency. The first one is simple and straightforward. That is to run a dedicated ETCB per tenant or ETCD cluster in case of high availability of the requirements. Maintaining and operating ETCD clusters for production requires some operational knowledge and work. With this option, maintenance costs will be much higher since we will have it per tenant. Another concern is ETCD writes data to disk and recommends SSDs in production, which would mean higher cost per tenant, especially when multiple replicas needed. The second option suggests running one shared ETCD cluster on the host cluster, isolate tenants via ETCD practices, users, and roles. This provides a cost-efficient solution and requires less maintenance compared to the first option that comes with its own problems. First, ETCD is not, for example, scalable, which means after some number of tenant ETCD will be the limit for scalability. Second problem is ETCD does not have any mechanism to isolate resource usage between users, breaking isolation between tenants in terms of resource consumption. This is also known as noisy neighbor problem where a resource-hungry tenant could affect other tenants on the same host cluster. And last option is using Kine together with an external horizontally scalable database. Kine is an ETCD shim that translates ETCD API to other databases like MySQL or Postgres. Kine is an open-source project by Rancher and mainly used by K3S, which is a lightweight Kubernetes distro. With this option, we would need an external database and would still need to prove that this setup works well in production. Another issue we need to deal with is accessing tenant API servers from outside the cluster. This is actually not a real challenge but still worth mentioning. For users accessing from outside, we deployed an ingress controller on the host cluster and created an ingress rule in each tenant main space. By configuring ingress as SSL path through, we could keep traffic encrypted all the way down to the tenant API server that is targeted. I also want to talk about security and isolation a bit since we ended up with a multi-tenant solution on the same host cluster. Since our host is a Kubernetes cluster already, we could leverage what Kubernetes provides for isolation, like putting tenants into dedicated main spaces, defining network policies to only allow required traffic inside the tenant and defining pod security policies to enforce pod security. With limit ranges and resource quotas, we could define resource constraints for tenants. We have configured mutual TLS between all system components and we are also running sandbox containers with GVisor which limits the host kernel surface accessible to the application by providing an application kernel. Host clusters can run multiple instances of our control planes. However, this is still limited because there are limits regarding the workloads that a single Kubernetes cluster could handle. Since we are targeting millions of control planes, we need multiple host clusters which means we need another entity who manages the host clusters and responsible for scheduling control planes. That is another Kubernetes cluster which we call as scheduling cluster. All of these processes need to be automated since we are talking about the hosting solution for our Azure service platform and we automated them via Kubernetes operators. Remember, cross-plane project allows you to provision infrastructure directly from Kubernetes API and we are also using this to create required infrastructure for our host clusters which plays very well with our provisioning operators. Last topic that I want to talk about is taking backups of our control plane. Backups are a little different than other day two operations that we are dealing with which I will not mention here because it is kind of proves our approach in a way that backup controller operates on our tenant control plane just like a regular Kubernetes cluster. We are using Valero which is an open-source project for backup restore of Kubernetes clusters and it is also extending Kubernetes API with custom resource definition and has a controller acting on those resources. So with the setup shown on the left-hand side, we can take backups of our control planes. Easily taking backups of our control planes also enables migration between host clusters which lets us to consider host clusters as ephemeral resources and simplifies operational work. All I presented so far already implemented and running in production which I will quickly show as a demo soon. However, there are some future work plans as well. I try to list relevant plans here. First, we are using ETCD and really interested in using Kine instead since it could help scalability and provide operational simplification if it could be used with the managed database. We did some proof of concept work and it looks quite promising but we still need to evaluate it regarding using it in production. As of our control plane, we are running three components Kubernetes API server, controller manager and cross-plane operators. And we want to investigate whether we can optimize them into a single binary at compile time by also making further optimizations which could help in terms of resource consumption. Last one, using cross-plane composition for provisioning our host clusters. Cross-plane composition is a recently introduced powerful concept that allows us to define composite resources by combining multiple resources as a new type. As mentioned, we are already using cross-plane to provision our infrastructure but currently we have a dedicated operator who provisions host clusters by orchestrating infra resources and deploying applications. As a future work, we want to replace this with a cross-plane composite resource which could simplify our code base. With that, it's time for demo. I have two quick demos. First from terminal, I will provision my local kind cluster as a host cluster and create two control planes as tenants. Second, I will show things in action by creating a dedicated control plane from the UI of our service. So now I want to share my screen. Let's see. I need to give a permission. I need to reconnect, I guess, since I'm already connected from phone. Okay, I'm trying to log into the platform again. Okay, I think I'm back. Now, can you see my screen? Can someone please come here? Okay, cool. So here's, I think you should see the terminal and there is a PNG file which shows flow of demo. So right now, I don't have any kind clusters and now I will call setuphost.sh which is making the provisioning that is shown on top in the PNG. What it does is it is creating a kind cluster and then deploying the host cluster components which is the cert manager for provisioning certificate for ETCD and Kubernetes control plane components and then it's deployed a shared ETCD on the host cluster. So now it is provisioning the kind cluster. It should be ready soon. Okay, kind cluster is ready. Now it is pulling required images for cert manager just to speed things up and now deploying cert manager in cert manager namespace and wait until it is ready. This could take a couple of seconds because kind control plane takes sometimes so kind control plane to be ready. Maybe I can quickly talk about what is the next step. Once the cert manager is ready, ETCD will be deployed and then I will run another comment which is create tenant.sh and give a tenant ID, let's say one and then it will create a dedicated namespace on my cluster and deploy the control plane components for me and then I will, I'm also deploying a debugging component just for simplicity and with that debugging component what I will do is I will connect to tenant API server and show that actually I have a dedicated isolated control plane but let me quickly check what's wrong here. Let's see how we get managers. Okay, it just starts working. Okay, now cert manager being ready. It is deploying ETCD and once it is also ready I will continue the next one. Okay, so the host cluster is provisioned. Now I will go ahead and create two tenants. This is the first one with ID one and then I will create another tenant. So let's see, here you can see the dedicated namespaces for tenants and let's just check one of them. We have two control plane components and one for just for debugging. So let's wait until they are ready. Okay, now I will exec into the QIP debug port which is nothing but a simple containers with QIP CTL in it and configured against tenant API service. Now I have a dedicated cluster view here and this cluster or it is actually not a real cluster does not have any nodes. And let's say I can create a one and here you can see I am in tenant one and I created a dedicated namespace and if I switch to the other tenants I will see that I had a completely different cluster view which is dedicated to this tenant. So yeah, this is actually the first demo that I plan to do and the second one is showing things in action. So this is the platform that we have built. I would highly suggest you go and check out. It is in community preview and you can just check it out freely. So I will go ahead and create the control plane here. Actually we call it as environment and what I talk about so far is actually what is going on behind the scenes. So when I create this environment behind the scenes our operator goes and provisions those dedicated control planes and also deploy cross plane so that I can just start creating infrastructure resources directly from this platform which also comes with some providers like here. You can see different providers so I can just once my environment is ready I can just click one of them and extend my control plane to support this infrastructure resources of this provider. So it is usually taking around one minute so I will just wait. Once it is ready what I will show is I will deploy a provider which will extend for example for let's say AWS and then I will connect to my environment by following the instructions on the screen and then verify that actually those custom resource destinations are available for me in my control plane. So yeah, now my environment is ready so let's go ahead and deploy a provider. I said AWS so just deploy AWS providers from here. So why which is being deployed? Let's connect to our environment of this. I have already set my access token before the session. So okay, so now I should see my cluster view first trip CTR request is taking a bit longer. Okay, so this is my brand new Kubernetes control plane and let's check if it could really deploy AWS resource, let me provide the resources. So yep, as you can see my control plane has just been extended to support this brand new AWS custom resource definition. So I can just go ahead and create an S3 bucket just from my control plane. And let's also see that cool batch here like provider is available here. Yeah, so that's all again, I would highly recommend to you to just sign up. It is live in production and it is now community preview and you can just go and check out. So with that, this is end of my demo. So now I will check the question. Okay, one of the questions, does this demo available to try on GitHub? Right now it is not, but this is definitely something that I have planned. So I will just do that after the session in a couple of days. So you can just check my GitHub profile for that. Okay, so another one can sort manager and ETCB be in the same name space. Yes, there can be, there is no reason for running them in different namespaces. I just follow some, I don't know, guides and end up having them into the two different namespaces. Okay, can we change cloud providers in a tenant? I guess you mean the support of cloud provider. This is what I understand. Yeah, as I showed, you can support multiple cloud providers like deploying multiple packages as I just showed. So this means that you can support or you can have as many cloud providers as you want to support with your control thing. So if I got the question correct, this should be the answer. Another one, oh, I see for provisioning cloud resources things that cannot run in containers. Yeah, if I got this question correctly, we are provisioning cloud resources from Kubernetes and you can consume those cloud resources with applications. You can just check out the cross-lane documentation for more information. But of course, for sure, those provisioned cloud resources are ready for any application to consume. So, okay, one Turkish friend is saying hi, hi. And is this easier? Better than moving to tenant with those containers. The question is, this is brilliant, but stepping back, is this easier, better than moving the tenant workloads into containers? I am not sure I got the question correctly. It would be great if you could rephrase and I can check it again. Another question, how do you constraint the resources used by different tenants in host clusters? Yeah, as I said, we are using the Kubernetes real-time mechanisms for this and there are limit ranges and resource quotas and you can just define resource quotas and say that, okay, this tenant cannot use, let's say, more than one core in total and more than two gigabytes of memory. So, Kubernetes has these mechanisms already and we are just using them. Okay, I think with that, I have answered all the questions that I could do. So, this is the end of my session and thanks all for joining and you can connect me from Slack and we can continue discussing from Slack. Yeah, thanks all and see you.