 KubeCon plus cloud native North America 2022. Hello, my name is Joshua Packer and I'm a developer at Red Hat. I'm excited to be here with you today at KubeCon 2022 to talk about the open-cluster-management.io community. I've been involved in this community since its inception and now as its present status as a CNCF sandbox project. Its core focus is on multi-cluster fleet scenarios while being cloud vendor agnostic. At its core, this project enables you to inventory your clusters, distribute work while being vendor neutral. Again, it works with almost any CNCF compliant distro. With this inventory of fleet, though, you can extend your day-to-management using our policy configuration and application add-ons that are also a part of the community. You can still build your own add-ons, though, as well, and distribute those out into your fleet as needed. There will be a separate presentation covering both these areas of policy and application in the KubeCon virtual presentation catalog. But in the next 30 minutes, I'm going to do a demo of how to get started with open-cluster management and talk about some of the future work that's being done. Lastly, this is a community of many and our governance board reflects that, as well as we have new developers and or contributing companies joining every day. And so I encourage you to come visit us at open-cluster-management.io, which is the page you see now in the presentation, or in our GitHub organization where you can actually contribute. And so again, open-cluster-management.io is the organization. There's the OCM repo that has details, our community repo where you can find the schedule for our weekly calls, as well as our API definition for that inventory control and our enhancement repository if you'd like to submit an enhancement and or contribute in general to the project. So over the next 25 minutes now, we will dive into open-cluster-management and take a look. Well, let's get started. All right, let's dig a little deeper. Let's look at how open-cluster-management works with your fleet of clusters. When we look at this diagram, we see the layout of how open-cluster-management is deployed within your fleet. We start with what is a hub cluster. This is a single entry point where you're able to manage your fleet, which could be one, two, three or many clusters. On that hub cluster, we run the registration controller, which takes care of making sure we have the inventory. We also run controllers like placement and work. These allow for targeting of clusters as well as putting manifests or payloads of manifests onto clusters. This is also where the policies and the application add-on control. This is if you use our community configuration policy or application subscription policy, as well as there's a proxy add-on that can be used. All of these controllers run on the hub to facilitate the add-ons capability within the fleet. And then on each of your clusters that is going to be managed by the hub, we call those managed clusters. Literally, there is a resource kind called managed cluster that you create on the hub side. In each of these managed clusters, we have a clusterlet. This is our name for the agent technology that is pull technology that reaches out to the hub. So you define the installation of the clusterlet, which is done via the CLI or Kubernetes resource kinds. And that clusterlet deployment or controller, as it comes on the line, reaches out to the hub and tries to register and once accepted will start to look for work as well as report on inventory. And so it's a pull as it pulls in desired work or desired instructions that it needs to apply. And it pushes back status on that work or from the different add-ons that may or may not be deployed. In this case, the add-on agents. So in this diagram, we have one hub controlling three managed clusters. Each of them able to do work as well as report on status and run agents if they're defined on the hub. What this does is give you a centralized single management point through an API in Kubernetes with which you can manage the entire fleet. So now let's try doing a demo with this and see it in action. The demo that you're about to see is an automated version of this quick start guide. The quick start guide shows using kind cluster demonstrations, which is CNCF compliance, the provisioning of a hub, as well as a managed cluster or target that we can deploy a piece of work, which is Kubernetes manifests on. Let's get started. First things first, we're going to set up the kind environments. We're going to have a cluster for our hub, as well as a cluster for our managed target. The managed target, in this case, we're going to use one, but you could have a number of those and under standard operating procedures, you would likely have three or four or five clusters that you're trying to manage, although, again, it can work with just one or two clusters. Once the two clusters are deployed, we will activate the hub technology on it, which is the controllers that we mentioned in the earlier diagram that take care of registration, which is the inventory, as well as the work which is being able to apply Kubernetes resources into that fleet. The hub is now created and we see that the managed cluster is being set up. Once that's done, we will grab the CLI and we will start to install open cluster management in the fleet. All right, we're going to set some environment variables that we can use for both the cluster name and the hub name, and pull the CLI tool and install it. CLI tool just simplifies creating a number of Kubernetes resources into a single terminal command. So first things first, we're going to deploy the hub cluster, which is the controllers for registration and inventory. They're also the controllers that manage placement and benefit. So these are being activated, they're being pulled in from our open source repositories. Once they're up, we'll take a look and see that the actual pods for the controller are running. So here we see the main cluster manager pod has been set up, as well as the rest of the control plane is now present. So we have the placement, like I mentioned, the registration, as well as the work, the webhook, which allows you to apply the manifests into the fleet. We also get a command that we can use to join clusters, where you would paste the command and then add the cluster name and any additional parameters you might. There's also installation details that can be found, and those are right here on the cluster manager resource. I tailed it to 15, but for each of the CRDs, as well as the deployments, their status is described on this resource. So let's get started. We're going to register the cluster. This is cluster one, which is going to be our managed cluster. So we select the command, paste it, as well as we give the name of the cluster, which will be cluster one. We set the context for where this runs. So we run the command on the managed cluster, and we use the force internal endpoint lookup for the DNS because we're running them. So we execute that. What this means is some CRDs are being created, as well as some deployments, which will bring the agent controllers online. Once the agent controllers are online, they will reach back to the hub and register a join request. That join request is done via a certificate signing resource. That resource will need to be signed on the hub then, and once it's signed, you will be able to, or it will continue the registration, at which point you'll be able to see inventory, as well as apply work. So let's verify that the controllers are online. So here we see the cluster registration agent. This is running on the managed cluster. We see that that was deployed and is running. And now we look for that certificate signing resource, which we see was created, but has not been signed yet on the hub. Again, here we see the context of the hub. And so let's do that signing. So we're going to accept the request, which signs the CSR. We did that on the context of the hub. Now that the cluster one has been signed, it will be imported. We can validate on the hub, looking at the managed cluster resource that it has been joined successfully. We see it's set to true. We see the status for the agent is available. We see it's been there for 60 seconds or so. And we see that the hub has accepted the cluster, meaning the administrator is allowing that cluster to the infrastructure for management. So now we're going to execute some work, which is to apply a manifest, and that can be a custom resource definition or a custom resource on the managed cluster that we have now present in the hub. This is done for each managed cluster. There is a namespace created. Any work or add-on related tooling is created as Kubernetes resources in that namespace. We see here there's the cluster one namespace that we're going to be using. We see we're going to create a manifest work called work test CM. CM is for config map. And inside the payload for that manifest work is a config map called work test CM that has a value, key value pair of put here by manifest work. So now we're going to apply this manifest work. It is applied on the hub in the cluster one namespace. So here we see the cluster one namespace. We see the YAML that we just had a visual of above. And we see the context is on the hub and that it was created. And so now we'll see and check if on cluster one the config map was in fact created in the default namespace. So here we have the default namespace separate or specified, as well as the context on the cluster. And we see 22 seconds ago that the work test to CM, which matches the name of the definition, config map was created by that manifest work. Now, if we had deployed something like a deployment, we could look for status back on the pod, same thing for replica sets, as well as services or even custom CRs. And you can path back the status into the manifest work. So only by connecting to the hub can we apply work to our clusters in our fleet, but also receive status back for the clusters that are in our fleet as well. All right, let's dive a little deeper into open cluster management. This hub spoke architecture where you have both a hub or central point of management and manage clusters, the fleet of clusters that you want to work against, collect inventory on or use with extended add ons. One of the key kinds that we find here is the manage cluster kind. This exists on the hub and is a representation of the cluster. It is what the cluster lit, the agent we saw in the demo connects to, as well as where the cluster lit provides status on how the system is doing. This is a cluster scoped object and is a clear representation of the fleet that can be queried against in Kubernetes. Managed cluster set is a kind we haven't spoken of yet, but you can think of this as groups of clusters. What you do is you use a managed cluster set to group a number of clusters and then you can grant users access to that managed cluster set or you can bind that managed cluster set to name spaces and any resources in that name space that are cluster set aware will be able to access the clusters. This is a way for managing access to your clusters at scale. Instead of having to try to individually apply roles and role bindings to each user, service account, etc. when you wanted to work with clusters, you only have to do that once to the cluster set via a binding to its name space and then all clusters present in that cluster set can be accessed. This gives an easy way to group systems such as production, dev, test, by region, by data center. Pretty much the sky's the limit and the labeling that's available in those clusters can even be used to bind them to a specific set. What this access control of cluster sets or grouping allows for is then placement which subdivides that. Placement is the idea of finding clusters based on specific labels, claims, taints and tolerations. Cluster claims are resources created on the cluster either by the cluster lid or other add-ons while taints and tolerations describe where information of the clusters and the nodes about where things should run. There's also the latest, which is the extensible section in prioritizers that we'll talk about in a second as well. But to go back to how placement works, placement is about filtering within the concept of the cluster set. So you have a number of clusters, say test clusters that are available in a cluster set. You can apply your app to one or more of those clusters using labels, labels such as North America or labels such as a region like U.S. East or even more detailed labels such as UAT or Josh's, since I'm the presenter, cluster. And so any key label that you find in Kubernetes that is used as selectors and deployments and stateful sets, et cetera, this same concept applies for placement resources. So you create a custom resource that is a placement, you put a label selector on it and it will produce a placement decision and that placement decision has the list of clusters that match the labels or the taint or the prioritizer. The extended multi-cluster scheduling capabilities have to do with you can add to the scoping and the types of decisions that are made. And so what you're able to do is look at different aspects on the managed cluster, be it from the hub cluster or directly on the managed clusters themselves in the form of a controller and return those as a placement score. Placement then is able to consume those scores and produce a list or at least an order for that list of clusters that can then be consumed by an add-on or other controller that needs to do work. The extensibility of scheduling means that the sky's the limit of what you can create for grouping and filtering your clusters within your cluster sets. Next, we have proxy. Proxy is just as it sounds. It is an add-on that's available for running on the managed clusters and it exposes the API server for that cluster on the hub. What this allows you to do is traverse difficult network situations and or to expose just the API server to the cluster to the needed individual. This is based on the API server network proxy that is available in Kubernetes but extends it so that there is a hub-side entry point as well as an agent-side add-on that exits the proxy and connects it to the API server. The last piece we'll talk about is the managed service account. The managed service account add-on gives you the capability to define a custom resource on the hub-side that results in a service account being created on the managed cluster-side. Once this service account is created, the token is made available on the hub-side giving you a means to connect to that cluster. You can set the expiration for this service account and you can also use manifest work to deliver a role instead of role-bindings for that service account as well. This ties in well with the proxy I just mentioned in the sense that you can use the managed service account add-on to create a service account, a managed best work to deliver the role-and-role bindings for that service account, and then have your controller make a connection from the hub to the cluster using that service account token via the proxy. And so all of these put together give you a perfect conduit for connecting to and managing within your fleet of clusters. Now let's talk about some of the new feature work that's underway. A prototype that was presented on August 19th, 2022 by one of our community contributors, Yu-Chen, is about stand-alone OCM control plane. Today the control plane lives on the hub and is pods that are running as controllers that manage the inventory that we've talked about earlier in this presentation as well as the add-ons and the placement capabilities. What wasn't maybe made quite clear is that the data plane or what stores the inventory data has always been the Kubernetes API server of the cluster where OCM or the registration controllers are running. And what this prototype does is turn that not on its head but takes that data plane that is the hosting cluster and moves that into a virtual API server. So instead of using the API server and etcd of the hub server to store artifacts like the managed cluster, the policy add-ons, etc. We make a change and use a virtual API server running on a cluster with an etcd backing that is also running on a cluster. This was made possible by changes that were made to the registration controllers the placement controllers as well as that are being contributed as I present this changes in policy as well as application to run in what we call hosted mode. What hosted mode gives you is the ability for the controller to run on one cluster but point its API input to another URL. This allows the data plane for these controllers to be a virtualized API server. That could be a completely virtual API server pod and etcd like there is in this example if you go and watch the video demonstration. It also means that other virtualized control plane pieces can be serviced by the controllers as well. This is exciting work and represents to a degree where we think that some of the control plane pieces will be going over the next 6, 12 and 18 months. Hosted mode is a new term you're going to start to hear more and more in open cluster management. This applies to both the Hub side controllers as well as the agent side controllers. Hosted mode is the idea that a pod may be running in a Kubernetes distribution but its data plane or the API server that it's pointing to is different than that Kubernetes where it's running. This could be a virtualized API server or in theory just another Kubernetes distribution that's running with a different API server URL. Hosted mode for the controllers on the Hub fit into grooves such as the standalone mode that we saw where you had a virtualized API server as well in some investigations we've been doing using KCP as a data plane for our controllers. Hosted mode also exists on the agent side and hosted mode on the agents is important because up until this point we've talked about how an agent is a pull technology by default. The agent controller runs on the managed cluster, it connects back to the Hub, looks for work and pulls that down to its Kubernetes distribution and applies it. What Hosted mode does is flip this on its head and imply that you can run the cluster lit agent on the Kubernetes that is hosting the control plane so the Hub itself and it will connect to an API server that is different than that cluster which in this case would be the managed cluster or remote cluster and so in this type of scenario there is no actual pods running on the managed cluster itself from the agent and the cluster lit. They are running either on the Hub or maybe even on an intermediary Kubernetes distribution and then those pods are connecting out to the final managed cluster and doing their work. It still means that the CRDs for manifest work etc are applied to the system but the actual pods that are doing the execution are running outside of that cluster. This works well for special cases where you might need to push information or where you've got specialized control planes where you can co-locate the agent pods as well as control plane pods in certain forms of virtualization and so hosted mode on both the agent as well as the Hub side is going to become more and more of a prevalent topic and we'll be diving into deeper over the next few months. One last use case for hosted mode is to support multiple hubs. What I mean by that is you can have more than one Hub running its cluster lit for a managed cluster in hosted mode and those cluster lits which run on those two individual or three individual hubs can all point to the same managed cluster. Now you need to imply or in part roll bindings and roll definition to keep the cluster lit agents from stepping on one another but if they all are running at the same version the CRDs applied to the managed cluster will work as well as each of those agents will be able to apply work and or other add-on pieces assuming they are designed to do the same push mentality you'd be able to apply those from different hubs to the same managed clusters and so this again is a prototype that's underway but is one of the possibilities when you run hosted mode for agents as well. Another integration we've been working on is with KCP dev. We take a look for those of you not familiar with what KCP is. It's a Kubernetes-like control plane focusing on independent isolated clusters known as workspaces. These are virtualized API servers and these enable an API service provider that is contained I guess we'd say to the minimum set of APIs required to be able to obfuscate the clusters. This is key for things like development and or security controls with tenancy where you want to expose just a small set of APIs to a user via a Kube API while differentiating or keeping the underlying clusters and or cluster separate. So what does that mean with OCM? So in OCM we did a type integration that's available today for folks to try. And what we did is since OCM is aware of all of these clusters in the fleet, we made a way to install the sinker. The sinker is the key part that connects these virtualized API workspaces in KCP with your actual clusters under the covers. And again, since OCM is about fleet management and has knowledge of all these clusters, it only makes sense for us to do an implementation that will help connect these virtualized workspaces to the fleet of clusters that we know about. And so on the OCM hub we do both a cluster set where we bind a number of clusters to it, and then we bind that cluster set to a specific workplace in KCP. And then what happens is on the OCM side when the administrator places clusters into that cluster set, the KCP OCM integration controller makes sure that the sinker is provisioned on those clusters. The sinker points back actually to a location workspace on KCP which then makes it available as a cluster target or sink target for workloads in these virtualized workspaces. So to see a little more detail of how that works, you have the KCP OCM controller which is watching the workspace and through cluster management add-on gets you a managed cluster add-on that is responsible for putting the sinker with a manifest work down on the managed cluster and also connects to the managed cluster set with a binding and to a namespace. And it's that managed cluster set that's actually the key. Having that linked to a specific workspace is what allows the different clusters to be chosen as sinker targets. And so any clusters in those managed cluster sets with the appropriate labeling on them will receive the KCP sinker which points back to the location workspace and is able to surface and surface the virtualized API control plane and therefore any CRs or other resources that are activated and exposed in the KCP. So this is a cross-community implementation and something that we use internally already and have verified as working. Now this project is changing constantly mainly because KCP is under heavy development. So ever so often it is possible that there is a break but we have been actively maintaining it and as of the last two releases of KCP we've had this capability functional and have been using it actually to test out and build KCP development environments for that community as well. So it's a nice example of some cross-community work that's going on as well.