 Hi, everyone. Welcome to SIG multi-cluster intro and deep dive. The three of us here today to give this presentation, for me, I'm Laura Lorenz. I'm from Google. I'm Paul and my name appears second in the list. And I'm Jeremy Omsett Thompson. I'm also from Google. All right. Hello. Let's talk about what we're going to talk about. So as Laura has just introduced, this is an intro and deep dive to SIG multi-cluster. So we're going to cover what is the SIG about. We're going to talk about what is the current state of play? What is the current activity within the SIG? We're going to touch on cluster set and namespace sadness. We're going to talk about cluster ID and cluster set membership. We're going to talk about multi-cluster services API and multi-cluster DNS and more than that, as you will see in the coming moments. We're going to do a deep-ish dive and we're going to have some demos. We're going to talk about the about API, also known as cluster ID, cluster property. We're going to see a deep-ish dive and demo on multi-cluster DNS. And we're going to talk about how to contribute to this SIG. Okay. So what are we about? Multi-cluster is still a new space. We're seeing more and more multi-cluster deployments in the community. And it's clearly where the community is going. But we still don't have those best practices or tools that we have in single cluster. If you want to know how to do something in a single cluster today, there's tons of documentation, tons of best practices out there. We just don't have that yet for multi-cluster. We're trying to work on, as SIG multi-cluster, we're trying to work on building the tools and primitives and workflows that you'll need to safely deploy your applications across multiple clusters. We want to work with you to help figure out how to connect services between clusters, how to replicate workloads across clusters, and how to manage the rollout of the deployments across your multi-cluster group. We made some great progress in areas that we're going to talk about here shortly. But we really need your input. Real-world use cases help us pick what to focus on next. Many of our projects are still fairly early stage in a room to steer in new direction. We're looking to figure out what the next set of projects should be. And we want you to come, you know, join us. Tell us what you're working on. Tell us what problems you have so we can help you solve them. So really important to the SIG multi-cluster group is our approach. And this has been learned over the years of just trying to make sure that we're focusing on problems that we can really solve right now, the use cases that people have today, and set ourselves up well for the future. So just a couple points about our approach as a SIG is that it's important to us to avoid premature standardization or to solve lots of optional problems. Multi-cluster is so new, but also there's so many new variables that come into play when you're dealing with multi-cluster deployments. It's really important for us to kind of focus in on what problems people have today. And we really try to focus on specific functionality that we want to build, some of which we'll be downloading for you later. And then if we can work backwards from these specific problems into something bigger, then that's the direction that we want to take that gives us something that's really applicable up front, but it can give us some insight and information that allows us to work backwards into the larger abstractions. So let's talk a little bit about the current activity and the projects that we are working on today. So the first conceptual basis and foundation for a lot of the multi-cluster work that we're doing is this term called cluster set. And cluster set is a word to represent a pattern of use that we see all over the place for people with multi-cluster deployments. It's some group of clusters that are governed by a single authority. And importantly, they have a really high degree of trust within that set. And this is something that we can leverage when we're describing different multi-cluster tooling or APIs that we want to come up with. We can kind of focus on this cluster set basis, this foundation, as this group of clusters that have high degree of trust in our governed by a single authority. So one kind of important side effect of the cluster set or item about cluster sets that's really relevant to all the work we've been doing is the multi-cluster is the idea of namespace sameness. So once we get into multi-cluster deployments, we need to say something about if the same namespace is in two different clusters, how, what are the properties about both of them now that these clusters are working together. And the link in here is for the position statement from say multi-cluster about what namespace sameness is. But basically the short version is that for a given namespace in different clusters, they're considered to be the same. Their permissions and characteristics should be consistent. And even though the namespace is in a cluster set, not every namespace has to exist in every cluster in the cluster set. If they do, they should behave the same across the cluster set. And this is what gives us the ability to operate on a multi-cluster deployment in some similar ways as a single cluster deployment. And a new update for this round of QCon is about the cluster's cluster set membership being stored in this about kates.io cluster property called clusterset.kates.io. So we're going to talk about it on the next slide and deep dive later. But this is where this actual property about a cluster set membership will be stored. So when we think about sets of clusters, one of the things that we need to establish that concept effectively is to talk about the coordinates of clusters within that set. The name that we have given to that concept is cluster ID. And if you want the details, you can look at cap 2149 or you can look at the 6.kates.io about API link, which will send you to the git repo where the about API is coming together. So about, if you haven't guessed, about is the name of the API group. There's one resource within this API group called cluster property. It's a cluster scoped CRD. And there are two special cluster properties that are meant to allow discoverability within the cluster of information about the cluster set and cluster identity of the cluster within the cluster set. So the first one is id.kates.io, which allows a cluster to identify itself. The next one is cluster set.kates.io, which allows a cluster to identify the cluster set that the cluster is part of. The cluster ID uniquely identifies clusters within a cluster set for the lifetime of membership within that cluster set. And it provides a reference for multicluster tooling to build on within a cluster set. So for example, it allows to disambiguate backends for headless services between clusters. It's a coordinate to use for scheduling work and it's a possible annotation for metrics and blogs associated with that cluster. Now I'm going to talk about the multicluster services API or MCS, which some of you may have heard us talk about before. The MCS API was the first reset into this new approach that Laura mentioned before where we just focus on specific problems. And here services seem like a logical place to start. You have services deployed in multiple clusters. You want to connect them together. This is the basis for a multicluster deployment. And it's a specific problem that had real use cases for us to focus on. So the MCS API builds on the concept of namespace statements that Laura mentioned and lets you consume a service just by knowing its name across multiple clusters. And we actually use the service name and the namespace to link services together. So you can even deploy a service with the same name in the same namespace in multiple clusters and consume them as a single service with just distributed backends. The API focuses only on the API and common behavior. We don't have a de facto implementation. And this is by design. We wanted to allow room for various implementations. Different platforms, different environments have different needs. Some have flat networks. Some have many networks that need to be stitched together. We wanted to leave room for implementers to do what works best in their area. But provide a common API that you can use to describe how you want your services to be exposed and how you want to consume them. This leaves room for centralized or decentralized control planes depending on what you need. But the idea is that a consumer of the service only ever has to rely on local data. So they get a familiar interface and they don't need to understand the topology of all the clusters. The result is that we have the extension of cluster IP and headless services that just basically work as you would expect across clusters now with a new name with a cluster set name instead of a cluster local name. And Laura is going to dig into DNS in a little bit here. We have some other initiatives going on as well. We are continuing to work with MCS to figure out how to extend the multi-cluster services API to work with network policy in a multi-cluster environment with policy applied across clusters. We are working on figuring out how to address the multi-network scenario where you are stitching together clusters on different networks. And whether or not that even needs to be something that gets represented in the API or it can be left up for implementations. Again, we are trying to just focus on the core problems that need solving and not be too descriptive about how exactly the problems need to be solved. So we don't shed any doors unnecessarily. We are also looking into what it would take to build multi-cluster controllers with replicas and multiple clusters. And I think the big thing that has come out here is we need some form of distributed leader election. This is something that is very new as an idea, but we are looking for help building it. So come join us. Come help us figure out what to do here. We have the work API which is focused on spreading groups of resources to different clusters so you can roll out your deployments. And, of course, we have CubeFed. CubeFed has been around for a while. It has some users who have been interested in a long time, but we have seen it kind of go a little bit into maintenance mode and we are considering archival as we have some new alternatives emerging. So come join us. Help us figure out what those alternatives should look like. All right. So I'm going to take us through two deep dives and demos of some of the work that's been going on. And we're going to talk about the About API and show you what that looks like and multi-cluster DNS and give you a demo of that as well. So first off with the About API. So as mentioned before, the About API is the source for the entire design is in Kepp 2149. And it's now available at sigskates.io slash About API, which is what I'm going to demo for you today. And again, we mentioned it's a cluster-scoped cluster property CRD of just name and value used for the purposes of uniquely identifying clusters and their membership in a cluster set. And more concretely over here, I have some examples of what those About API resources may look like. So again, they're in this About Kates.io API group. They're a cluster property kind and they just have names and values, right? Names and values. And also down here, names and values. And the two special names that we've been talking about is id.kates.io, which these two examples up here are referencing and clusterset.kates.io. And id.kates.io is to represent the name of the cluster, whereas clusterset.kates.io is to represent the clusterset membership of that cluster. So the exact specification of these values are actually quite flexible. So you can see there's actually two examples up here of what it might look like for an id.kates.io resources value. We recommend the cube system namespaces uuid for uniqueness purposes, which are described in some more detail in the cap. But as long as the values meet the properties laid out in this cap, regarding uniqueness for the length of the membership in the cluster set, then it can technically just be any value that meets those properties, even though this one is a likely candidate. So what I want to also mention here is that this cluster property kind, this just name value, where value is very resource specific and only those two id.kates.io clusterset.kates.io have any specific properties defined in the cap. It's important to point out that actually in the cap other uses are allowed for this API. So you could store any arbitrary properties that you want in this cluster property CRD. And as long as the names follow the cap guidelines, which is basically not to conflict with any of the well-known properties of which again we have those two id.kates.io clusterset, but they must use a suffix and cannot use the reserved kates.io or kubernetes.io suffixes. But this means that you could use this to store any arbitrary properties that you want to about your cluster in sort of the centralized CRD and make it a little bit easier for you to access them using the about API. Some ideas that were mentioned over the course of the cap were, for example, if you wanted some sort of fingerprint for some specific company or implementation, the important part is just to name it whatever you want dot your suffix and then you can put whatever you want in the value field. Or under discussion right now regarding multi-network, there's an idea of potentially storing network information. So this is just a potential way that that might look in the CRD. So any sort of structuring use in here is possible and we're definitely interested to hear what people might want to leverage cluster property for and happy to brainstorm with folks about that. Both for single cluster purposes and of course for multi-cluster purposes. All right, so I want to give a quick demo of the API right now and major shout-out to Ishmit who did a lot of the work on this implementation. I am just the messenger, so let me change my screen. All right, so to demonstrate this, I am using a kind cluster here that doesn't have any CRDs installed at all. And right now I'm in a local copy of the about APIs. I've cloned the about API from case.sig.io here and I'm going to run this make file to install the CRD into this cluster. So this CRD was built with cube builder, so folks who have used cube builder before, this will probably look very familiar and all the boilerplate that's inside and generated because that's inside this. This GitHub repo, but if you haven't used it before, once you get into this cluster property directory, you have this make file that you can run, make, install. As long as you have controller gen and customized in the right places, then the make file will apply the CRD. So we can see now that the CRD is here, clusterproperties.about.case.io. And again, it's just that name and value. So I actually have the source here from the CR, the custom resource definition in this repo here with everything being just these two values, right? Or these two fields, I should say, right? Okay. So what I want to do next is to actually switch over to another terminal where I'm in the samples directory because in here I have this example cluster property, which I also have opened up here, which is again, from the slides, like very simple, just this name and value. And in this case, I'm using a value that looks something like a cube system, namespace uid. So if I go ahead and apply that in here, then now we have this cluster property in about the case.io API group that's called id.case.io. So the whole thing here is that now I can get my cluster property called id.case, got to spell it right though, id.case.io. And I can give myself a look at it all here. And this is how I can directly access this value right through this cluster property API now. So I'm doing here from cube cuddle, but you could of course do this from any Kubernetes API client. And this is how you can just access whatever id resource that you or another implementation has put in the CRD. So that's the idea behind the about API symbol and suite. This is a demo with again id.case.io that needs to follow some certain properties. But again, you could put any cluster property you want in here and use it in your controllers as you like. All right. So next I want to talk a little bit about multi-cluster DNS. So I have some slides here that show two of several parts of the specification. So we have the cluster set IP case that Jeremy mentioned earlier for MCS API and just switching to the next slide really quick, the multi-cluster headless case. So I've just pulled up the a and quad a record definitions. But the specification also has information about serve records. But the idea here is that for a cluster set IP service, one that you have exported with the MCS API. So that for example, over here in cluster A, if you have some pods that we're calling blue, number here in cluster B, we have some more pods that we are calling blue. We want to be able to treat them like they're the same service from cluster B, we should be able to get back ends, blue back ends from cluster A, from cluster A, we should be able to get blue back ends from cluster B, etc. And from a DNS perspective, how this works is that the cluster set IP is that we have a DNS name that looks very similar to the DNS names that you're used to for single cluster, blue.test.svc, but you'll see the change here is that it ends in cluster set.local instead of cluster.local. So the idea here is that this is very similar to what you are already used to when single cluster deployments and you can just sort of seamlessly drop in this cluster set.local zone at the end and the MCS API and the MCS controller will route this DNS name to the VIP that will go to any of the associated back ends in any cluster. So same idea for yellow, if we have yellow back ends over here and yellow back ends over here, one DNS to rule them all, right, we can get to the cluster A back ends or the cluster B back ends with just yellow.test.svc.cluster set.local. And on the multi cluster headless side, for the aggregate DNS names, same type of story, you can get all of the information from cluster A and cluster B. How headless A and quad A records work is you get back all of the individual IPs. So this works the same for multi cluster headless, if you query blue.test.svc.cluster.local for back ends that are all headless services, then you'll get back all of the IPs, the pod IP for cluster A1 here, the pod IP for cluster B blue 1, blue 2, and blue 3, right, you'll get all four of them so that you can do what you want with them. Same thing for yellow.test.svc.cluster.local. And then the example down here is showcasing how we handle pod DNS. So for individual, to access the IP for just an individual pod, in normal single cluster headless services, the host name plus the rest of the normal aggregate DNS, the service name, name space, svc.cluster.local is how you can disambiguate between the different back ends by DNS in a single cluster, but in the case of two clusters or more clusters, right, multi cluster, we need one more piece of information to disambiguate. We need to know which cluster pod we want to go in. Go what clusters pod we want. So you can kind of see the problem here of cluster A if it has blue 1, cluster B has blue 1, then just saying blue 1.blue.test.svc.cluster.local is not specific enough. So you just throw that cluster name in there too. So this single IP gets pulled back from blue 1.cluster A.blue.test.svc.cluster.local, whereas this pod IP over here in cluster B, even though it has the same host name, we get just its IP from blue 1.cluster B.blue.test.svc.cluster.set.local. So this really clearly showcases the use of that cluster ID for the ability for a cluster to know what its own name is so that we can actually put this cluster name into the DNS records for the purposes of multi cluster headless services. All right, now I would like to do a quick demo of the multi cluster DNS plugin for core DNS and major shout out to Jeremy who did all the work on this one. All right, so over here I have a version, I have a same kind of cluster and I have a version of core DNS deployed that has the multi cluster DNS plugin installed and I have it configured, you can see here, to forward the cluster set.local request to the multi cluster plugin. So with this configured and overall the core DNS installed on my cluster here, having this plugin in here, now any request to a DNS name that ends in cluster set.local is going to be sent along to the multi cluster DNS plugin and what it will do is then base the IP that it's going to respond with on the service imports that are present in this cluster. So I also have a service import that I have dropped in here and actually think up here. Yep, I have an example of what that service import looks like. So my service in the demo namespace, it's a type cluster set IP and it has this fake IP attached to it. But the combination of our multi cluster DNS plugin and being able to see the service import that has this name and namespace and knowing that this is the VIP for that service import, this is what we're going to get the response back for a DNS name of the form my service.demo.svc.cluster.local. So we can go ahead and actually try that out because I also have already deployed this little DNS utils pod in my demo namespace and it has NS lookup installed. And so I'm going to request that very DNS name my service.demo.svc.cluster set.local kind of crossing over here. And we'll see that the response we get back is that address that's in the service import 1234. So this is again a property of a multi cluster DNS plugin that's in the core DNS that is deployed on this cluster that's using that plugin to resolve any DNS names that end in cluster set.local. So that is what that looks like. All right, let's talk about how you can get involved in SIG multi cluster. This is the part of the intro and the deep dive where I tell you that one of the most important things that you can do, it has high value. It should be fairly low effort is to share your use cases, your problems, your ideas. Note that use cases and problems super valuable. If you don't have any ideas and you just have a nice big shopping cart full of use cases and problems, we'd still love to hear about it. So I want to just make a personal request to you if you're seeing this and you've thought about this functional area like that we're talking about, we'd love to know your use cases, your problems, your ideas. Check out our homepage. It's on the community site SIG multi cluster. Give us a ping and slack. The channel name is I hope you're sitting down for this. It's SIG multi cluster. And you can hit us up on the list, which is called Kubernetes SIG multi cluster. We'd love to have you join the meetings. They're bi-weekly on Tuesday. They're at 12.30 Eastern, 9.30 Pacific in 16.30 UTC. If you have something that you'd like to present, feel free to put it on to the agenda and just come and talk to us about it. I'd love to see you there. Hope to see you soon. Thanks for joining us today. Thanks everybody. Thank you. Have a great QCon experience.