 All right, thanks everyone. It's Friday, end of KubeCon, this last session. So thank you all for joining us. Yeah, it's been a great KubeCon. So thanks for coming to talk about multi-cluster and the things that we're doing in the SIG. I am Jeremy Olmsted-Thompson. I work on Google Kubernetes Engine, and I'm one of the co-chairs of SIG Multi-cluster. And I'm Stephen Kitt. I'm an engineer at Red Hat, where I work on lots of things multi-cluster. So today we're going to be talking about what this SIG is about, as usual for a SIG intro talk, and telling you about our next problem spaces. Current activity as well, which is a bunch of things, SIG MC website, cluster sets, and namespace sameness. So that's not really current activity, but it's one of the staples of our SIG. The about API, multi-cluster services API, and some new building blocks for orchestration, the inventory API. And of course, a call to action at the end, how to contribute. So what is the SIG about? It's about Kubernetes. It's about many Kubernetes. No, really, it's multi-clusters everywhere, right? And you're all here because you have an interest in multi-cluster, and you know that increasingly over time, we see more and more. Multi-cluster is becoming important to basically everyone in the Kubernetes community, and for those of us who build platforms for all of our customers, for so many reasons, right? I mean, the obvious one is fault tolerance. You want to deploy across multiple regions, multiple data centers, or even multiple clouds. But we're seeing more and more reasons as well. Policy is a big one. You've got data locality. You need to deploy workloads where your data lives. Or you've got some other policy that prevents you from moving something out of an on-prem data center into the cloud, or makes you otherwise split your workloads. And Kubernetes wasn't really built to stretch across massive geo locations, so multiple clusters is the way to do this. Especially now, as many of us are getting into AIML and starting to focus more on accelerators and other things popping up is capacity, right? GPUs are scarce. You want to be able to deploy your workloads where you can find capacity. That's another really important reason for getting into multi-cluster, so that you can kind of stretch those things around. This is hard, and why we have this SIG, is because Kubernetes was built kind of to manage everything, right? Like a cluster really kind of is the end of the universe. Everything that you deploy, everything that your workloads depend on, historically has been contained within a cluster. And anything outside of the cluster, Kubernetes natively, is kind of a black box. Like it's opaque IPs, and you have to kind of build your own connectivity. You lose out on the metadata that you have for those workloads. So we've made a lot of progress in kind of breaking down these walls, and we'll talk about that, but there's still a long way to go. And I think most importantly, this SIG is about all of you and your needs. Everything we try to do, we've been trying to build based on real concrete needs, not trying to reach out to build these ultimate flexible systems. We want to focus on concrete use cases and stitch those together to build the flexibility that we really need. So we need all of you to come tell us today after this talk, come to our meetup on the forum, on Slack, and tell us about your needs, your real use cases, the things you're building, the things that you want to build or don't know how to build, and the things that we're doing that don't really fit your needs and when we need to make some changes. So a little bit about our approach. So this is kind of a, we had a reset a few years ago and really want to focus on APIs. Because what we realized is, especially when you're stretching across platforms, you know, on-prem cloud, multiple clouds, it's a very diverse ecosystem, right? There's a lot of different pieces that go into a deployment on each cloud and not everybody does things the same way. So by focusing on APIs, we leave room underneath for customization so that implementations can, you know, tweak things in the way that works best for them. It could be, you know, sometimes you've got a central controller, you want to push configuration out. Sometimes on some clouds or some environments you might have, you might want to pull configuration in we want to support all of these models. We want to avoid solving the optional problems, right? For me, this could be useful, is has always been like a bit of a fly trap, right? Like I want to build the solution that's going to solve all the problems. But sometimes you end up, you know, getting caught up on things that don't really matter. You can spend a lot of time debating some feature that isn't really core to the problem you're trying to solve, right? So we started out kind of trying to solve maybe too broad of a problem space and in the last few years, we've really been trying to focus on like, what are the minimum set of APIs we have to define to make things work? Consistency with existing APIs, this is important because nobody starts with multi-cluster, right? You learn Kubernetes, you get your deployment working and then you want to spread it out. So making sure that when you do that, it's not starting over, things work the way you'd expect is really important. And lastly, composable building blocks. We don't all have the exact same problems. And so we want to make it so that you can, you know, take the pieces that work for you and put them together into a bigger system and you don't have to worry about everything else. So one of the things that we've added in recent years has been the SIG MC website to try and make our work more visible. You can find that at multi-cluster.sigs.kates.io and it provides higher level documentation. And again, as Jeremy said, we focus on APIs and APIs that are really for end users. And so this is supposed to provide higher level documentation for end users. Also project status updates, although we haven't really been very good at that. And ultimately we'd like it to be able to connect implementers to our APIs and tooling and to also catalog implementations for end users because there's lots of implementations of all the different building blocks, different levels of support and so on. And we'd like that to be visible. And obviously we also would like to know what we're missing, what you trying to do multi-cluster if you are would like to find out from our website, whether that's good patterns of practice for multi-cluster deployments, how you actually use this stuff and open questions for future developments. And real quick, as Stephen said, we haven't always been good about the updates. So we are looking for someone to maintain the website if anybody's interested, talk to us or join us online. But we'd love somebody to kind of take ownership and help us drive that and make it more regular. So with that, I'll shift into the stuff we've actually been building. So starting, I guess this isn't something we've really built, but this is, I'll talk about the cluster set. And what a cluster set is, is it's not an API as a concept, but it's the concept that we've based our building blocks on. And it's basically a set of clusters that are working together with some important characteristics. So first, the set of clusters need to be governed by some kind of common authority. This could be a team or a person or an organization, but some entity that has the authority to make strong statements about multiple clusters and all the clusters that work together so that we can make assumptions when talking to these clusters. There needs to be some degree of trust between these clusters. If you never want these clusters to have any knowledge that the other clusters exist, they're probably not in a cluster set. If it would be a major security breach, if any application in one cluster found out about the existence of some service, even if it couldn't talk to it in another cluster, like maybe this isn't the best thing to combine in the same cluster set. But this doesn't mean that they, all your services need to talk to each other. It just means that it is within that organization. The most important piece of the cluster set is namespace sameness. And this is a principle that a namespace with a given name in one cluster is the same namespace as a namespace with the same name in another cluster. And that means used for a common purpose, common access. And basically if I can talk to a namespace in one cluster, I can talk to that namespace in another cluster. And they do not use for wildly different purposes. So we can make some strong assumptions about how they're used. And this helps us kind of elevate the identity that comes with namespaces above the cluster so that clusters can come and go and those assumptions still hold. A namespace doesn't need to exist in every cluster. You don't need a namespace foo in every cluster in the set. Doesn't mean that I need to be able to talk to every cluster in the set or have access to that namespace. But if a namespace exists in a cluster, it should behave the same as it does in any other. So with that, we just need to figure out how to identify these clusters. Because Kubernetes doesn't have any built-in way to actually name a cluster. Yeah, in fact, more than that, as Jeremy said earlier, since in Kubernetes, the cluster is the universe, there isn't actually a representation of a cluster as such. And so the above API is a way to start changing that. It's a simple CRD, just a key value store with some well-known keys. And they'll allow you to tell that, well, to identify that a specific cluster is named this way. So here we have clusters A, B, C, and D. And that's a well-known property that is supposed to be present in every cluster with the above API. And we also are told that which cluster set it belongs to. So in this case, it's all my arc. So this means that in theory, all these clusters can talk to each other and therefore you can also expect that the different namespaces where they're available across different clusters will be the same and provide access to the same services. And you can also use the above API to add other fields that are of interest. So this could be in implementers, according to implementer's requirements, or just to your use cases, a bit like labels in a way, I suppose, with their environments here, for example, network foo and compliance secure. And so just in a little more detail, this is KEP 2149, it's currently in beta. And it's a cluster scoped cluster property CRD. Like I said, just a very simple key value store. And we have so predefined well-known keys for the cluster ID and the cluster set it belongs to with requirements on a university. So a given cluster with a given name can only exist once inside a cluster set. And also a given cluster can only be part of one cluster set. And that's a very simple building block for higher level of orchestration, that's all it is. And so on top of that, we can build other things, starting with multi-cluster services. So this was actually the first of the kind of new round of APIs we built around the cluster set. But it does build on the about API. We started with multi-cluster services and then kind of worked back to identity. But it is, as you'd expect, a tool for connecting services across clusters in a flexible way, or even deploying a single service with back ends in multiple clusters. And really the goal here is to separate a service producer from a service consumer so that service consumers and producers don't really need to care about where each other is located. And they could be in the same cluster or different clusters. That flexibility is up to you and your needs and can change over time without having to coordinate. And so how it works is really simple. You have a service, a normal Kubernetes service, headless or cluster IP, and you create a service export. And the service export has the same name as the service you wanna expose in the same name space. And now that service becomes accessible from other clusters in basically the same way that you'd consume a service within that local cluster. And so as we talked about our approach, we started with a focus on the API. I think looking back, we probably originally started over-specifying things. We tried to get too much detail. We got caught in that fly trap. And over time, we've kind of walked back a little bit and really focused on the core pieces that matter. So it ends up being a very, very simple API, but you can use a variety of implementations. Some of them are listed here and connect the services between clusters very easily. One of the neat traits about this is that we've built it such that consumers only ever need to rely on local data. So we minimize the amount of cross cluster coordination you need to do. But the end result is that cluster IP and headless services across clusters basically just work like they do in a single cluster. And one of the really interesting ways you can use this is if you think you might grow into multi-cluster or need some additional degree of fault tolerance, you can actually use multi-cluster services in a single cluster. And then as your service grows or your needs change, you can spin up additional back ends in a different cluster. Because if two services has the same name, like we talked about saying this, they're part of the same service, right? So the back ends can be shared or you could even move the service to another cluster and you never need to tell your consumer what's going on, you don't need to build that coordination. So that can make it a lot more flexible. And with that flexibility, we've actually partnered with SIG Network and the new Gateway API. For those not familiar, Gateway is basically the evolution of Ingress and lets you define routes for, you know, how to connect to services from outside of a cluster. And if you have a compatible implementation, you can actually point a Gateway at a multi-cluster service and have multi-cluster back ends for your service in a way that works exactly like you'd expect with a single cluster deployment. And so the next step from there, as you might imagine, is getting into orchestration. And so far everything we've said has really just been objects that are inside a single cluster and you only know about that single cluster. But API tells you what your cluster's called and whether it's part of a cluster set or not. And then the MCS services API allows you to export services and there's something somewhere that allows you to transparently consume services from other clusters, but you don't actually know that that happens. And we're working on a new API that will provide more visibility over all the clusters in the cluster set. And with the idea then of, well, lots of use cases, but for example, orchestration. And so this has come up from, another one of those APIs where we've done things in the past that haven't really worked out and so we're walking back. First there was the cluster registry, then there was KubeFed that actually got some use, but it was too ambitious and development stalled. And there is actual demand for all this stuff. So how do we solve it? And so we're working on a simpler, hopefully, API called the cluster inventory API and this is driven by a number of projects and groups who are involved in the SIG. But this is still at the stage where we are sketching things out, working out the details. So now is the opportunity, now is the time for you if you're interested in this stuff to get involved and help us avoid making more mistakes. And most importantly, help us avoid setting something in stone which doesn't really meet your requirements. Because obviously we hear a lot from people in the SIG who are interested in this sort of field already but who are aligned with the way the SIG works. What we would really like to know is all those of you who've maybe looked at this work and decided it's not for you because it doesn't quite match. If you don't tell us why, then we can't fix that. Yeah, so in a little more detail on the cluster inventory API, this is a way for clusters to discover or things running inside a cluster or things with access to information about a cluster to discover what all the clusters that are in the cluster set are and properties about them. So this will be, for example, their API endpoints. This KubeCon, we've had discussions around credentials as well so that you could take this cluster inventory and use it to actually access all the clusters. It could be information about the sizes of the clusters, the equipment that's available in them. Jeremy mentioned AI workloads, for example, you could say that there's a specific cluster that has lots of GPUs and so an orchestration system would move workloads that want those there. But like I said, we're still discussing this a lot so we're really interested in possible uses for this and so what's next in general for the SIG? Canonical patterns, we mentioned that we'd like to describe that on the website but we've got all these building blocks and they're fairly basic so far and what we'd like to know is how you use them, how they should be used, or try to figure out how they should be used and as a result of that, are there patterns and workflows that we could highlight as best practices? The other people coming into the field, dipping their toes into the multicluster waters can use to avoid making their own mistakes. And then one that's been on our mind for a while now but now that we're starting to have more of these building blocks, I think, is becoming increasingly important is things like leader election, right? How do you, if you're gonna build controllers that span multiple clusters, you need some way to coordinate, right? It'd be, we're very interested in anyone in the community who wants to come help us design this as well, especially now that we're starting to shape up things like the cluster inventory and have some discovery building blocks, we wanna use them and take them to the next level. And then, as we keep saying, this is about you, right? In your use cases, so what else do we need? What's missing? There's a lot here, you know, in the past, we've had people come talk about storage and what that can mean in a multicluster context. We know that there's been conversations, we've had lots of conversations about identity, you know, and extending that authorization, so there are lots of areas. Basically, every part of Kubernetes at some point is going to need some awareness of multicluster operation. And so we need your help to figure out what's next. So, please come get involved. Yeah, be part of the SIG. You've probably heard that from lots of SIGs, right? All SIGs are thirsty for more participation to find out what things are actually of use. So we would like to know what you're working on, what tools you're building or you've built because we know that lots of, well, lots, not necessarily lots, but there are people doing multicluster on their own. And so they've built their own tools. We'd like to find out what those are like. You know, you don't have to reveal all your secrets, but just what you've learned from things and how you go about it. And then, of course, the tools that you're missing, the problems that you and your customers have, and whether you have specific needs that aren't addressed at all or only partially, perhaps. And if you need help or if you want to help. Yeah, and we've had a number of people come and just basically do demos, right? Just come with a deck or a video and tell us what you're working on and, you know, just share your learnings. Yeah, and so some specific ways that we know that you could, well, anyone could help with the SIG that aren't, there isn't a huge barrier to entry, is just add information to the website. The website is so basic that, for example, the About API, which is a very simple API, has an empty web page. So if you feel like helping out with that, that would be an easy one to do. And another big one is contribute to the test suite because we've got a number of APIs in progress that we've been discussing, but none of them are GA yet. Yeah, not even the About API. And we'd actually like that to change. And so one of the big missing pieces is a conformance suite. So we'd like help on the test suite and that's also where implementations can be confronted. So, you know, because once we have a conformance suite, that bakes in the hidden assumptions behind the API. So if you want your solution to be conformant, it's best to get involved when the conformance suite is being developed. So yeah, come share your use cases. I think on that previous side, we also probably deleted a couple of bullets accidentally, but yeah, come give a talk, send us links to project pages. Here you can find all the information to contact the SEG we meet every other week at 9.30 Pacific time, 16.30 UTC. Come join us in person and chat with us. We hope to see you there. And now we'd love to actually hear from you in real life. Yeah, so there's a microphone in the middle there at the front for people who have questions. Hi, thank you for the talk. I had one question, sorry. Yeah, so you talked about that consumer, on one of the slides at the beginning, that consumers only ever rely on global data. Can you describe what you mean by that? So are you saying that if, for example, I'm at the cluster A, I don't need to know, for example, the endpoints of the cluster B? You do need to know the endpoints, but the way we've designed this is under the assumption that those endpoints will be imported by the implementation into the cluster that is consuming. So when you actually talk to those endpoints, that list is local. You're not trying to do an on-demand fan out. Very much for the update. Multi-cluster services API, I think is really quite an elegant solution. And I was wondering what the status of it is, is it likely to the API overchange and we're still figuring it out or it's just the conformance testing that you were outlining earlier, like what's the status of that and when could we expect possibly maybe GA? Yeah, I'm sorry that it's not GA, I'll start there. We will. There may be some small changes we've been discussing, but small. The core implementation isn't gonna change. The general workflow, create the service export, discover the service import, that's not likely to change at this point. In fact, some of the implementations listed here are including GKs that I work on, are GA, like already, despite the state of the API. So likelihood of drastic change is very, very low, but we do need to work on getting it officially GA in the community and we have been discussing that a lot while we're here, so, soon. Yeah, maybe a little more specifically, it's safe to assume that the service export object is going to stay as is, so, because, well, so all you need to know when you're a user consuming this API is if you create a service export object and at some point in the near future you will then be able to access the service using cluster set.local and so that's how you create it and that's how you consume it. That's not going to change. The service import mechanism is probably not going to change either, the endpoint slices and so on. It's going to be probably more details on whether where IP addresses are required in the objects and what the guarantees are on those, whether they're, because different implementations have different reliance on virtual IPs or not, that sort of thing. Yeah. Great, thanks so much. Absolutely. So, with the namespace same as position statement, I've heard from customers and vendors that while it is really nice in an ideal state, there are often folks who need some sort of way to carve out an exception for like one or two namespaces that maybe should not be replicated and have that strong thing even though they might be the same name onto different clusters. I have concerns about doing that in a loose unstructured way because then like particularly as a machine implementation you lose a lot of the like strong guarantees that you can build around or things like that all over. But I'm curious if you've seen over in the network policy working group, there is a NPEP proposal for a tenancy group object that would be able to group namespaces in a way that seems like it might be useful for this problem. I'm curious if you've seen that and have any thoughts on that. So I haven't actually seen that yet. So now I will definitely check that out but I wanna call out that I miss something important with namespace same as. So this namespace should not be the same in any cluster is also a valid type of namespace same as basically that a namespace should be considered cluster local and not have any broad exposure. And then there's actually another kind of area that I know that we when implementing it have have faced as well which is the kind of the transition phase, right? Like if you're coming from a single from a disparate group of single clusters and you wanna expose multi cluster you need some way to kind of gradually opt in. So the goal is that, you know the that's not a steady state. Like I think it's fine if you wanna have your dev namespace in every cluster and it actually belongs to different people as long as it's never accidentally treated as a global namespace and you have a way to have your implementation exclude that. But ideally long-term you don't have a situation where you've got namespace foo in five clusters that is the canonical namespace foo and then there's this other cluster that just has a namespace foo that's named the same way but totally works differently and you know, cause that's just in many ways a risk awaiting to happen. I also wanna add, the worst possible outcome I think would be to have one service whose name was foo at one cluster and bar at a different one, right? Like if you would map those things would be a no. Right, exactly. So names are really important and so I think, you know trying to maintain that as best as possible. I think maybe another way of thinking about that for people who are not entirely familiar with all this discussion is that the namespace idea namespace same as ideas about the philosophy of multi-cluster which is that we extend things across clusters. Lots of people tend to think of multi-cluster as a way to do point-to-point services make service foo on cluster A available in cluster B as maybe something else that's not what multi-cluster is about. There is, you do end up with point-to-point connectivity for services, but it's a result of the philosophy of multi-cluster. So my question is about digging more into this so I'm making service to service communication between multi-cluster so if the communication is getting through the QPPI itself and if so, the configuration and the secret for that cluster how it's shared and how it's exactly solved. So the actual mechanism is implementation dependent but most of our implementations tend to kind of work in a way where it all continues to use the existing data plane. So they end up, or several of our which implementations I won't say most because we have mesh which of course is a different connectivity path but the idea is that this is more about service connectivity. I think if you want authentication between services things like that, that's higher level. That's where the mesh might take over. Traditional services have a problem with a secondary networking and I think that is being addressed as we speak. When you do multi-cluster services do you also include secondary networks on stock? We don't yet because that's being tackled now but we did kind of design it with the hope of multiple addresses behind the services in case you wanted to be able to do that or dual stack. So the hope was to be flexible but I believe right now I'm not sure if any of the implementations have multi-network support. But it's part of the plan. Absolutely, yes. So we are doing our clusters as cattle, not pets so that means that we have many smaller clusters but it also means that when we were to want to say a broken cluster out or have a cluster set we are using it as an Istio so it makes it a little bit easier. Right now what we have to do is duplicate everything. We don't do any federation, we don't do any machine connectivity and we just copy everything you know and copy it all this other thing and there you go and then you switch a little cluster manually and then you turn on the other cluster. This will essentially eliminate all of that just by using a cluster set.local essentially because then it doesn't matter how many clusters there are because they could in theory only use the mesh because it would be multi-cluster service aware will be the implementation I suppose all definitely cluster service. Right, I mean you still need to get your configuration in each of those clusters of course but the idea is that if your mesh supports multi-cluster services and optionally with Gateway you can bring up a new cluster, expose the service in that cluster and it's just kind of added to the backends for that service then you tear down your existing cluster if you want and the service has moved or you leave them both up for distribution but you don't need to do that kind of rolling update. There's machinery that can handle that for you. Yeah, so as a use case for essentially in safety this would be a good match, that's good to know. And there's also pretty much rules into the next one is what about version compatibility? Is there anything about that or is it just going to be based on say the CRD revision? So this is, so it really is the Kubernetes service layer so there's no real versioning built into that either. I think for versioning the best solution right now is probably either use Gateway to shift your traffic or if you're using a mesh and it has capabilities there that can be an option as well. Oh yeah, I was also thinking along the way we'll say the version of the control plane so you have a 129 and a 130 for example. Got it, yeah, so that actually the most implementations I'm aware of for all of them don't really care about control plane version so you can bring up a new cluster version. Actually this is a really great way to do like blue-green upgrades. So you bring up a new cluster version, add the back ends when you're happy that everything's working tear down the old one and your service has just moved. Basically you've upgraded without downtime. That's exactly what it was after. Blue-green but for your entire cluster or multiple clusters or multiple cluster sets you never know where you end up. Yep. Don't get the cluster set set. All right. Yeah, there's always another layer. Yeah, that was actually something I was thinking when I was talking there. We've added a representation of the cluster but we still don't have a representation of the cluster set. And then that's going to be the next universe at some point you're going to tell, but the multi-cluster. Yeah, you know it. Thank you. Thank you. So before multi-cluster services, service mesh was another way of doing multi-cluster connectivity between services and now that I guess this isn't really your problem but I'm curious like now that there's a Kubernetes native way of doing this, will service mesh implementations align with that API or will it be totally separate or is there work to get a standard way of expressing that intent or? That's exactly the hope and you see Istio is one of the named implementations. So like we've got compatibility there. I think yeah, the hope is that this will make it easier to gradually adopt a mesh as well, right? Because if you start using Kubernetes native services or Kubernetes native multi-cluster services and you want to adopt a service mesh, we want, you know, keeping with that consistency, it should be something you can adopt without starting over in your configuration. And then I guess the next step is a Gamma and Gateway and then everything just becomes the one API to rule. Yeah, exactly. Yeah, yeah, yeah, networking, by the way. I want to throw down a harder challenge. When is storage coming? I really want storage to come. I think the, you know, I won't throw it back to you specifically, I'll throw it back to all of you. I think we need help. We need help understanding what multi-cluster storage means to you. We had actually a demo in the SIG. If you look back in the history, we had someone come and show a demo of, you know, what it would be like to move storage between clusters. It was really, really exciting, but there's more to it than that and so we need help. I think we have one minute, so go for it. So I'm mostly involved in a civilian project. We've heard a lot about it. Seems to me the concept of cluster set is very similar to the cluster mesh. If you, yeah, so there's already issue on like supporting multi-cluster services, which I think is great. If I remember correctly, they are mostly like hesitant and maybe worried about like scaling issues with exactly what I asked the first question. If I somehow need to copy all the endpoint information from cluster B to cluster A, and there's a high pot turn and I would say hundreds of clusters, I am afraid that it's gonna hate hard on the API server. So this is actually really exciting because we had a lot of conversations about this and we've actually, we don't require that you copy the endpoints. We just require that there's connectivity. So if an implementation found that it was better for scalability reasons, not to actually have real endpoint slices, that's okay. Yeah, that's exciting. You don't have to copy all the endpoints. Yes, yeah. Yeah, there can be horizons. If there's a lot of them, you can copy some of them that are enough to satisfy the condition and then you can over time adjust. Right, yeah, there's a lot of flexibility here. And that's also assuming that you have like an underlay where your pods actually talk to each other across the, if you don't have that assumption, you could instead expose like an east-west load balancer at the end of the point and accomplish the same thing without having to copy the scalability. Yeah, we wanted to leave flexibility for all of those choices. Yeah, but we've really run out of time. So we're expecting all of you the next stick call to continue the discussion. Thank you so much. Thanks.