 Hello everybody. I am Paul Mori and this is a SIG multi-cluster intro for KubeCon EU 2021. I've got with me Jeremy Olmsted-Thompson. Hello, I'm Jeremy. I'm a software engineer and I work on GKE at Google. Today we're going to talk about what SIG multi-cluster is focused on and what we're all about. We'll go through the current activity and all the projects that we're working on and then we'll get to the most important part, how you can contribute and please do. So what does this SIG about? So multi-cluster is a extremely broad topic that means different things to different people depending on who you ask, but our best definition so far is just making multiple clusters work together somehow. We touch many different functional areas but we're still trying to figure out what are the best most durable primitives. We just want to start with the basic pieces that all multi-cluster deployments need to function. We really need your input. We're looking for real user stories and use cases. We're looking for feedback on the things that we've developed and how they work for you or maybe don't. Many or all of our projects are in early stage still and are still malleable so your feedback can really impact the direction we take things. If we standardize around best practices or try to push best practices you can really help shape those and we're also finding that as we develop these new tools we are exposing new needs. So every time we make something easier to do it leads to a whole bunch of new questions that we need answers to. So our approach in this SIG which really comes from years of lessons that we've learned with previous attempts to solve problems in this space is centered around a few kind of like keystone values. First of those is that we want to avoid premature standardization. It doesn't, it actively works against progress if we make new standards before they're battle tested and before they are widely adopted. We also want to avoid solving optional problems at this point because as Jeremy said the area of multi-cluster is so broad and has so many different facets that if we allow ourselves to be distracted by shiny optional problems we won't be able to solve the core fundamental ones that we need to as best we can. We also want to focus on specific functionality that we want to build and work backwards from specific problems into something bigger maybe if it makes sense. So let's talk about the cluster registry as one of the the areas that we have progress to report on and the cluster registry basically has historical roots being the single point of agreement in the community after Federation v1 in that it makes sense to track multiple clusters when dealing with multiple clusters but it was developed without a clear vision for what problems were being solved and you know one of the things that again this this wisdom that we gained looking backwards is that cluster registry was attractive because it seems intuitively like it should be easy. Spoiler alert it wasn't easy. It was so not easy that it was actually very hard to to get something meaningful accomplished that was durable at the time with what we knew then and eventually begin to sort of wither on the vine and just a couple weeks prior to the recording of this talk cluster registry has been officially retired because we believe it's basically superseded as a concept by some things we're going to talk about later. Another active area that we have well actually the last one wasn't active so an active area that we have the first of many active areas that we're going to talk about today is is called QFED you may have heard about this before you may have also heard the term Kubernetes Federation so TLDR is that QFED is the second attempt at having a having something that spreads resources basically from one copy to multiple clusters. The QFED model is basically around having new API surfaces that are distinct from the ordinary Kubernetes API and your ordinary custom resources or aggregated APIs that you may have that basically a fundamental primitive of QFED is to generate a new API surface for the so-called federated API that encompasses a template definition it encompasses overrides by that we mean a way to differentiate certain parts of a resource when it lands on a particular cluster and then also placement information. This model works well for some users in fact QFED is on its way to beta may even be beta by the time that you see this and QFED is currently considering how to add pull reconciliation the the initial model that was implemented in QFED was a push model they're looking at how to add pull this touches the cluster registry and cluster tracking type of concept in the sense that how does a pull reconciler running in a remote cluster gain access to observe the cluster where QFED API surface exists. So the next few areas that I'm going to talk about are kind of all related and it starts with this new concept that we've developed called the cluster set so since conception you know sometime last year the cluster set doesn't really correspond to an API it is but it is a concept that we're kind of building around here as we mentioned before multi cluster means different things to different people and what we wanted to do is kind of create a well understood bounded definition of you know the multi cluster space that we're at least to start trying to address and so this is this is where the cluster set comes in and it's a pattern of use that we see in the field that gives us you know something concrete to work with here and that is that it's a group of clusters governed by a single authority that could be you as a user it could be a company or a team but somebody who has you know formal authority to make strong statements about all of the clusters within that within that set a high degree of trust within the set so we're looking for clusters that work together in much the same way you know that that uh workloads within a cluster do and and with you know where communication is generally allowed you know maybe not between every service and every namespace but certainly within a subset if uh if two clusters should never talk to each other they probably don't belong in a cluster set and where expected access patterns are kind of similar um the same service is governed by the same user or team um in generally the same way um but you know some some uh domain where you know you can make again strong statements about how those clusters work together and and some degree of consistency across them and that's where namespace statements comes in and that's kind of that's the biggest piece here um basically the concept that a namespace and and names have a consistent meaning in all of the clusters within a cluster set so you know a given user's permission to access a namespace um is consistent in clusters so if you know if i have namespace foo and i can access in clusters a and b and namespace foo is present in c i can also access it in c so i don't have to think about what cluster i'm dealing with i can just think about namespaces now namespace doesn't necessarily have to exist in every cluster but it should behave the same in clusters in which it does and why this is important uh we'll get to uh a little bit later so the first kind of piece i want to talk about here is is some new work we've been working on called cluster id this is a new cap um it is uh on its way to to alpha right now and will probably be uh hopefully implemented by the time you're watching this um or soon after uh basically what we've done is we're introducing a new cluster scoped cluster property crd that is essentially a name value pair um what we want to do is kind of uh create self-awareness for the first time in kubernetes clusters are very introspective by nature we want a way for clusters to self-identify who they are in the you know in the broader context and so the first way that we do this is by introducing a well-known uh resource a cluster property instance named id.kates.io that will correspond to some uh unique identifier for that cluster that is unique uh you know within some well understood time frame basically as long as that cluster belongs to a cluster set the other piece here is a way for clusters to identify the cluster set to which it belongs so a cluster uh will have a cluster set.kates.io uh cluster property and this property will you know it could be the name of the cluster set or it could be some kind of mapping to that uh cluster's membership but some way for that cluster to tell which group it belongs to and the point here is to give us a way to uniquely identify clusters within that cluster set for that lifetime of membership so you know you could absolutely use um the id beyond the membership and you know for the lifetime of the cluster but we really want to make sure that there's a way to uniquely identify clusters as long as they're in a cluster set so we can start building on that and this is really you know getting into the most important parts here we now have a reference point for a multi-cluster tooling to build on so you know you could use this to disambiguate backends for headless services between clusters which we'll get to in a minute um a coordinate for scheduling work or even a way to annotate metrics and logs so if you're feeding a bunch of data from from your cluster set you can you know have some tag to look back where that where that data is coming from you know help with root cause analysis and whatnot um and and we think this is a really important building block I just want to shout out to Laura for driving this work it's been awesome uh to see the progress here over the last few months and and this is really exciting so let's talk about how we're using this stuff um multi-cluster services uh this has been an ongoing project for a bit over a year now in the SIG has been making great progress uh services are a multi-cluster building block of course if you have multiple clusters you have services you want to consume them across clusters that's how we build applications on Kubernetes um multi-cluster services builds on the concept of namespace sameness and allows a single service to span or be consumed by multiple clusters um as though it is uh local to that cluster so if I have a service for example named foo in namespace bar and I've deployed this across multiple clusters using the mcs api um I can tie these together into one global bar foo service that can be consumed from anywhere you know as though it was a local service in that cluster the the consumer really doesn't have to care we've really tried to focus only on the api and and the common behavior that's necessary and in all platforms and really really the the experience of consuming multi-cluster services and and leave as much room as possible for various implementations and so we've already started to see some of these spring up for example uh Submariner has a great open source implementation there's a managed offering with google kubernetes engine and we've started seeing service meshes like Istio begin to adopt the api as well and they're currently working on an implementation we've left a lot of decisions or as many as we can uh up to the implementation so you know a big one here is that the uh the way we've designed this a control plane can be centralized or decentralized um but consumers ever only ever rely on uh local data so you know in some uh in some implementation it might make sense to have you know one controller connect to all the clusters and manage things and another implementation it might make sense for you know each cluster to reach out to the other clusters and and uh pulling the information that it needs um but the experience of the consumer is is always the same you know there's a consistent resource that we call the service import available within each cluster and that will help you discover the endpoint information you need to connect to services and other clusters so to the consumer they get the same behavior no matter what the implementation does and the end goal here is that you can take your existing cluster ip and headless services um that you have today in a single cluster turn on the the multi-cluster services api with an implementation and you get a multi-cluster equivalent service that spans all of your clusters but can be consumed um in the exact same way as a is a cluster local service so you know they work exactly as expected from a consumer perspective another area that we have work uh ongoing in is called the work api uh so in contrast to the pattern that we previously described in a federation where there is like a uh one to one uh well one too many mapping of resource and control plane to resources that are put on two different clusters uh and we're kind of shipping individual resources as the units uh instead the work api takes a different approach that a uh the unit being shipped around is a collection of resources instead of individual resources this is currently in a pre alpha state with a kept like document coming together um and a little bit of code going into the kubernetes sig's work api repo and uh this is another area where we're working backwards in the sense that we have an initial concentration on finding the right api surface for a single cluster to have work applied within it um before we do higher level things next steps also approach the registration concept in the sense of how does the cluster where work is defined know about clusters where it is applied how do clusters that are supposed to have work applied on them know about the cluster where the work is defined and no pun intended but very much appreciated it's in here still a work in progress we need some input uh so this is one area that you can contribute to as you can see one challenge that's absolutely present in this space it touches a lot of different um touches a lot of different areas and sort of comes back to that first slide is that tracking clusters is definitely a palpable concept it's easy to talk about conceptually it's hard to materialize into functional software and also is a strange attractor to be tempting as something to build first simply because it seems easy and we're avoiding characterizing a registry as a first step you may have you may have sensed as a theme running through this presentation that we're like very careful to avoid doing that we're also avoiding a gravity towards standardization on a registry because we think it's important that there be room for many different schemes that are best fit to a particular situation so now that we have multi cluster primitive multi cluster primitives what's next that's the big question i'm left with and that's where we get to where you come in we need your input we'd love for you to share your use cases problems ideas with us and you can see here we've got some coordinates for the home page the slack channel and the mailing list our meetings are Tuesdays at 12 30 eastern 9 30 pacific and 16 30 utc time thanks a lot for coming everybody thanks everyone i appreciate your attention and giving us the time come help us define multi cluster kubernetes yes we'd love to have you look forward to seeing you in the next meeting of kubernetes sigma multi cluster have a great day everybody thanks everyone