 Good afternoon everybody. My name is Tom Nado and I'm going to be giving a talk to you today about multi-cluster federation with my friend Anil Vishnoi. Specifically, we'll be talking about should networking impact a multi-cluster federation solution within the context of container networking. The agenda we have today for you all will be talking through sort of a problem statement to start and then I'll give you the kind of overview and the problem statement and then Anil will take over and get into a bunch of the details around various solutions that exist today as well as a bunch of the challenges that exist with those solutions. And then we'll wrap things up at the end by going through and seeing if we satisfy the requirements we set out at the beginning. And we will be having a Q&A at the end of the session so please hang out after the presentation and we will be doing a live Q&A for a bit at the end. So to start things off, when we looked at this problem space, there's a variety of solutions in this space today and the series of use cases that we looked at and talking to not only folks in the communities working on the different solutions but also talking to customers who need these solutions to solve certain problems, we went through and broke things down into a couple of buckets of areas that seem to be common across these things. And each of the use cases that we're going to discuss today fall into having these sort of parameters defined in different ways and that really will lead us to the conclusions that we have at the end of the presentation around what solutions are better in certain situations and more optimal in other situations or not. So we've broken this down into three primary areas, location. So location, both of the workloads and the clusters that are serving those workloads. And that falls into three areas that we're concerned with specifically latency. For example, when I run an application, is the application, when I speaking as a user run an application, is that application close enough to me to provide the user experience that I expect? Or is the service that I'm consuming providing the right environment? For example, low latency gaming. So are the workloads positioned close enough to the user to provide that experience that they're after? Jurisdiction, so when we're talking about multi component solutions, let's call it, so we have pieces of a solution, pieces of a microservice, if you will, running in different parts of geographic jurisdiction. Here we're concerned with issues such as GDPR. So is data being stored and handled in a way that's appropriate and legal in that jurisdiction? And that, as you'd imagine, differs depending on which country and locality even within a country sometimes, what kind of data. Data gravity is again related to location. So does the data exist in one service provider or another? So large databases, for example, for machine learning applications, for example, might exist across multiple providers. And do you have access permissions, et cetera, to move that around, and so on. Isolation of the information and the operation of the system is the next large level or large bucket of things we're concerned with. And here we're interested in cross workload interactions, and there are various kinds of those things that we're interested in. To begin with, the environment itself that the workloads run in, and here we're concerned with developers and testing and production. Can these environments span, for example, multiple geographic locations and still function as expected? Performance isolation. So here we have a situation that a lot of you should be familiar with today where you have a data center and a Kubernetes cluster, for example, that is shared by two customers running two different sets of workloads. And how does one workload impact the other? For example, if one workload exhausts all of the memory available in the machine, the other workloads are going to get starved and so on. Security isolation is another important thing, and it's sort of related to location, but security is important everywhere, specifically for sensitive data and untrusted code types of applications, and this also gets into issues like GDPR, but having inappropriate access to the data wherever it exists. And then that gets into organizational isolation, which is what I mentioned. I have customer A and customer B tenants in the data center. We want to keep not only the sensitive data apart from each other, but you obviously don't want to have one accessing the other or being able to access. And then even within an organization, you may want to isolate different access and management domains. Again, potentially regulatory, for example, you may have an investment bank that also has a banking operation. And in the US, for example, regulatory restrictions require that those things be kept separate, even though they're under one roof. And then finally, in this area of cost isolation, and this is very related to, again, workloads impacting other workloads, you don't want the cost of a workload, one workload, to have a significant impact on the other, and specifically as an unrelated workload. So for example, here, you can imagine, like in Kubernetes, there's this shared key value store. And that sort of database that exists on a disk somewhere on a SSD or whatnot. And the cost of that actual equipment, you don't want it to be disproportionately used by one workload or organization over another without expecting that. Then the final grouping of of requirements that we're looking at is reliability. And as Anil likes to say, what is the blast radius of the interaction, again, between different workloads on the same clusters or in the same network? And again, this is very similar to what I just mentioned in terms of isolation, but in terms of maybe performance reliability. So you don't want one workload being able to hog the network resources, for example, and cause packet loss for another workload, you don't want one workload to be able to crash the cluster is another common one, which which many of us have experienced to. And so these are things that relate to the two previous areas we talked about, but put a little sharper point on that. And then a few more important points in this area, infrastructure diversity. So the the underlying zones and regions and providers where things are provided. So for example, you all know AWS, for example, has zones and regions. And what that allows you to do is to do regional load balancing, but also again gets into things like latency and jurisdiction and so on. And that flexibility is something that is something that's needed by these solutions, many of the solutions that we've talked about, talked about and looked at. And that gets into, of course, scale and then even upgrade scope, which these are all very related to, you know, can the microservices fit on a single cluster in some cases, they can't, right, you have databases that are so massive, they're not going to fit on a single cluster. Or for the reasons we just discussed earlier, you may want to spread them around for high availability types of strategies and things like that. And then upgrade scope. This is very much an operational concern, being able to not shut down or pause the service that's running on the clusters. So you need to have again a high availability strategy, which say lets you upgrade one cluster while another cluster continues to serve the users, albeit maybe even at a lower sort of level of performance, but it's on and it works. So these are all very important things to consider when looking at various solutions. So another way to look at the different solutions that we've considered here and Neil's going to get into in a second is what I kind of like to call this is the spectrum of or the continuum of size and scale. It's a very much a multi-dimensional sort of way to look at the problem, but it really comes down to the min and the max of these solutions. In a lot of ways, for example, Kubernetes these days has moved to support obviously very, very massive scale types of deployments and uses, but in a lot of cases it's actually forgotten about the small use cases or it's made the smaller use cases have to incur cost performance operational penalties to work in a very small and constrained form factor. And so, for example, cluster size is an example. We obviously have gigantic multi geographically spanned clusters that are part of the solution space that we need to address. However, how about a very small cluster, a single cluster that is has a single application running on that. Is that still cost effective? Is it operationally reasonable? Doesn't meet those other requirements about location and distribution and things like that. Can it actually work? Is it a viable base case as I like to call it? Similarly, deployment location. Can I deploy multiple small clusters in the same data center to begin with that then could later span across multiple geographies later or disparate data centers that are farther away? And does that impact things again like security, like access, performance and cost? These are all things that are underlying these main areas. Network connectivity. For example, some of the solutions rely on a flat layer two network. Some rely on a more sophisticated layer three or layer three network or a VPN like closed user group network. And then clusters, for example, that span layer three boundaries that use gateway type connectivity to span multiple areas. But for example, you're introducing a gateway component and is it worth in the cost benefit analysis to deploy and maintain that as part of the solution? And once you answer the questions we've just asked. Application specific requirements. So different applications, as you'd imagine, have different requirements. A database, for example, will have a requirement of a minimum amount of time that it would take to service a minimum set of requests per second, for example. And as a means to scale that up, that database you may have to spawn another cluster or for GDPR, for example, you may have to restrict, you have to have multiple clusters, but you may have to restrict the data that sits on any one cluster. So these are things that you really have to consider as part of that. And then what is the connectivity between those clusters? And so the application expects a certain level of performance or security or whatnot. Does the underlying network connectivity support that? You can imagine, again, back in the discussion about workloads interfering with other workloads or impacting them, you can imagine one case where a workload somehow is able to use a disproportionate amount of the underlying network versus another workload, and then that impacts adversely how that works. And then just a couple more areas that are interesting around this application administrative domain. This gets into security, but also scale. Can I have a single application that runs, appears to run across multiple clusters and scales automatically? Or do I need to deploy actual multiple instances of an application and then have more things to manage there? Seamless operation. This gets into the previous point where, again, if I, if I deploy multiple applications, that's two things I have to manage versus one. And is it, is it worth it? Does it work? Does the strategy scale? Both, not just performance. A lot of times we think about scale is performance, but operationally. And then finally, the developer experience. This is a thing that gets into a thing I mentioned at the beginning today, which is, you know, the developer friendliness of the environment itself. And getting any open source solution to be well adopted and well maintained over time requires a very accommodating developer experience. If it's hard for developers to develop on, they're just not going to contribute to the projects. So with that in mind, I'm going to hand it over to Anil to get into some of the solutions. Sure. Thanks, Tom. Looks like Tom, sorry, lost a little bit. So just let me quickly kind of review this slide. So the problem statement that Tom explained, if you see the wide range of, you know, problems and then the issues that we need to kind of resolve to come up with a, you know, good end to end solution that is consumable. And you try to, we try to divide it in a major chunk and broad challenges that we probably have to solve to kind of come up with that good solution. And we kind of came up with these four areas where we need to kind of focus and come up with a good solution for those. So first one is multi-cluster lifecycle management because your cluster is now as a resource to the overall solution. So in a single cluster, you see node as a resource, but in a multi-cluster solution cluster is your primary resource. So you need to make sure that the lifecycle management is pretty seamless and it provides a full range of features to kind of manage that lifecycle. Consistent application deployment is another critical feature. If you are dealing with a single cluster or a multi-cluster, for a user, you are just deploying it in a cluster. So the application deployment lifecycle should be very consistent irrespective of a single cluster or multiple cluster. And that brings brings to the next part with a seamless service connectivity because if your application deployment lifecycle is consistent, that should give you a very seamless service connectivity because for a user service matters, it does not matter where you're running on single cluster, multiple cluster across the jurisdiction. They want to have a service with their policy saying that this is my service and these are my service related policies and that should they want to deploy it. So for them, it needs to be very seamless irrespective of the complexity that you have in the backend. And then once that happened, obviously, once we want to provide this kind of seamlessness, it means the traffic that is coming to the services or between the parts across the clusters, that need to be seamlessly routed. Because if that is something user need to be taken care of, that makes the entire solution very complex, which probably will give you a solution but might not be as consumable as we want, given that we're talking about the scale that goes to a planet scale level. So these are the challenges, definitely we need to address. But whatever solution we come up, there is a non-functional requirement on these things. That solution need to scale. We can think of going to a planet scale deployment for this particular solution because you're talking about multiple clusters. This need to perform. You run service in a single cluster. You deploy the same service in a multiple cluster. The performance should not be significantly hit. The service deployment time should be very reasonable, for example. And for that matter, for any other resources you are deploying on the cluster, the performance should not degrade. Otherwise, it kind of defeats the purpose to having a multi cluster solution around it. A high availability and disaster recovery that goes without say, because for any production level solution you need that, it need to be debuggable. Now, that is very critical in some of the use cases, most of the use cases I would say. Because customer need five, nine kind of availability and they have that kind of SLAs. So if your system of this large scale is not easily debuggable, it means your downtime can increase significantly. And that is something nobody actually likes. So it need to be very highly debuggable solution. It need to be observability because if you want to increase the debuggability, obviously, you need observability in the system. So you need to have a visibility of every corner, whether it's from a packet to the user level APIs. All these things are, if you see a very much shell, it's like basic requirements for any good product that you want to deploy in a production. Maintainable, that is extremely critical. And I think at RADBC, really at a top level requirement is because of the size of the solution, it need to be maintainable. The operations team that actually managing the solution, it should not turn like a nightmare for them to kind of manage it. Because if it's extremely hard to manage it, it's very hard to bring customers to that kind of a solution and give them a service that you want to give them. That is part of deploying. Now we are talking to the user. So you have a big large clusters that you can connect, you can provide those resources for deployment and user can consume it. So if your interfaces are kind of very much impacted by what kind of a backend networking, backend storage, backend compute nodes you have, that will make the consumption part very hard because then user need to take care of a lot of first off in their mind when they want to deploy their application. So we need to have a very simplified consumer interfaces so that we can consume this large scale cluster. Deployability. So that is day one and day two problem. But we're given that clusters lifespan can be very small or it can be very long. So small duration clusters, you need to make sure that cluster can be deployed very fast. You don't have to wait and one or two days to kind of configure backend networking and bring it back up. So these are kind of when we look at the overall solution that addresses all the functional aspect of it. But we need to qualify it for a production deployment. You need to go through all these criteria. So now any solution that we want to let's say start writing, you need to kind of focus on two aspects. We are just focusing on two aspects here. One is API definition level and network level because we just wanted to focus this toward the networking aspect of it. Obviously, there are various other aspects that we need to take care of it. But for that session, we're just focusing on the API definition level and network level. Now, given that you have a multiple cluster here, you need to expose your construct that you have in a single cluster at the multiple cluster level. And you need to define them at what those constructs mean at that cluster multiple cluster level and what they mean at individual cluster level. Once you kind of define these clusters, obviously there will be data there. So if there is a data, how we are going to share and maintain that data. And that to across the clusters with all the possible problem statement that kind of goes to the solution that also applies to this particular problem as well. And then you need to maintain that data. You can't lose that data because multi-cluster solution that is even more critical data to save because your state of this planetary scale solution is in this metadata. You can't lose that. So this is kind of related to API definition level that you need to think of these and provide APIs and interfaces that is manageable that you can use and manage these metadata and constant construct at a multi-cluster level. The second aspect which we are going to focus is the networking level, the connectivity between the clusters. So these are the two aspects that we're going to use. And we are going to go through some of the existing solution. We see how much they are doing in these individual criterias and how they are kind of solving those broad challenges that I discussed, you know, just the previous slide. So the first solution that is pretty popular solution, and I think a lot of people knows who are kind of dealing with multi-cluster is Istia service mesh solution. So Istia API definition level, they kind of define their own construct, which is like service entry, which define that this is a service that we need to expose across the multiple cluster. They have a DNS resolver so that they can resolve the service reachability across multiple cluster. They have a cluster registry, because that comes to the cluster management part of it, because you have multiple clusters there. And they have Istia gateway. That gateway is basically envoy this standalone container that act like a gateway for individual cluster, and that can enable L3 routing between your clusters. So if your services are kind of deployed across multiple cluster, the traffic will go through these gateways and the gateways will take care of the networking aspect of it. Now, at a networking level, what Istia does is they they're not proposing any specific solution at a networking level, they're not cooking a very specific solution. What they are doing is they can be multiple deployment models, and they actually shows them that how using these multiple networking model you can deploy Istias in multiple ways, and that can serve to your specific use case. So now it comes back to the user and user decide, okay, for my network, the way my data center network is configured, probably this kind of networking model will work, and I'm going to use that networking model. And they basically use as VPN and gateway connectivity between the clusters. And the gateways is still kind of they are routing the main core component is gateways, Istia gateway that is in warbacks and that actually do all the magic for you in any of these network models. So now at a service mesh level, and it provides services that is kind of across multiple cluster and there will multiple variants of it. So it's not one service mesh, they say you are creating service management in single cluster across multiple cluster, or you have a multiple service meshes that is across multiple clusters. So they kind of provide those kind of a user API is to kind of configure your services. Then there is a control plane configurations. So if you have multiple cluster, you want to have a shared control plane, or you want to have a replicated control plane, in the sense you want to run control plane per cluster, or you want to have one shared control and that actually manages, you know, all these Istio deployment or service meshes. So if I think this is the current state, as of now they support three deployment models. And I think there are a lot of good work happen, your recent work that actually simplify the networking aspect as well here. But these are the three, you know, deployment methods they kind of proposes one is replication control plan as I mentioned, Istio control plane on every cluster, every cluster, then shared control plane with single network where you have a where your data center is connected to a flat networking, but you so and then you are using shared control plane to kind of connect all those data centers and deploy services across those data centers. And then you have a shared control plane mode, but with multiple networks where your data centers or clusters are running totally different network, and they are across healthy boundaries, that's where they use Istio gateway to kind of connect them. So if you see this slide, if on the top icon, Istio basically does, you know, multi cluster life cycle management, but it's not entirely that's not entirely focused to kind of do multi cluster life cycle management, because it provides a service mesh across multiple cluster, it need to take care of some aspect of multi cluster life cycle management is like connecting or onboarding, you know, cluster where you so that Istio control plane knows that there is one more cluster now in where user want to deploy the services, but that their main focus is consistent application deployment across cluster and seamless service connectivity, that's their main focus and you know, that is entirely supported there, but multi cluster life cycle management and seamless spec iterating now, routing L3 request or routing the request through gateway, they basically give you multiple way of doing it. So you know, you can use the gateway and deploy in the way, it does not give you a seamless solution in the sense, a user can't just go and say, okay, I want to deploy it and my all problem will be solved. So they are not addressing all the aspect of it, but definitely they are, you know, addressing some aspect of that. So anything that is in orange that actually says they kind of do a partial support of that and anything in a blue is like, you know, they fully support it. Now, next solution is Submariner, which is basically a network connectivity, it basically, you know, solves the network connectivity aspect of it, but you can't solve the network connectivity aspect in a multi cluster federation without knowing about your multiple clusters and what resources are there. So Submariner project kind of created some more projects internally, they call admin and lighthouse, they kind of do the metadata collection of the clusters, which actually brings awareness to the Submariner that there are these cluster that exist in the system, where we need to enable the networking between them and the service discovery aspect of it that you have services deployed across clusters and you want to connect them. So you definitely want to discover them. So there is no existing functionality out there in the Kubernetes as of now that actually do the cross cluster service discovery. So they kind of build those projects around that. Now, there's some work that is happening in Kubernetes federation group, where they are trying to define a multi cluster service API, they basically trying to address the problem, you know, the problem statement of connecting service, which is the partial part of this entire solution. And Submariner is planning to implement that because that gives them, you know, some, some aspect of service discovery and, and sharing, getting the metadata in a very standard way that is defined through these APIs. At a networking level, they enables your L2 level overly, base connectivity between the clusters. So they use IPsec and VXlan there, they use L2 gateway parts for netting between the clusters. So if the traffic want to go from one cluster to another cluster, it directed towards the L2 gateway, and then it will be contaminated there and it will be further routed to the L2 gateway of the next cluster where it is destined. It provide network policy informant. This kind of work currently in progress there. The only downside is they are, as of now, they are not supporting overlapping IP addresses, but that is, you know, under plan to kind of support that use case. So now if you kind of put the solution and these four broad challenges that we kind of listed out, it does not do the multi-cluster lifecycle management and consistent application deployment across cluster, because that's not literally a focus of this particular project, but it provides seamless connectivity for some of the use cases when the, you know, depending on the cluster and depending on the service that you have, and it provide a seamless packet routing because that is what the main focus of this project is. The next one is RAZI. So RAZI is, you know, is on a different, on the other spectrum of these broad challenges that we listed out. What it does, it basically do a multi-cluster lifecycle management for you, and it provides you a consistent application deployment across cluster, but they don't do any seamless service connectivity or seamless packet tracking. They rely on your LTE networking and they have assumption that your clusters are connected through your LTE networking or they are reachable through IP addresses. So definitely you need to kind of have a gateways or public IP addresses that is reachable and they kind of deploy on them. So that actually restricts the solution to multiple angles in terms of, you know, the way we talk for exterior like multiple services across multiple clusters and stitching them. So there is limitation in term of that, but the focus of this project is kind of doing multi-cluster lifecycle management and providing them as a resource to your user so they can deploy application across these multiple clusters. Reddit has a solution called AdWars cluster management, which also focuses on multi-cluster lifecycle management, which also provide a very consistent application deployment model across the clusters. It does a seamless service connectivity because they kind of expose some more construct in term of, you know, managed clusters, channels, subscription, placement rules, and they expose API to deploy resources across multiple clusters. So using this, users have a little bit more control in term of defining service connectivity, but it does it addresses all the use case, not yet. Seamless packet routing is something that is also not in the scope of this project, where you don't have a dedicated networking solution that goes and do the magic in the background. So that's why you need a complementary solution that comes here and can help us, so just like some money. Scooper is another one, and we added this project in the list because this actually talks about L7 networking. So can user deploy their solution where two clusters are there and put the service across them? Definitely it can be done at a small scale because scooper does that and you don't really need any networking, you know, dedicated networking solution for this because it can handle all these, you know, traffic at layer seven. It means you can deploy two services across the cluster and then you're sending HTTP traffic, they can route it to L7 routers and gateways. So you basically dealing with L7 networking. So it's not even an L2 and L3. They say, okay, what are the basic networking you have as far as they're reachable, we can take care of the traffic routing and services stitching at the L7 level. So that is very, you know, other interesting solution that goes to the L7 boundary of the networking part of it. So, and this is I quickly want to touch upon this thing is MCS API is a multi cluster service API that is proposed by multi cluster sick. And the intention of this thing is to kind of summarize, you know, to go through these, you know, implementation that is out there and come up with the API is that actually is more generic. So the most of these solutions can adhere to those API. So it's going to be easy to maintain any kind of a multi cluster lifecycle management on top of it, because you have a very generic API is for example, you take Kubernetes, networking API, storage API, node API is all those API are generic API. So any solution you write it, it can be backward competitive. So they're coming with those kinds of API is to address those challenges. There are some implementation currently being done on these APIs. One is Cisco is Cisco data implementation, a cube proxy base implementation of this multi cluster service API that actually dabbled with the boundary of L2 and L3. Submoney or project also kind of implementing the MCS APIs. Now, there is a project called MC robot is another implementation that aligned with, you know, they're not literally implementing these APIs, but they are kind of aligning on the same part of these APIs. So there's a lot of interesting work currently going on in a community in this field. And some of these work is something, you know, we should keep an eye on. So now that brings us to a question of our original question, should networking impact the solution? Tom, I think you can give that answer there. Yeah, the question is theoretically no. But practically, as Anil has shown you, the answer is yes. Unfortunately, today, we have solutions that are a little too glued to the networking. And practically speaking, you know, Anil talked about some of the solutions that we're using to move through that and and not be bound by that issue. And so really, our conclusion is, as part of the talk here is that really, we need to focus on solutions that irrespective of the network layer that they're simple to operate, consume, maintain and operate. And if we follow these guidelines in whichever one of the solutions here that ends up, you know, the more predominant one at the end of the day, including modifying ones that exist today, you know, we definitely can hit this hit this goal. So with that in mind, I wanted to on behalf of Anil and myself, thank you all for taking the time to listen to our talk today. And we hope you have some interesting questions and feedback. Thank you very much. Thank you.