 Welcome, or welcome to this session. At this session, we're going to talk about breaking through cluster boundaries to autoscale workloads across them on a large scale. And this is Xin Yan Jiang, software engineer at DotCloud. And my name's Ying Zhang, and I'm a software engineer at Zendesk. We're going to cover the HPA pod autoscaling and the benefit of autoscaling across clusters, multi-clusters, and introduced to Kamada. And the Kamada features about federated HPA and crown federated HPA. Then we're going to do a live demo, then five to three to five minutes on the Q&A. Let's, like, I'm assuming that everyone is familiar with HPA, but let me refresh. So HPA, the horizontal pod autoscaling, is a mechanism for automatically scaling the number of pod replicas in Kubernetes cluster so that it dynamically scale an application based on the current metrics. So there's no. OK, so during each time period, the HPA controller manager carries resource utilization based on the metrics specified in HPA definition. The controller manager identifies the target resources defined by the scale target reference and selects the pads based on spec selector labels from the target resource and retrieve the metrics from either the resource metrics API or the custom metrics API. Here is an example of HPA. So you can see that the number of the replicas of those HPA Apache deployments, the number of replicas will be increased or decreased to maintain the average of 50% of CPU utilization, which means the number of replicas will be scaled up or scaled down if the average CPU utilization is above or below 50%. HPA works perfect in a single cluster. When we're thinking about multi-clusters, how can we handle autoscaling across clusters? So here lists a bunch of benefits. And we can have a unified management of autoscaling operations across clusters that can reduce the operation redundant and also break through the resource limitation of a single cluster, for example, there are massive requests coming through in a short period of time. And we run out of, for example, the instance. And also, to meet different scenarios, we would like to have a variety of various strategies that can scale the workload across multiple clusters, not only by the cluster name. And the scenarios can be like the disaster recovery, like the cluster level, autoscaling full disaster recovery also, you can thinking about other scenarios like cost efficiency, something like that. But how can we achieve that? Carmada can help you with that. Let me take a moment to introduce Carmada. So it is a Kubernetes management system that enables the cloud native application across multiple Kubernetes clusters and clouds. It has a bunch of capabilities listed here. So I'm gonna start from the first one. It is a Kubernetes native API compatible. So it's spiked directly to the Kubernetes native API. So which means you don't have to make any change to our current applications. And it's open, and naturally, open is open source tool. Recently, a couple of months ago, it has been moved from a sandbox project into incubation. It avoids the vendor lock-in. So it has integration with majority of the cloud providers. It also provides the building policies to assess for the scenarios like active-active, remote disaster recovery, and geo-redundant. Oh, sorry. It has a variety of workflow scheduling policies, like a cluster affinity, multi-cluster splitting, rebalancing, multi-dimension, HA, multi-dimension, highly availability across the regions, easy clusters, and cloud providers. It also offers centralized management. By saying that, we're gonna move to the next slide. It shows how Kamada looks like. On your right, it shows that the Kamada control plan is more look like Kubernetes native, Kubernetes control plan. So it has a Kamada API server. It has a Kamada controller, a bunch of controllers managed by the controller manager. Kamada scheduler also has a SED, sees as a database, as a data store, so it can store all the objects. And the members, like each individual Kubernetes clusters, can be joined to this control plan. And either we push mode or pull mode. On the right, it shows like the Kamada workflow. As I mentioned before, Kubernetes native API capabilities. So it used the resource template, exactly the same with the Kubernetes APIs. So, and let's move down a bit. The propagation policies, the user design default, how the workload be propagated to the member clusters. With that, the policy controller gonna bind those propagation with the resources accordingly. And after those resource binding generated, the bind controller gonna take the override policy to do a JSON patch to the resource binding and generate the work objects into the specific namespace. And the propagation controller will take the actual propagation to the member cluster. This is more like a before and after, so before we using Kamada. It's more like an user or something that manually deploy each deployment to each member cluster. So a lot of like an engineer toil or duplicate operations. But after we use Kamada, it offers like a unified or centralized control plan. So all those like workloads can be propagated to the right member clusters based on the propagation policy of that. So it's more like a one step instead of the redundant operations. But you may ask like every have a perfect like CICD pipelines to handle all those work for me. Like I'm not like manually deploy all those stuff, but Kamada also offers the workload propagation policy. So advanced policies. So not only based on the cluster name, it can like based on the labels, the field, tent and tolerant and also topology and available resources. So let's take an example on the topology and available resources. On the left, it shows like in most of the workload are scheduled to like zone X and maybe less workload schedule on zone Y and you can define all those in your propagation policy. On the left, it shows that the workload is scheduled on the cluster member two in the middle because it has more CPU. So it kind of makes sense, right? We have instance reserve instance, like in AWS, we have reserved instance and we want to take fully usage of those instance. So we schedule more workload there and for those like less CPUs available in the cluster member one, in cluster member three, we just schedule less workload. Small customer efficiency. And I've been talking a lot about propagation policy. How does it look like? This is an example. It shows like how we propagate the NGINX deployment. You can see there is a cluster affinity and it's defined like this deployment gonna be propagated to cluster member one and cluster member two with the replication reference as weighted, the last two lines over there. So it can be like static divided, like we just have static members like one or two. It also offers like dynamic weighted. So you can define it like based on the available resources. There is, under that, the last line is replica scheduling type. There are currently two types available. One is duplicated, the other one is divided. We're gonna, in the demo, we're gonna use the divided but duplicated you can think about when you're doing a blue, green migration and things like that, during the migration, you wanna have the workloads existing on both clusters. And I just, I don't think I covered all those policies. There's a URL at the bottom that you can visit to the Commodal website. There's more detailed information there. The other scenario I would like to mention is the cross cluster application failover. On the right, so there are three member clusters registered to the control plan and one of them is in trouble. Maybe the API connection, maybe the API connection cut off or something you can imagine, like all of a sudden the member one is unavailable, it's not ready and it will be detected by the cluster controller with discovery failure and the workload on the member one will be automatically scheduled to the other available clusters. So it's gracefully and migrate ensure on interoperative service. All right, I've already covered the first topics and then I'm gonna hand over to Xin Yan and he will walk you through the Commodal features, federated HPA and crown federated HPA. Okay, okay, thank you. And that's been, as we will know, when we want to use resource in Kubernetes, we should first look at the API definition of natural resource. So now I will introduce the federated HPA. As you can see, the federated HPA is consistent with the Kubernetes native HPA. And for example, the spec field is very similar to the Kubernetes negative HPA. They both have fields such as mean webcast, max webcast, metrics behavior and so on. So you can also see not the metrics field here reference the fields of negative HPA. Okay, so how can we use the federated HPA? And let's take a close look at how to migrate. As I described earlier, the federated HPA is very similar to the native HPA. So if you have any experience using negative HPA, you can easily work with the federated HPA as well. For example, on the left is the negative HPA, on the right is the federated HPA. You can see not, the two examples only differ in API version and can't. Therefore, you don't need to make any change to your existing HPA as spec field. Okay, this is very important. So how can, so let's see how it works. The KMANA, the federated HPA controller under control plan obtain metrics of the deployment through the KMANA metrics adapter and dynamically scales the number of replicas of deployments and then KMANA scheduler will schedule this newly added replicas to different member clusters. For example, member one and member two based on the scheduling policies specified by the user in the propagation policies. Ying has already introduced the propagation policy. So this enables course caster auto scanning, okay. The federated HPA controller is located in our KMANA controller management component. It queries metrics from KMANA API server and the scales replicas, of deployments accordingly. And then KMANA API server queries KMANA metrics adapter and then the KMANA metrics adapter queries the member, metrics queries the metrics server of member clusters to obtain metrics or customer metrics. And then aggregates them and finally returns them to KMANA API server. So you can see that the federated HPA emulates the mechanism of negative HPA, okay. With the latest version of KMANA releasing there is a new API, Chrome federated HPA. It is permanently used for scenario neck. If there is a traffic spec every day at the 19 a.m. I would like to proactively scale up the related servers or hand of time, for example, 13 minutes before to handle your people loads and ensure under gravity availability, okay. And another scenario neck to schedule database migration jobs on weekends, okay. In general, Chrome federated HPA is used for regular auto scanning actions. It can scale workloads, not to have a scale separate source or federated HPA. And let's take a look at the two examples of Chrome federated HPA. On the left, we have an example of schedule scanning for federated HPA, either it includes a schedule field for writing Chrome table expressions and target the mean webcast field. On the right, we have an example of schedule scanning for deployment. It also includes a schedule field for writing Chrome table expressions and a target webcast field, okay. So when the specified time arrivals, the Chrome federated HPA controller will trigger a scale up or a scale down actions allowing you are to proactively scale up or down your deployment, okay. Kemana implements the Chrome fed HPA as a control loop, not proactively checks the Chrome schedule time. If the schedule time is reached, it scales workloads webcasts or federated HPA mean webcasts or max webcasts. So you can see that the Chrome federated HPA working mechanism is very simple, okay. As I described early, a Chrome fed HPA can scan workloads with a scale sub resource and the scanning federated HPA. And so let's take a look at the consideration for usage. In general, it is important to ensure that the scanning operations performed by Chrome federated HPA and do not conflict with any other ongoing scanning operations. And so it is recommended to first use Chrome federated HPA to scale federated HPA and then federated HPA can scale workloads based on their metrics. And then let's take a look at the advantage and the notice of Chrome fed HPA federated HPA advantages. The API and Kubernetes native HPA are almost identical with the same user experience and low migration cost. And notice federated HPA is the type of the center. Oh, okay, okay. And, okay, okay, federated HPA is the type of the centralized and multi-cluster HPA when the scanning concurrently on a larger scale or larger by the way is required. And the storing and then computing Chrome responding data requires match CPU and memory. Okay. In the face of some sort of comments of federated HPA, the commander community is exploring a new API, distributed HPA, which he introduced an agent in member clusters. It will list and watch distributed HPA and calculate the intermediate states. And then the intermediate states will be updated to the status fields of distributed HPA. And then Kimada calculates the final results and adjusts the number of webcasts based on intermediate data reported by the agents in the member cluster. Okay, currently this design under discussing. So you are welcome to join us. All right, at least let's take a quick demo before we move to the takeaways. So in the demo, we're gonna show you how we use Kimada to propagate the workload to the member clusters. We're gonna have three Kubernetes clusters in local. One gonna be a host, gonna be the host where the Kimada control plane gonna lives. So this on the right terminal, oh, sorry, right terminals. And those are two separate individual Kubernetes cluster will be joined as member cluster. But currently, in those member clusters, they're not joining yet, but as a prior request, the metric server has been deployed to those member clusters. We have set up the Kimada on the Kubernetes cluster. All right, let's do the join using Kimada CTL. You can see both member cluster have been joined to Kimada successfully and we are gonna get the cluster. It can show that the status of those clusters, they are ready, which means they're ready to be propagated. And then we are gonna deploy a simple engine X application. So you can see the replica set to two. And then we apply this deployment to Kimada, created. But I would like to, if you wanted to, yeah. But there's no workload on the control plan or other of the member cluster because we don't have the propagation policy set up. And after we, yeah. Let's take a look how the propagation policy looks like. It has, it defined how we schedule the replicas is weighted and one in each member clusters and it's divided. So we are expecting to see one replica in each member cluster. Let's apply this and you can see on the right, there are replicas. And you may see that then we apply the federated HBA. Yeah, let's take a look what it's look like. It defined the minimum replicas as one and the maximum replicas as 10. With the scale done, the scale up, the time is 10 seconds. With the CPU utilization set us like 10%. So we are expecting to see maybe one of the, oh, it disappeared. Maybe one of the, one of the replicas disappeared because there's no like load, actual load or traffic into those NGINX application. We've, we've logged in to one of the member cluster and we do a log generator using AB, the Apache bench. So is the command is shows like in within 20 seconds, we're gonna make generate 10,000 of the concurrence. And we are expecting to see more replicas spin up and the member clusters and evenly, which means like maybe one, one, one or two, two. Okay, three, three. Is it three, two? All right, you can see the difference is because it's automatically calculated based on the available resources. Maybe in the member one, we have more available resources but all those three clusters are generated by kind. It's all local. And the load generator is completed and we may see that the replicas will be scheduled down within a couple of seconds. Well, a couple of seconds means like 10, 20 seconds. Oh, there we go. All right, that is the end of demo. Let's move back to our slides. So I wanna take a quick moment to recap for the key takeaways. Here is the regular HPA can now break through the cluster boundaries to out of scale the workloads but the federated HPA and the crown federate HPA can help you with that. And Kamada is an open source tool that enable you to run your cognitive application across multiple Kubernetes clusters and clouds. So I would encourage everyone to give Kamada a try and see how it fits in your USKs or scenarios. Here are the links. Kamada doc, the GitHub repo and we have a Slack channel named Kamada and you are more than welcome to join us. Thank you so much. All right, thank you so much for listening and I just wanna give a shout out to the Kamada contributors and the maintainers. You're awesome. And we are appreciate your feedback. Please scan this QR code above to leave feedbacks for us. And yeah, we're, we still have a couple of minutes. I'm happy, we're happy to answer any questions you may have. Yes. Oh, the question was, so the Dow Cloud is currently using Kamada. I'm gonna let Xinyan answer those questions. Okay, we can discuss offline. Yeah, a short answer is yes. Yeah, I got this correct answer. Yeah, but we can discuss the more details like offline. Yeah, cool, awesome. The gentleman over there, it's your question. Yes, so we saw how you were able to scale your workloads across multiple clusters as traffic came in. But my question is, how are you handling Ingress? Because once workload scales out to another cluster, your Ingress endpoint's still on cluster A. But now you've got a workload running on cluster B, but no traffic's necessarily going to that. So how are you handling that problem with Kamada? So if I understand it correctly, it's the question is when the ingress of a single cluster is coming in, the traffic in, but we've scheduled the workload on the other member cluster, is that correct? Right, right, and without something like GSOB, I mean, I'm wondering how that effectively scales across multiple clusters. Can you answer those questions? It's more like underlying networking, setting up. Right, right, there has to be a networking part to it. Okay, so as far as I know, the underlying network, the best practice is to using Submariner. So all those clusters, all the underlying, all those member clusters are using Submariner, so they can be like talk to each other and handling the traffic across the cluster. That's more like an assumption of that. But if you have more clusters, like we can discuss offline and see your use case or you can open the issue for us in the repo. Yeah, I'll hit y'all up in the chat, slack room, thanks. Okay, thank you. The next gentleman. So in your examples, you're using the default namespace? Yes. If we have custom namespaces with custom resource quotas and limits and policies, do we have to create that namespace with all of our resource quotas and everything across all the clusters? Or do we do that at your top level, Carmada, and it just creates that namespace across all the clusters? I got you. If I understand correctly your answer, your question is when we're using Carmada, how it generates the namespace in the control plan? Is that in the control plan? Yes, when we deploy the, when we do apply the deployment, it will generate a default namespace in the Carmada control plan. And after we apply the propagation policy, it will generate the specified namespace in the control plan, like starting with the Carmada-es and it insured all those work objects. And the format is Carmada-es-clustername-namespace that you specify in your ML file. Gotcha. We can show you the real thing. We can show you everything else or carried across all the next cluster that it propagates to? Yes. And it will create all the namespace in all those member clusters. I think you can see from there. Gotcha, gotcha. We should show that in our demo, right? Okay, thank you. Okay, cool. I hope you answer your questions. Thank you. Thank you for your question. Yes. Sorry, I cannot hear you. I have one question related to federated HPA. Does it support external metrics like Datadog or any other PrimotheSys metrics? Sorry, I still didn't hear you. The federated HPA, does it support external metrics? External metrics? Yeah, right there. I think we offer external metrics. It'll be a Prometheus. Like, have you used it with, for me, three of us? Yeah, we can offer that metrics for monitoring. Yes. There is a session in the website and show that how we're monitoring, alerting based on those metrics. Not only for the HPA or federated HPA, but in general, there are metrics available. Yeah, thanks for your question. Yeah. I have the same question, but what happens in a network partition where your control cluster doesn't have access to the other clusters anymore? So the question was, what happened if the network is in trouble? There's an outage in the network between your control cluster that's managing the federated HPAs and the remote clusters. What happens to the HPAs running in the federated clusters? Oh, the question was, what if one of the member cluster allows a connection? Oh, what if the control cluster can't talk to the remote clusters? Oh, I got you. So the question was, sorry about that, the question was about how we handle if the commodity control plan is unavailable. So if the control plane can't connect to the remote clusters, they're happily doing their thing, the remote clusters are working. Yes. But the control plane can no longer talk to them. We cannot answer your question at the moment because we, I'm assuming that all those workloads are running happily on each member cluster. It would just stop scaling? Yes, self-scaling. So it would no longer scale in or scale out. It would just run at a steady state? Yes, I'm assuming that you have your HPA set up in the each member clusters. Yes, the HPA are going to handle all of those products of scaling in each member clusters. I'm hoping to answer your question, sorry if I didn't. Feel free to open. I was specifically asking about the distributed ones. Distributed? Where, because it looked like the example you just ran, if cluster two, if the control cluster couldn't connect to cluster two anymore, would it just not schedule any more workloads on there? Oh, to be clear, so the member cluster still registers as a member cluster to the commodity control plan, right? Right. Because if we un-join, it's going to delete all the workload from member two. But not an un-join, just like the network goes on. Network issue. Undersea cable gets cut or something. The workload is going to be still running it inside a single member cluster. But in the central, the federated HPA cannot work, cannot scheduling those workload across the clusters. It lost control. Right. But it doesn't affect the workload. The workload is continuing running, but they would not scale in or out? No, yes, without scheduling across the cluster. But it can be running in a single cluster. Yes, yes. OK. You had a slide that showed when your control cluster lost connection to one of the member clusters that would migrate the workload to one of the working clusters. How would you be able to migrate the workload if you've lost, if the control cluster has lost the connection to the member cluster? OK, the question is, there is a slide showing that we lost the connectivity between the... But to be clear, it's not a connection between the control plan to each of the member clusters, just only one of the member clusters. One of them, and it showed you would migrate that workload to a new cluster. Yes. But I'm curious, how would you do the migration because you don't have access to that cluster anymore? So how would you shut down that workload on the cluster you can't reach to create that on the new cluster, or on the working cluster? OK, I got your question. So the question is more like the whole cluster, the workloads, you're running on the bad clusters. And we just autoscale the workloads with the other member clusters. How are we going to handle this situation? But what we're trying to explain is that cluster is in trouble. So the workload disappeared and is based on the feature gate, which is the enable failover. So it's automatically scheduling to the other member clusters. OK, so it may schedule some new workloads while the workloads on the cluster you can't reach anymore might still be used. We're just assuming that cluster is down and the workloads can out handle any traffic. I see. OK, thank you very much. Thank you for your question.