 Hi everyone, welcome to Istio call in virtual. I'm Shivanshu and I'm accompanied with Jamie to talk about demystifying load balancing in Istio. And the agenda includes understanding what load balancing is, how out of the box load balancing configurations of Istio can be used. What are the different configurations? So for example, when we configure simple LB, what does random round robin least request mean and how to configure them? What is consistent hashing and how to configure that in Istio and locality load balancing settings? Like what are the failure settings? What are the, how to actually achieve the locality load balancing setting in Istio? And we'll also try to dive deep into how to use this knowledge for configuring load balancing settings in a multi-class set up. And then we have a demo to demonstrate some of the common use cases. So for example, in my left, I have a single cluster set up and all the traffic that is coming from the ingress gateway is distributed between different instances of app one. I may want to distribute this traffic on the basis of some set of rules. And that's where the load balancing in Istio comes to the picture. And on my right, I have a multi-class set up with cluster one and cluster two. And I have a gateway, a multi-class set gateway which is distributing the traffic in cluster one and cluster two. So how would I configure load balancing in this scenario? How would I configure load balancing for the traffic distributing, getting distributed to cluster one and cluster two and then traffic getting distributed between different instances in a given cluster? So the idea is to, and first understand a single cluster set up and then try to extend that knowledge to a multi-class set up. So what exactly load balancing is? So in a simpler terms, if I have a service A which contains this deproxy and service B again, it contains this deproxy and there are multiple instances of service B. So whatever traffic that is coming from instance A is somewhere or another need to be load balanced between different instances of service B and the power, the flexibility to define some set of rules is what we call as load balancing algorithm. And we'll discuss what are the, out of the box load balancing settings configured, which can be configured in this too. I can, I mean, there could be a requirement of splitting the traffic between, let's say a staging environment or a production environment. I can also configure that and then distribute the traffic between the production environment using some load balancing settings. There could be a use case of distributing traffic differently, load balancing traffic differently for a different port. So for example, for port 80, I may want traffic to be distributed in a round robin fashion among different instances of service. And for port 443, I may want maybe consistent hashing for my traffic to be getting distributed at differences of service A. And I may want to distribute or load balance my traffic based on the locality of the service itself. So if it's in the same region, same zone, I may want to prioritize that as compared to a service running in a different zone and different region. Maybe I can configure some weight-based distribution for multi region, multi-zonal environment. So for example, traffic is originating from region one, zone one. I want 70% traffic to remain in the same zone but 20% to another zone and 10% to another region and another zone. There could be a requirement of setting up a failover settings. So for example, if particular service in a given zone goes down, what should be the next service which where my traffic should flow? So we'll take a look into how all this, what is the meaning of all these configurations and how to actually see within the studio. So first of all, let's discuss about random which can be configured using simple LV in studio. So the idea is, so for let's say for a given service A, if I have multiple instances and at the moment, let's say only one, three and four are healthy. So the configuring at random would only pick a healthy endpoint at random and then send the traffic to it. If I configure round robin, then the traffic would again go to healthy endpoints but in a round robin fashion, meaning the request one may go to service A instance one and then the request would go to another healthy endpoint which is service A instance three and then the request may go to service A instance four and then the round robin fashion, the request would come again to service A instance one. I can also configure least request. So for example, this is the, at a given time, this is the current scenario that so far service A has served 801 request, service A instance two has served 1012 request and then it went unhealthy and service A instance three has 900 request and so on. So I want my traffic to go to the service which has served least request so far. So I can do that by configuring least request and simple LB in STO. So what are the best practices by choosing them on these good financial settings? So for least request, I mean it's preferred to use least request over round robin because as we can see that least request ensures that all the traffic is evenly distributed based on the least request. So it ensures more, so it's basically as serves as a better replacement of round robin and experimentally like it's found that random performs better than round robin and it's mentioned in the same SODOG. So if we are using random instead of round robin, so generally it is found experimentally that random works better and it's more performant than round robin. Now let's talk about consistent hashing and how it works in STO. So the idea of consistent hashing is you have a hash function which hashes incoming traffic based on some hash key and then servers are also hashes and hashed and all the traffic goes to this nearest server. So for example, if I'm hiding based on header name and the hash value turns out to be one then it would go to service instance one. If the hash value turn outs to be two it would go to service instance two and so on and so forth. So in STO, I can achieve consistent hashing using STP header name itself. I can also use STP cookie to route my traffic to different instances of the service. I can also use source IP. So for example, consider case where I want to send my traffic to some specific instances running in some specific zone and I want consistent hashing based on the source IP. So if the source IPs is coming from some specific region or for some specific set of users I want some specific service. I can do that by using consistent hashing with source IP. I can also use STP query parameter to route my traffic to different instances using consistent hash. And then I can also use the famous ring hash and maglev in STO out of the box to achieve consistent hashing. So let's also talk about how locality loan bearer settings can be configured. So one way is to use weighted distribution meaning, so if I have a particular reason US West and particular zone, zone one then I can route my traffic in a way that 80% of traffic would go to the same zone and 20% would go to different zone. So this is like a weighted distribution of traffic based on the locality. Again, like similarly I can send all the traffic from zone two to remain in the same zone and 80% of it would remain the same zone and 20% would go to another zone. In this case zone one. We can also configure locality failover meaning what happens if a particular service in a particular zone in a region goes down. So I can configure different priorities for different zones, different reason to take care of failover scenarios. So in a simple case, I can just configure failover from US East region to US West meaning if US East services in US East region for some reason goes down then I can still serve my traffic by sending the traffic to services in West region. So for example, if we take a closer look at this particular diagram. So if I have region one zone one and hello world is running inside and region one zone two with the same application distributed among multiple other region and zones. So if I configure failover settings from the region one to region two and region one to region three. So this is how the, how is to decide the priority for failover. So the highest priority is the same region and same zone and then if there is no locality configured that is given to the second priority and services running in different zone but in the same region are given the next higher priority and then based on the failover settings. Next, like the next priority is given to zone three in the region two as I have configured failover settings from region one to region two. Similarly for region one to region three I can have traffic flowing from mess region one zone one to region three zone four because I have configured the failover setting. So this is like a very high level overview of how we can configure it and let's try to understand how to achieve that in a multi cluster. So I would hand over to Jim to take it from you. Hi everyone, I'm Jimmy Song, the developer advocate of Tetrait. Let me show you how to deal with the multiple cluster load balancing with ECOS Smash. So far everything is good in a single cluster. You can choose whatever load balancer type you like but for multiple cluster scenario is another story. Consider the following multiple cluster setup. We'll have two clusters from different vendors. Your first thought should be creating a gateway for each cluster and then creating another one elsewhere to connect them, right? Here it comes to the two tire inverse gateways. For two meshes to communicate each must have an inverse gateway. In addition, there need to be a unique entry point for user access. So maybe create a new gateway typically deployed on a separate cluster. It can be in the same cluster, but we don't recommend it. To allow each other to find other endpoints, we need to create a source entry for them. For each cluster to discover and route services, we must create virtual services and destination rules for each deal. Meanwhile, cluster zero has also deployed each deal and we need to create this easy resource objects in each cluster. Now let's take a look at this demo. This is deployment architecture of our demo. I created three criminal clusters on GKE and deployed each deal in each cluster. These three clusters are located in different regions. You can see this is the region name US Central 1-1, US West 1-1, US West 1-2. I also deployed the booking for application in two tire two clusters. One only deployed the product page service and the other cluster deployed the entire services. Next, I will demonstrate how to achieve the multiprocluster routine and load balancing. Only by solving the multiprocluster routine problem can we achieve more advanced functions such as load balancing and failover. Because the tire one cluster also deployed each deal, you can apply the load balancing method introduced by Xinhua issue earlier to this gateway. As you can see from the diagram, it still is also deployed in the tire one cluster and the two tire two clusters. You need to create radio services, destination rules and service entries in each cluster. The service entries need to include the entry points of the ingress gateways of each cluster. I have switched to the cluster zero. Let's check out the each resources we have created. This is a ingress gateway, tire one gateway and this is hosts. Let's check out the virtual service booking for the tech trade aisle. Here is the virtual service. You can see the match headers, escutters, selectors and HTTP paths. It will be routed to different subsets. Let's check out this subset. Here is the destination rule from cluster zero. And there are some different subsets. See the subsets name. We just see this one. Booking for external one, zero. Booking for external one, zero. Okay. And it will select the entry point with the label match, SB tech trade aisle cluster. We have as orientation to the label to self-century TKE Jimmy US West one two. And this is a self-century created for each cluster. You can see the end points with different labels. Here is the label. A healthy on the destination rule subset. This is the way how the tire one gateway finds the end point of the end points from different ingress gateway in the tire two cluster. Let's take a look at the rules of user's requests. The effects of this demo is when the user request this URL, with or without X cluster selector header. The request will be routed to different tire two gateway, tire two clusters. Use this command to retrieve the IP address of the tire one gateway. Let's run the test. First, get the gateway IP from the Taiwan cluster, cluster zero. Then request the URL without HTTP header to easily view the results. We will export output directly to an HTML file and then view it in browser. Oops, some service unavailable for real details. Next, the way we request the URL with HTTP header. This time with five bytes header with cluster one as the result. It's the same with the last request. This time we'll specify the header with cluster two. This time it works, everything works fine. This is the same as what we planned. This demo shows how to use HTTP header and pass routine. You can add the end points of tire two clusters to a cluster to a subset and configure load balancing in the destination rule. Add is the subset and as load balancing here. This is what I mean by multicaster load balancing. And after sending some test requests, let's look at the service topology. When the user requests the cluster two, all services will work properly. But when the user requests cluster one, the reviews, ratings, and detailed services will be unavailable. You can solve the failover issue by configuring the ingress gateway in the tire two clusters as an egress, as an east-west gateway. Finally, let's summarize. Is your support a variety of load balancing types based on Istio? If you use Istio to create various resource object memory, you need to switch back and forth between multicaster repeatedly. It is extremely prone to errors. Routine and load balancing across multiple clusters need to be further automated, probably with an interpretation layer on top of Istio. That's what Tetra did. The Istio cluster was deployed with TCP in this demo. It is a product from Tetra and compatible with the upstream Istio. You can learn more about it from the Tetra website. Here are some reference links. If you want to find the load balancing of Istio, you can use in-wall filter. I hope this sharing can be helpful to you. Thank you.