 to the March 1st OpenShift Commons briefing. My name is Paul Mori, I'll be your host today. I have with me Yuri Tsarev from ABSA Group, and he is going to talk to us today about the Kubernetes global balancer. So why don't you take it away, Yuri? Hi, thanks a lot, Paul. So I'm Yuri, I'm Principal Engineer in ABSA, and I'm part of the platform engineering team, and we are mostly focused on building advanced automation on top of Kubernetes. And one of the open source projects that we produced in 2020 is the Kubernetes global balancer. So roughly around the concept. Guys, can we hear me well? Yeah, we can hear you. Yeah, yeah, sorry, there was some... Go ahead. Yeah, yeah, thank you. So this project is originated out of the need of global balancers that is Kubernetes native and pretty much cloud native. So we tried multiple vendors, none of them work for us well, so we decided to develop one global balancer from scratch, but this global balancer is not a standard approach for world balancing. It doesn't surpass the traffic through itself. It actually modifies DNS responses on fly and it monitors the Kubernetes primitives from inside the cluster. So it is developed as an operator pattern like it's going from CoreS back in the days. It doesn't have any single point of failure. So a controller is installed on top of the target clusters where our workloads are running. And there is no control cluster. So there is no single point of failure in that regards. And everything is built on top of standard Kubernetes primitives, ingress services, the seeded endpoints and down to aliveness and readiness for props. So the core of the operations DNS protocol and as it's running internet is pretty reliable. It obviously has its own limitation, but for global balancing scenario, it works pretty good. And we built, we try to build the KGB this way so it's as much independent of environment as possible. So meaning that we relying on the environment DNS, for example, we call it edge DNS, it's route 53, for example, or info blocks in our on-prem scenario, or NS1 as we have another integration. So we are configuring only the DNS zone delegation on environment DNS. And the rest of the DNS responses are served dynamically by KGB itself. So from implementation standpoint, we use the operator SDK, SDK, it worked very nicely for us and it allowed us to bootstrap projects pretty quickly. Another integral in the part of KGB score DNS, that's exactly the tiny part that serves the DNS and responses dynamically and basically steering traffic to desired clusters according to the web balancing strategy. External DNS is taking care of communication with external DNS, with edge DNS providers as I mentioned. So it's as a route 53, NS1, info blocks and maybe something else in future. So these three we already very well tested. So this part takes care of automated zone delegation. And we also were using HCD, like dedicated HCD cluster and associated HCD operator to actually populate a local HCD database dynamically. So external DNS would read information from dynamically, sorry for that, dynamically populated DNS and point CRD and external DNS would locally create the HCD entries. So and eventually core DNS would read from this HCD, it's so-called the SCAD DNS backend. So it worked like out of the box, we would have less community procedure here, but it had quite amount of reliability and maintainability problems for it. So for example, a CD operator is dropped by community completely at CD itself, and it wasn't working reliably enough. So long running clusters we were finding at CD in a degraded state from time to time. So it was a problem for KGB reliability itself. So that's why in a recent version we dropped at CD and HCD operator completely and we replaced it with the core DNS built with a custom Kubernetes CRD plugin that we developed recently. So core DNS in our case can read information directly from DNS and point. I will show everything like in a demo how it works. So now we can create this dynamically morphed to DNS responses, this information out of DNS and point CRD directly from communication of core DNS with basically Kubernetes API bypassing a necessary layer of HCD operator and it may host it up much more reliable. And everything is driven by a single CRD of GSLB type. So we have quite a amount of integrations. So this Hdns provider as I mentioned, so info blocks we carefully testing like not testing the using in the bank and that was the first provider we implemented. So we'll fill out our business needs. Route 53, we using AWS reference sit-ups that I will show you today on top of two geographically dispersed DS clusters and NS1 as our future integration. We are very close collaboration with these guys. They are amazing. And we have very nice open source cooperation with Admiralty project, which is a multi cluster scheduling and it works very nicely where Admiralty schedules workloads on top of multiple clusters and KGB basically enables a load balancing for them. So we have a associated tutorial for that on both of the pages of both of the project. So you can quickly go to the demo. Just quickly on a project website, so it's KGB IO and associated GitHub obviously. So just to provide a context between the demo. So we will run two case clusters. Yuri, I might help if you just bump the font size up or zoom in on that picture a little bit. I'll try. Is it somehow better? There we go. There we go. Oh, cool. Thank you for that. Yeah. So we will run a demo sit-up similar to the picture here. So two geographically dispersed clusters. One is in Europe and Dublin. Another one in Africa Cape Town. And on top of both clusters, KGB is already pretty ployed and there are some simple workloads and we will work with GSLB Custom Resource Definition. And Custom Resource Definition pretty much works, it looks like, like that, it's on the index page. So we have a kind of GSLB obviously have a special API and we have embedded ingress resource type. So basically it's kind of the same ingress spec as a standard Kubernetes one. It's actually behind the scenes, under the hood it's exactly the same goaling type and we embed it into GSLB. And we specify as pretty standard to also host and backend service. So and the controller will be monitoring its healthiness. And on top of that, we're attaching, we're adding the specific load balancing strategy. So let's go straight to the demo, I guess. Is my console visible? All good? It's visible to me, yeah. Cool. All right, so what we're running here. So two clusters, as I mentioned, you'll just, you'll check this, get notes, the geographical locality. Okay. And I'm surprisingly logged out for that. Sorry, token just got expired. Wasn't very lucky. So yeah, we are in Europe. The second cluster is in a Cape Town. So you see it by these texts, right? You know, note names. So, and on the right pane, we just running continuously this stuff. It's a, it's just a demo, demo script, which continuously pulling the associated FQDM. So it's just while true, right? And it grabs the message because it can, the sample application contains a geotech to actually demonstrate where it's located. So it's super simple. So we continuously pulling it. Currently, everything is healthy and located in Europe, as we can see by a geotech here. So let's investigate the testing setup. So first of all, how KGB setup looks like. So controller itself, core DNS, and external DNS, which is taking care of the three side of configuration. So effectively, don't delegation. So, as you can see pretty minimal footprint as of now, especially after the good read-off at CD, operator and at CD, special CD cluster part. So for testing workload, so we are running in test.jslb namespace special workload, which is like standard code info, which exactly returns this response with a geotech. And we have jsobs. So this special jsob failover one, which is actually deployed out of the spec, which is very similar to the one that I showed on the index page. So again, failover test.jslb.io as a main cost backend service, which we are running in a testing namespace of test.jslb here, and failover strategy. And we are pinning European cluster as the main one before the failure occurs. So in case of some malfunction, it should be failed over to African one to a Cape Town. So in a runtime, it looks like this, but better to specify it. It's fully with the status. So it detects that the service is healthy. The ones that is running in test.jslb namespace, the ones that is front end put info, and the ones that is referenced in embedded ingress spec, right? And it populates the DNS endpoint with the IP addresses of a load balancer associated with the workload ingress. So if we will get ingresses, and a specific test.jslb failover, we will have in a AWS setup that we are running associated NLB. So if we dig this NLB, you'll see this addresses. And that's exactly the ones that are getting populated in DNS by HGB. So we pretty much can dig right now, and they are identical. So basically, currently, our code is running and we are pointing to European cluster. If we go to Cape Town, so just to pre-verify that we are in Cape Town, we are in Africa, you'll see that we are running exactly the same setup. So without any modification, it's the same spec. And both clusters are returning consistent responses. So as you can see, this African cluster also returns a DNS response which consists IP addresses of European cluster where workload is Kelsey and which is labeled as a main one. Yuri, a question that I thought of looking at this, how does the GeoTag field of the status get computed? Yeah, so GeoTag, it is specified at the very beginning of deployment of the cluster. So for example, if you look in European HGB configuration, it's basically Helm values. So we specify GeoTag and we specify GeoTags to talk to. And in case of African configuration, we're doing exactly the same advice where the GeoTag is the one that we label the cluster with and external GeoTags to talk to is European one. So we close them and the rest is created by convention. So zone delegation, NS servers names, they are all consistent GeoTag and zone and basically clusters out of this GeoTagging, they know how to contact each other through this convention LFQDNs. And that's how they basically sharing the information also through DNS protocol. Got it. Thank you. So we can actually go to the European cluster and try to emulate the filler. So we can do it simply by, as we usually do it by scaling, right, so of deployment. So testing deployment for the input info and scale it to zero. So what about to happen, GSLB should detect unhealthiness. Is there a reconciliation loop? So as you can see is a limitation of DNS protocol. So we're running like DNS DTL 30 seconds. We're also running some reconciliation loop on top. So there will be some five or three during the failover. Let's see how fast it will happen. So reconciliation loop already done. So basically it's already ready to return African IP addresses and basically failover already happened. So it was pretty quickly. So what important to pretty much repeat but in a failover scenario that again we have exactly like mirror resource in Africa and it also returns it against the uniform response, right? Now it took over and both the European and African clusters are returning African Cape Town IP addresses of CSLB bouncer and basically both of them are steering traffic to Africa because word quote in Europe is that. So everything looks great and as expected. So we're getting African Geotech as a response. So we pretty much can scale it back in Europe and see how it will return back to master cluster. So next reconciliation loop, it should pick it up. While we're waiting for that to happen, here's another question is what is the highest number of clusters that you've used the global balancer with so far? Well, we always using it in pair, right? So even I have a ticket like to test it like this more than two but so far all the testing was done with one player. So we have a multiple players like we're running around 122 clusters and the page being able but they always like in this pair. So we are not, we never properly tested it for example for three clusters or four clusters enabled. Not yet. We definitely have it in the backlog, yeah. Yeah, understood. I think this is a pretty new project, right? How long have you been working on this? Well, it started in December, 2019. So it's like slightly more than one year. Okay. Yeah, and it is already heavily used within APSA. So we're running it in production for server projects and more and more teams are adopting. So, yeah, obviously finding new issues and challenges on the way. So for example, currently we are thinking how to actually handle multi-tenancy. Usually our clusters as well, like we are controlling them with our answers centralized way. So there is no problem to spin up a new pair of clusters for a team and they will fully own it. But sometimes we need a multi-tenancy and KGB is actually not ready for it, right? By the way, yeah, if I failed over back to switch back to the main cluster. So now we are in Europe, so everything is expected. Yeah, so, and if we go to the GitHub, we have pretty much good activity. We'll be using GitHub issues as a, not just like the reporting boxes, but also doing a milestone. So we have quite a amount of plans. Outstanding lots of, for example, how to implement more complex strategies. So currently it's a failover and a round robin. I can quickly show you round robin. But yeah, so basically it's pretty much the same. Here in this example, we just separating several applications, several services in backend, right? So just to demonstrate the statuses like non-existing, unhealthy, in front-end, which is healthy already, and a round robin strategy. Yeah, these one are additional controls over the specific stuff like DNS to tell, for example, if you want to make it shorter or longer, so it's all available. Yeah, and for, actually I can show the round robin runtime that you should have this JSUB around. Yeah, so round robin is basically merging the array of IP array of both clusters, right? And basically making them DNS, standard DNS round robin out of two clusters. So as you can see, and again, if you scale, so if you emulate yet another failure, it should return only half an array and then return back. So while it's converging, yeah, so the next step for us would be figure out how to implement more advanced load balancing strategies. Like we don't have an urgent business need for that. Like these two reliable strategies are pretty enough as of now for us, but from community standpoint, we definitely want to implement something more interesting like geographically where geographically like returns the closest DNS response, for example, causes to the requester and all that stuff. So we are thinking of idea of writing some advanced customer DNS plugin, which is like aware of situation more and can modify DNS responses on the fly. Because currently it is nice, it can do its job, but it's pretty much, if you get to these DNS and points, that's how it works in the backend. Controller dynamically completes these DNS and points here, this kind, and coordinates reads it. And according to the strategy, it populates with a specific IP addresses. And obviously it has its limitation. It's enough for basic load balancing strategies is one I demonstrated, but it's not enough for something more advanced like weighted load balancing or again, geographically, geographical locality because it's not enough dynamic here, right? So. Is there a particular strategy that you heard requests for many times from community folks? Well, not yet. So we are still a little bit under radar, right? So there is no yet direct response, but it's very nice that Red Hat actually initiated some conversations and willing to contribute. So Rafael, you probably know him, a very nice person here. Approached us is a very technical question regarding KGB and it looks like we are going to have very nice collaboration in regards of integration to OpenShift. So yeah, looks very promising. And we definitely have some plans regarding strategy. So we see pretty much look in the issue. So if you want to edit topology, some manual weighted and consistent round robin because currently it's pretty much random. So yeah, we have this plans in mind, but so far we heavily worked on IPI stabilization and overall reliability of a solution itself. So many enhancement in a management, for example, so for easy adoption by teams, we implemented like a backward ability to create GSTLB strategy with a simple ingress annotation. I think it's better to show in a documentation. So if we go there, yeah, so one of the main goals of KGB is actually to give development team a power over load balancing, right? So instead of standard HTTP checks, we're utilizing pod props which are defined by application teams. So and the strategy is the describable by a simple CRD. But sometimes like it's a little bit overhead to add yet another CRD into health charts as we have multiple teams. So that's why I made those also community requests for a modern reality project. We added ability to add the specific KGB annotations on top of standard ingress. Controller uses the same information as we specified in the spec. So like strategy type and geotech in case of failover, controller would pick it up and create GSTLB automatically for this specific ingress reference it and it'll close the loop this way. So basically application teams are even not required to control yet another CRD. They can pretty much utilize the standard ingress with a couple of additional annotations and the global load balancing will work for them. Controller will take care. And there's a question in the chat about whether KGB has been submitted at all to CNCF or is that on the radar? Is that on the roadmap? It's totally on the radar. Yeah, I really want to submit it to sandbox as soon as possible. Like it's really good question. Yeah, we have a life in direct plans. You know, just getting ready so it will be stabilization. So I think we are pretty much tested currently like both internally and we will get by community. So project is pretty good shape. So I think it's ready for sandbox. If folks want to get involved, I'm sensing that the GitHub repos are the best place. Is that accurate? Yeah, totally. So we're doing everything in GitHub for PGB. So again, you can have issues. I used not just as issues, but as also and use anything future requests roadmap planning and anything and probably we'll see poor request I will come and to shoot an idea. Yeah. And for chat, we are hanging out in a SIG multi cluster in a Kubernetes click. So you can find us there as well. Diane, I see you have a question in the chat. Why don't we take that one offline? Absolutely. Yeah, I was trying to figure out who's working on this besides APSA or is this just coming out of APSA mostly contributing this project? Yeah. So it's coming out of APSA, but we are trying to gather community. And as I mentioned, we already have a very nice conversation with Red Hat and looks like Red Hat will join and we are very happy about this fact. And there's another question, yeah. Any other questions? Yeah, Vipin is asking, how is this different from F5 CIS? Because they're using F5 CIS. I'm not sure if I'm familiar with CIS, but generally how it's totally different from any kind of standard load balancer. So the two things. So it doesn't pass the traffic through itself. It utilizes, it basically works purely over DNS and it is a layer of internal cluster resources. So it doesn't employ any standards in HDB checks. It uses the fault line as early in the check to make a balancing steering decision. And it's totally open source. Hopefully, somehow answers the question. I know we have a built-in load balancer, I believe, for OpenShift now. So is this, would this replace that, Paul? I think you're referring to the router. Yeah. I don't think that I have enough information about what's been discussed to comment about that. I look forward to seeing it in the sandbox and getting some more folks and more Red Haters working on it and seeing it again in action and integrated into OpenShift. And a demo of that sometime soon, that would be awesome. That sounds great. Yeah, maybe we'll have a sequel to this one sometime soon. That would be great. All right, let's see. Any other questions in the chat? If not, what I'd have you do, Uri, is go back to your home page there for your project, for KGB. And it's so close to KGB. I'm gonna have to keep myself from saying that. Yeah, yeah, yeah, that's a bad problem. That's not bad. And then I would say, this is where people can go as well to find more information. Or as you said, Uri, the SIG cluster, the CNCF SIG cluster, it would be a great place to find you all. And I look forward to seeing it used at scale in production and getting some more feedback on this. I think that's gonna be a great addiction, I almost said, addition to the CNCF and just to the open source communities. This is really a very interesting project. So we'll be definitely following it along closely. Thank you so much. Thanks for joining us, Uri. And just one thing, I think that was Kubernetes SIG multi-cluster. Yeah, it was. All right, great. Yeah, so basically I'm gonna let him go, but channel this. Yeah, that's okay. That's good. That's what it's there for. That's perfect. All right, well, it's great to hear from the Apsa group and looking forward to hearing more great things. And this is the beauty of open source. So thanks, Uri and Paul, for having us today. All right, thanks. Thank you so much. Talk to everybody soon, bye-bye. Bye-bye. Thanks a lot, bye.