 Uh, up next is Joe Kim and Iago talking about, uh, taming the thundering GitOps heard with update policies. This is particularly interesting for us large cloud providers who have problems when configuration management goes wrong. So I welcome this talk. Have had it. Thank you. So, yeah, hi everyone. Um, I'm Yago and this is Joe Kim and we're part of the Kimfold team at Microsoft Azure. And yeah, we're going to talk about, uh, update policies. So, yeah, I guess I don't have to sell you GitOps here, but yeah, GitOps is great. So you describe the system declaratively. Uh, you have your state and Git, there's an audit trail, you know, who did each change. And yeah, you have a really nice development experience in general. So, uh, the problem is that, uh, GitOps applies all changes at once. So, you know, you do your change, you commit it, and then the operators or the controllers in all your clusters will start pulling the changes. So, yeah, what happens when something goes wrong? Um, and while changes are tracked on Git, uh, the rollout is not. So you cannot just say on Git, uh, just rollout 20% of the cluster. Uh, yeah, that's something you have to do somewhere else. So, and the other issue is that you don't have a global view. So if you have several clusters, uh, yeah, it's usually like one controller per cluster. And yeah, coming in global view of where your update is going, it's great. So we explore existing solutions, of course, like Flux and Flagger. Uh, yeah, so Flux is nice because it's simple, it's very flexible. Uh, but we haven't seen rate limiting and you don't have a global view of how updates are going. Uh, and it's usually for a single cluster. And Flagger is a bit more complicated. Um, web app oriented. It does have, uh, you know, a progressive release and canary deployments and things like that. But yeah, again, there's no global view of updates. Uh, it's a single cluster, uh, tool, and it's somewhat complex. So our proposal here is instead of having Flux just pull the Git changes directly, you put an update manager in the middle. So these update managers should determine, uh, not only what commit to, to apply to the cluster, but when to do it. And it should be this, the single source of truth for what version change changes is returned to the, to the clusters. So, uh, yeah, it should also have a global understanding of each application's update. And it doesn't matter, you know, when it, where it's rolling. So what cluster is rolling, you should have, uh, you know, a global view. So, yeah. Hey, so, um, let me first tell you about Nebraska. So we have several projects involved here. Uh, the first one is Nebraska. This is an update manager and I'll be very fast just describing that it's, uh, in this case passive. So it doesn't have to connect to your, uh, instances or applications or whatever. It's just the opposite. This is the applications that connect to it usually to get updates, information, uh, besides giving the update information, it also gives you, uh, the ability to monitor and to, and to release, uh, stuff based on policies. Like was mentioned before policies means like, uh, maybe rate limiting, uh, restricting the updates to happen only during certain hours and all that. So of course, like I said, this is, this is passive. So when you have an application, uh, using githubs, uh, on a coordinate disk cluster, then, you know, both don't know about each other. So we're missing a part and this is the part. So, uh, this is called the update agent or newer, uh, for short and, uh, yeah, this runs in cluster and this is the, the part that is in charge of checking for updates from Nebraska, applying the, the configuration for flux as you will see and then, uh, checking the flux state, reporting it back to Nebraska. So you have this sort of control. So essentially, uh, you know, instead of, of letting flux pick everything from gith directly, applying it and, uh, if something goes wrong and you have several clusters connected to it or looking at the same repo, then you're going to have the same issue in every one of them. Um, instead of doing that, we're putting Nebraska before that. We're putting Nebraska in charge in this case. Um, yeah, so, no way in there, as you can see, it pulls the, the data from, uh, Nebraska says, okay, there's an update. So the update is going to have, uh, a git graph. This is going to configure flux with that graph and then from that moment, flux operates as usual. So all, um, newer has to do is to pick up the, the status and communicate it back to Nebraska. Uh, this is the reporting part. So, um, I'm going to show a demo very fast. It's going to be under the limit time we have. Thanks to Santosh and Suraj. There are colleagues that did the heavy lifting on this one and, um, couldn't be here, couldn't be here. So, uh, yeah, let's go. So in this case, we have, um, you know, we have a, a cluster and, uh, this cluster is running a web app, well, an application. In this case is engine X as you will see. Uh, so as you see, there is, uh, version 117 running. And, uh, now we're going to show Nebraska. So Nebraska is also running. This is still like, has nothing to do with the, with Kubernetes. In this case, this is just running elsewhere, if you will. And, uh, but this is just to show you that there is, you know, this nice UI, if you will, uh, could be nicer, but, but, you know, and, um, yeah, so you have the policies part that you can control this way as well. And then there is also, uh, you know, when you go to the instances, uh, or to the update status, you can see how many instances there are. Usually it's much more interesting than this. If you have several instances and you can see that, you know, there's one instance and there's the version. Now this version is different from the engine X version. And I'm going to explain later why, but moving on very fast, you know, you can also check the, the story of, uh, you know, the updates, what happened throughout. So far, uh, you know, you had one update that happened, but we'll see the process of updating. So this is, wait, this is the, uh, you know, the status of your application or what you wanted to, you know, the status to be. So this will be a track on your regular GitOps repo, right? This is the stuff that flux would pull from and actually does pull from. So when we want to update the version, we just go, we say, okay, now I want version 1.21 in this case, and then the regular, regular spiel, you just update, uh, you know, commit and push until now everything is, is normal, right? So you, you push it, but now instead of having flux, just pull it and it's like, okay, it's broken, uh, chaos, whatever. Uh, what we do is that we, instead we pick the, the commit ref and then we go to our, you know, you could go to the UI of Nebraska and set it there, but since some people don't like UIs, we also have this Terraform provider where you can just change stuff there. So very quickly we go there, we say, okay, now, and this is why we had a different version in the UI. So this is like a notion of a package in Nebraska. It has lots of information because it can mean more than just one app, right? So you up the version there, like that, and then you, you replace the commit that you want flux to look into, sorry, maybe this, yeah. So you replace it and yeah, yeah, there's all the other configurations that you can provide to it. This is all flux stuff as you, as you can see. Yeah, so this is the diff. We create a PR in this case. We're in just to go very fast. Hopefully you can still track what's going on. Uh, you know, open a PR, then we have some actions to perform the Terraform, um, magic, as you can see here. Let's see. Yeah. So this is like, uh, what you get from the PR when the the action runs. Everything looks fine. If you're familiar with Terraform, you know how to read this, then we press March and, uh, and then it, you know, it will eventually change the, the Nebraska configuration. So there you go. You got some feedback. That's it. Now, Nebraska got configured. It says, okay, I got a new update whenever, you know, instances come and ask me for an update. I got, I got new stuff to deliver. So, uh, and that's what happens, because Nuwa, the agent in this case will, will have communicated. Let's see if it happened. Let's see. Oh, surprise. It happened. And, uh, and, uh, yeah. And as you can see, you know, in the back, like I said, Nuwa checked for the new update information, got this new update saying, okay, there's a new ref that flux needs to pull. Here's the, let me change the CR, that describes it, and then flux from that takes it on. So it already reconciled and everything. And if we go, and, you know, I talked about the reporting part, so if we go back to the Nebraska UI, we should see the new, uh, information. If things get refreshed, you can see that it's tracking already the new version. Now again, this is not spectacular because it's just one instance. But if you had like a hundred clusters, uh, doing this, you could apply a policy like, okay, if there is one that reports an error, abort all the updates. So the first one that reports an error, you only fail until that time. Hopefully the first one. Yeah. And you can see that, you know, you also have the history here. And, uh, and that's it. So, well, we have 15 seconds left. No, I don't know. I don't know if we have, but yeah, that's it. Thanks.