 Bonjour. Hello. Parlez-vous anglais? I hope so because I don't speak French. But welcome. Thank you for coming. I hope you're all at the right talk. We're here for the rendered manifest pattern. And I'm going to start this off by asking you a question. What represents the desired state for Kubernetes? Just what's the first thing that comes to mind? Simple question. When you think about it, did anyone here think about Helm or Customize when I asked that question? Anybody? Maybe a little bit? Okay. Well, that's wrong. It's wrong. Helm and Customize is not the desired state for Kubernetes. It's the manifest rendered by these config management tools that actually represents the desired state for a Kubernetes cluster. So let's break that down a little bit. Helm and Customize are common abstractions for plain Kubernetes manifests. They both address the need for a standard set of manifests and the ability to alter them based on the specific instance or environment that you're going to deploy those manifests into. These abstractions are extremely convenient for keeping your configuration dry, not needlessly repeating yourself, although there's a whole argument on LinkedIn about how dry something should be and whether or not there's a certain level of complexity at which it makes everything worse. But that's a different talk. These abstractions are extremely convenient. Ultimately, though, they render out the plain manifest that the Kubernetes API server uses to determine the desired state for that infrastructure for which pods should run where and in what configuration. In the context of Argo CD, the Helm and Customize abstractions are typically referenced directly by the Argo CD application to determine the desired state. When Argo CD is deployed, it includes a version of Helm and Customize alongside of it. Then Argo CD uses these tools to render out, or another terminology you might hear is hydrate, the manifest into the plain Kubernetes manifest from these abstractions. Ultimately, Argo CD effectively, there's nuances to it, just runs Qubectl applied to reconcile the live state of the cluster with the desired state rendered by these abstractions. Argo CD does not tightly couple itself with this tooling. For example, Argo CD does not manage Helm installation or Helm releases, it simply uses Helm template to get to the plain manifest that it is then going to use to reconcile the live and desired state. This pattern is what I would call the runtime rendering pattern. I, the developer, the GitOps user, push a change to my Helm chart or to my customization, Argo CD is watching that main branch where I host these abstractions. It uses these tools to generate the plain manifest and then it applies them against the Kubernetes API server. However, there's problems with this approach. There are notably three risks to the runtime rendering approach. The first is probably the most apparent to those users of the GitOps repository, the abstraction away from the plain manifest. It's a double-edged sword. The second is the risk introduced by the tooling used to determine the true desired state during the reconciliation with the live state. And ultimately, the third is the performance impact on Argo CD caused by rendering these manifests. Let's start with the obfuscation. People might be familiar with this example. This is a Helm umbrella chart that has a dependency on the Prometheus community chart. And I, as the GitOps user, have opened a pull request and I'm saying I want to update my dependency to this new major version. When approving this change in a pull request and Git, this is what you're going to see. And frankly, this is a big jump in major versions because I'm trying to make a point here. So we can assume that a lot is going to change. At least 10 breaking changes, if not potentially many more. The problem with this is that you couldn't tell me anything more than that. You couldn't tell me what broke, what manifests are changing and what's changing relative to the previous state, the previous desired state. We're not even talking live state at this point to the previous desired state when looking and approving this change in Git. Frankly, I've had experiences where an update, just a minor version update to a chart that I didn't maintain but that I was a consumer of, unexpectedly deleted the service account that the deployment relied on, breaking that deployment. Sure, this is an example of poor release management on the part of that chart, but it ultimately affects me as the end user. And I couldn't tell that that was going to happen because I didn't know what manifests were going to get rendered until they get applied to the cluster. A common workaround for this is that you disable auto-sync right before you make a change and then you make sure that the diff looks right in Argo CD. And then you re-enable auto-sync and then it works kind of defeating the point of that functionality. And the only way to truly be sure of the resulting manifests without kind of following that anti-pattern is to run, in this example, Helm template yourself. But even then, this is what the Helm template command looks like in the Argo CD repo server. And it adds like 100 flags. Granted, most of these are the API version flag used to mimic what Helm install does, which is make sure you're using the correct API version for resources when applying it to the cluster. But you also still have to take into account the Helm options specified in the application manifest when you're doing this Helm template. And you have to run Helm template for the existing desired state and the change that you want to make to that desired state. All locally, yourself, manually, every time. And honestly, I can't say how many times I've committed a new change to my customization and Argo CD tries to sync it. And I only then realize that I, you know, didn't remove the resource listing from my customization and then it breaks the build and it's a whole thing. It's that feedback loop is broken because I don't know what manifests are getting rendered until after Argo CD tries to do it. Speaking of which, the version of Helm or Customize that is used to render these manifests is based on what version of Argo CD you use in your environment. There's an asterisk for that for Customize where you can specify the version of Customize you want to use. But again, tangent. It's included alongside Argo CD. So when a version to upgrade to the version of Argo CD looks like this, can you really be certain of what the changes on the manifest rendered for every single application by that Argo CD instance? I'm not confident. And there has been notoriously in the past some upgrades to Customize that broke the rendering of manifests, which ultimately affects all of the applications that's managed by that Argo CD instance. Actually, in this example, we're going from version 5.42 for the Argo CD Helm chart to 5.43. This upgrades the version of Argo CD, which upgrades the version of Helm, which may or may not affect the rendering of those manifests. It's hard to be sure. The tool chain risk comes from two perspectives. The first is that the version of Helm or Customize can influence the resulting manifests from those tools, meaning that an upgrade to Argo CD could actually be an upgrade to the desired state. Second, when your GitOps tooling is responsible for generating the manifest, issues won't be caught until after the changes have been approved, merged into main, and have been attempted to be applied to the cluster. That's a fairly long feedback loop. I feel like we could catch that earlier. The performance of Argo CD is affected by the rendering of the manifest. So Argo CD has to render manifests in three cases. First, the fairly obvious one, there's a change to the desired state, to the source that the application is referencing. The problem is that if you have 100 applications, an arbitrary number, referencing that source and revision, the repo server has to regenerate the manifest for each one of those applications. Even if they're all exactly the same, they all reference the same Helm chart, they all reference the same path, it's going to rerun Helm or Customize to generate the real desired state that it has to use to interact with Kubernetes. Relative to applying, running kubectl apply, Helm and Customize is pretty damn expensive. We have worked with customers where their repo server takes up a 200 gig of RAM node on their EKS cluster. This node is literally just dedicated to the repo server because they have so many applications and Helm and Customize is so expensive to run for that many applications. The second, actually stepping back, for the first example, you can limit the impact of this using the Git web hook and the manifest generate paths annotation. It can reduce the number of applications that ultimately have to have their manifest re-rendered by saying that, okay, well, only if this revision at this source on this path changes, but you could still have hundreds of applications that are referencing that specific path that then have to get their manifest regenerated. The second is that the manifest cache expires. By default, every 24 hours, the cache for those manifest, those manifest rendered by Argo CD expires, resulting in the manifest needing to be regenerated to determine if there's any difference between desired and live state. And then ultimately, anytime the user requests a hard refresh for the application. In any of those cases, Argo CD is running Helm or Customize or whatever config management plugin you're running to get to the real desired state to determine if there's any difference. So you might be asking at this point, what's the solution? It might have been the title of this talk, so I've kind of already given it away. It's the rendered manifest pattern. So the desired state stored and improved in Git, or really whatever your desired state store is, but again, a topic for another talk, should contain no abstractions from what will be applied during reconciliation. So the idea here is that I, as a GitOps user, push my change to the main branch. And then the CI engine sees that change in the main branch and renders out the manifest using Helm or Customize, just like the Argo CD repo server would. And then it stores those plain Kubernetes manifest, the true desired state the Kubernetes API server is expecting in environment specific branches. And then Argo CD monitors those branches instead of the main branch and applies them directly to the Kubernetes API server, meaning that there's no tooling in between desired state and reconciliation of live state. Now, the idea here is that your desired state should be treated similarly to a container image where it's immutable and applied as is to the cluster. Having your GitOps tooling run Helm or Customize is kind of like running app get install when your container starts up. If you've got that in your entry point script right now, there's other talks that you should probably go check from previous Kube Cuts. The whole point of these containers is that you have artifacts that represent exactly what you want to deploy every time you try to deploy it. It's kind of like running the latest tag is kind of bad practice because you don't really know what you're going to get when you apply it. So the idea here is that your main branch is essentially the source code for your GitOps repo. And then these environment specific branches, I know you're probably thinking we'll get to it in the next slide, are artifacts. They represent the desired state. And that is what Argo CD references to determine if there is any difference. Now, you might be thinking, isn't that GitFlow? No, we are not talking about GitFlow here, but I see why you might have gotten there. So it's a common adage that you shouldn't use branches for environments anymore. We've moved on as an industry. I know there are probably people in the room that are still stuck with GitFlow. I'm sorry. But you shouldn't use branches for environments when you're merging them into each other. That's unfortunate. In this pattern, you can still use trunk-based development. These environment specific branches are rendered automatically, and they are not merged into each other over time to determine a promotion. We're not talking about promotion between environments in this talk. If you want to look into that, there's plenty of other talks on environment promotion. Check out cargo and open source tool. That does that now. And there's also some talk about doing it in the Argo CD project. But the idea here is that you can continue to use short-lived feature branches or trunk-based development, and the maintenance of those is all automated, just like your container images are all automated. If you're pulling down your container images, making a change, and pushing it back up, and those are container images you actually rely on, that's bad practice. Just like how merging these into each other would also be bad practice. You should consider the contents of these branches kind of like a release bundle, right? Where it represents what the desired state is as is. Okay. So let me show you what this would actually look like in practice. Okay. That's a good start. And we'll go over to... Very good. Okay. So I've got my GitHub repo here. It's called render manifest example, pretty self-explanatory. I'm going to be demonstrating a tool called cargo render, which is a sub-project of the cargo project. Basically, the whole purpose of this tool is to do the rendered manifest for you. We'll get into why you might want to use a tool instead of doing it yourself in a moment. But I want to demonstrate what the difference to the user experience is like when using this pattern. So I've got my main branch here that contains a front-end application or the manifest for a front-end application, which is a Helm chart and also a back-end application, which is customized. But we're going to focus on the front-end chart for this example. This front-end chart contains some templates, a deployment and a service. And it also contains Redis as a dependency of this Helm chart. And so in this pull request, I'm going to be updating that Redis dependency. And I'll be honest, I actually don't know what the result of this dependency update is. I'm not sure. Assume in reality you've probably read the release notes, you're pretty confident on why you're upgrading this chart. You've put it into your PR description or maybe you have a bot that monitors that and generates this pull request. But this is what I as a developer would have to go in, review and approve and say, okay, I see why you want to update that pull request. I agree with the intent of this change. Let's go ahead and merge it. So I merge this pull request and it goes off and it should kick off a new GitHub action. In this GitHub action, in this example, this Helm chart is utilized by both the dev and production environments. So without the rendered manifest pattern, without the approach that we're going to take, that would mean both environments right now would be upgraded at the same time if they're using the autosync policy. But let's go and see, based on our pull request here, what actually changed? So if we look here, 13 files changed, 100 lines got added, 37 got removed. So this is now a pull request that is changing the desired state is basically presenting a new artifact for the desired state saying, if you merge this in, you're confident that this is what you want deployed into the cluster, you want to know exactly what the benefits are, you can go through, you can check it, you can see that, oh, there are now resource limits added by default to Redis, even though Helm kind of recommends against that, that's probably good to know, right? Because that affects the scheduling of those pods, that affects, maybe I don't have ephemeral storage to give to Redis, or maybe my Redis needs more than 100 millis CPUs and I didn't know that they were adding resource requests because that's not what we did in our organization at this point. So you get, oh, there's a network policy too, I didn't know that. Where did this network policy come from? And maybe I had different labels or I had messed with the values and I as the one that understands manifests can come in here and understand what the impact of this is going to be. And I'm just going to say I'm happy with this, we're going to approve it, we're going to submit it, and we're going to squash and merge it. And now I've updated the desired state. This is what Argo CD is going to be referencing, is this desired state. So if I go to this branch, I will see Environment Prod and I can look at precisely what the diff is for every change to the true desired state of my front-end application and I can see the full history of it. I will briefly look at the Cargo Render configuration just to show roughly how this is configured so we can see that Cargo Render becomes aware of what Config Management tool we're using, the configurations for that Config Management tool, and the path that we want to output it to. And I think it would help if I showed what the results are. So the result is that you have a branch that contains every single manifest separated by name and resource type and you can go in and see exactly what it is without having to go into the live state of your cluster. You can be sure that this is the desired state and so if you're troubleshooting something, you have reference to this. Because of the amount of time I have left, I'm going to go back to the slides and finish up just talking about the advantages of this. So to sum it up, the advantages are that we're eliminating the obfuscation created by these abstractions tools that we use. How many customized are fantastic for managing a set of manifests that you know you're going to deploy multiple times to different locations with slight or even serious variations. But now you can actually see what the impact of changes to those abstractions are on the desired state that Kubernetes understands. And the cool part of having the rendered manifest is that you can now lint them before it changes the desired state. So if you want to run Kyberno policies in a GitHub action on that pull request to production, you can. And you can say, no, no, you're using image pull policy if not present, but the latest image tag that's going to break, that's going to cause problems because it's not going to update that image unless it gets scheduled onto a new node. You can lint for that now. You reduce the tooling risks introduced by abstraction as my desired state. Argos CD during the determination of the reconciliation is rendering out my manifest. So if the version of helm or customized changes, that can change the results for your desired state. The big part here is that you render manifests once creating an artifact. So Argos CD no longer has to re render those manifests on a regular basis when the cache expires, when the source changes for all of those applications, you can do it once NCI on demand and not have to waste those resources rerunning it or have a repo server that gets an entire 200 gig node in EKS. And the reason we do different branches for different environments is that you can set policies like on production. We want pull requests to review that change to the desired state before it goes in, but on dev, I don't really care, push whatever you want, break it, that's the whole point of having that environment. That's why it's there. And I really want to stress the idea of creating artifacts for your desired state. So every one of those commits that we have now is an artifact no different from a container image that represents exactly what you want to deploy, and you can go back to that point in time and get consistently what existed at that point. We're enhancing performance, so Argos CD is essentially just running Qubectl apply now because it's working with plain Kubernetes manifests. And seriously, helmet customized is really expensive relative to running Qubectl apply. It adds up quickly at scale. You have different policies for different environments, so you can utilize the like the extensive branch based protection policies to determine when and who can make those changes. I will note there is a disadvantage. You're now shifting the complexity of the repo server into your CI engine. And I don't want to over over understate that because the significant amount of work has gone into the repo server to have it reliably render those manifests and you're now shifting it to your CI and you're taking on that responsibility. That's why Acuity wrote the cargo render tool is because we use essentially the repo server to render those manifests, but we're shifting where that rendering happens. We're not putting the onus all back on you. The tooling still requires some growth. The tooling to implement the solution isn't perfect yet. The one that I used in this example is still pre-alpha. We're just working on release candidates for 010. I'm not saying it's a defined solution, but the pattern is the important part here. And that's the end. I think I'm officially out of time. Thank you for coming. Find me on LinkedIn or after.