 All right, 215, let's get started. So hi, everyone. My name is Michael Crenshaw. I'm on the Argo CD team at Intuit as a back-end software engineer. I've been there for about a year now. And my team supports our internal Argo CD users, such as the folks who do builds and deployments for us. So what I'm going to talk about today is what I call the bootstrap attack. So Argo CD, as many of you probably already know, has the ability to bootstrap itself, to install itself. So the way this works is you start with kubectl apply, and you apply your initial manifest for Argo CD. Once it's up and running, Argo CD is just Kubernetes resources. So you can create a new application, point it to your manifests, and now you're using GitOps to deploy Argo CD or to bootstrap it. The fact that Argo CD is just Kubernetes resources means that all of its security config is stored in Kubernetes. So this attack is about a user basically taking advantage of Argo CD's power to maliciously change configuration files and escalate their privileges. So to understand this mode of attack, it's useful to understand the layers in the structure of Argo CD. So the thing that most developers are familiar with is the applications layer. So each application is a custom resource definition, which references a source of truth for your manifests, and then a destination, which is a Kubernetes cluster and one or more namespaces that those resources will be deployed to. Each application must be grouped under a project, and a project is just another custom resource definition defined and maintained by Argo Proj. And you can think of a project as a group of rules about what the sources of truth are allowed to be for a set of applications, what the allowed destinations are, and a number of other rules just to restrict and make sure that your developers are only doing with their applications what they're supposed to be allowed to do. And then the very bottom layer of the Argo CD structure is the actual resources that you deploy. And these are the things that you see when you click on an application in the Argo CD UI. So config maps, deployments, et cetera, all those things that were represented as manifests that are now deployed and monitored in your Kubernetes cluster. So it's useful to see what visually a bootstrapped Argo CD cluster looks like in Argo CD if we're going to attack it. So this is a screenshot of the application view of a bootstrapped Argo CD instance and I filtered it down to just a few types of resources. You've got config maps, stateful sets and secrets. And the reason I filtered it to those is those are some of the juiciest targets for an attacker, the things that we can modify and potentially escalate our privileges. I've listed a few of the first things that I would attack but for this talk we're going to attack the Argo CD RBAC config map. And that is the config map that stores the CSV file which describes how users are allowed to use the API, i.e. the CLI and the UI, to make changes in Argo CD. So by attacking this we can potentially escalate our privileges and do more interesting things. The RBAC config map matters because it basically governs a second source of truth for your manifest. So you've got Git, you've got Helm, that's one source of truth. But depending on how you configure your RBAC, it can become the gatekeeper for the second source of truth, the UI, the CLI, the API. And the rules are defined in CSV like this, so we have different types of resources. In this case we're restricting the dev groups access to the applications resource. And I've given this group access to Git or to sync applications that are in the dev project. So that's the dev slash star. Every application which is grouped under dev, you're allowed to access. What that ends up looking like in the UI, and it would look the same if you used the CLI or the API, is when an admin logs into Argo CD, they see all of the applications. And on the left side you see that bootstrapped Argo CD application. But when a developer logs in because they're under that dev role, they only see applications that are grouped under the dev project. The attack, if actually performed, would look like this. On the left side you're an attacker. You've set up an application, or suppose an admin is set up an application for you. And what you're supposed to be deploying is those bottom two resources, the guestbook UI service and the guestbook UI deployment. But the nasty thing there is the config map, which is the top resource, Argo CD RBAC CM. When you deploy an application, you're allowed to specify in your resource what namespace you want to deploy to. So I can set in this Argo CD RBAC CM, I want to deploy this to the Argo CD namespace where all the good stuff lives. So as an attacker, I see the left side. I've synced it. My changes to the config map have been applied. On the right side, as an admin, I'm scared to death because I see that Argo CD is out of sync. And there's a warning indicating that some other application is managing the same resource. And if I want to figure out what exactly happened, I could click the Argo CD RBAC CM resource, and I'd see this. On the left side, what's currently deployed, which gives everyone with a dev role full access to do anything, that's what all the stars mean. And what I should have deployed is on the right side, pretty restricted RBAC. I can hit sync, and then the attacker could go back and hit sync, and we could go back and forth like that. There's no way that I could completely stop them at this point, though, from just managing their own RBAC. And as a matter of fact, at this point, they could kick me out of the system and they'd take over the whole instance. I really started thinking hard about this attack in March when the Argo CD team got an email about a CDE. Some organization had rules on their projects preventing users from deploying namespaces, but one of their users added a namespace manifest to their manifest repository, and they did a refresh in the Argo CD UI and they saw the namespace, which worried them a little bit, and then they clicked the triple dots and realized they were now allowed to delete that namespace. So the bug was basically that Argo CD let this resource into its internal representation of what is being managed by this application. Nine days later, we fixed the bug. We kicked those resources out of that internal resource tree, and so if a user tried to sync this or if a user tried to use the API, UI, or CLI to modify or delete that resource, they'd be prevented from doing that today. But this attack was really interesting to me because I realized it started out as a fairly simple attack. Okay, oh no, someone can delete a namespace. That's one problem. But if a user can also use the API or the UI to modify the live manifest of something that they've just added to their repository, then they could edit the live manifest of Argo CD's own resources in the Argo CD namespace and escalate their privileges. So it took something like a seven point something CVSS vulnerability to 9.9, I believe is what we eventually settled on. So I started wondering, are there ways that people are misconfiguring Argo CD that could let this stuff happen even when a bug doesn't exist? So to avoid those type of misconfigurations, we have a few lines of defense. First, we can use Git-based restrictions. So for example, in your source control management provider, like GitHub, GitLab, you can set some restrictions, and you can also do some RBAC restrictions in Argo CD to complement those. Second type of defenses, you can use Argo CD's own project resource, which we talked about as sort of a list of rules. There are rules that can mitigate and even prevent this attack. And then finally, I'm gonna spend a couple slides talking about the app of apps pattern and the app of projects pattern, which are a little bit more advanced and require a little bit more effort to secure from particularly this type of attack. So let's start with Git-based defenses, which is basically restricting one of the two parts of an application, source of truth and destination. This is about restricting the source of truth. So I'm gonna ask you to imagine a hypothetical Argo CD setup, and we're gonna define three protections against attackers. First, our developers are only allowed to deploy from GitHub repos that require admin approval to push. So as a developer, you can't push straight to any branch on your deployment manifest repo. You have to fork it, you open a PR, and then an Argo CD admin reviews it. Rule number one, pretty easy to do in GitHub. Rule number two is developers are gonna be given their own RBAC role. And for this hypothetical, I'm gonna give them full access to any of the actions related to applications. It's pretty common to want to give developers as much like broad access to their applications. For example, restricted to the dev project as possible, because you want developers to be flexible and be able to respond to issues. So we'll start there. The third part is we're going to configure that dev project that devs are allowed to access to access any source repo and deploy to any destination. And some of you who are Argo CD admins are already like, that's a bad idea. It is, but we're gonna poke holes in this first and then figure out where the holes are so that we can make sure they're always patched. So the first attack that we can initiate against that hypothetical setup is we can change the app's source repo. So remember, we have restrictions to prevent users from just pushing to that source repo. However, I can use the CLI or even the UI to patch this orders API application and just point it to a different repo that I control completely. I can add an Argo CD, for example, to RBAC manifest to that repository, push it, and now I've taken over. It's pretty easy to mitigate. There are two different options I wanna give you. First is you can restrict the project sources. In the app project CRD, there's a source repos field and it accepts a glob pattern. So in this case, I'm assuming your organization name is example and all of your deployment repos end with hyphen deploy. So it into all of our deployment repos end with hyphen deployment. So this would be a good pattern. Keep in mind, if users can create repositories that match this pattern and they have control over the push settings for that repository, they can just bypass this. They just match the pattern and they're in with whatever sources they want. So make sure that, for example, in GitHub, in your organization, users are restricted from creating arbitrary repositories with arbitrary names and control settings. If we have this rule in place and we try that exact attack that we just showed, the user's gonna be stopped. They're gonna see an error from the CLI saying that source is not allowed or the same error from the UI. So we've defended against that attack in one way. Here's another way. You could restrict the RBAC for applications, which remember we had a star for because we want developers to have flexible access. Now we're gonna remove a few things. The main one to focus on is update because that's what we were doing. We were updating the application manifest to use a malicious source. We just wanna stop them from updating it. Instead, our Argo CD admins will create the application for them, pointed at something that we know is safe, and then it just stays there. And I went ahead and got rid of the delete and create permissions. Create because they could create an application with any source repo. Delete because if I'm not giving them update, I probably don't want them just deleting applications anyway. And they're gonna hit a similar roadblock when they try to attack with this in place because they're going to run Argo CD app patch and they're gonna get an error. This time slightly different. It's not an invalid source. Instead, you're just not allowed to update. And they would see the same error in the user interface. So either of those two defenses works. You might wanna use both of them in conjunction. For example, if you're worried you can't get a pattern that is sufficient to match and prevent malicious repos, but either one should be able to work. But there's still an attack available. We've done all this. We're protected against those attacks. This is still vulnerable to a bootstrap attack because the Argo CD CLI has a flag option that I think kinda gets overlooked. It is hyphen hyphen local. And what this does when you do Argo CD app sync hyphen hyphen local, it builds the manifest locally on your computer, shoots those manifests up to Argo CD and that's actually what gets synced to the cluster. This is awesome for developers who just wanna test something out real quick. They make a quick change locally. They do a local sync up to the cluster, look at it in the UI, looks great. So now they do get commit, fire it off, put up a PR. I like this tool for admin type people. I don't love this for just average developers because there's a reason we like GitOps because you can trace what happens and see the entire history. This flag is controlled by the override RBAC actions. So you remember where we had applications, comma, star, comma, et cetera. Where that star is is where the override application or the override action lives and where we can restrict it. So here's your defense. This is one of the four remaining actions that are allowed. We just delete the override action. When someone tries to do a local sync, this is what they'll get. Permission denied, you're not allowed to override. I was chatting with someone yesterday who had never heard of this particular action that's available in RBAC. So I think it gets overlooked and it's definitely worth looking at your RBAC and seeing, one, do you need that? Two, if not, can you eliminate it? So that's Git-based defenses. So restrict pushes to your manifest repositories, make sure users can't create repositories that match your glob patterns, and then restrict RBAC at the very least to prevent override, but also probably to prevent creates updates and deletes on the application if you're not sure that you can secure the sources with just a glob pattern. The problem with Git-based defenses is all of it relies on Argo CD admins time. And I don't know about y'all's organizations, but a lot of orgs only have two or three people who really have in-depth knowledge of Argo CD, if even that. So taking those folks' time just to review PRs to make sure no one's submitting anything nasty, that's gonna be a lot of time. Intuit has 16,000 applications. If the team of eight-ish people who know Argo CD the best had to review all those pull requests, we'd never do anything else. So we need something besides just reviewing PRs. And we're gonna do that with the app project CRD. So first, we'll start with an initial setup, only two rules this time. We don't need to get restrictions because that's gonna take too much time. Instead, we're gonna give people full access, again, over all of the applications in the dev project, and we're gonna give the dev project full permissions to pull from any source, pull from any destination. We'll leave it nice and open. We'll poke some holes in it and see how much we can leave open and still prevent a bootstrap attack. So the attack at this point is the simple one we saw earlier. We have no way of preventing someone from adding the Argo CD RBAC config map or some other sensitive resource to their manifests, pushing it to a repo and hitting sync and taking everything over. The way to stop this attack using an app project manifest is actually really simple. You edit the destinations field to prevent people from deploying to sensitive destinations. In this case, I've said, okay, I know I only want these resources going to the default namespace. If you have a very dynamic set of namespaces that you want to allow people to deploy to, it could get really cumbersome to have to always update this. So we do accept multiple destinations or we accept a glob pattern in that namespace field. So you could do star hyphen dev. Starting with 2.5, thanks to a poor request from Blake Patterson, we now have the ability to add an exclamation point to the beginning of the glob and that just means don't allow anything that matches this glob. I would love to see people, as soon as 2.5 comes out, start adding exclamation points Argo CD to all of their app projects because this absolutely prevents anyone from managing resources in that namespace. There's no way for them to take over your Argo CD instance and start doing bad things. So this is what it looks like with that rule in place when someone tries to attack your cluster. You can see the config map in the Argo CD UI, but that is it. If you hit sync, the sync will fail with the error that namespace isn't allowed. And if you hit the three little dots and try to delete or you go to edit the live manifest, all that'll be prevented. You get errors for each of those attempts. So nice and easy way to prevent those attacks using app projects. Now I wanna talk about some advanced use cases. I'd like to see how many people have used or are interested in using apps of apps or apps of projects. Okay, a lot of folks. So this isn't sort of the niche thing that a more advanced use case, this is something that people are actually using a lot, which I love. But it is more difficult to secure and you'll see why. For folks who aren't familiar with apps of apps, it's kind of just what it sounds like. You have an application and the source of truth contains manifests for other applications. And when you look at it in the UI, you could have this main app of apps application and I'm deploying a couple of other applications using that root app. The difficult thing about an app of apps is that project field. So if my users are allowed to deploy arbitrary application manifests, they pick their project. And those projects are the rules that keep people from taking over your Argo CD cluster. So we have to protect that field. And we'll talk about a couple different ways to protect it. In Argo CD 2.4, so the current release and prior, the only way to secure that project field is using the review pattern. So you need to do everything that we discussed earlier under Git-based defenses. You need to make sure that Argo CD admins, people who understand the importance of that project field, you need to have them reviewing all the PRs for apps of apps. And you need to make sure that the sources of truth are limited to just the places that are being reviewed by those Argo CD admins. You also need to make sure you disable the update permissions, because let me go back and show you what this would look like. If you don't disable update, users can click one of these applications. They can click the orders API, go to the live manifest and just edit it and change the project to whatever they want. So in app of apps prior to 2.4, you gotta have people reviewing, you have to disable the update permissions. Starting with Argo CD 2.5, there's a new feature called applications in any namespace that it provides a model to allow self-service applications and restrict applications to particular projects that you can define in a way that's safe. I'm not gonna go in depth on that. This is a brand new feature and it needs a lot of testing. So when the RC comes out, if you're interested in this pattern, get in touch with me. We'd love you to test the RC and see if you can secure this pattern. Apps of projects, don't do self-service apps of projects. Argo CD admins are gonna have to review every pull request for an app of projects. And just to be clear, app of projects is just what it sounds like. It's an application that deploys projects. There is currently no mechanism to prevent users from, one, creating new projects that allow them to do things they shouldn't be allowed to do. Or two, potentially just editing existing projects to allow them to do things that they previously weren't allowed to do. So make sure your Argo CD admins are reviewing all those PRs and you've used all the git based restrictions. There are discussions of making projects self-service, but those haven't resulted in a really awesome proposal yet. I tried to write one, but it's still a work in progress. So I'd love to see this happen down the road. So to sum it up, I've got sort of just the basics. If you're trying to secure your Argo CD instance, this is what you need to do. A bootstrap attack is when you use Argo CD's power to destroy Argo CD's defenses by taking over sensitive resources. You've got a couple of different ways you can defend. One, using restrictions in git and making sure that someone reviews all the changes and makes sure they're safe. For two, you can use the app project manifest to restrict people to only deploy to places that you know are safe. And finally, there are a few extra mechanisms to defend apps of apps and apps of projects. So that's the bootstrap attack. I hope people have found something useful and something that they can take back to your setups and analyze and see if you're secure. So thank you for listening. Are there questions? How can I attack mine, Ed? Yeah, let's see if we can find a mic. Shouldn't we exclude the Argo CD namespace by default as a valid target? So we ship Argo CD with a default project, which is wide open. And we do that because we want folks to be able to test quickly on their local machines and see if Argo CD is good for them. I mentioned in the workshop yesterday someone wrote an article, Denilson Nastasio, about how you should restrict that default project down. The first thing you should do is try to limit it down to only what you need or even just completely restrict it and then use new projects. To restrict out the Argo CD namespace by default is something I would want to look at for 3.0. I think that that's probably a good starting point. It would make onboarding more difficult for people, but. We could include a demo namespace, right, that is. We could include a demo namespace. Yep, we could point the default project to that and know that people can bootstrap, attack your Argo CD instance by default. Okay, well thank you and a great job of making a complicated topic understandable. Thank you, I hope it was clear. Other questions. And just to expand on that, I think there are a number of things that in like a major release 3.0, we'd want to change defaults. Maybe in ways that slow folks down a little bit, but would help people take care of their setups a little bit more easily. Yes, let's see if we can get a mic over to you. Pardon me if it's kind of a weird question. If you're good. Are you guys in connection with the rest of the industry, other vendors providing technology also for securing governance? How is this research happening as an ecosystem? Sorry, can you repeat just the last bit? Yeah, what's the relationship of the ecosystem of different vendors creating their own, because I was exposed before with Starboard from Aqua and I always thought that they were kind of doing this in a canonical way, but now that I'm seeing the bigger picture in the Vops, I see different vendors creating their own thing. So for me as a newbie it's kind of difficult to understand where is the one place you can look for research and feel like this is it or it's always like this fragmented thing. That's a good question. I mean, Argo CD is unique in a lot of ways. Like we do have our own RBAC system. We came up with our own names for like the resources and the actions. As far as I know, the app project concept is kind of a unique thing in the particular set of rules that we include in an app project. So it's kind of difficult to generalize those structures. I think the closest thing would be, I know there are probably really excellent guides on how to secure Kubernetes because it has its own model of here's what resources are, here's what namespaces are, and here's our RBAC model and how to keep yourself safe. I think that someone thinking about Argo CD and its relationship to basically how Kubernetes asks people to secure things is probably a good way to think about it. As far as just in general, good ways to understand security with CI CD tech, a lot of it's very, very new stuff and very, very quickly evolving stuff. I highly recommend thinking about how you would attack your cluster. Even if you're just a dev and you just use Argo CD to bump the tag on one of your applications, think about what you would do that's malicious and talk to your admins about whether there's anything that they need to tighten up and then write a blog post and send me a link because I'd love to know about it and know how to make Argo CD move towards those more secure practices by default once we kind of understand the use cases. Hopefully that was helpful. Kind of touched a few things. Other questions? Yes, Oprah rules. Okay, so the question is have we explored integrating Oprah rules for RBAC? The last part I'm not sure I understand though. Oh for Argo users, kind of. So Alex from Acuity has an issue up and I think it's in the 2.6 milestone but it's basically a field and app project that is a JQ query and any resource that you attempt to apply that's under this project that JQ query is gonna run and if it returns a truthy result, it's allowed, if it returns a false result, it's not allowed. This was particularly gonna be useful for restricting which custom resource definitions people could deploy because in that case the name is super important. As far as integrating OPA rules by default, I think it's fine right now to kind of layer OPA on top of Argo CD stuff and I know that they're like, I think Kevarno has some Argo CD or even Argo workflow specific rules that you can apply and try to keep things safer. I would like to see how the space evolves and what OPA rules people define for themselves and find useful and then try to integrate that as a first class feature in Argo CD. That's kind of where I'd like to go. Okay, I've got 30 seconds according to this clock. Anything else? Yeah, Raymond. Was there any attack that you saw with slash temp and the repo cache? Sorry, what with the repo? The repo server like slash temp is where all the repository information gets stored and then I know you were trying to protect that mount point. Are we talking about the directory traversal problems in repo server? Yeah, so was there any sort of attack that people were doing with? So I don't know of any attacks in the wild. We have been plagued with issues with Simlinks and directory traversal in the repo server. Jake from Cobalt put up a PR. We now disable starting with 2.5. We're gonna default to no Simlinks that point anywhere outside of your repo. So I think that'll help a lot. I'm not sure if that's the issue you're talking about but that's a huge improvement in 2.5 that I'm excited about. Cool, thank you all so much.