 Hi everybody, it's great to be here. My name is Elan, I'm the CTO and co-founder of Argon and today I want to talk about the software supply chain. So this is me coming to you today from Tel Aviv. And just a word about Argon. So Argon's mission is to help companies release software securely meaning we help protect each and every phase of their software supply chain and we'll dive into that in just a few slides. So what are we going to talk about today? So an intro of me and Argon is done. And just a few words about the software supply chain, what it is and how it comes into play. We are going to speak about a few of the recent attacks and we'll dive into two of them specifically. After that we're going to have a live demo of an attack and how it looks like in the wild and we'll see how Argon comes into play in all of that and that's it for today. So a little bit about the software supply chain. The software supply chain is actually a term that is borrowed from the physical world where it is used to describe the path or the route physical products or services take from the moment they are composed in the factory from all those different ingredients throughout the entire chain of supply up until the time they are delivered to my doorstep. So the software world concept, the concept are pretty similar as we can see here the five different phases of the software supply chain. So starting from the code itself and the source code management platforms that manage it throughout the materials which is another way of saying dependencies. So open source dependencies and CI pipeline dependencies we'll see how those look like later on. Then we have the build phase, CI pipelines taking all lines of source code compiling them and finally achieving a final form of artifact. Those need to be managed as well so we have all those different package registries for artifact management and the final phase is the deployment phase where you take the final Docker image or NPM package and deploy it to the relevant environment let's say your Kubernetes cluster in the production. So here we have a relatively simple example I guess of this type of chain. We can see that the technological change that was introduced here in the just recent years is very dramatic. So we didn't used to look anything like that just a couple of years ago. Companies would release software completely different. You would have this quarterly release night everything would go according to some project plan a lot of human intervention in the process. Today this is not the case so software companies release software tens of times a day with full automation without any human intervention and the technological chain was very quick to come into play. Here we can see some of the major recent events just in the last few years to understand how quickly this has come so we can see that in 2015 GitLab was just a 10 employees company after that in 2016 Google Cloud Build was only released in beta today is one of the more popular build platform out there Bitbucket Pipeline replaces Bamboo just in late 2017 Azure DevOps Cloud was introduced in 2018 granted Microsoft did have TFS platforms but some would argue that it is different. GitLab Action I guess this is the more interesting part so GitLab Action is the third most CI use today it was only released in general availability in 2019 so just child two years ago and GitLab Container Registry is production ready just last year so we can see how a lot of the services a lot of the products of a lot of the companies that we use today and we almost take for granted was just we're just introduced in the last couple of years so Algon conducted the survey to ask companies to ask security leaders how this change has affected them we can see the results here but we don't have to look at all the numbers now this is just for show I'll tell you about three main data points that I find to be interesting so the first one which is kind of obvious I guess over 90% of companies today use full CI CD automation to deploy software to production I guess today this is not a big surprise however only 23% of those same companies feel that they have confidence in those processes meaning the rest feel they don't really know what goes on there they don't have the proper control over it and when we asked those security leaders those CISOs and application security team we asked them what was the number one challenge in securing the software supply chain today and the number one result that we got was collaboration between DevOps and security teams which I find to be extremely interesting because it has nothing to do with technology and automation and everything to do with people and processes so let's see some of the recent events that took place involving this software supply chain so these are just some of the high profile attacks I won't go into each and every one of them we'll just do a quick overview so Mercedes had a major source code leakage through their GitLab on-prem server roughly 600 code repositories were leaked online including very sensitive code projects later on CodeCov and automation company was breached and attacker was able to modify the software basically affecting all of CodeCov's customers running that same software in the CI environment and cause the leakage of sensitive data directly from those environments the third one here that we see is the one known as dependency confusion so a white hat hacker demonstrated how easy it is to trick those package registries, those artifact managers into pulling the wrong code packages from the outside world instead of private one developed in-house and the final and fourth one probably doesn't need a lot of explanation SolarWind was a major build time code manipulation attack basically and attacker gained access to the build environment of the Orion app and was able to modify its source code affecting thousands of users of it and just last week we had another major incident involving this action, the check spelling action basically we can see here a screenshot of the security advisory explaining that workflows using it were vulnerable to leakage of their GitLab token and we'll see a live demo of how this looks like in the real world just yesterday, less than 24 hours ago Travis CI published this notice noting that all public repositories using Travis as their CI were in danger of leaking secrets directly from those repositories we can see that the issue is valid only for public repositories which I guess should be kind of reassuring in a way and this happened less than 24 hours ago so let's see how one of those attacks looks like when we dive a little deeper so taking the check spelling incident for example we can see here the lines of installation it's pretty amazing to consider the level of trust that we put in them we can see here how workflow could add the check spelling action directly onto the GitHub repository using it and we can see how the app store for action looks like it's no longer just your application today that contains dependencies but rather your workflow, your CI pipelines so this means when you release software someone else's code running as part of this same process is taking place, has access to your repository to its secrets and to the code itself now this is not a dependency of your application this is not listed in your manifest file, in your package.json file this is other people code running as part of your CI process so what happened with check spelling so we can see here that it's a GitHub action used by a lot of repositories to well check the spelling of which pull request submitted against them now its workflow was say very improperly configured we can see a screenshot from their official page about the recommendation of how to set it up and essentially each repository using it allowed an attacker to gain write access to them just by submitted carefully crafted pull request so this is a screenshot from the security advisory that was published we can see four repositories using the check spelling action which enabled the trigger on pull request target the attacker was able to send the pull request and cause the GitHub access token to be leaked with that token you can do a bunch of stuff including read, including write to the target repository so this is kind of what the workflow using spell checking action would look like you can see here that the event is on pull request target and when you put that together we check out of the specific pull request this is when things become a little messy and for those of you who haven't used GitHub action before don't worry we're going to show live how this workflow can allow an attacker to gain access to your repository so a couple of workarounds that were suggested well the first one would be to simply disable the workflow which I guess is not that much of a fix but rather kind of a first aid kit the other would be to kind of change the way that you allow specific actions to run you can allow only actions created by GitHub or only actions verified by GitHub for example and I'll show you how this looks like in just a few slides plus you can change the GitHub token level of access its scope, by default it is granted something else rather than simply read so these are kind of just the quick fixes so I created a completely new GitHub repository and added just an example workflow to see how those configurations are configured by default so we can see here that the workflow permission is granted read and write access to my repository these permissions are reflected in the GitHub token itself so an attacker gaining access to this GitHub token actually has read and write permissions by default to my repository containing this workflow another security configuration here that I've also checked is the default action permission so basically if you're managing GitHub organization you can control the different level of which actions you're allowed to run the default as we can see here is allow all this means I can run anybody else's code which is wrapped in a GitHub action and granted access to my repository to my source code as part of my workflow now the solution itself is a little trickier because you would have to not only fix the main branch of your repository and update it to the latest version of check spelling but rather you would have to fix each and every branch containing a copy of this workflow because the way this works is each branch has its own workflow file and if it is still using the affected version then any pull request submitted against this branch could also trigger this check spelling action with the vulnerable version allowing the leakage of the GitHub token so let's see how this looks like in the real world okay great so I have here this Roketo app which is just a simple Node.js public repository we can see here it's completely standard with an innocent this is the Roketo app readmefire plus I've added a very standard Node starter workflow which I've copied directly from the example at GitHub so we can see here that it really doesn't do much just a standard workflow for CI plus I've altered the event we are listening to to be the pull request target event together with checking out of the pull request itself now this is the combination that could put you in a lot of trouble and we're going to see exactly how just in one second so another GitHub profile which I have my alter ego we can see it here which doesn't have any access to the Roketo organization or the Roketo app it can find it but just because this is a public GitHub repository and in order to make changes the other user would have to fork this repository and suggest modification using a pull request so this is what we're going to do now I'm going to fork the Roketo app repository I'll give it a few seconds here we can see that now I'm working on this fork point so while I'm doing that I can suggest modification to the source code specifically I'm going to suggest modification to the package JSON file now I'm going to use this script stack in order to run my arbitrary code now there are all these different sorts of ways to run arbitrary code specifically running NPM test would be something a lot of CI platforms would do so I'm going to use this one we can see here the file with the instructions of my arbitrary code and what it is going to do it's divided into three phases so the first one would be to obtain the GitHub token now this is accessible because my code would run in the context of the GitHub runner and it does have write access which is enabled by default so I'm going to extract this GitHub token using gitconfig just to see where it is stored once I have this token I can make a request to GitHub to modify the source code of the Roketo app I would obviously have to use this token and I will also send this token back so we can see it together so I'm taking all those lines and I'll compile them into one long line just to be placed in the skip tab so we can copy paste this one here and we'll suggest modification so now these test scripts contain my arbitrary code I can commit the changes to the main branch this is only on my fork copy of the repository however now that I have some modification I can offer them in the form of a pull request so I'll open up a pull request targeting the Roketo app great, once I'll do that the Roketo app which has my workflow would start validating checking my pull request now while this is running we can take a look here at the written file we'll give it a few seconds for the workflow to be done maybe we'll see what it does in the meantime so it's running npm test which is part of my workflow now this would run my arbitrary code we'll see in a second there is also it so if I go here this is the Roketo app I guess if I refresh the page this no longer would be the case so you can see I am in now this modification to one of the files the readme file in my repository the Roketo app was made by a user that doesn't have any access to this GitHub repository so we can see the workflow itself has run this cleverly put arbitrary code with the changes and also sent back the access token for me to see so we can see here the access token was sent I have this secret eater listening for any secrets being sent from the GitHub workflow we can see this GitHub token here ok now what are we going to do about this workflow the obvious thing would be to change the event right so that now we will no longer listen to the pull request target event so we can do that and just listen to pull request event plus we will remove this one because it doesn't really it isn't really needed here now great so now this workflow would only be triggered by pull request event so we can go back here to the action tab and we'll try the same thing again so I go to my fork um that is located here I'm going to fetch the updates and I'll do another change so this time maybe just kind of change one of the lines in this JavaScript file just so that we can open a pull request great so now I have other modifications to suggest I'm going to open a pull request with them and we should see the workflow running here so we can see that it was triggered and we immediately see different result so the workflow was triggered but it didn't run it is waiting for a pull request event so firstly this is obviously much better results than the previous code injection so I let this workflow run I'll give it the manual approval it needs great and let's see how this looks like we'll give it a few seconds to set up so it's still processing my pull request however it is doing it in a different context on the event of the pull request instead of the pull request target so we can see that the npm test has run successfully so my injected code is still running as part of this workflow however now I'm getting this error message saying resource not accessible by integration now this is GitHub letting me know the access token I've used here to try and alter the readme file of the Roketo app again doesn't really allow me to do that now this access token only have read access it isn't able to change their file in the target repository and what about the other code the one where I send the token back home so we can see that the third one another one was added here so my secret eater did get to and this is the GitHub token generated by this workflow runner so the case is a little better I'm not able to alter the repository however I did manage to send back home the access token now the reason I was able to do that is because there is another important thing to notice when using this workflow and this is this step so the checkout step now the checkout step has another kind of property by default that is enabled which we need to pay attention to it is called persist credentials so with our suggest modification here using the width command and persist and then it shows set to false which is not the default then we should expect to see another flow of events so let's target this workflow sorry one more time we're going to go to the fork the copy of it confetch the updates and again suggest a modification let's say just add an empty line so now we are able to open a pull request and trigger this workflow from remote so we can see just now this workflow has been triggered I'm going to prove and run it great and keep in mind here we are waiting for the secret to be populated so we can see the workflow started to run it will run my arbitrary code and it would fail now just a second so we'll take a look at the failure so the reason for the error message here is because the github access token was not available to it and we can see that here we didn't get any new lines so the secret eater that is listening for github tokens did not manage to get it so I hope this clarifies the change of the different events of the workflow and the nuance of using persistent potential false together with the checkout action because otherwise the default would be true meaning that the rest of the workflow would have full access to the gith token and is able to send it out to whichever location that it wants great so going back to recent attacks another attack that is also around the issue of workflows is code code we touched on that in the overview and we can see the lines that are required to be targeted by this attack here so if you are using the code code action, if you are using it you were potentially leaked, you have potentially leaked information directly from your CI environment we can see that code code is an extremely popular test coverage tool you use it as part of your CI either with a dedicated github action or simply by placing the curl command to download it and it was hacked pretty bad so when the attacker was able to modify one of the automations basically affecting all of code code customers using it at the time ok so to be a little more specific we don't really know how code code was hacked however it was noted that its Google Cloud access key was leaked to one of the Docker images which it turns out is an extremely easy thing to do when constructing a Docker image engineers can leave sensitive data behind we can see here an example of a Docker history command which can also be very compelling by itself now the bash uploader the utility that was hacked is a small utility looks kind of like this and it is responsible for uploading the test coverage results back to the code code platform and then all the attacker did once he was able to modify the file was at this one new line it's a pretty simple line most of you probably understand what it does just by looking at it it prints out the environment variables that are accessible in this sense this is something running in my CI environment and sends them out to a remote server so we can expect to find data like access token, user credentials and API keys now the results of this hack were pretty massive so a lot of high name profile companies did publicly notify they were affected but not only private companies so a lot of open source projects using it as part of the CI some of them are extremely popular so we can see Algo CD Webpack, Ansible, even Kubernetes using it in fact if you go to github and search for the command to download the effective file you will get a few hundred thousand of results of references of it being used today so let's see how this looks like so we jump back to this workflow this time we can look at this one here so this is another workflow this is a pretty standard one again with the protocol's target line, however I've added this extra step directly copied from the official website of the official instructions of how to upload my test coverage result so you can see that I'm running the core command to download this automation script as part of my release process now this is something a lot of companies today have uploaded CI pipeline we're talking about tens of pipelines, hundreds of pipelines and it's extremely hard to keep track of anything that runs as part of it so one of the tools that Argon has built to help mitigate these types of risks is this Argon CLI and it accepts CI workflows it goes over them deconstructs the instruction it doesn't matter if they are written workflow, Azure pipeline a Bitbucket pipeline or even GitLab CI anything you use you can deconstruct and apply custom logic on it so it helps you avoid these security issues those misconfigurations those untrusted user inputs we can run it here okay so we can see here this is how it looks like this is the util you can see that it's scanned one workflow file this is the one to the left and it has all those insights on it so starting from un-pinned actions which allows kind of flexibility into which code you're running to a suspicious use of environment variables and even the pull request target event which we referenced earlier so we can see here that it lets me know that the pull request elevated access could be granted we need to take a look at this specific line here in my workflow when combined with the checkout action this is definitely something that we saw could put me in risk another thing we can see here is that credentials stored in disks so this means I'm not setting the persistent credentials flag to false so GitHub token is kept on disk and is available for any security out there to consume them now the relevant line here is codecrop so you can see here that the argon pipeline scanner has actually noticed that there is an unverified external dependency so someone else's code running in my CI actively running on every release and it lets me know which line it has been detected so we can see one file scan with seven finding one is critical, the other medium and it also allows me to take action on it so not only find the relevant security issues but also to take action we can see here a pull request so when the workflow contains an unverified step argon helps you kind of add verification layer on that so it is added as a pinned GitHub action which helps you kind of use checksum level verification that anything that is running as part of your CI is in fact should in fact render the same goes for any of those gatchas those small kind of very dangerous misconfigurations that could be easily avoided so we can do it kind of in an automatic flow and let's go back here great so now that we saw some of how of those scenarios looks like in the wild an important thing to realize about the software supply chain is that we often think of it as just one long unified process however it is composed of five different layer or phases so the first one would be to secure your source code throughout the dependencies the pipelines themselves building it the artifacts that are being managed and up until time of deployment and in order to fully protect this chain we need to secure each and every link in it and if we fail to do so then we get incidents like CodeCo which was failure in securing one of the dependencies or SolarWind which we already talked about earlier a failure to secure the building environment or the dependency confusion which could have been avoided with proper control over the package registries the artifact servers themselves and the list goes on so CNCF on GitHub can easily show you a list of the software supply chain incidents that took place recently the list does go on you can find more details by checking that out so a little bit about the solution what to do about security of the software supply chain so I'm going to state this framework it correlates very well to the five different phases and it offers one to eight the set of controls that are required to be put in place in order to fully secure the software supply chain so we can see some of them here and if and only if those controls are in place then we can hope to avoid incidents like the one that we mentioned we can see how each one of them targeted a different phase of the supply chain and how the solution how those control gates could help mitigate them now obviously what we just saw with the algorithm and the pipeline scanner is only a small portion of the solution in order to fully protect the supply chain we need a unified security solution so protecting each phase of the process kind of helps you create a governance layer on top of it and obviously prevent any any affected versions of your product from being released to production so I'm going to end with one of the more interesting quotes that I just recently saw from github security lab any modern build orchestration is complex enough to have multiple code injection points so I think this kind of reflects well the fact that the technology has changed build orchestration is now so complex that even github themselves let you know that multiple code injection points can definitely be found there so that's it for today I really appreciate your time if you have any questions or any thoughts on that please feel free to keep in touch and this is it for today thank you bye bye