 Hi, okay. My name is Danny Nevencel from, I'm the CTO of Scribe Security. Scribe Security is focused on securing supply chains and through our work in Scribe, I had the chance to work with design partners and so on implementing the ideas of Salsa and other frameworks. And this is what I want to share with you in this lecture. This is my book only available for the Hebrew speaking audience, but you can see the translation of its name. And this book, I make ideas of cybersecurity available through stories and I wish to use this method also here today. So instead of telling you about this setting and that setting and this design customer and whatever, I'll wrap up all the lessons we have learned in a fictional setting. And first I will set up the story and then discuss why automating evaluation could be useful and why it is important and then take you through our implementation journey along the Salsa levels and then take a few minutes to discuss the takeaways. What I want you to learn from this lecture is how one would go implementing Salsa in a large organization and how this could be evaluated. Okay, so I put on my DevSecOps hat, so name then is DevSecOps and I work in Imaginary.org and you know DevSecOps gets the worst of all the world. He's the DevOps guy, so he gets waken up in the middle of the night and he has also the sec in his name so he can't be too friendly with the developers. So one day I get a call from my CISO, supply chain attacks are on the rise. We are getting customer questions. You are responsible and I want you, the DevSecOps guy, to sign off every release that our company releases. So after I pass the shock, so I put on my hacker hat and try to consider our pipelines. We have many diverse pipelines, thousands of programmers and numerous attack vectors, so I would never be able to sign off every build. So I return with my answer to my CISO, but you know these guys are experienced and he gives me the CISO answer. Okay, you don't want to go on the risk trial, go on the compliance, go do salsa. So I said what's the salsa thing? He said I don't know, it's good for Google, it's also good for us. So go ahead and do it. Okay, so I roll up my sleeves and try to understand what this salsa is. So salsa in like two minutes is a framework that originated at Google and is intended to prevent tampering, secure artifacts and is focusing on the integrity of the artifacts build. What's that? I don't know. Okay, the way salsa works is you implement controls checklist, a checklist of security controls that you should do in your pipelines. And these controls are in some domains like the source control systems and the sources, the build system and the dependencies that are brought into your software project. As many other frameworks, salsa lays down four levels of compliance. You could go on a salsa level one, which would be the easiest but of course less security value and try to get to level four which would be highest security but of course would require much more, you would have a longer list of requirements. One of the important ideas in this philosophy of salsa is the idea of provenance. So provenance is a document which is a sort of evidence to the build of the artifact. We are using many other provenance documents like the food label we see on our food. We know who built it. We know what ingredients went in and so the salsa provenance document also requires to note who the builder is and I should use artifacts only from trusted builders, what recipe was used, what ingredients were used and so on. Since it's evidence so it should be treated as evidence and as you go up the salsa levels you need to better protect the evidence itself. Each level of salsa puts more requirements on the generation and protection of the provenance document. After understanding a little bit about salsa, I started to consider our pipelines. As a large organization, our organization has many diverse pipeline technologies. We have source control management of various kinds, Jure DevOps, SVN, whoever remembers the jazz and that's only a partial list. As a build systems, we have many build systems, Jenkins over Kubernetes as a main tool but we still have Jenkins over in-house servers and other pipeline technologies, either old projects or came to our company through acquiring other companies that came along with other technology stack and of course also legacy projects with NoCI. So as a DevSecOps guy that got a project, the first step was to focus and I decided to focus on two main pipelines. Looking forward, Jenkins over Kubernetes is our main heavy duty tool in-house and we use GitHub and GitHub workflows as an emerging technology and as a technology that is convenient for us to let subcontractors and for small projects to handle themselves before we integrate them into our Jenkins over Kubernetes main pipelines and as source control, I would focus on GitHub and also focus on Dockerized applications as they are more and more common in our imaginary company. When I consider how I would sign off in the future the assurance of the artifacts build, I remember that our pipelines are dynamic. Pipelines are not static, you know, build them once and they stay, they change, they're changed by the programmers, by the DevOps guys due to features that are needed, due to new settings and new technologies that are incorporated or the need for temporary bypassing tests and whatever making things easier just to get things out. So how would I go if I would need to sign off every artifact, I would need to track all these changes and make sure that I can handle it and I knew I couldn't. So the answer was to evaluate the automation, I would not evaluate the compliance, I would not take a pipeline and make it compliant but I would like to measure if the pipeline is compliant and such a measurement would give the value of a gap report and then the gap report could be used to go ahead and fix, manually fix whatever needed. So with these ideas of focusing on two main pipelines and automating the evaluation, I set up on my way on the journey of the salsa levels. Okay, so the first steps of course are salsa level one. So salsa level one is easy to adopt, great, create some visibility, so let's jump into it. So two requirements for salsa level one, it should be a scripted build and the provenance should be available. So when I think about implementing it, so nothing to implement since the pipelines I've chosen are pipelines and so they are scripted, nothing to do over there. Of course it could have value for the legacy non-scripted pipelines but that's not the mainstream of projects in our company. And when I look into the salsa standard and look into the exact requirements of the provenance document, so I see that the required fields, it boils down to two fields, the builder ID and the build type and this of course is available from the build systems themselves, so the provenance is available, it's over there, we know it's Jenkins and we can also know the pipeline run, like the job run or it's equivalent in get type work flows. So the provenance is available, so if it's so easy, I can jump right into salsa level two. Salsa level two adds additional requirements, the sars should be version controlled, build should be done by a service and the provenance now should be authenticated and service generated. So before jumping into implementing this, I put back my hacker hat and tried to understand what is the value of these requirements and I think it boils down to separating the build from the programmer's environment. So if the sources are version controlled, sources reside somewhere not on the developer machine and this would serve as a forensic tool, it would serve as a deterring mechanism like if you're an attacker and you know everything is recorded in version control, you need to be more careful, so it should deter the programmer or attacker who caught the builder, the programmer's machine. Also building by a service takes the power from me as the programmer and puts it in some other service and when the provenance is service generated, again it's not me as a programmer writing down some provenance document, it's generated somewhere else and so it's supposed to be better than giving this authority to the programmer himself. Okay, so let's move to implementing it. So the source version control, again once we're talking about modern projects, sources are version controlled, not much to do. Also modern projects would be built by a service, so we're left with a provenance that should be authenticated and service generated. So the natural answer of the security expert to what is authenticated is signed, you sign and verify the signature, but if we open up the salsa documents, so they wrote and truthfully so that it should be authenticated, but it should be signed and using, you know, public cryptography, but it could be authenticated also in other ways. So in my opinion, there were two options to the provenance authenticity, two implemented provenance authenticity, one is to consider the build service data as authenticated and there is logic behind it because I trust the build system for doing the build and also salsa assumes that the build system is trusted, so even if I would give the build system the ability to sign, again, I would trust on the build system for doing the signing, so I don't get much from the signing and the other option could be to go ahead and implement provenance signing. So my decision was going on both, but I'm not sure in the way that the salsa guys meant at the beginning. So first, in this case, I have like two hats, the hats of the consumer, I will need to sign off, so I need to trust artifacts that come to me, but we might need to prove to other people that our artifacts are trusted, so I would, I myself, I would accept the build service logs, the build service data, like logs and API data as trusted and I would sign myself as a DevSecOps guy, I could use any signing technology and what I would gain is that it would be easy, I would do it myself in my station when I need to sign off something and I would not need to go and integrate this signing into all the pipelines or at least all the pipeline mechanisms and the value would be long-term provenance protection once it's signed, so I don't have to worry about someone tampering with this piece of evidence and it would be, it would go more smooth to other customers, which I would not have to convince why I trust the build systems, they would go for the default answer of authenticity signing. So how would I go automating the evaluation of the salsa compliance, so we can collect data from the log files, if we want we can optionally convert this data to a provenance document in the format that salsa suggests and we can actually, as I said, sign the provenance document, so when we give a look, we don't have to read all these lines here, but if we give a look at the log files, we have most of the data, at least the minimal data required we have there, of course, and even much more, but care should be taken, like not all logs are created equal, if we look at the GitHub UX log and at the GitHub CLI tool logs, we get different results and what is missing is the most important part, the yellow part, and the above will be displayed on the GitHub app through the UX, but would not be displayed when you, would not be recorded when you use the CLI tool and there, what's written there is what was the build script that was used and so on, so care should be taken here. Also, a similar anecdote is the mutable reference, so when we said that the salsa should be version controlled, the exact requirement is we should use immutable references, a reference that would point exactly to the artifact and not a tag, you know, when we tag a version, it's a tag, it can be passed to any artifact and that's what usually happens, that we have the tag of the main branch and it every day would point to something else, so caution should be taken to take the right data, but the data is there, like if we can't get the data from the GitHub workflow log, we can get it from the GitHub API, workflow API and know exactly the immutable reference needed. Okay, so evaluation automated, automation is really easy, collect data from log files in API files, as we said before, so I was really happy and updated my CISO that we accomplished salsa level one and level two. What we have learned from this until now that it's in modern environments, it's given once you use a version control and build system, you are in fact salsa level two. Another lesson learned is that when things are done compliance driven, we should expect the minimal result and that's why I went on with the minimal provenance document with only the builder ID and the build type, why is it so? Because if I have DevSec ops guy and I have a lot of tests on my plate, I would not have the time to excel in protecting my supply chain, I would do whatever was required if I measured on compliance and if I am a consumer of software, so I must assume that that's what my supplier has done until he proved otherwise. So when we go on a framework that is compliance, when we do compliance driven implementation of a framework, we should be aware that we might get a minimal value and we should take care, maybe negotiate with our suppliers and so on. As you saw, I went on building the provenance and doing everything from APIs and log files because in a larger organization with many pipelines, it's not legitimate to go ahead and say, for this salsa thing, I need to open up all the pipelines, that would not work. Signing, so signing is still an issue, is still an issue and that's why I went on looking into the authenticity of the build system logs themselves. It is true that one can use SixTor, Inspire and so on and that's what I would use if I would sign myself on my station these provenance documents, but going ahead and installing it in many places and handling the ramification of key systems such as revocation and dealing with key leakage and all the prices that come with key systems, it's still an issue. Of course SixTor is a big promise here. So after describing all this to my seesaw, we answered with obvious questions. So what did we get from all this work about this salsa? So I understood the hint and went to the next step in my salsa journey. Okay, so level three promises much more. Let's see what are the requirements. So here we got a long bunch of requirements, like seven requirements. Before diving into them, again, when I put on my hacker hat and asked myself what do we gain from here, so the protection from the developer workstation gets better, it's a little larger, and we get the protection from adjacent build systems, build runs, or from the history of the build runs, like the ephemeral requirement would save us from attacks that left something on the build system that would influence the next build run and isolated would prevent two pipelines from influencing each other. And the provenance here is non-fossil valuable, we want better evidence, better protected evidence. Okay, so source verified. What exactly is the requirement? Every change in the vivid history has to be as at least one strongly authenticated actor, and along the salsa requirements, strongly authenticated means two-factor or multifactor authentication. So as a modern organization, we would have multifactor authentication deployed across the organization, but the exact requirement, every change is quite a total requirement, and it would be tough to comply to this requirement, like imagine that one day some subcontractor lost his phone and he needs to work, and we in the natural solution would be to let him work one day with single-factor authentication, and it would be in risk management view, reasonable solution, it's not he's working his whole life with single-factor authentication only like a few hours, and it would be okay, but I would lose my salsa accreditation. Another challenge is strong authentication and the get reality, so the natural solution for verifying source is using signed commits, but signed commits are not two-factor authenticated, they are based on SSH keys that is stored on your machine, and if we use like other tools of automating code changes that use API, so API keys also are not multifactor authentication. Let's go to the next requirement. So the next source requirement is that the sources should be retained indefinitely, and again, it's quite a total requirement, but should be noted that here the salsa guys gave a solution, but this solution is hard to implement. You could change the history, but this change should have two-party approval, like a multi-sig solution, and this is not really a feature, a tool that is available, so if I want to implement it, again, what would happen is that I would lose my accreditation because I had to do some history rewrite, and I would not have the tools in place for the two-party approval of the leading of the history. Why should I delete the history? Sometimes it's just to clean up things, and sometimes, like if someone committed by mistake a secret, so, of course, the formal solution for a secret leakage is to go ahead and make this secret not useful again, but it could take time, it could be a secret that other systems also use, so the quick solution, maybe the not the best solution, but the quick solution would be to delete it from the repo as fast as I can. So, again, this is also a total requirement that could be a problem when implementing Salsa. So, what are my options to implementing and also, in sense, automating the evaluation of these requirements? So, I could use signed commits, and it would not be two-factor authenticated, or I could continuously evaluate that, verify the two-factor authentication is enforced while the project is built during the time of the commit, at least, and so, happily, in the discussions around the Salsa community, signed commits are appearing to be a legitimate solution for this requirement, so this would be the easy track for implementation, and, again, using retained history, also, I could evaluate the branch protection rules and through the APIs or monitor the commit history and verify that it's continuous. Now, to see that, once I saw a hash of a commit, I can always see it in the future. For that, for such reasons, we just released, at the last week, a GitHub project named, an open source project named GitGut, which is aimed as, it generates a report about your security posture of your GitHub account, and running it continuously could bring us close to these requirements. Okay, so, we have two more challenges in front of us for Salsa level three, build requirements and provenance. Okay, so, hold up for these three requirements. So, as we said before, ephemeral and isolation requirements are supposed to make sure that pipelines don't influence each other, but there are natural shared resources between pipelines that we cannot disconnect. It's the build system itself, which could be accessed by the build runs, like if we want the build run to be part of the version labeling of the software itself. When you do vendor version to the software you want to know, you might want to know it's build run. Bumping could, which is a technique in which the build system would push up the version number of the build software. Image repositories are a shared resource because they would be used also for base images, parent images of the images built by the organization itself, or build images, images using as build machines. And these are also like shared resources. So, how would I go ahead and approach this challenge? Okay, so let's go one by one. So, regarding the source control, what I could do is verify that isolation was not exploited. I would not verify that it is isolated. I would verify that even if it's not isolated, isolation was not exploited, and that's quite easy. I would go review the git commits and see that no commit was done during the period of the build. The build service, we said before, we need to trust someone, and this someone is the build system itself, so we assume the build system itself trusted. There is also the build infrastructure. Like, suppose I do trust Jenkins, but if I run Jenkins over Kubernetes and I let the runners high permissions access to the build machine, so, of course, builds could influence each other. And this would require an internal checkup. Like, if in our organization we are aware of high permission risks and not running the Docker inside Docker and so on, so we would have in place our policies and we would need to check them up. And regarding the image repository, we could also verify its configuration, like, make sure that only the build images, everyone may have read access to the parent images and build images, but only certain pipelines would have write access. So, using this configuration, we would not care that it's a shared resource because only one could write to there. And regarding other resources, usually these fall into the category of dependencies, and, happily, dependencies are a requirement only of salsa level four, so I could leave it for the next stage. So, how would I go on evaluating it? So, I could verify that builds are done on fresh pods and that's, you can see it in the Jenkins logs. On GitHub workflows, I would verify that I'm working on hosted runners and once it's a hosted runner, GitHub promises that it's a fresh machine. And I would need to verify implementation of our security practices that no high permission dockers are run on the build and so on. And this would be tailored solution for our pipelines and not a generic evaluation method. Another thing that I would need to take into account is to make sure that no caching is used, like there are also in GitHub and Jenkins, the techniques to cache and I would verify that cache is not used in these cases. Okay, so we're left with the non-falsifiable provenance. The requirement is that the provenance cannot be falsified by the build services users. I remind you what was in the mind of the salsa creators is to protect from the developers that could either influence what's built or the other builds that could then further influence this artifact on hand. So since I wanted to rely on log files are the log files falsifiable. So here I went on to make a quick experiment. So I created my version of Git. That's it. These pipelines over there. I just print the Git version to 18 bad version. And I tried to combine it in a standard GitHub workflow where I'm using the official GitHub checkout action as you see at the bottom line over there. And what happens when we run this build? So we see we can get quite far down the run of the pipeline. Of course, you can't get too far with the implementation of Git that only prints out on Git. But if I would do a hack on Git, I could go down the whole pipeline, implement whatever was needed by Git, and also bad actions and report whatever I want. So here I report that I'm Git version to 18 so I can get past a large part of the pipeline. So what we see here is that we should again be cautious on which data we are willing to use from the pipeline. So for unphosphor prevalence the what we understand that we should use only service-generated data and logs are not should not be considered that the whole log file is service-generated. Of course, the log files are a combination of data that is generated by the service and data that is generated by tools that programmers or developer decided to operate and maybe he could also inject them and pull them into the pipeline like in the example you saw a minute ago. And again we could also sign it if needed. Now it was much more tough but I could report accomplished to my CISO. And what did we learn here? So SASA level 3 is achievable on some platform setting but not all and if the infrastructure security is not handled I would not be able to declare that the pipelines are isolated for example. Losing the SASA accreditation due to the total requirements is a challenge and I think it's a limitation and there is a need of some exception handling through which someone could be authorized to sign off and say okay even though the federal check is not checked it's okay. Some of the when we come to implement such requirements at another organization and we explain that SASA level 3 will protect you better from the programmer one of the questions that is asked is okay but we do a lot of effort in protecting the programmer's environment. We give him many tools on his machine and we have security systems that track everything so are you sure that we need to do the additional requirements? Again SASA does not give the space to do this risk management practice and that's I think we all know it the world of standards has two flavors one is the checklist flavor and one is the risk management flavor so this is one of the downsides of choosing the checklist okay and we saw non-phosphoryl provenance should rely solely on on service generated data and care should be given about it okay so okay so after reporting this I got the natural answer from my CISO if you accomplish level 3 go ahead and do level 4 so I open up level 4 I have many more requirements but again what's new here we get better protection from the programmer from the developer like the two person reviewed requires human eyes that would go and see the code so we know that the programmer is not pushing malicious code and the build is better protected from other builds and we see a first addressing of the dependencies so since the build requirements of salsa level 3 were the hard part so I jump straight into the build requirements of the salsa level 4 so here's a requirement the tough one is hermetic all transitive build steps sources and dependencies were fully declared upfront for the mutable references which means that you cannot use like in your packet JSON you cannot write of course you cannot use any hat so if you don't care about the version or any minor version you need to say I want to use I don't know colors and this is the hash of its commit so here I was defeated it would require re-engineering of all the pipelines we would need to go into every pipeline and to pin all the dependencies and to find somewhere their mutable references and the tools so it may be implemented for specific projects if we have a customer that is willing to pay for the additional cost needed to do such a thing we would craft him a specialized pipeline but to go ahead and as an organization to push forward and better protect our supply chain we would not go for salsa level 4 okay so a few words about what we have learned here so salsa an emerging interesting standard and as an emerging standard it on one hand puts many requirements on the table that we could go and implement and on the other hand there are still things that need to be improved some concerns were that many depends on the developer and his decisions and as a consumer we must pay attention on what we are getting other concerns were the totality of some requirements and the lack of some tools and we saw that most of the many parts can be evaluated from salsa level 3 on evaluation depends needs to be tailored to the organization thank you do we have time for questions one question who's second thank you very much so what you would need to do with logs is you would need to hand pick what part of the logs you are willing to take at the beginning of if you look at Jenkins log it's quite clear what was created by Jenkins and what are the logs of the tools that were run by Jenkins that would be the starting point to approach it okay do you have time for another one I'm sorry I didn't get it so the question was like in the threat modeling view did the salsa address the threats that I would anticipate needed to be handled and I would say not all of them and the emphasis would be that we should listen to the salsa language which speaks about integrities requirements about vulnerability management and the dependencies wait only for level 4 so I'm not sure that this prioritization is right in the broader view salsa states that it's focused on integrity so you would get integrity okay okay so thank you very much goodbye