 Good afternoon, my name is James Holland, I'm as I already mentioned, I am Head of Upside for City. My colleague from control plane will be joining us shortly. There's no rush at the moment, he's going to come in and give us a demonstration of what we are doing at the moment. So, a bit of a provocative title, How is your supply chain with your INSEC and your OSS ingestion? i chi'n practiwch wrth ddim i chi ddim y oedd退 i'n gynghwyl ar dechrau phbonnydd. Nid wrth ddim modd a ddim chwewyddolygu ein lland Samsung yma. Mae gwirwch ychydig eu gw complex wnaeth, wrth gwrs yn holl yn dweud favourite yn hon dülovearn cherishau yng nghymruOh ym yn cael ei wiselliau arforeign a rhai. Felly gallwch na di wleiddiwch cy fareid yr gen i есть aglawn ych hypedw novel. Cyddo didnd после gareddonol ond rydyn nhw ddim caill oedledso'i gyd. I'll give you a background of the why, a bit of a history of what I've been doing over the 20 years in supply chain with people like Department of Defence, which is super interesting. Some of the tunings we've seen in the past and what we're hoping to open source to the community with control plan and city. This is how we are looking at the landscape from the work we're doing in the marketplace. In city we've got quite an extensive software supply chain programme under John Meadows. He heads up a few of the working groups and in the working groups for CNCF supply chain, also the OSSF as well. Before I do get into this, can I have a show of hands? Who here is using a package manager or something like Artifactory in their organisations? Do you ingest software directly into your pipelines straight off GitHub? Some nervous looking people? Hands up? Anybody else doing some straight into your pipelines without checking where it's coming from or whether you should use it or not? Another show of hands, how many people are actually doing an assessment of whether you should be using that library or not? One. Excellent. Come and speak to you later. I think that's what I'm trying to emphasise is that the point is that you can use this library but should you? Is it fit for purpose? Is it what it's saying is doing is the right thing? Now, as you can see in this landscape here, you can't have a supply chain secure bills if you have no idea where you're getting your libraries from and you don't know the maturity levels and all the other signals that might go around that. If we look at this tweet, I think this sums up the problem pretty well actually. Look at all these package managers, very flexible, super helpful, but a lot of them allow, do you run pre and post install scripts as soon as you've installed it, the piece of software into your development environment. Now, we know from previous experience MPM has got 2.2% of all the libraries have scripts that run. Then, if we look in further, we've analysed some of these, 94% of them you say are pretty insecure or malicious. It's quite a huge amount. If you look at MPM alone, 2% of the libraries there are doing malicious activities. That's not great, especially as you've got no way of checking that at installation time at the present moment in time. You might have software that does things, there is some commercial software that does this, but there's nothing really out there for anybody else. If we start looking at what signals we want to look at, there's never a green yes or no, there's always a gray area in between. We're always going to take libraries from Google, a shoot open source, or Alpha Amiga project from OpenSSF, because they've gone through a whole series of checks and we can rely on them. That's the level of trust we have with those organisations, but you still have to ingest them and you still have to then check the signatures that came from that source. Then there's the ones you can deny. We're having a signal that we know this library is being worked on by somebody who has a sort of dodgy background. Or you know it's coming from a sanctioned country. We've seen this in the bank. The fact that the package doesn't have a signature is not necessarily a bad sign, but if a signature is failing, it would suggest it is. I'm not going to go on to the message code. I think that's where the install scripts, I was saying, the PMPost install scripts on the package managers come into this red area. And as a bank, you've got to make a decision or an organisation enterprise, you've got to try to make a decision. But then you get all the gray area. Should any one of those fail? Should you block the usage of that library? No. There's a set of each individual enterprise to make a risk decision-based office information and have their own policies around this. And there's a lot of information in there, a lot of signals coming in that we see and a lot of different feeds coming in. We need a way of having a very simple policy for enterprises to check these signals and make a decision whether they should ingest this library or not. And then you're going to have to start scaling this up. If you're making a decision around this, it's okay to do it for one or two libraries or 50 libraries. But what if you're starting to deal with three million libraries? And you have to then do it continuously. So it needs a bit of scale around this. So, as I said before, I've heard a bit of history about this. I've done this manually for a long time. Department of Defence, we used to do this. They wanted everything built from source, and if you brought a library and you had to justify its existence every single time. So you were very careful on what its maturity of it was, so you didn't have to replace it. So just looking, this is a spreadsheet that my colleague François Eric Gauymars used to come up with. And we used to use this every time. So any developer used to come in and ask for a library. They had to go through this checklist, and then we'd go like an empire, emperor, thumbs up or a thumbs down. And that was including Oracle Database. So we had to build Oracle Database from source for DoD and put it to AskGrow, which was quite interesting. So what I would like to say about this is that the process of building everything from source is not possible for most people. OK, there's only certain organisations of a certain scale that can do that. Google will do it for their bog, DoD will demand it of a lot of organisations, but trying to do a reproducible build or build your source code is impossible, because all of the environments are so different on all the install scripts. Each library requires something else to be done for the build to happen. So it's too difficult to do and too expensive to do. So I think we mentioned this before. I think the previous speaker mentioned this. So there's a load of new tooling coming in. This is allowing us to automate a lot of this. So Sbom, Salsa, Atasations, that would be great if we could get that every time. Vex, as I mentioned. There's also Kev, another one, which is known exploitable vulnerabilities. That's an interesting one to look at, so you can say actually there is a proper exploit here. But these are now allowing us to automate and scale this automatically. That's what we're hopefully trying to do. But the tools do exist. They have evolved a little bit. We have scorecards and stuff like that. But they're limited to normally proprietary systems. So you have a lot of the vendors do have the intelligence, but they don't actually let you to make a decision on the grey area. They'll just say yes, no, and that's it. You might want to do some further investigation and actually make some policy decisions around that. So there's limited availability for the tools, the current tooling we have, to remove this toil. We've been working together with Control Plane at City to come up with a tool to allow us to ingest software to make the checks beforehand and allow it to be groomed and remove a lot of the problems of toil, as I mentioned before. And then obviously using the best practices you've seen around in Saltaher and a few other of the Fresno areas about signing and making sure we bring them in the correct way. Some of the used cases, I'm going to squint onto that. Some of the used cases around this are getting a bit more complicated. So at the moment we're focusing on the blue areas, the top right about how we're ingesting the libraries in. But we're doing a bit of a separation of the policy of what you ingest in and actually what the scanning policy might be or the test policy you do. So when we collect the evidence, we don't make a decision within the container at that time, we collect the evidence and store it for later. We sign it and store it for later. And then make a policy decision at the end. Because it will allow us then to rerun that in the future, the next day, the day after the day after. So if we collected the evidence, we can then the policy changes, we can then see it re-evaluate it. And this allows us to groom the libraries daily. What do I also want to say about this? I think it's very interesting to separate what the initial lookup is from the scan policy. Because the initial lookup policy of the PRL can do this particular scan or not, or it can be coming in a particular way is okay. But the scan policy can have a massive effect across all the scans and all the libraries you're using. So if I change the policy on whether a signature is allowed or not, I want to see what that effect is if I make that change across the organisation. If I've got 3 million libraries in, and I say everything must be signed, you could lose 50% of your libraries from your organisation. So you need to test and show what the effect would be on that scan policy change. Okay. And also we want to be able to subscribe to feeds so we can bring these in earlier, and I think that's the red section left. But individual libraries, you've got to be able to give an override as well. We see this a lot. The stuff that's brought in that fails a policy, a very specific policy, which Thomas will demonstrate later, that you actually have to say, for this library, it's going to be fine. No problem at all. You're going to let that in because we know we've done some backward tracks to allow it. Because even though you have a core policy set you've set up, it's going to blow you away at the water with the failure. Okay. Some flows I'm going to try and go a bit quicker so we can get more demo in. As I said, there's a separation between PURL, packageURL, which is our universal lookup. We make a policy decision there. We can get feeds to make an instant decision, or yes or no. But then we can go into different flows. The first is the intel flow. You know, go to a score card or a chain guard, or other feeds from other vendors. Within the city we've got about four or five different feeds that we use, giving us various information about even individual developers. Then most things will go through the standard flow, but you'll also feel they might go on to things like, okay, this is a package MPM. Do we want to actually run this in a sandbox, see what it does on the install scripts, run it over a period of time with a Linux timestamp so we can see what is going to happen over a two or three-week period, and accelerate that to see what it does, because it might actually have some sleeper code in it that gets triggered in two weeks' time. So that's the sort of extensions we're looking to put into this. Also, getting from alternative sources, I think I've mentioned that here. I mentioned that in the first slide. So, we're going to do a lot of talking. It's a very similar talk that we gave in Dublin, but this time we want to give a demo of what we're doing. We're in early alpha, but we'd like a bit of feedback off that if we could. So, I'll ask Thomas to come up. Can everyone hear me all right? All right. Let me go through an example. So, we are going to demo the MPM ecosystem today. And React, everyone knows it, I guess. So, if someone requests React, we always care also about the dependencies. And in this one, the direct one, we have looseenrefly and a transitive one with chase tokens that we also need to ingest, that we also need to check. So, we've built open source ingestion based on AWS, EKS, and Tecton. And so, we have Tecton pipelines in place. And, yeah, let's just click it off. Those are the two purals I'm going to send off. So, one is the mentioned React example. And the other one is the shell quote package. I'm going to do that later. Why? And this is just a test script to send it to our API. And I send it off. And right now, the Tecton dashboard is a good way to showcase what's going on. And we can see we have four pipeline runs running. The reason of that is the first one is React, then we have two sub dependencies for React, and then we have to shell quote package. So, this one is going to be React running our different checks. And we are almost halfway through here already. And let's go a bit more deeper into the checks we are running at the moment. And this is something that can expand quite a lot, right? Like, in the future, we are going to have the pre and post install script checks in a different task, and SAS checks, et cetera, all kinds of things we can run here. So, right now, we have, in the first stage, the Intel scorecard check, the scorecard check itself. So, we are going to look at the GitHub repository of the dependency signature check. Lots of signatures everywhere nowadays, but no point in if you don't verify them. We are doing that here. Most package managers or registries publish or supply the signatures. And we are checking the signature here against the public key that is publicly available. And at the first stage, we are running our vulnerability stage. So, we are calling out to SNCC here and checking if there are any vulnerabilities and any CVEs, stuff like that in here. And James mentioned before the policy. Let's go inside the task on itself to see. So, in the policy decision stage, we are fetching all the check results of the previous stages, we are getting them back. And we run a policy check against OPA with a bit of policy we've written. And in this case, a bit of a spoiler here, it's a pass. So, a policy pass, and that means we can continue with ingestion and can put it in our internal libraries. Now, the last one, shell quote. One second here. I got a kind of a bit of a different message here. First, I got a message from Scorecard. Apprentice protection is not so great. Many GitHub repositories have that. So, it could be a warning, it could be a failure, depending on your policy. So, that's quite flexible. But here, in here, we also have a high CVE coming back from SNCC. And that's a remote code execution CVE. And we probably don't want that to be ingested. So, in terms of ingestion, what does that mean? So, we have... Let's go in here. We have an e-jaste stage. We do a couple of things here. So, depending on the previous policy decision, we go a different path. So, we can go to the internal registry and if everything passed and we are golden, we can throw it to quarantine, or we can straight up deny things or emit warnings. And at the same time, not only putting it out to registries, but also making sure that everyone knows about it. That's not an important thing to notify people. So, right now, simple emails, we are sending out to SNS. And here, I can see what happened inside our tecton in summary. So, shellcode, you already know why it failed. Let's look into J's tokens. And here fails the signature check. Reason being of that signature check is because NPM does not enforce you to put signatures on your packages. In Maven, that is mandatory, for example. So, if a signature check fails in the Maven system, it gets much more interesting because then something is really going wrong. But in NPM, it's not mandatory, and if it's not there, we can't run a signature check, and therefore it fails for us. In terms of e-gest, right now, we are putting it into an S3 bucket that is kind of like our output store, and you can see here we have the two packages that both got accepted along with the signature for further consumption. If someone else wants to run signature checks again, they can use that. But our two failed packages, of course, are not part of that gang in here because we didn't allow them. There's a bit more going on. So, in terms of check results and provenance, we are using Tecton chains to create provenance for us. So, we're signing in that with our KMS key and putting in a dynoDB for all of our provenance. But the provenance doesn't capture the actual result of the checks for us. So, for example, the scorecard check, Jason, I get back, is not part of the provenance by Tecton chains, but we still very much care about those results, and we still want to fetch them back, which is why we've written a small database client. It's Noiskill, so we have a document on MongoDB running here. And let's go into... Is that readable? It's okay? Okay. So, we have a bunch of information in here and it's going to be more in the future, but all of that is metadata that we can use to reference data with each other, to reference what other checks have been running, and time of ingestion, stuff like that, to also trigger re-ingests, things like that. But we also have a payload, and signature check is kind of easy right now. We have either it passed or it failed. So, here we have our current example, React, that one passed, and so we have signature verified through. But we are outside of Tecton chains, and so with Tecton chains, what it gives us, we can verify our provenance, but we also want to be able to verify our provenance from our check results, which is why we have written with DSSE envelopes a mechanism to also verify these documents. So, that signature is part of that, and so anytime we pull back those check results, we can always verify the signature of everything and make sure that someone else didn't do something like this, or mess with our check results to get something through, so we can always do that. That is pretty much it for our technical demo. There's a bit more to it, but that's all we have time for now, and I think it's also a good time to have questions, if there are any. Those here at the PULs? The UI, where in Tecton? Sorry, you had a screen drilled down. You had the package mentioned. I think you just like two minutes ago. Okay, your check didn't look like it had a version. Like you had an NPM check for a signature? Yeah. And that check, are you versioning your checks now? Yes, kind of, because the check itself is versioned because it's unique in the context of Tecton chains as well. We have the task-run ID in our context with the database to make it unique, and in terms of the versioning of the database, we also have with that context. With our versioning of the actual code we are running inside those checks, those are at the end of the day docker images that we can version accordingly. That information is also part of Tecton chains, so someone could go back and check at what version was this check running. The example you had was someone went in and edited it, it made it super good, but the check itself, it didn't look like you had enough information on the check. You had a timestamp and you had the name of the check, but you didn't have the version of it, so that could itself be... You can put it on there. Any other questions? Thank you for listening. Thank you.