 So welcome. I feel like we have such a big room. I want to ask everyone to step forward, but that's okay. Very excited to be here. This is my first time being at this event, although I've kind of followed the site and whatnot for quite some time. A little bit about myself. My name is Susan. I work in product at White Source. For those of you who might not be aware of who White Source is, we're an SCA solution software provider. We focus on identifying and remediating open source vulnerabilities as part of your SDLC work clothes. Not to make this into a commercial or anything like that. So today what I wanted to really focus on is about the software supply chain. One of the things that we've noticed when we talk to our customers or even within our own mind space in our own industry is that there's a lot of different biases and a lot of different mindsets of what the software supply chain really is. So pretty graphics. Actually not so pretty. I didn't have time to have marketing or anybody take a look at this slide deck. But I know when I first came to thinking about a software supply chain, I'm like, well, of course, it's the software that you put in as part of making your application. So that could be my open source. That might be my off the shelf components. That might be my custom code, but it's really the software piece. And it's really so much broader about that. And the reason I'm kind of belaboring the point a little bit on this is because when we look at these attacks, which we'll get to in just a moment, and when we think about the overall problem statements of some of the regulations and the frameworks we're trying to achieve, they're really looking at it from this perspective. So they're looking at, you know, your Github and your Artifactory. So like, you know, your code and your binary perspectives, they're looking at your Jenkins and your Circle CI and your Github actions, right? Like how things are built. They're looking at, you know, Docker Hub and S3 and they're looking at Composer and NPM and orchestration platforms. And they're looking at your code. So all of this together really pulls everything together into your software supply chain. So when we're looking at the problem statement, this is kind of the scope. Why do we care about it? So I'll be the first to admit I am not a big, this is probably the wrong place to admit this, but I'm not a big open source geek. Like I don't go and find cool studies to read and, oh, tell me everything about the community. But I actually, as part of preparing for this particular talk, I actually found this EU threat landscape report that just came out. I think it was in July. And if you guys are into that sort of thing, it's actually a really good read. It's not like 50 pages. They have a lot of good information in terms of what they're seeing in the supply chain landscape. But I thought this graphic was actually kind of compelling. Yeah, we hear about software supply chain attacks and the news will go through a couple of them. And I know I've probably heard about solar winds in the last six months more than I ever ever wanted to hear about anything in my professional life. But you can see from the graph, you know, if we look at 2020, yeah, of course, it's out there. This is nothing new, right? This is nothing new. But what we see towards the end of 2020 going into 2021 is the frequency. The frequency is increasing. And it's increasing across the board. So it's not just nation states. It's not just these APT groups, you know, we're starting to get like cyber criminals in the mix too. So the group of attackers is also expanding some additional information again, because, you know, what's one more graphic to kind of belabor the point. Again, within that same bit of research that the Inesha, I guess that's how you pronounce it, actually put out is that they are estimating about a four times increase just within 2021. Also another report, state of software supply chain looked at this and said, well, year over year, it's a 650% increase. And this I thought was actually very interesting. So if you remember the first graphic, the software supply chain is actually a very complex thing. Lots of different components, lots of different moving pieces. But what they were looking at where the main focus of these attacks are was really within the code. To be fair to put that into perspective, if you read the whole report, a lot of times people had no idea how they got it, which is scary in and of itself. But for the ones that they looked at where they knew where the source of the attackers or how people got in, it was really on the code. So that would be your open source. That would be in your off the shelf components, what not that goes into your software supply chain. So an example, I am assuming that everyone has heard of codecub. Is that a valid assumption? Okay, I got one thumbs up. I'm assuming there's virtual thumbs up here. So I thought it would be worthwhile to take it from the very abstract to maybe give you a couple of examples of what we're talking about here. So codecub was actually fairly well publicized within the main street press. And again, you know, if me and my very biased way of looking at things, I'm like, okay, so I have my open source out of my code. I have this and I have that. That's what I need to be worried about. So with this particular attack, they actually were looking at codecub, which is a very well known way to make sure that you're producing quality code. So it'll give you code metrics. How well are you doing in terms of quality? Very cool. Lots of languages, lots of cool functionality. So basically what happened is they updated their product as we do and they introduced a bug into that update. So basically they updated, they deployed again, put it out there and a malicious attacker said, oh, I see this bug. So they were able to basically take advantage of that bug, modify that container, a bash script within that container. So then when a customer would download that to get the latest update, their credentials, their repository credentials would be sent to that attacker. So now the attacker would have access to their repository, where their code is, where their IP is. So again, something that you wouldn't think about, this is just a code coverage tool at the end of the day. I'm using it to test. Was able to give a malicious attacker insight into an organization's intellectual property. And one of the, again, one of the, if you search on code code, one of the victims of this was Rapid7, which Rapid7, I mean, they're an app site company. They know their stuff too. So when you look at these supply chain attacks, again, one thing is very interesting. It's not that, you know, Joe Schmo over there, he doesn't know what he's doing. Oh, God, he doesn't have an app site program. He's not securing anything. He's really looking at some of these larger organizations like Rapid7, like Apple, like Microsoft that are really being affected. Another example that maybe isn't quite so well known just to really hone in that point of where potential risks can actually live is within Composer. So Composer is a package manager for PHP. PHP, one of the most popular languages out there. I don't think it might be number one, but certainly top five in terms of the number of applications deployed using PHP. So basically if I'm a developer and I'm using PHP as my language, I would have Composer installed or it might be within the CI CD pipeline. I would say, hey, I need this open source library. Composer is like, okay, I'm going to go out to Packagist, which is a public package repository. I'm going to go see if I got it. If I do, I'm going to send some information back. And then Composer is like, okay, I know where it is. I'm going to go back and get it. So cool. I mean, this sort of model has allowed PHP and Packagist to really expand. It's popularity and getting a lot of people using it. So some of the challenges or some of the problems with this setup is that first off, it's a public repository. So if Susan wanted to upload a package to Packagist, all I need to do is create an account and I upload it. Secondly, which is good in terms of popularity. Secondly, there's really no validation when I upload that package or I upload that metadata to describe my package. It looks to see if the JSON is properly formed and it does some very, very basic looking at, you know, is it an HTTPS and you know, stuff like that. But very, very, very basic. So for those of us in the apps like industry, what's the problem with accepting user data that you don't really validate? It's usually not good. So you can see kind of on the left or I guess, yeah, you're left. What a good, well-formed, safe package manifest looks like. You know, I have some descriptors. I say where it's at. It's over here on this Git repository. This is Composer. This is where you would go and get it. Okay. But a researcher was able to show, well, guess what? Again, it's a bug in the software. This isn't anything that was malicious. And the difference between this and the previous examples, this is a vulnerability. The other one was an actual attack where people got attacked. We'll just say that. With this one, we were able to discover it before and then fix it. But with the proof of concept, you can see, well, hey, I'm going to take that URL field in the JSON file and the metadata and I'm going to put my own system command in there. And so the package was uploaded. It wasn't validated for, you know, it looks good. And then on the packages server, in this case I'm just doing a list command. You could easily expand that into a different type of exposing an injection attack against that packages server to potentially take control over additional or do malicious things for users of that. So in this particular case, this was fixed very quickly. The researcher was able to work with the maintainers of packages. I think they got a hot fix in under 12 hours into production. So again, it was very fast. It was a great example of working with the maintainers of a registry server. And once they pushed it out, it was good. The bug was fixed. Of course, now, if you were using a private packages server, like you had one, you would need to upgrade to a certain version. Again, I'm not sure what's posted in terms of the presentations, but all of the references are in the notes, too. So in this Slack channel afterwards, if you can't find them, just ping me and I'll send them out. But it was pretty cool. But again, something that you use to actually build your code, so the package manager, that was risky. Who would have thought of that as something that I need to be worried about as part of a supply chain vulnerability or software supply chain attack? So great, Susan. Sounds like we are completely, like there's nothing to be done. It's too complex. I know when I look at problems personally, you know, sometimes if they're very complex, it's very hard to figure out where to even start. So the first thing that I would say in terms of starting to get a handle on your software supply chain, again, you have to pick somewhere to start. And I think that the number one thing that we can always do for trying to solve a problem is, first of all, you have to know what the problem is. Otherwise, how do you fix it? It's true of anything. So know what you have. One of the ways that is a good way to do that is to start with this supply bill of materials or the SPOM, which I'm sure we've heard so much about. So for those of us who haven't, a software bill of materials is very, very straightforward. It's a very simple concept. Again, it's a formal record that has your components listed. It has the relationship between your components defined. And it also contains both open source and proprietary freed and paid, and you can read the whole verbiage here. But really, if you Google SPOM, what is an SPOM? The first example or the first analogy that will always come back is, it's an ingredients list. And really, in its simplest form, that's really all it is. I know that in my previous experience, I supported network devices. So they did some WAN emulation, usually testing for network changes, like if you were migrating your data centers, whether to the cloud or to somewhere else, whatnot. So anyway, we would put these devices into data centers. Data centers kind of run people's businesses. So what's one of the first things that they would ask? What's in that device? Of course, there's the chassis, the hardware components that you need to list. But there's also the software components, right? Oh, it's running Apache because it has a web console. So you don't have to go and plug into it every time. Or it's running Java for Apache, Tomcat. So we would need to put that into our build materials so then they could get approved, hopefully, that it would go into their data center. This is the same thing. It's just from a software point of view, right? So I think that, again, you know, if we look at our practices, I do think that there has been a good start with this, especially in the open source world, or especially from the open source piece of applications and components, which, depending on which report that you read, could be anywhere from 50 to 90% of your application. But a lot of those tools will give you some sort of inventory report. This is what you have. Here's the versions. Here's the vulnerabilities. Here's the license and whatnot. But I think the difference here is, one, it's expanding the scope to include proprietary or commercial off the shelf, as well as the relationships. So, again, we kind of put that all in the blender, the mix, the funnel. We get this S-bomb into a standard format. So one of the cool things that I think is really interesting is that, again, part of the recommended minimum requirements of this particular component, if you look at the executive order, is that, again, list all the components, list the relationships. It needs to be a machine readable. Well, who cares, right? Who cares? Well, it's important, I think, if I sell you white source, right? Like, I'm going to buy that stuff, that sounds great. And then Julie here is like, she's like, oh great, I get to be part of this presentation. And Julie's like, okay, I just bought white source. Well, now that's part of my supply chain. So I need to be able to track that, and I need to integrate it into all the other things that are part of my software supply chain. So having this machine readable, this standard format is extremely important. Again, and it should be automated. For those of you who work with AppDev teams, or even DevOps or just dealing with the pace of application deployments today, if it's not automated, this just becomes another compliance thing that we use to check the box and really has no impact in improving risk or managing risk with the software supply chain. In addition to kind of supporting the improved supply chain risk management, again, when it's part of the overall process, not just some SPDX format that I need to check off the box, which is kind of the vibe that I'm getting now when I'm listening to customers look at this. I think one of the things that maybe isn't so emphasized that's coming out of the standard that I think could be a real bonus is the ability to potentially standardize how we talk about components, how we identify components. It's consistent whether we're talking about open source from Node versus Maven versus Go versus some off-the-shelf component that I bought from that vendor over there. So once we have started a common language, again, then it becomes very easy or it becomes easier to track, to manage everything that kind of goes into that pipeline, into that supply chain. And again, better risk management, better understanding of all of the pieces that might be vulnerable. And then if something does happen, like a SolarWinds or something does happen like with Composer, then I know that, oh, this guy is related to this, this, this, this. I need to put some mitigating controls in there or I need to somehow fix it. But at least I know what I'm dealing with, which I don't think that we have any idea what we're doing now with regards to that. That's great. So I have this report. It's got some dependencies. It's automatically generated. It's awesome. It fixes nothing. Let's be real. It fixes nothing. Is it an important piece of supply chain risk management and making sure that we're safe from attacks? Absolutely. I can't protect what I don't know, right? But it doesn't actively protect against supply chain attacks. So before we jump into the protection piece, how are we actually exposed? And right now I'm focusing specifically on open source supply chain attacks. There's two main ways. That we see this happening. The first is there's a brand new package out on Maven or Node or RubyGems or whatever out there. And it's malicious from the get go. And people try to entice you, trick you, whatever, right? To install it. So some examples of that. Certainly the dependency confusion that was all in the news earlier this year, I think it was. That one again was a research project, but that would be a good example of this. It was never meant to be a real package. It was always intended to be that trick or that malicious piece from the very beginning. Brand jacking. Again, something that looks like maybe it comes from Google, but it really isn't. They're just leveraging that name. Type of squatting. I'm actually probably very, that would be one that I would be very, very susceptible to with my typing skills. All of these are examples of the types of attacks out there that, again, they were never anything but what they were. Their whole purpose is to trick people to use them. The second type of method that we kind of see out there is that I have something that's good, that's useful. I'm using. And then it takes a turn to the dark side. So through some sort of method, it becomes malicious. Again, a malicious takeover of an account or a package, I think that one's pretty straightforward. I know there's a well-known example within Node community. I feel like the guy didn't want to maintain it anymore, so he passed it over to somebody who was very active in the community. That guy didn't do good things. He made bad choices. So that would be a package takeover. There might also be an account takeover, and again, that's more associated with social engineering and fraud, and I'm impersonating Susan from white source, but I'm not really Susan from white source, and I do bad things on her path. The trickier one, I think, in my opinion, is the imposter. And this one's kind of a hybrid, I think, between the first method and the second, is it has aspects of brand-jacking in it as well. So in this particular example would be, we have a well-known library. So again, a good example, this is a DD trace by Datadog. So a very well-known library, extremely useful to integrating Datadog metric gathering into your pieces of software. There was another package called DDD Tracer. And if you looked at it, I mean, everything about it says Datadog, Datadog, Datadog, Datadog. It just had a little R at the end. And there was nothing malicious about it to begin with. It was impersonating. But so the idea is that, OK, well, we're going to give the same functionality, we're going to do everything as the original library, we're going to get users, and then we'll switch over. And we'll start, again, to not act as a good citizen of the open-source community. So again, taking advantage of a well-known name, a well-known package, the second category, those are the tricky ones. Those are the hard ones to protect yourself against. The first one, are they tricky? Sure they are, but they're easier to see and they're easier to block against. Whereas these other ones, they look so similar. They might have been good to begin with, but they've just kind of gone over to the dark side. So first and foremost, there is no silver bullet. Anything to have to deal with in AppSec, there is no silver bullet. Just have to accept it. One of the risks with these types of attacks is that if you look at the ecosystem, a lot of times what you can do is that I don't even have to install these packages to be exposed to the risk. So I just download them. The arbitrary code executes even without me running MPM install or Composer install or whatever the command is for the package manager. So you have that on one side. And then you have on the other side that these package registries do monitor and do self-police. They don't want malicious stuff in their ecosystem. That gives them a bad name. Nobody will use them. But there's very few real-time scanning and there's very few, as you guys probably can attest to, there's very few full-time people looking and evaluating. And again, if Susan from Whitesore says, hey, did you know that this package is malicious? I looked at the code. It's not good. It's not doing any input validation. Or it's making a phone call over to some country that is sending my data over there. They have to evaluate that. It takes them time before they yank it, assuming that they agree with Susan from Whitesource. So in the meantime, you're at risk. So you download an updated piece of library or package. It does some random things with the arbitrary code. Yeah, the maintainers of the registry are looking at it. But it's still there. So basically, whoops, I skipped over here. So basically what we need to do is, again, there is an education aspect to this. Again, there is no silver bullet. There is no tool that will fix everything at all risk. But basically what we need to do is start, not just looking at the software, but looking at the pieces that go into the software supply chain. So in this particular case, again, I'm looking at potentially hardening the package managers. So before they actually download and install that, run that arbitrary piece of code, they're scanning the code in an automated way. So there's tools out there that can do that. So a plug-in into a particular package manager, it looks for malicious or suspicious behaviors, and it'll block it from even being downloaded. So then this works, whether I'm looking, you know, out to the public internet, where I'm looking at maybe my internal and my private registry. Again, maybe I have a firewall rule that says, hey, don't download this. If it's da-da-da-da-da, well, then I'm using a cache version. It works like on the developer's workstation within the CICD pipeline, where we're actually working with the managers. So again, it becomes a multi-layered approach. In terms of best practices, again, this is kind of a tool people awareness sort of deal. Again, we need to be smart about package managers. So an awareness campaign. Packages that are really new, like less than 30 days old, hold off, hold off. The same goes with packages that haven't been touched in years. Who's maintaining those? And then all of a sudden somebody's on there? Sounds sketchy, right? Maybe not the best selection from a usage point of view. And then, of course, use only verified sources, which we'll come back to that in just a second. The internet, as we all know, is not always your friend. I tell my daughter, not with regards to open source, but in general, the internet is not your friend. Trust, but verify. Same goes here. In terms of package review, I think of all of these things here, which we can kind of go through, but the number one thing is we need to look at the packages. Packages are code. So we look at them when we're looking at maybe getting them on the approved list. We look at them when they update. We look at them. We're looking for code vulnerabilities. We're looking for that malicious targeted behavior. And if we can kind of going into the ecosystem, if we can automate that with some of these tools that are out there, all the better. And of course, we're looking to make sure that there's typos or similar names. Again, kind of the typosquadding or the branch hacking. And certainly, we haven't really talked about this. We've really been focusing on security risk, but we need to make sure that the license risk, the package licenses meets our policies and our thresholds as well. So all of this, again, kind of goes into awareness. It goes into what we can do, maybe from a tooling point of view. The ecosystem is really, I think, about... Again, it's a mix. I think this is also about culture. So again, it's great. I can tell to my app dev teams, only download from verified sources. Okay, what's a verified source? What's the policy to get something on the verified source? Do we ever review those? So you have to kind of have the foundation to put some of these things in place. Again, I'm a big fan of automating where it makes sense. We can automate these package reviews. Awesome. If I can automate some of the blocking, awesome. If I don't have anything in place where I know what to block or I don't have anything in place where I want to automate fixes, don't automate. You're just going to make everybody mad and make a big mess. So only look to automate where it makes sense. And again, kind of tying this back to the first section is that SBOM is a great tool. It will not do what it's intended to do if it's just a governance or a compliance or a check the box thing. So really looked how you can leverage that to mitigate your risk in terms of supply chain as well as some of these policies and procedures to implement. So in terms of kind of some takeaways, hopefully you got out of this session, again, encompasses everything. Everything that you did not write yourself is your software supply chain. Everything that goes into deploying that component, deploying that, you know, subsystem, deploying the business application that your customers will use. That is part of your software supply chain. The second is you need to know what you have, know your ingredients. That would be very difficult to cook anything. Just kidding. This will, again, let you have a good foundation of knowing when where. So when you are exposed to you, hopefully when or if not when exposed to a supply chain attack, you know what you need to do. You can start to formulate incident responses around there. I think it also, again, should, I didn't put it up on the slide, but I think it also becomes a mechanism for you to deal with your suppliers as well. Give me your espon. What do you got? And then the third thing, again, it's great to identify and document and know what software supply chain is, but really, you know, it can't just be about that piece of it. You have to have active, proactive policies both on the awareness side, the culture side, as well as the tooling side to really protect yourselves from this, from these targeted and these malicious attacks. Thank you very much. I went 35 minutes. The guy before me went 50-ish minutes. So I think I win because I give you more time back. Again, thank you guys very much.