 My name is Zach. Thanks for coming. I'm here to tell you whether large-scale automated scanning is going to stop malware on open source software repositories. No, it's not. I saved you 40 minutes. You can go take a break before the next session. So this talk is based on some research work I've done. And it's open access, so you don't need to pay $50 to go read the PDF version. I'd encourage you to read it for a whole lot more detail that I'm going to have time to get into now. But I think it's a pretty interesting look at software repository security and how it shakes out actually in practice. So who am I to be telling you about all this stuff? I'm a research scientist at a startup called ChainGuard. We work on software supply chain security, whatever that means. But that extends to include things like open source security of package repositories. So my academic background's in applied cryptography. But I'm interested in general in software repository security, package repository security, especially package signing and provenance, but also things like two-factor auth. And as you're going to see, malware detection. I'm also interested in policy for secure supply chain. That includes things like government policy. It also includes things like machine readable policies for how do I decide when I'm running some software whether I should trust it or not. So we're going to kick off with a little bit of background on malware detection. So what do I mean by malware detection? As long as we've had malware, we've had scanners for malware. The first sort of computer worm was something called Creeper. It was at BBN in the early days of networked computers. And it was just a little piece of software that duplicated itself. And if it was present on your machine, it would just print out this message, I'm the creeper, catch me if you can. So very innocuous as far as malware goes. But they realized at some point that this had gotten onto every machine on their network and kept duplicating itself and was showing no signs of stopping and was starting to actually cause disk usage issues. So they made a new program called the Reaper, which just looked for the Creeper program and deleted it. And so this sort of, I think, represents that, one, if you've got computers, people are going to do bad things with them, whether they mean badly by it or not. And two, it's actually pretty useful to be able to have software that can go look for the software doing the bad things and get rid of it or prevent it from entering in the first place. So how does malware detection actually work? There's two main flavors. One is what you'd call signature-based. So this is what you think of when you think of anti-virus. It has a big, long block list of software. And the simplest possible way to do this is to just have a big list of hashes. And it looks at a file and it says, this file belongs to this virus. Don't run it. But you can get a little bit more sophisticated. Obviously, it's very easy to defeat. The malware can just change a byte at the end. That doesn't affect the operation of the code. And now it's got all of a sudden a brand new hash. So there's kind of an arms race in terms of normalization. That'll happen. And it'll try to look up the hash. But all of the signature-based stuff, basically, the way it works is I've seen evidence in the wild of this software behaving badly. This software looks like a very specific kind of malware that I've seen before in the past. So I'm going to flag it as malware. So it doesn't at all generalize two new samples. It's never going to work against the first instance of a piece of malware that you see. In contrast, behavior or anomaly-based, these are the same things that are used interchangeably. Try to figure out what the software does. So these malware detection methods do generalize two new pieces of malware. It doesn't mean that you have to actually run the software and wait for it to do something bad, like send a credit card number to some foreign server. It can, right? You can sort of do runtime monitoring. You can also run things in sandboxes. You can also do static analysis and just look at the file itself or the source itself. And so for the most part, we're going to be talking about the latter category today, just because while signature-based malware detection does have a lot of uses in the package repository setting, we're really concerned about not exact copies of malware that we've seen before, but sort of new malware that's coming onto a package repository. And so these are kind of the flavors of behavior-based malware detection. There's the static analysis, which looks at the software. And so if you're in Python, you can look for things like a val of base 64 of some block of data. And that's very suspicious. It's very rare that you would do that in normal, genuine code unless you were trying to obfuscate or hide what you were doing. You can look for things, not to pick on Russia, but URLs in your software library that end in .ru, or in a binary. You can just run strings on that and look for .ru. It looks for execution of code that's loaded remotely, all sorts of stuff. And the heuristics can range from very, very simple to very, very complicated that try to parse the AST of the program or decompile something and figure out what it does. The static analysis, I would say, is in general, because you don't have to do that decomplation step. Much easier to do in interpreted languages. So we're going to talk a lot today about how this works in Python. It can definitely miss things. The sort of flip side of that is dynamic analysis. So this is where you actually are running the software and seeing what it does. This is obviously a little dangerous, right? If you're running a potentially malicious piece of code, it could do malicious things. So in general, you want to be doing this in kind of a sandbox environment. It can also miss things. And in particular, dynamic analysis can be detected by malware, right? It's very easy to do something different when you notice you're running in a sandbox than what you would do under normal circumstances. So you can have a flag that says, if it looks like I'm being scanned by a malware detector, don't be evil, otherwise be evil. And here's a cartoon of the, I don't know if you remember this Volkswagen's emission test scandal a couple years ago, but that's exactly what they did. They had the car figure out when it was being emissions tested and when it was, it emitted less stuff. So how does all this stuff work? There's a pretty rich academic literature on malware detection. As you can see, there's hundreds of thousands of academic papers on it. And techniques that get used range from regular expressions to looking for patterns in the abstract syntax tree of the software. They decompile things. They sandbox. They have antiviruses, which are more often like these signature-based methods, which do hash or file-based matching. As part of that, they often do normalization of the software that they're checking. You can look at metadata about the software filenames. Where did it come from? Is there a signature on it? Who is that signature from? And then I'm just going to hand wave over deep learning and AI. But there are a number of methods that throw a neural net at a piece of software and see if that neural net says it's good or bad. And then, obviously, you can combine these in arbitrary ways. Cool. So that's what malware scanning tends to look like. What's a package repository? Or what's an open-source software repository? So these are usually tied to operating system or language ecosystems. So you can have things like apt or homebrew or portage. In some sense, you can consider the Apple App Store or the Google Play Store to be a package repository as well, though that's not the focus of our talk today. There's language ecosystem repositories, like PyPI, which is going to be something we're talking a lot about today. There are things like NPM. And then there's things that are in between. So Conda is a repository that is focused on data science and machine learning applications. So it's mostly Python, but it will blend. It'll install stuff from Fortran all the way to Cuda code to whatever it needs to. There are things like Nix, which will install, again, pretty much anything. So there's all sorts of these things. And I'm being a little bit sloppy here. If I'm being precise, we should be distinguishing between package managers, which are the thing that you're running on your machine to install the software and the repositories, which is the remote service where the software itself is living. And so when we're talking about malware scanning, we're usually talking about doing that on the repository itself, either to detect malware that's been uploaded after the fact, or to prevent malware from even winding up there in the first place. And one big distinction we're going to make is between curated and community-based repositories. So curated repository, and so something like the Debian app repository, will have a small number of trusted maintainers. So these are often members of an organization, employees of the company. And so often your threat model says, let's really, really hope that none of the people with access to published packages to the app repository are evil. And if they are, we're kind of in a bunch of trouble anyway. And so malware scanning is not necessarily the solution here. Community repositories, in contrast, allow quote unquote anybody to submit. So if I wanted, I could close this tab, open a new one, make a new account on PyPI, and have software uploaded inside of, I don't know, 10 minutes for my Hello World Python package. And that's really actually great. We're all at the open source summit because we think open source is great. Democratic software development is awesome. Everybody gains when we have this big software commons that everyone contributes to. But it does come with risks. And in particular, we've seen a number of instances where there are malicious packages on these software repositories. And I could have kept going, but I was starting to run out of room on the slide. There's all sorts of attacks that you see. And if anyone's curious, come up after the talk, and I can point you at a cool data set that sort of just lists compromise after compromise after compromise, or at least attempt it at compromise, attack after attack after attack, on these kinds of settings. So it's at this point that folks who sit at home and comment on websites with orange banners tend to ask questions like, why don't we just scan for malware? This will solve all the problems. There's this rich literature on malware detection. And so obviously, if we do that, we can just have a little program that says, if is malware, don't let people upload it. And problem solved. And so we actually started this project, and we thought, yeah, we hear people say this all the time. No one's really doing this in practice. What's the reason for that disconnect? And our assumption is the people who run these repositories are pretty smart and pretty dedicated to maintaining the security of these repositories. So if they're not doing it, there's probably a good reason. And maybe the reason is just the gap between the performance that we see and the needs that we have. So let's go off and let's download a bunch of these things and benchmark them and see how well they do. And then that's the project. But at some point, we decided to stop before we just collected some data and went to these repository operators and said, hey, we know better than you about how you should be running these things. And so instead, we decided, before we sit down and try to benchmark, we need to know what we're looking for. What sort of properties? Like, what are the compute resources available? Is it OK if we come up with a good malware detector that takes hours and hours and hours to run on every individual package? Are we OK with missing malware occasionally? Are we OK with flagging things that are good as malware occasionally? And these are the sort of questions that I would rather answer ahead of time than sort of after the fact after we've gone through a lot of work to make the case for why we should or shouldn't be doing this. Yeah. So this slide is basically what I was just saying. In general, I like to know what I'm looking for before I collect a bunch of data, and that's exactly what we did here. And in fact, we weren't the first people to ever propose doing this. And we weren't even the first people to go out of our way to try to put this in place on a repository. And in particular, on PyPI, so the Python Package Index, which is run by the Python Packaging Authority, which is part of the Python Software Foundation. In around 2019, they sort of started putting together the groundwork for a big project that did automated detection of malware. And so this got merged. You can see on the left in February 2020. And then in May of this year, it got removed. And so what's the reason for that journey? And so we wanted to start with sort of so-called user research on PyPI. Can I just sit down and talk to people who are involved with this and say, what were you trying to accomplish? Why is this something that never wound up really getting turned on and prod? And why did it ultimately get removed? And then I also wanted to talk to someone who was an academic, who spent their time working on the malware detectors, and say, what are you trying to build? And what are you trying to look for when you're building these tools in order to make them useful in such a context? And this is going to let us identify sort of the requirements that we have when we're deploying these things. And then as part of this, we can identify other priorities. I don't want to come in and say, hey, automated malware detection is the solution to all your problems when maybe there's a bad password policy and everyone's password has three characters in it. Or what are the security priorities that you, as an administrator of PyPI, are dealing with? And where does malware detection fall in that list? So what did we actually find? And the first thing that's kind of an interesting point is, does malware on these repositories actually matter? And I'm picking a little bit on phylum here, which is a company that does security research. And to be clear, I think this is good and important research, and I'm glad that they're away to find attacks on PyPI. But they made this flashy blog post, and they say, oh, we found an ongoing attack. Someone's uploading malware to PyPI, and you're all going to get hacked. And then if you look at the names of the packages that they're uploading, it's like libinfo hacked. Like if you pip install that, you kind of deserve what's coming to you. And in general, I think malware is sort of a relative concept. If you think about penetration testing tools, they do things that look a lot like what malware would do. So is that bad? Well, it depends on your context, and it depends on your expectations as a user. And so I don't, in fact, think that there is one judgment that even a human who sat down and studied this for years could make that says, this software is definitely good, or this software is definitely bad. There is no such thing. It's all relative. Furthermore, for malware to affect someone, it needs to actually get run. And there's kind of the saying, if a tree falls in a forest and no one's around to hear it, doesn't make a sound. Physicists tell me that the answer is yes, but the point of this parable is we don't care. And I feel the same way about malware. If someone's going to upload some malware to PyPI, and then no one downloads it and no one runs it, it's not a big deal. And in fact, most packages on PyPI, it's one of these things that follows a power law distribution where there's a handful of packages that pretty much everyone downloads. Really, really popular things like Request, Async.io, and so on. And then there's this big long tail. So PyPI has about half a million packages on it right now. There's this big long tail of packages that got one release, got downloaded maybe once, probably by the author of the package, and then never again. And PyPI is actually really awesome in that they commit to making a lot of data, open source, and available freely. So they have a big data set that you can query about downloads of different packages, and you can see what are the statistics. And every package does get some downloads, but we think, and this is kind of confirmed by talking to these administrators, we think that the bulk of these downloads for most packages are coming from mirrors that are just scraping the whole thing. So every time something gets published, if I have a mirror and I'm collecting to host internally at my organization or for research purposes, I'll just download that right away. And so we do see downloads of every Python package, but we're still pretty sure that most packages, no one ever runs them. So all things equal, right? We prefer to remove these things. If something is just going to be a piece of software when you run it, it scans your hard disk for things that look like credit card numbers and ship them off to some remote server. We don't need to be hosting that on PyPI and we'd like to get rid of it, but we care much more about specific cases of malware. And so one of these is typosquadding, right? Does the package have a name that someone would plausibly install by accident? And so this could be, you know, requests is a popular package, it may be if you say requests with two Ss or something, someone's gonna try to grab that package name and it'll get installed by accident. There's a, you also worry about compromises of existing packages. If someone takes over, you know, they hack the account of someone who publishes one of these really, really popular packages, we worry a lot about malware there too. But again, something like self-hacked CVNVIDIA, like maybe that's not quite as big a deal. The other thing we found in talking to them is that there are different constraints than you might think. So one thing that I was worrying about when I was saying, oh, what if PyPI is gonna run these malware detection scripts? What if they take like four minutes to run? Like that feels like a lot of compute, a lot of dollars to burn on AWS or whatever. And it turns out most of these repositories don't really care because as open source projects, while they may be incredibly understaffed, they don't suffer for want of cloud resources. It's the easiest thing in the world for these, you know, sort of cloud providers to say, we're not gonna give you any volunteer time to help make the repository better, but we can throw some credits your way. And so if PyPI comes along and says, hey, we're very desperate and we need to run these malware scanners, pretty easy for GCP, AWS, Azure to say, sure, like have some credits with which to do that. So we're not actually that constrained on compute. I mean, to a point, right? If you, we were running some fancy deep learning thing, like you hear about these LLMs that cost $14 in compute per query or whatever, that's probably not gonna fly, but something that's even pretty computational intensive isn't actually a huge deal. The latency is also a little bit flexible. And what I mean by that is, if you're gonna block publication of a package, you have at most a second or so, right? Like when I try to type NPM publish, if that goes through a malware scanner before it's actually available on NPM, the expectation is that I'll be able to download it pretty much immediately after publication. And this is just what users have come to expect. You wind up in a situation where you're trying to redeploy something and the thing you're redeploying pulls from PyPI or NPM, and like you're making an emergency fix, you want to be able to turn that around pretty quick and not see something like, oh, your package has been held. It's gonna be another couple hours until someone can manually review it, blah, blah, blah. So that feels like in general and on starter, but if we can say, we're not so concerned about preventing these things from winding up there in the first place, we just really want to catch them after the fact or pretty soon, eh, no big deal. It's okay if the latency's a little bit longer. But engineering resources are quite limited, right? If you say, hey, we have something for you to implement, all you have to do is read this academic paper and transcribe this awful pseudocode algorithm into a real programming language and debug and put tests in for that and then deploy it. That's actually a problem. So anything that we want to roll out has to be pretty simple. And also there are limited admin resources. So the admins on these repositories tend to be quite technical, but they have pretty limited time. And in particular, PyPI is quote unquote volunteer run. So most people who are admins on PyPI, it's not their day job. They have something else that they're paid to do and something else that they're supposed to be doing during kind of business hours. And if a report comes in and needs to be dealt with on an emergency, some of them are in positions where they can take 20 minutes during the day to go deal with that, but others kind of have to wait for five o'clock and quit in time before they're able to do that. And so in particular, what this means is that false positives when like a proposed malware scanner says, looks at some legitimate software and says, hey, this is suspicious. Why don't you the human review it? That's kind of an issue. And so I can write you on the right. There's my malware scanner that has no false negatives. It's going to catch every piece of malware. And the problem with this, of course, is that it causes a lot of noise. And so when you're thinking about false positives, you want to think about the base rate. And so what do I mean by that? Every year, there's about a million package updates that happen on PyPI. So there's half a million packages, but most packages get updates, new versions, and so on. And each one of these is something that we're going to want to scan for malware. There's about 10,000 packages that get removed by the administrators every year. Some of these are malware. Actually, a lot of them wind up being like spam and things that are not going to cause any problems, but are just like the full upload of a feature film. But they put it in dot wheel format for some reason. If you look at the newly published packages on PyPI, this is actually quite common. I don't really know why, other than I guess it's like free hosting. So if I assume an unrealistically good malware detector, and I assure you that this is unrealistically good, and we'll go into some benchmarks to compare it with soon, where it has a 100% sensitivity. So it finds every package that we want to take down. And it's got kind of 1% specificity, which means that only one out of every 100 good packages are going to get flagged. So these are really, really good numbers compared to what we see in the literature. Most of the reports are going to be false positives. And that's maybe a little bit counterintuitive. You say, oh, it catches everything bad, and only 1% of things good. But you have to realize there are many more good packages than the bad ones. So we wind up, basically, at this point with a majority of reports being false positives, which means a lot of wasted admin time. And the worst part of this is the things that are most likely to get reported as false positives got reported because of some heuristic, which means that there is some ambiguity as to whether they're good or bad. It's not the cases that someone can look at and say, immediately, oh, that looks like malware. That looks legitimate. It's the sort of edge cases that humans are going to wind up having to spend all their time on. And this is the big reason why these malware checks that used to run in PyPI got removed is they were creating far too much noise and far too many false positives. The other really, really interesting thing that I learned is there is a malware detection system in place. It just involves humans. It's not fully automated. And so in this system, what happens is independent researchers will run their own tools. And there will be a bunch of false positives. And then the human who ran the tool will go ahead and look and check those and make sure that there are no false positives. They'll filter those out. And then they just email the PyPI admins and the PyPI admins do a quick confirmation to make sure they're not taking down something legitimate. And then they get taken down. And this is kind of surprising that this is the solution that's emerged because it's not like these researchers are employed by the Python Software Foundation or anything. But it turns out they get something out of it, which is, I guess, fame and glory in a very limited, nerdy sense. And so for instance, that blog post that I showed a picture of earlier from Phylum is one instance of this. Security companies often really like to say, hey, we found all this malware. We're keeping you safe by our products. And it seems like actually a worthwhile investment for them. But it's great because then the PSF doesn't have to employ teams and teams and teams of vulnerability researchers to do this. And they don't have to employ tons and tons of admins to scan through false positive flags all day. And these researchers now are really incentivized to have good tools that they keep updating because malware authors are pretty crafty. And if you knew the total set of automated malware checks that we're going to run on PyPI, it's a pretty simple matter to go run those yourself against your packages before you try to upload them and snare people. And you can do things like keep the checks secret, but that feels very contrary to the open spirit of open source that we work in. So with that out of the way, I'm going to spend relatively little time on the sort of quote unquote meat of this just because I think, as you've seen, it's a lot less interesting now that we know what we're actually looking for. But we did do some benchmarking. And it's, I think, at least instructive to consider the gap between where we're at and what we need. So what do we use as a data set? There's a couple of previous projects that collect malware that has been on open source repositories. So one's called MAL, OSS one's called the Backstabbers Knife Collection. And these are sort of previous academic works where they went and they collected from various package repositories. Oh, this is a bunch of malware. And so you cannot go just download this because it would make it a little bit too easy for someone with malicious intent to have a big repository of bad things to pull from and re-upload. But it is available on request. And so if you can just say, hey, I am so and so. I have a research interest in this problem. They'll go ahead and send it to you. And then we needed a data set of Goodware, which is non-malicious packages. And actually, it's pretty easy to do this. The data set of Goodware is just the packages that are on PyPI because we assume that things haven't been taken down. In theory, there could be things that aren't detected. But those are the things that no existing malware tools have caught. So our tools are obviously not going to catch those. So there is sort of this sort of upper bounds how many of the bad packages we know we're catching. But it's better to be able to catch some than none at all. So we just took kind of a random set of packages. We took a set of popular packages because there are, again, a half million of these things. And actually, these malware scanners tend to take quite a while to run. So we only grabbed 1,000 or so to make it kind of computationally tractable. And then we de-duplicated the set. OK, and so what tools did we use? We started thinking we were going to have like a dozen of these things that we were going to run. But it turns out we had kind of three requirements. One is that the source is available. And so actually, a lot of these proprietary tools that companies offer, because they're proprietary, the source is not available. So we can't run them ourselves. We were also looking more for anomaly-based detection, which actually most everyone does. And then finally, there is actually really important point. A lot of malware detection tools are kind of like engines. And they say, we're going to provide you a system that looks at a source file and runs a regex on it and then reports of that regex matches or not. Or we're going to parse a Python AST, and we're going to let you make some query about what does it contain code in this structure, a function call with this name, or whatever, which is great. And it's a great start. But that actually doesn't let you detect malware. The kind of meat of that is what you'd call a detection rule, which is the regular expression itself or the AST query itself, which says, oh, I'm looking for a val of base 64 of something. And a lot of these analysis tools don't make their detection rules available, which makes it actually very hard to benchmark them. We could write our own, obviously. But then what we'd be benchmarking is not the engine. It's sort of the bad rules that I just came up with. So we kind of narrowed down to three of these tools. There's the checks that actually ran on PyPI, or we're supposed to run on PyPI, but never got actually deployed. There's a tool called OSS detect backdoor, and there's a tool called Bandit Formal. And so, OK, what's our setup? We're going to take our packages. We're going to get kind of our data set. We're going to get the latest releases of each of these things. And we're going to run our tool and then count the alerts. And so what did we find? We found that these are the packages with at least one alert when we ran the scanners on them. So OK, if you look, we ran in two settings. One is we just ran on the setup.py file, which is sort of things that can compromise you at install time. When you install a Python package, we didn't know better than to let the package that you install run arbitrary code. A lot of newer packaging systems, say things like, OK, cool, we are just going to, when we install a package, put things in place. But Python lets you run whatever you want, including network access, including remote execution, whatever. And so if you look, OK, we're finding most of the malicious packages, which sounds pretty good. It's a majority in pretty much every case. But then you look, we're also finding a majority of benign packages, too. And so I'll draw your attention, in particular, to the PyPI checks that were proposed. If you just look at the popular packages and you scan all the files, not just the setup.py files, that's 94% of packages got flagged as potentially malicious, which I'll point out is higher than if you actually looked at the malicious packages. And so maybe we're not being fair. These tools aren't issuing verdicts, they're just issuing alerts. They're saying, hey, we noticed something matching this pattern. And the table I just showed you were packages where at least one alert fired. And so alerts can be innocuous. And maybe you need to, only certain combinations of alerts are things that we should worry about. And so you could imagine coming up with something really sophisticated for looking at sets of alerts and turning that into a binary good or bad analysis. We didn't want to do that. So instead we just said, well, what if we thresholded the quantity of alerts? And again, these are the things that you would need to have in place for these to be deployable. You can't just hand a regex matching engine to PyPI and say, hey, if you wrote really good regexes to look for malware, you'd be able to find it. Right, like, so, okay, so you need to actually have a prescription here. And so we looked at kind of setting various thresholds. And so these are charts that show kind of, oh, what happens if we say that first table was at least one alert? This is what happens if we go one, two, three, four, five. And what you wind up seeing is that as the threshold of alerts goes up, the number of false positives goes down, but so do the number of true positives, right? And again, pretty soon you're catching so few, and often, like if you look at this chart D, you'll find that the number of true positives goes down substantially faster than the number of false positives, right? And pretty soon we can no longer distinguish between the malicious and the benign packages. And also, I'll point out in general that people really love to throw scores at things for scanning malware, but the score of actual malware just needs to be anything non-zero, right? Like, it just takes one line of code to turn a package that's totally innocuous into something that does something evil. And so I don't really love this idea of thresholds, but we figured we'd give it a shot. So those are kind of our empirical results, and if you wanna dig a little bit deeper again, I'll refer you back to the paper or I can take questions in a second. But yeah, so just wrapping up, we're a long way away from having in-the-loop automated malware detection. Other security measures should probably take precedence. And in fact, if you look at the PSF blog and the work that PyPI is doing, they are. PyPI rolled out two-factor auth requirement recently and is sort of ramping that up. And for the malware case in particular, a pretty reasonable alternative that requires some manual work has evolved, it could be better and there's a lot of places for it to be better, but it's not necessarily in the tooling. And I didn't get into it so much, but the worry is not that the malware detection tools that the independent researchers are doing aren't good enough. There's plenty of people working on that problem. The worry is that the interface that the admins have to deal with when they're taking down packages, right now they often have to download the package, unzip the wheel or whatever the format is, inspect the files themselves in the console and kind of look at what's been reported. You could imagine sort of a better web application that kind of exposes, hey, a researcher flagged it, here's the line of code that they're pointing to at evidence like almost like a dating app interface on your phone, like swipe left, swipe right. And there's a bunch of stuff. And again, if you just talk to the Python folks, they will tell you, and they maintain actually a really great list of they call it like fundables.markdown, which is just things that they really want done but they don't have the time to do themselves, that they've identified as high impact security improvements. And that's a really great way to listen to the folks who are like actually on the ground before they hang out in their ivory tower and do their research and try to impose a solution on them. And I will say if you are a researcher, this is a really great opportunity because it makes writing the introduction of your paper really easy to say, hey, we actually, instead of trying to justify this convoluted thing that we did, we solved a real problem and here's really good, hard evidence as to why the work that we're doing is impactful. And then I'll also give a shout out to the OpenSSF, so which is part of the LF. They had a whole day of talks on Monday, which I assume will be available online pretty soon. And the OpenSSF runs a securing software repository as working group, which is really great because it brings together folks who work on these package repositories and folks from industry and researchers as well, and it helps coordinate and sort of fund cross-coding efforts, because you'll notice a lot of the stuff that I was mentioning has been actually effective, two factor auth. None of that is super unique to Python itself. None of that is super unique to NPM itself, right? Like, and they can all benefit from implementing a lot of the same tools and solutions. Yeah, so that's the talk. Happy to take any questions? Billy? Yes, so that's a great question. So some of these tools, the alerts do come with severities. And we looked at different decision rules and then we thought about like, oh, should we try to have some statistical algorithm learn a decision rule? And then we got kind of like three steps into that and realized that at that point, we weren't evaluating existing tools. We were making our own new tool. And so, again, I think that points to a need for, if you're trying to design one of these tools, for it to actually, you know, it's fine to sort of make that granular alert level information. Oh, we noticed this. It's a severe, you know, level alert. Oh, we noticed this. It's just a little suspicious. Make that information available so that if a human's gonna go review it after the fact, they have that available to them to make a decision. But you also really need to make a like binary, like good or bad verdict available. And that can even be tunable, right? Like you can say like, oh, I don't know what your data set looks like. You're gonna need to mess with this parameter. So maybe that's like the threshold or maybe that's something else. But I think that's the job of people proposing these tools, which is I guess my hop out way of saying I didn't wanna do all the work of trying to figure it out. But it did feel like at that point, we weren't evaluating the tools. We were sort of evaluating our own new tool that we were making up on the fly. I'm not trying to keep anyone here, so we can just call it. Thanks everyone.