 We're good to go. All right. Welcome to the session on metrics, models, and software to identify your most significant at-risk dependencies in a portfolio. So when I say portfolio, I'm anticipating that you have some collection of software projects contained in repositories that you're responsible for or that affect your organization. Either you're in an open source program office or managing a community. And you want to keep an eye on some collection of projects and get a general sense of relatively how current are the dependencies within your collection of stuff you care about. So there's a whole family of things that are under the category of what I'd call risk. What the chaos project calls risk. And anything that's a critical infrastructure or especially concerned about. So what your critical infrastructure is might be different than my critical infrastructure. If you're running a real time operating system, you actually have a safety critical system. And these things are going to be managed much more closely than if you're running a series of web apps or something. So it's really what is the business that you're in and what are the risks that you're going to encounter? Essentially, you want to have a knowledge of these general spaces. And when it comes to critical infrastructure, the things that relate to security and things that you consume or have in production, those software packages are the ones we're going to talk about when we talk about dependencies. And there are four types of dependencies in our way of thinking. And I'll explain each of them. There's a direct dependency, a transitive dependency, interdependency, get to that, and circular dependencies. So what do each of those look like? This is a direct dependency. How many people are aware of what their direct dependencies are or could at least get a list of them? Most of you, OK. So this is pretty straightforward. In the KS project, we call these upstream dependencies. These are projects or libraries that your project or software portfolio depend on. And there can be, obviously, multiple. And there often are multiple. And the numbers are often growing. Transitive dependencies, who wants to take a stab based on this visualization describing what is a transitive dependency? A dependency of your dependency. So you can keep going with your dependencies pretty far down. And one of the choices that you have to make is whether or not you're going to track or manage the dependencies of your dependencies. Now, what's a case for not managing the dependencies of your dependencies? It would be the reason not to try to manage that or be aware of that. Sounds like a lot of work. It's a lot of work. I like this guy. All right. What are? Yep, if you're not redistributing them. Another case that comes to mind, in addition to those, is if I know what I'm dependent on and I know what the current versions are, to some extent, if I trust the package manager, if I trust the software developer of whatever I'm dependent on, then I'm going to know that the version that I'm at is considered it's not on a risk area somewhere. It doesn't have a security thing out about it. And if it does, then all of the dependencies for that library are also suspect. Now, if I'm really wanting to dig in, I can track transitive dependencies as well. Here's where we get into sort of interdependent dependencies where my project could depend on all three of these libraries. And they could actually have dependencies on each other. And if you've ever installed a Node program or a Python program, you know these things are real. And I'm actually pretty impressed with how PIP has made things a lot easier to see. Like, if I've got the wrong versions of something, it tells me now, or it didn't use to. Now we get into some of the really crazy dependencies, as I call them. Your project can be dependent on these three things. And then they could each be dependent on each other in kind of a circular way, where everything's dependent on everything else. And this can be kind of messy. And I would put this in the category of these are things that you have to decide if you want to track or not. And you may not want to track them, or you may. So when I think of tracking dependencies, I want to understand my dependency risk at a portfolio level. And I have choices all along the way about how I'm going to understand that. One is to examine every project. So you could actually take every project in your portfolio and do an analysis with your brain, looking through the dependencies of each of those projects to determine if there are dependencies that you want to update or need to be concerned about. That's a fairly manual process. How many people here have portfolios of over 100 repositories in their universe? 1,000? OK, you might be able to do an examination of every project if you have 100. But once you get somewhere between 100 and 1,000, you need to find some mechanism for identifying your greatest dependency risks. So step one is you have to enumerate all of your dependencies somewhere so that you can understand what they are and track how they change. And the next thing you need to do is try to come up with the heuristic, which we have a chaos metric that I'll tell you about in a minute. A heuristic for what are the greatest risks that you have. And if you have problems, in some cases, if you're concerned about some small set in that 100 space, then again, you can do this very manually without tools. But if you're concerned about a large portfolio in the thousands, then you need some kind of way to understand things at the portfolio level. Your critical success factors are increased awareness of the dependencies in your open source software. And so that's the list and then awareness of where your greatest risks are. So what information do you need to make an estimate of what your greatest risks are? You need the list of libraries or things that you're dependent on. What else do you need to know about them? Need a way to sort them. Sorry? Need a way to sort them. Need a way to sort them, right? Another piece of metadata about your dependencies that I'm looking for that you would need to be aware of. And that would be the version of the dependency that you're using, right? And that version might tell you a little bit more than just the dependency itself about whether or not you have some vulnerabilities or risks to be concerned about. So in the first case, this is a KS tool called Augur that essentially scans any repository and identifies this enumerates every type of file that every category of file that you are dependent on and account of how many different files have that dependency in them. And so this is just the straight up enumeration. This is the awareness of what are all my dependencies in a piece of software. And you can look at this. This gathers information for every repository in a collection set. And it keeps track of all of the changes over time. So you can calculate whether, for example, your average number of dependencies is going up or going down on a project by project basis. It can create some elementary, real-mentory awareness. And then we have a metric in KS called Libier. Libier is exactly what you would think it is. It is the age of the dependency compared with the most recently released version. Not compared with today's date, but with the most recently released version of the software. So if the most recently released version of a piece of software is, for example, in this case, it was October 21, 2021 is the most recent release for this piece of software here. And the current release in my software is from April. So I have a Libier number of 0.53 blah, blah years, about a half a year old. Now, if I'm thinking about dependencies and I'm trying to swag the vulnerabilities, right? Like, I'm not trying to tie this into a critical vulnerabilities database, partly because those databases are not neat and friendly and easy to integrate. They can be. But if you just want to get a swag without getting into CVE management, knowing that it's pretty recently updated is kind of a clue that you're more likely to be all right than if it's not more recent. Does that make sense? OK. So again, I'm thinking about the portfolio level. And I hear it had the same thing. I think I'm OK. So if I want to think about the portfolio level, and my tool put, these are repository IDs. And these are Libiers. So just to give you an idea what the data is, for these repo IDs at the top, the average Libier in that repository is actually 0. So these are very current. Most of my software in those repositories is at the current release. Here I have some that are kind of in the middle, around two years old. And then I have these projects down here at the end where I have dependencies up to 10 years old inside there. And so what this graph is showing you is essentially the Libier count is the size of the square. So your 10-year-old dependencies are up here in the left, and your other dependencies are down here. And so if you're thinking at the portfolio level, you have a rough idea now of how much of your portfolio is in a fairly comfortable dependency zone in terms of how outdated the dependencies are compared with the current release, and you have a sense that you have some projects that are significantly outdated that you need to pay attention to. And so this is taking the Libier metric and applying it across the large portfolio, this particular portfolio is over 10,000 projects, and understanding what my dependency health is. Any questions so far? OK. I just want to go back here for a brief second and point out that you can actually see the version. So you can know which version you're using and which version is the latest version. It's the latest version that you've current version. Yeah. I think it actually has those errors. If I'm thinking about categories of dependency-related risk, I want to understand dependency awareness. And these are some other tools for understanding dependencies and how healthy they are. So you can do reproducible builds. There's a supply chain security function in Kubernetes. You have high security vulnerability scanners, proactive error detection. OWASP has a dependency check. OSF scorecard gives you kind of a sense of the health of the project overall in terms with regards to security. And then these are the national vulnerabilities databases that I mentioned earlier. So if you're thinking about software dependencies, that's kind of the information that you want to be able to be aware of. So let me ask, what kinds of dependency information do your projects have outside of the scope of what I've just gone over here? Oh, so we've talked about dependencies. And I've kind of brought them to a very simple state, which is you want to enumerate them. You want to calculate the libiers on them. And you can get a rough idea of how out of date you are or aren't across a large portfolio. And you can get a visual representation of that. When you think about dependencies, what other perspectives do you bring to it? Or what other questions might you have at the portfolio level that this doesn't quite get to? Where they're from? OK. When you say where they're from, what do you mean? Like the GitHub repository? OK. Yep, you don't get that. Yeah, you don't get how active they are. That's true. So there's an awareness of the resources that might need to be applied to them, particularly if you're dependent on a library. So one way that we solve that particular question in chaos is sometimes if I have a portfolio of repositories that I'm directly aware of, I can also start to monitor any repository that is a critical dependency in my infrastructure. So if I get to, and there are a couple of organizations I know that do this, if I get to the point where I know specifically that, for example, these are some node packages that I rely on, there's organizations that will traverse back, identify the repositories where these libraries are generated, and also run health metrics for things like level of activity, newcomer welcomingness, how quickly our pull requests addressed, how much general activity is there, how quickly our issues responded to. So if you have critical dependencies in your infrastructure, one thing that you can do is add those to the list of repositories that you're paying attention to at a basic level. And you can classify them as maybe not under your direct supervision or responsibility, but they're things that are critical. And you can monitor or manage and understand the health metrics for those repositories. So you know when you're coming to decide whether or not to use or keep a particular repository how healthy it is. There's another best practice I think that your question gets at as well. And how do your projects manage the introduction of new dependencies? Is there a decision-making process beyond a developer importing a library, for example? So it's difficult to put constraints like that on a development team or on an open source team. But you can consider with the awareness of what the dependencies are each time that you run chaos tools over your collection, you can see if new dependencies have been introduced. And you might want to evaluate those dependencies with regards to how active are they, do you have a bus factor of more than one, is the project attentive to issues and pull requests that are created against it, especially if this is a dependency that's injected into a critical piece of your infrastructure. It's probably good to get in front of that and understand before something gets too far down the road and too woven in to your open source project if that's a dependency that maybe isn't part of a healthy project overall. Again, these are resources to address dependency risk. Another perspective, so we have all these dependency metrics and things that we can use to understand our portfolio. Some questions that you want to ask yourself at a very high level is what I'm doing secure enough. In other words, not all systems, not all open source projects need to be perfectly secure. I mean, they need to not have critical vulnerabilities, but some things are on the front line and you have a higher degree of concern about ensuring that they're secure. And there's other things that you can have maybe a softer set of rules around and what is safe enough. So when you're trying to keep dependencies up to date, I read a statistic recently that the number of dependencies in the average open source project has increased significantly in the last number of years. So if I want to make a decision about whether or not my project is safe enough, some of these heuristics like knowing how old the libraries are on average is just a good way of getting a sense of where you're at. And tracking that over time is a good way to keep track or get a sense of where you're going. And how do I measure this for dependencies because they're increasingly depending on depending on. The biggest thing that's not looked at is can I use, I can use unsafe or less than secure component and have a secure result. So if you have a component that's vulnerable, is it possible to have a secure result? Sometimes it is. Sometimes it is because you're not using the flawed piece of that dependency. And so understanding how you're using or what your software is using a dependency for can help you manage which things you go after fixing with what level of aggressiveness. Because if you've got 10,000 projects that are part of your portfolio or a thousand, if you've got a thousand projects, you probably have 30,000 dependencies or 10,000 at least. You've got some large factor of dependencies and you have to make decisions about which ones you're gonna go after. So you can build a trustworthy machine and not have trustworthy results. How do you do that? And as we go from 95 to 97% application being these third party libraries, we have to figure out which things we're going to attend to. Because if you spend all of your time making sure that all of your libraries are current, sort of in a defensive stance against the potential that there's a vulnerability out there. And keep in mind that some of the, and the reason that Libia is I think is a pretty useful metric and that we promote it as one of the metrics for understanding dependency risk is because when you look at the high profile data leaks, for example, over the last number of years, if you think about Heartbleed, if you think about Log4j, each of those vulnerabilities were on versions of the software that were significantly older than in the last year. So you're covering yourself to some extent if you're keeping your library versions current. You're hedging your bets, you're managing risk and not necessarily waiting for there to be a critical vulnerability identified by the security world. And so sometimes on the one side of the coin, there may be packages that you can use that have some kind of imperfection because you're not using the imperfect part. Now on the other hand, you can be very proactive about ensuring that you don't run into dependency issues just by managing how much is up to date. And you probably can't always keep everything up to date because if you have hundreds or thousands of projects, your development team would basically spend all their time updating and testing new versions of the dependencies instead of building new code. And so you wanna, this is where you wanna pick the pieces of your infrastructure that are most critical to keep up to date, the things that are forward facing, the things that you would have high visibility problems with if there was a flaw. Those are where to put more of your energy than other things. And how do I measure what I have so I can force these rapid updates? Well, my advocacy would be that you use Libya as a baseline. It's a very simple metric at its core, but it does give you a sense of where you're at at a portfolio level. That is my slideshow, that is my talk. What other questions do folks have? Because we do a lot of dependency analysis and we've thought about this a good deal. Another thing that's come up recently is how do I determine what the end of life is for the things that I'm dependent on? So if you're using really old versions, at some point you want the library maintainer to keep older versions current to some place and time. And you're counting on some contract with them maybe in the first year or two of use. And then as the dependency version that you're using gets older, somewhere along the line the culpability for having a really old dependency that isn't secure, doesn't work, falls to you. So that support window as a factor that we've been discussing. Other questions? Anything that anyone wants to talk about regarding dependencies? Yes. So I mean, I can only reflect on my own experience in different ecosystems. Like a Python ecosystem doesn't, they do a nice job of telling you when you have versions of. Yeah, yeah. I mean, PIP is like way better than it used to be. Like keeping me from having anything like that. NPM, it's an interesting ecosystem because one of the really, I think a critical organizing factor in that community is the idea that you can just create some small plugin and contribute it and then people can use it. And it's a very small contribution for you but it's something that's useful to other people. Where the node environment, at least in my experience, has become problematic is when I start to use cool widgets across an ecosystem or a project that I'm maintaining. And the maintainer just stops maintaining it. There's no commitment in that community to maintaining things that are put into the world. So that can cause problems. And NPM does an okay job but whenever I do NPM audit fix, it's usually on something on the front end and I usually have to manually test the crap out of it. Like there's no test suite for determining if the new version of some node library has broken my front end. I have to go in and figure that out myself. So when it comes to node, I tend to lock things down pretty tight and I allow, node's something that I allow to be out of date on non-secure systems. So systems that are presenting data without any crud operations. I tolerate out of date-ness with node because I can't afford to keep it all up to date. Yeah, yeah. I always, anymore I check to see how, like it's just to hear, it's a quick heuristic, find the repository where that library exists and see if anybody's updated it in the last year. And if they haven't, you know, consumer beware. I mean, yeah, I mean, is it enough? It depends on your application. It's certainly better than not having depend about or SNCC on a project and because at least then you're aware. And my experience with the, I mean, I'm sure everyone has pretty similar experiences with depend about. It finds things that need to be updated. It tells you how critical the update is and it also tells you if there's breaking changes and if there's breaking changes, those kind of go into sort of a triaged bucket. Things that don't break anything can be implemented right away and you just test it before you release it. But as you, you know, if they, so is it enough? No, probably not, but is it better than what you had before those things existed? Yes, much, much better. And again, it's like, think about the context you're operating in. If you're responsible for the national healthcare website, you probably have, you know, you've got a, you've got a honeypot that people want to vandalize and so you've got to be on top of your game for anything that touches the web. If you're running a metrics webpage, it's probably not a lot of people looking to come after that. If you're a credit card processing company, your honeypot's a big deal. And so that's, you know, you hopefully have good security practice around. There's a whole lot of layers beyond just the software for a system that's critically secure. In my opinion, I guess this is all my opinion. I speak for no one else other than myself here. Any other thoughts or questions? Okay. Okay, all right. You're just gonna keep us here all day. All right, let's go. Woo! Yeah. What's the tool? The auger. The tool's auger and it's one of the KS software tools. Sure. And auger generates a whole bunch of metrics in addition to dependencies, but that's what I focused on here. Yeah. Yeah, I'm here to talk about dependencies, but yeah, it's auger. Do you have anything else? That was the last one. Okay. All right. I'm happy to talk with anyone afterwards. The rest of you are free to go unless anyone else has anything they want to share with the group. Thank you. I really appreciate you being here. Thank you.