 Welcome. We're going to talk today about package URLs and version range specification and how they can make better identifiers for dependencies and vulnerabilities. Hey, I am Rathik Vijay and I am very passionate about open source, Linux, and infosec. I've been associated with wonderful code for over a year now and for maintainer at wonderful code and universe. You can contact me at the following addresses over here. That's it. And I am Philip Ombredan. I'm the maintainer of scan code and about good projects co-creator of package URLs among other other projects. I'm a long time contributor to free and open source software and co-founder city of next be which is a software company that makes deja code and is the primary sponsor of about code. You can find me on IRC and I signed off on one of the largest legion of lines of code in the kernel. So I'm very good at deleting code. I'm not sure if I'm as good as that for writing code, but I hope. So orange not today. The problem is that we have more and more free and open source software dependencies, which is great. I mean, we can build software better faster and more efficiently, but we also have more bugs, more bugs, more vulnerabilities. So we're going to see how we can deal with this. The difficulty to identify package is one key topic. Understanding resolution and dependency resolution is difficult to understanding ranges of vulnerabilities or range of versions as they apply to packages is a problem too. Generally speaking, I wish versioning was hard, not as hard as it is, but it is. So we're going to explore solutions with what we propose today, which is pearl and version range. So as I said, things are getting more and more complex. We can build applications like take legal breaks and assemble them together. And if I compare the volume of packages and number of packages we were using a couple years ago, it's really an explosion where we have 10 to 100 times more. Not only that, but also we have complex stacks. We're systematically mixing multiple application package technologies and ecosystems, but also system packages. If you think about container deployments, you may be running a part of Alpine Linux and Ubuntu and Debian and Susie in a single stack across multiple containers on the host operating system. We also have many un-stated dependencies across all these ecosystem and package types that we use. And it's very difficult. Actually, the boundaries between the dependencies are difficult to express. Say, how can I tell that I depend as an application on a certain database or applications of all this is complicated. And because of the explosion and this complexity, we have more breaks and vulnerabilities for sure. Because of the volume also, it's difficult to automate. So how can we deal with some of this complexity? The first way is to provide a bunch of, you know, read me instructions, say, hey, you know, here are my installation prerequisites, go and install this and that. That's working okay. Another approach which has been seen used in some of the big tech companies is to replace all the package managers with single one to rule them all. And typically with Mono repo and build system which going to build everything from source and essentially replaces package management and provide the dependency management system. Another approach is to use general purpose package manager that are emerging and examples back coming from the world of high performance computing or condo scientific computing are good example of general purpose package managers that address the needs of many both system and administrative level issues. And finally functional package managers, such as NICS and NICS OS and which is based on NICS, which provides a way to replace the dependency and package management system across the entire stack both system and application. And containers, which is a way to also mix system and application in a frozen environment, which we call container. So it's difficult. There's many standards, many package management we need one more to roll them all. And that sounds like a bad idea but actually, it's not so much adding a new standards, it's having a way to accommodate all these other standards. In practice, it proves that even though it feels ridiculous, it's actually a good thing to have something which can unify different standards to identify packages. Meet Pearl. So package URL. The problem we've tried to solve here is that each ecosystem as we said, as it's own convention there's so many standard a package could be called file. It's a Ruby gem, or it could be called file. And it's a Debian package which has nothing to do with the Ruby gem in question. So we're trying to find a way to define an expressive string, which is obvious and minimalist, which can allow to identify and talk about the package across these different conventions and ecosystem. Good example. So say you have an NPM. So it's a URL starts with a PKG scheme. Then you have the package type. In this case, NPM, the name at the version. So it's very obvious if you've been using NPMs, Fubar at 1231 is something used all the time because that's the convention to actually talk about a NPM package. The same way, we can talk about the version 111 of Django, which is a Python package. And based on this string, I can really figure out and go find fetch everything about this package. This is something that started with originally scan code and vulnerable code with the help of many other organizations and projects. It's now adopted in many places. It's used as a de facto standard in several projects of the next foundation, including the open source, the database schema, the OSSF. It's been adopted by Cyclone DX at OWASP and dependency track and OSS index and the type and so on and so on. It's really emerging as a simple de facto standard, which is a minimalist lingua franca to talk about packages. And we have libraries in many different languages that have been contributed over time. It's even been recommended by the NTIA as an alternative package identifier for software build of material as bonds. Okay, if we look a bit at other approaches, the key, key benefit here is improve readability and abuseness. If you think on the left, I'm not picking up specifically on CPE, but that's one of the thing where one of the existing thing we're trying to support and improve on. CPE has these, they have a lot of stars, so a lot of placeholders. The problem here is that if I look at these here, PyPy feedpasser for that one, I can know exactly where to go and where to find the source code. The convention of the CPE doesn't tell me much. I could search a marked pilgrim feedpasser and it would be hard pressed to actually find exactly whether it's on PyPy or else, and that makes it difficult. Same thing here, log4j. There's a lot of different builds of this. In these package URLs, you have something which is much more precise and accurate and focused on the code that you find in your code base. The key point is that we're trying to avoid any kind of arbitrary assignments, and instead you can infer the package URL from what you observe in your code base. Like your main dependencies or PyPy dependencies, it's going to be pretty obvious what you have there. Of note, the national database version 5, the new API, is essentially adding an ecosystem. So it's not exactly Perl, but it becomes very close to Perl. The OSV is also supporting Perl as well as an ecosystem there. And then some reference and very proud of that in the news where Stephen Hendrick from the Linux Foundation was saying that package URLs are eventually supported and become one of the leading data formats and way to identify vulnerabilities. Very encouraging and was a couple months ago. So that's the first thing. We can name the package now. What about versions? We talked about packages and it was a tough journey to name packages and naming everything is a tough journey. But now we can with the help of package URLs. But what about versions? Versions are tough and how do we name them or any idea do you have? Well, why do we need it first of all? We need it for resolving package dependencies. Every dependency for a package comes with a range of possible versions. It could be better than two or something like that. But an example would be I need a package, something and a version or the greater. Now a very simple problem here arises when we talk about do we want inclusive of 2.0 or exclusive of 2.0. That's a very simple issue, but it could manifest itself as different ways. Now that's just the dependency and that is used by package managers and such. We are also interested in vulnerabilities and vulnerable versions. Security is important and we need to take care of the affected vulnerable versions that we get in the advocacy. So how do we deal with those version ranges? Everyone publishes them with a pinch of difference and which makes the entire thing very different. Well, version numbers should be very, very simple to understand. When I say about, let's say, Django, that's from PyPy, it's very simple, very obvious. In package URL as we saw earlier, it was as simple as writing PyPy slash Django. That was the main hotspot over there. And it is simple, boring as a URL, and we know everything about it. But what about version numbers? They should be very simple as well. They should not have a lot of different types of operators like sometimes they'll have a cat or tell they or maybe an information mark that everyone comes up with a different type of operator for the versioning system. And it supports their ecosystem, but it does not support the involvement as a whole. Now what if they could express it in a very universal way? Well, now you'll go back to that XKCD that Philip just showed you that we are going to come up with a new versioning scheme. They'll tell you about this is the thing that you have to use. But no, we are not talking about a new versioning scheme. We are talking about something that could accommodate all of them, that doots all of them, that does not replace all of them. Well, first of all, we can use package URL, and that solves our problem of naming packages across different ecosystems. And then we come up with a new verse that is version range specification. And it holds for ranges. But for every package URL, we have an ecosystem, the package type, the namespace, and so on and so forth. And then we come up with an exhaustive simplified comparator set that is all of these shown over here. So here we make a promise that we won't come up with a new or new comparator. We won't say that there is a carrot anymore. We say that these are the comparators and that's all. And we have very specific version comparison that is based on ecosystem. So if you are a node symbol, if you use the voice greater than you will get the greater than that is equivalent comparator in the node symbol ecosystem. And we make the transition inside the specification itself. Well, it helps us to figure out the dependency and the vulnerable ranges both of them. So what happens so what can we come up with this maybe a universal package manager who knows maybe a lot of other things. But before exploring what are the future schools let's have a look at the specification itself. That is the most specification. Well, the problem was very obvious that everyone comes up with a new convention to specify version ranges. Well, we need something very simple and a string that is very minimal, very easy to write easy to use easy to include in your code base. And it should be compatible to for it as well. Now, keeping all of this in mind, we have the worst specification. We come up with a verse and that is a URL esteem so that we will be registering soon enough so that is first. And we provide the namespace that is npm. And then the version number package name and the version number. So it's one to three and then the constraints. Now the constraints are limited to what verse supports and those constraints would get transformed into the ecosystem specific constraints. Now we started with this one level code in the universe library and we have a Python implementation ready already. You're welcome to look at the universe that is universal worsening scheme. And it has already been used in cyclone DS and scan. We can pay for very clear parts to universal dependency resolution. And that would pave a path for the universal status manager, which is which is one single thing that is missing to make the next as the desktop of the new era, you know, because every disco comes with a different package manager. So that would potentially solve this issue as well. And of course, one ability processing would be very, very simple. Now we talked about verse and its specification. So we don't claim to be the single solution in the market. There are others. And we'll have a look at it. It could be something like a single syntax for everything like CP. We can have everything as a symbol. But again, the solution like this enforces one single standard to remove all of the standards that brings us back to getting one new standard that is 15 standards, then they were putting standards already present. And that is not a good thing. So over here as well, we see that verse accommodates every other standard. It does not. Outdates every other standard. When we put it all together, we get a universal package naming scheme with the help of the world and mostly universal version range state that is worse. So we have both the problems completely tackled for storing vulnerability ranges to evaluate them to find out the vulnerable packages in a list of dependencies. A multi package installer for all the ecosystems that is finally every Linux distro could be fun and could agree on one type of for installation. And that is simple. And a very simple dependent declaration over here. And this would be an incremental approach. They will go slowly one by one accommodate everyone. And not replace everything. So as of now, we have the implementations in the universe library for a few version ranges and we're looking forward to having a lot of different implementations very soon. Now that's all about how it works, but is it already out there isn't does anyone even bother to use Berlin verse. Yes they do. All of these are great organizations as pds OSP CSF second years, and of course vulnerable is already using Berlin verse and it has been helpful throughout the entire development. Among us. Now, jump into the conclusion. The naming anything is a very different even if you have a cute cat. The name sticks for the entire time. And it has to be consistent. It has to. It has to be something that is easy to remember. So, we saw the naming software packages with package URLs, and that is just as easy as it could get. And the version range rotation could be solved by, of course, was the original space fires. The vulnerability identifiers have been solved by CDs, and it has been very, very important. It really helps with the categorization of the vulnerabilities. And once the control systems have VCS URLs. So we're here we have a complete set that we can use to build a very nice vulnerability resolver that works across ecosystems that that could improve by itself that could support a lot of different types of adversaries and so on and so forth. But we have not done yet. There is still a lot that we would like to continue to and it would like to help you can, of course, contribute both time documentation cash, whatever you are familiar with. So have a look at that. Have a look at our projects under guitar.com slash package URL and projects and the about code and next week. We are working towards making the soft vulnerability. The version ranges the software the packages and easy to handle. Yes, and in addition to this. We can check specifically the, the version range and package URL specification as well as the libraries we haven't mentioned many languages. We are also preparing a webinar specifically on vulnerable code. This is where pearl and verse where we're born. And that makes extensive use of pearl and verse that will be announced in this conference. There's a separate presentation on vulnerable code that you can follow. That's pretty much it. Thank you very much and have a wonderful day by now.