 We all love kittens, right? Okay, so I think it's time to start, right? Okay, so let's go so We've heard a lot about package and names today Trying to present something which is a small initiative and kind of a social experiment on the one end and a standard experiments on the other end to Define a common way to discuss about packages Irrespective of what they really are and I agree with what Sun said we don't know what really a package is but intuitively we feel what it is Nevertheless, it's difficult sometimes to talk about them. So quick quick thing about me So I'm on a mission to make it easy to reuse free Libre and open source software And I contribute to quite a few projects including the Linux kernel as trace Our right code in a stress. I don't write code in in the kernel I just bug every mentor to ensure that they put proper licensing information in in their code files I'm a co-founder of a speedy X and it used to be a commuter on eclipse and Jebus most of the code I write is a user Python javascript when I'm under the rest Mostly Python and a bit of C and I'm doubling trying to double in rust these days So why why should you care? Very often we have more than one package environment, right? Not only one So how many of you use only a single package manager? Which one is it? Okay, that's fair Portage, okay, that's fair too, but nevertheless your rear birds and Backman, okay, so there are there are a few odd balls Or a few a few advanced users which have found the right way to actually Simplify their life and I command you for that But most of us are morons and we continue to use many package managers whether it's a good thing or not so we should use Nick's Pac-Man and Portage of our Gen 2 and sometimes It's probably even a problem for some of these package managers when you need to cook talk about a package Across these different package manager. It's surprisingly difficult. Why do you want to do that? Where there's some narrow use case one is you build an inventory of every packages out there like Andrew does with libraries that are you? And you want just every package or Graphias, which is a new API from Google project to provide information about things that's running a container An API And you want to get information about not every package, but any package that exists in the container for instance Irrespective whether it's a system or an application package So the problem if I'm telling you I'm using file Which one is it you know to each of us it means very simply something different if I'm Python developer, okay, it's the file library, but no, it's it's also a JavaScript library It's eventually most of this case would be the actual fine free file package Which is the original lib magic file come on But here we're talking about the same thing more or less, but not exactly each three package in different ways And that's a very simple problem I'm trying to address with that so more firmly is that really use software from many different places In some case if you're forced to use JavaScript Massively with a lot of npm's if you're forced to use containers, too You will use a lot of packages across many containers eventually and Each of these package manager environments tools have different ways to talk about mostly the same thing Now each of them have different protocols eventually they rely on different package repository registry Some will use get some will use a talk of our HTTP The point is that When I say npm It encompasses a lot of things, you know, it's a tool It's a convention to document the package It's eventually primarily a language a way to build the package a way to express appendices a lot of things that goes into these three layers Whether it's npm pip jam It's it's a whole protocol of its own that describes an entire system and the whole idea here is to say well rather than try to make that complicated is Try to make it very simple if npm's mean npm's then we'll use it Now tracing back a bit on our origins. Why trying to do this? URL standard of sorts So I maintain a tool called scan code Would ask two things it scans your code for license file and mentions This kind of search engine for license where the index is very small. It's about 20 megabytes and The query is eventually gigabytes of code so be the inverse of a Google where you have tiny queries very large index If you have tiny index and eventually large query And it also scans and parses package manifest and try to stuff them and squeeze them in a common model So in this model, I have the problem which is how do I identify file as being a ruby jam or a Pipe I pipe package or a node package. I cannot just say File that's not enough as a name Then there were some folks from Jeff frog and Google doing this graph as project middle of the fall and I stumbled on their home page They had something called so they need to aggregate information about package deployed in containers that's the primary purpose of the API and They were having some kind of informal specification. This is oh, this is how you identify package as resources with some informal URI of sorts, which was like Maven calm a name of a Maven group ID and an artifact ID and a version and similar things for Debian RPM things it says that's interesting That was not super formalized, but that look really cool Then I was also diving into the the schemas and the internals of libraries that are you Because I love packages and and I so they're also something which was very similar I mean the the notion of the type of package where it's coming from the versions and Looking at a few other ones most other package indexes at some level seem to use the same Approach each with subtle differences, but mostly the same approach And so the solution was to say well, let's try to come with a simple and expressive URL which is formally defined and That can abstract all the subtle differences and can be reused across all of these again for a very narrow use case I mean if you don't do a libraries that are yours can code. Maybe you're not aren't interested though if you're using many packages and Somebody in your team ask you Can you tell me all the third-party open source package that are using our products? What's going to be your answer? How can you provide an inventory of? this set of packages in a simple and clear and clean way The applications may be multiple you may just need to know You want to ensure that the license match your license policy if you're a GPL project You don't want proprietary software and maybe the other way around applies to you want to make sure your packages Maybe don't have security bugs Quality problem and so on so even though you may not have the concerns of Detecting packages or inventorying packages on large scale. You'll still have in the small And everyone that develops software may have this problem now Whenever you start the standards, you know, you see okay. They're using this standard this standard standards Let's come with a new one to rule them all and and that's an easy trap to fall into and and Hopefully we've tried to avoid that at some level. It's not entirely possible yet So the approach first thing it's it's been so far mostly a social experiment in the sense that So I had this issuance can code and then I started chatting with Resent rule on others and and pinged them started to give up repo Started to ping folks there put some documents and it is it more feedback Really trying to get something which was as inclusive as possible from the very inception of of the discussion And the other Importance in the approach was to say well, let's not try to invent anything new You know if an npm version says I don't know angular at One to three. It shouldn't be much more difficult to express a URL for an npm So it's an attempt to standardize but rather than trying to come with yet another way It's more defining a few data elements and an optional syntax to express it as a URL Hopefully avoiding the standard strap so now Any questions so far things makes About sense. You're not to the folks that are coming from overseas are not to jet lag than starting to do those I see someone sleeping Sorry Sorry for waking you up. That's not that's not nice Okay, so what is a pearl? It's six data elements The type like NPM Ruby Jam and so on the name and the name space so two elements one of them optional name space a version And that's pretty much it the rest is more extra things which you want to use only In special case when they're needed. So some example of a syntax A big bucket repo So here my type is big bucket It encapsulates a lot of things About big buckets, which maybe it could be using it or mercurial There's an API when I know it's big bucket that I can query which returns certain informations And I know exactly where to go and which repository points to Which revision it points to in this case this is a revision and the same applies to github or else Now the thing is that it's This big bucket and compass again a lot of information and that's why we are treating that as eventually a quasi package like animal Where there's there's a lot of things and there's more than just Gits or a mercurial package protocol and repo behind it another example for a Debian package We're here part of the namespaces What's the distro? That's actually What's the the provider of the distribution because there's more than one Debian there's a lot of derivatives In the case of system package sometimes you need information about Architectures you have native code that's been built so and there may be other qualifiers That come after the query string and the right part Right Yeah, right the right part is the query string basically So some more example of syntaxes Docker images which may be from a registry that you know about or are just published on the standard Docker registry typically identified by a Hash check some on the image A jam sometimes has a platform or not mostly for a bit could be Java Go our interest is interesting animal because there's there's really no notion of Naming beyond the language packages themselves and they represent namespace and Here we're just saying well Last segment of the name He's actually defining a whole Package and inside you can have sub pass hence the the bound side Which is to define an optional segment which is a path inside something which is package Which you may or may want to track its entire optional You may want to track just a subset of a sub package use or a whole repo the point is that offers this flexibility When it's in and when it's possible another example from even where you have a combo of what they call artifact ID and group ID and Additional you could point in this case In another repository and that's possible for others and so on and so on so you think about very basic case for npm You've just prefix the way you reference an npm as an npm user with npm That's pretty much it not much more than that so it it's meant to be extremely simple You would wonder why I call Pi pi things for Python and not pip for instance big beam the tool used to install Python package Well peep itself Doesn't encompass the whole protocol For Python packages, that's the protocol of the pi pi type registry which expose an API a way to download and Also all the conventions to document versions and dependencies that really qualified the package The package itself and not so much only the tool they can be several other tools that can be used to install Ruby packages the same way if you think about gem you could install them with bundle and Most of the time you will be using bundler to to install gems Okay, so since that I am on if we dive dive a bit more most of them are optional And in fact, there's only one that's required beside the type so type and a name So it's eventually saying npm call on something in this case You're not specifying any version there are case where it's actually a valid use case and you don't care about the version per se Then the namespace is also something Which is optional some may use it scoped npm's artifact ID group ID from evan Required that and it makes sense to put that in namespace The other thing you don't see in this URL, there's actually no host right Doesn't tell me where to get the package and so the point here is You get the package from wherever you use usually get it from When you npm install by default unless you have a special configuration you get it from the npm registry and that applies to Whether you use a pnpm yarn or npm or any other command-line tools to provision your package same thing from evan if you Install something with me, but it will go by default to mement central So there is in the vast majority of the case a default centralized Public registry for the packages so it doesn't make sense to repeat it on over and over as a mandatory attribute It doesn't make sense to make it in the path at all times But you put that as a qualifier if and only if you need it when you use something which is not on the public registry or another alternative Alternative package repository like a private registry for instance if needed and So qualifiers query string key value pairs can be anything It makes it easy you can stuff any kind of a weird off-board way quirky saying Which may in some case be useful and needed But then you don't pollute the standard Command case 90% of the time with everything that's needed in 10% of the cases And we discuss also already about sub path Okay, so let's put that to rest There's no host. There's no authority as it's called in the URI or URL scheme Yet. This is a URL. This is locator When I say just npm column Angular at something it points exactly to one central host, which is the npm registry and therefore I can look at my package without any ambiguity whatsoever and There's been a long debate on that topic, but this is not about purists This is a URL. It's also a URI like every URL and it's been reviewed by real authorities for URIs and URLs namely folks like and even Kirsten and Mark Nottingham Mark Nottingham being the guy behind the HTTP protocol and any being The lead of the URL spec for the white working group So they know a bad deal more stuff than each of us combined about URL so I trust their judgment and we ping them to actually get their feedback and their take on the topic and And so there's one tidbits today that needs a bit ironing It's not entirely settled There's really potentially two ways to present a pearl The one at the bottom is the one I've showed you today One possibility would be to have oh and there's a typo who PKG Something like PKG or pearl a single scheme and prefix For all the pearls which could make it eventually something simpler if you want to have that Registration official rfc in the future because we would register only one scheme as opposed to one scheme per package type and I'm really a sitting feedback there. I don't know what so who would prefer to have one single scheme PKG and then a type versus having The second case where you have many different schemes one per package type. So case one one prefix Okay, and case two one unique prefix for each package type It's about even maybe a tiny bit more for the the PKG as long as it doesn't have a typo, but Excuse me So why why why why you prefer the former the first of the second So it's it's not simplicity of implementation it's simplicity of official registration as a URI scheme As an official rfc, so it's not the implementation doesn't make much difference between these two cases Yes Okay Yeah, the second one is simpler and because there's less Three less I mean four less later is total. So that's that's appealing on the on a large volume The first way is unique if you imagine a future where a bad browser wants to make these clickable They have to implement a whole list of which prefixes are packages Which may be extending I'm not fully known if you have this Start it knows this is one we have to make it would I have to open with my future imaginary Universal package manager program Yeah, so that's by the way, that's an interesting possible Option which is to to write kind of unifying meta package manager which would install things from any places Yes Yes Okay Yeah, okay, so so yeah The reason why I brought the problem is that I was much more in favor of the second one at first But it's eventually more of a source of problems down the road and and maybe these few extra letters are worse it As long as they're spare, right? Okay Now in terms of language implementations we have today a go and a python implementation and The implementation is very simple. I mean there's a spec That you can see here Which is evolving and we're accepting any poor requests on this spec literally it's been a bit quite since the the the a bit quite since the The holiday break, but there was really this issue that I wanted to discuss at first time And I was waiting for having a bit of that shot Whether we should prefix it with a unique prefix and I think that's settled now mostly And there's some folks which I've discussed I think there's already a Java implementation somewhere And I said Python Python go It's pretty trivial to implement especially if there's already a URL parser in most case available on all the the the platforms and So we need some help if you want to contribute something I hope that my project will be accepted again as a summer of code project this year and I'll put a few tasks For doing that can be a fun can fees bus like programming project where you have Four or five different language implementation to do as a student same thing To help the implementation we have a single unit test suite which is just a set of expectations In jason, so we can have things which are mostly compliant and Some credit and contributors again. I said it's a social experiment and we had a lot of Contributions there and that's it. Thank you very much Any time for questions, but believe we'll be back for the panel