 to my talk about how to simplify copyright using SPDX IDs. I'm Stefan Lachnitz. I'm a contributor to Debian. I maintain with Jonathan Carter and the games team, for example, the game mode package. So what is the SPDX? So the SPDX is the software package data exchange. It's a working group of the Linux Foundation and the most important thing they do is they maintain a list of common licenses. So yeah, this is a part of the list. This is pretty long. And as you can see, you have a lot of licenses and they all have a short identifier. So that's the middle column here. So for example, for the academic free license, that would be AFL. And you also can read the license text, the default license header. And they also have a column where the license is approved from the FSF. Or more interesting, in the case of Debian, from the open source initiative. So why would one use these identifiers? So first of all, they're short. So saying I have the AFL or writing it is shorter than writing the full name of the license. They're also standardized. So it's pretty much clear. Like a common problem is like, oh, it's a BSD license, but they're like several BSD license. And so if you use these identifiers, it's exactly clear to which license you're referring to. And also they're easy to parse. So in the case of the GPL, you have like a pretty long header for files. And with the SPDX license identifiers that will be shortened to just you have one line basically, as you can see, like with the examples. Also, it's possible to state like several licenses. So for example, I've written down MIT or FSF AP. So that's also pretty easy to parse if you have multiple licenses. So next question is why we want to use them in Debian? Well, we have the DP5, which is the copyright format, the machine readable. But it would be much more readable for both machines and humans if instead of writing basically all license tags, there are only a couple of licenses stored on the Debian systems like the GPL. But having more of it standardized is easier to read for humans because it's shorter and for machines as well. And it also shortens the copyright file in some cases pretty dramatically. And if you would use SPDX identifiers for the copyright file, we can also when creating a package with DHmake, we can simply check for file headers with an SPDX identifier, which is pretty easy to parse as we've seen before. We can also automatically fill in that copyright file. So that would be easier for maintainers. And we can also automatically check if a package is CFSG compliant or not, if it only has SPDX licenses because we know which licenses are CFSG compliant and which not. So that would also be pretty helpful, I guess. Also, we could make more precise statistics about license usage because the name or the identifiers are like standardized. And it also will remove some clutter. So right now you have the license assignment to the files of which file has which license in the same file with the license text. But if we would split these in two files, so license text and just like the assignments, that would be at least in my opinion, I think a little bit cleaner and rooms a little bit of the clutter. So how would it look? Well, this is an example of a mango hood and it's more like a worst case example here. So this is a part of the copyright file. It's a little bit longer. And as you see, like in the beginning, the normal file assignments, but if you go down and you have like four licenses and it's not really, I don't think it's nice to look at. And yeah, with SPDX identifiers, it would simply look like this. So nothing really changed. I simply cut the picture, but you get the idea. So how would one, could one implement it, actually understand it? So I thought of like three new identifiers, entries. One would be like license minus SPDX. So after this, so look back, like instead of writing like license MIT, like license minus SPDX MIT, if there is a license which is not collected by the SPDX license list, it would store them in a separate file, or something I'm not really sure if it's that useful, but like for a short license, maybe license minus text. So as you can see below some examples, how it would look like with SPDX or file. So also this is pretty helpful for non-free packages as well, I think, at least for some. So if we have there some, so the SPDX licenses, they're not all free according to the DSG. So some of them, yeah, we might not want on the system, but if we have a package which still uses them, it could still be in non-free. And what we can do is automatically depend on a package which contains all the non-free SPDX licenses and automatically depend on it with that helper. So that would be pretty neat as well. So now to the more interesting part, what is the current state of the SPDX or SPDX adaptation in there? So with the DEP5, it's actually pretty close. A lot of the things, so a lot of people write GPL minus 3 plus or something like this. This is actually from the SPDX version 2 standard. And there is a wiki entry which collects like the biggest differences between current practices with DEP5 and SPDX identifiers. But the wiki page is pretty much outdated or a lot of stuff is outdated and yeah, that's not really much progress. At least I don't see anyone working on this really. But one thing that is nice is that the SPDX license package is a new, I think it entered new four months ago or something like this, but that's it. So I haven't seen any initiative for a change of DEP5. So yeah, this is where the discussion part comes in. So talk is basically over. Now is the part where I want to discuss how one could implement it, maybe reasons not to implement it, maybe I've overlooked something, but I think it would be pretty neat if WN would adopt this standard. So in the hypothetical case, that WN would adopt it, what are the main things that would need to be done. So first of all, the SPDX releases versions of the license list. I think the current version is 3.10. So if we write up a standard, we should write in that standard to which SPDX version we're referring to. Or at least the base version. So I think if they add new minor versions, they won't remove or rename licenses. They only would add some, but on major versions they might remove or rename licenses. But yeah, so for example, one could say for the bullseye cycle, let's take version 3.10 and write it in the standard. Then obviously the license packages need to get in, the licenses would need to get into Dibion somehow. And the standard needs to be written, a proper standard, and then the Indian would need to be updated to be able to pass the new copyright format and Deadpool plays way for the non-free stuff I talked about earlier. And then later, not that important, we're going to add fancy parses like the automatic checking or checking against file headers and stuff like this. So yeah, thanks for listening. Here's some credits for this super nice template and yeah, let's start discussing. Thanks a lot for your talk, Stefan and welcome to DevConf. Now we have a couple of minutes for some questions. And the first one is what has been the hardest thing about SPDX until now for you? I don't know if I get the question right. I think the hardest thing about it is I think when one tries to implement it is the sheer size of the licenses is pretty big. So each license kind of has to be grouped between free and non-free which is a lot of work, but someone already did it. I actually don't know the name and just start randomly new. Yeah, so that's the hardest part. Yeah, because there's, I don't know, like 100 licenses I think and not all of them like in the talk I said they have like these old I approved a column but not all of them are actually approved and I think some of them are actually still just DSSG compliant. Right. The second question, why not indicate the used SPDX version in the copyright file? Yeah, I also thought about this. It was probably a very good idea to just like in the beginning a little ahead of it, but I still think a base version is a good idea because it only really makes sense to give in a newer version when the version is also in Debian so it wouldn't be a good idea to say okay I use SPDX whatever 3.10 but in Debian we only have 3.7. So the whole point is like off because you're missing the license text which you actually need to comply with the copyright. Yeah, so I think it would be a good idea to like for the cycle to set a version and one could still say okay if you want to use a newer SPDX version then you can add the spheres and make it optional. But yeah, that's a really good point. I also think that would be a good idea but optionally. Wonderful. Thanks a lot for your answers and for this great talk. Thanks for sharing about SPDX. Yeah, thanks.