 Hello everyone and welcome to this presentation about reuse. In the next few minutes I would like to present to you the principles of reuse and also guide you through the steps how you can make your project, your community or even your company an adopter of the reuse principles. We also have a look at the different tooling, so the helper tool and also the API and furthermore have a look into the future, so how reuse will develop. My name is Max Miel, I work for the Free Software Foundation Europe, which is a charity that empowers users to control technology. First of all, let's have a look at the common issues that we see in free software compliance. Of course I'm not telling anything new to you broadly, but let's get through it step by step. So we often have the issue that we don't have enough information or any information at all about the licensing and copyright situation of a single file or even the whole project. This can be even about own code, but mostly about third party code. Then for instance in the Linux kernel project, but then many more probably as well, there is the issue of license ambiguity. So for instance if you have a source code file in which you can find a license header, a notifier, then it's often unclear for instance which GPL version is meant by that. And even if mostly companies but also communities or individuals find out information or clarify this information, this is often only stored in private databases. So in silos, which are not shared among each other. And this becomes even more complex if there are changes by upstream. So for instance with software updates, because these would in theory at least trigger rechecks. So the people in different companies and different areas have to check the licensing and copyright information of a project all over again and do not share this information. And because free software compliance is quite complicated and of course we cannot solve everything, developers need a lot of training and also individual developers who want to produce free software. And all of this leads often because many projects and initiatives try to improve the situation to conflict in practices. So this is the situation that we saw in 2017 when we at the FSVE thought about how we can improve something. And the starting point of our thoughts was well should we create another initiative that tries to fix the issues that have been already created by different parties by diverging practices, but not try to solve them before they have been created. Wouldn't that be more clever? Well of course this has been tried in the past. Well there have been some guidelines, some practices, how you can make free software compliance easier. But as I said these are often quite complex, conflicting and do not really cover all the edge cases that you will find. What we came up with is reuse. And reuse tries to solve everything in a few simple principles. So the starting point was that we want to make copyright and licensing information available for every single file in the repository. So that there's not only a license file or multiple license files, but that somehow this information is available for every single file. So that every single file in the project can be just taken out and be reused in own projects while knowing which copyright and licensing situation exists right now. And the goal is also to avoid silos by storing this information inside of the repository. So there are no external databases, but the idea is to have the information about copyright and licensing as close to the files as possible. Another goal is that this information shall be readable by humans and machines alike, so that tooling can pick up this information and process it, but also that humans understand what actually happened. And furthermore, all of this should be compatible with existing initiatives wherever possible. We didn't want to reinvent the wheel, but instead do something on top or below, however you see it, that improves the whole situation and also makes life easier for other license compliance initiatives. But overall, the most important goal for us was to make licensing easy and fun for developers. Because right now it isn't. There's so much conflicting information, guidelines that do not really cover all the cases. And we wanted to provide best practices that can be applied by any developer for any project size, no matter whether it's a hobbyist project or the Linux kernel. So how did we do that? We came up with three simple steps. The first step sounds simple, but it's quite important. We asked developers to choose and provide licenses. So they're asked to make an informed choice about the license they want to have or the licenses, and to provide these licenses in a standardized form, so the license text files. Then the next step is probably the most burdensome one, I would say. People who want to adopt reuse are asked to add copyright and licensing information for every single file in a repository. The idea is to have this as close as possible and we'll see in the next slides how this happens. And the last step is to confirm reuse compliance. So ideally, if the first and the second step have been done, then the third step is just a matter of one command with our reuse helper tool or other tooling that we offer. Now let's have a look at these steps individually. So the first one, again, is to choose and provide licenses. So we asked people to use a license. We recommend, of course, free software licenses, but in theory every other license could work as well. The idea is to save these license texts inside of the licenses directory. So this is a directory that reuse invented. We didn't see a way around because before that there have been so many different ways how to store multiple licenses in one directory or in a repository. So the easy way is to put everything inside of the licenses directory and name the license text files according to their SPDX license identifier. So this way everyone knows humans and machines where to find the license text for a license or which licenses are existing in a repository. The second step, as I said, is adding copyright and license information to every file in this repository. So ideally this happens by modifying the header of the file. So we add information about license and copyright holder as a command. So for the licenses we insist on using the SPDX license identifier tag. Of course, following with the SPDX license identifier itself. For the copyright holder you can use many formats. We recommend using the SPDX file copyright text because that's quite unambiguous. But also traditional copyright lines are supported. So in the example below you can see that we have a comment here with this information about that the file is under a GPL3.0 or later license and there are two copyright holders. What happens if we cannot edit the file directly? So for instance for a binary file like a picture or also for files that do not support comment headers like JSON. So the easiest way is to add a separate file called or named after the original file name and append it with dot license. So for instance in the example below we have a picture called cat.jpeg and we create another file called cat.jpeg.license. That's just the text file which has to contain this information about license and copyright that we have seen before. So in this case it's ccby4.0 and copyrighted by a great artist. Now what happens if we have many of these binary files or other cases where separate license files would not work? In this case we support the depth 5 format that has been introduced by the Debian project. So this is only one file stored in the .reuse directory at the root of the repository which has a different format than we already seen before but it's quite simple to use. So we say in this case that all the files in the image directory are under the copyright of the create artist and the license ccby4.0. So this also works for bulk, declaring, licensing and copyright for many files. The third step is to confirm reuse compliance. So for this we have a tool. This was one of the first things that we invented in reuse that has one command, reuse lint. So in this case our example repository that we went through here, I will see that later in practice, tells us well we have now information available for every six files in this repository and that the project is now compliant with a reuse specification. So this is the ideal output. We see in the next or in the later step how this looks like if the situation is not so ideal. So summarizing what are the specialties of reuse. As I said we want to be compatible with existing practices but this was not always possible. So the licenses directory is a specialty. But for us this is the cleanest solution and by the way it's already supported by the CCI initiative but also the Linux kernel project. So they already recommend or do this storing the license text files in this directory. Then another specialty is to add information about licensing and copyright headers, copyright in the files inside in the commentator. So of course this has happened before, we just formalized it here by. We provide alternatives for uncommentable files and all of this concludes in an unambiguous declaration of copyright and licensing information for every single file in the repository. So this is quite unique. So in the course of the years as I said we started in 2017 we developed a few components of reuse. So everything started with the best practices and this is still the most important part. So we have a quite formal specification which is ready to be integrated by communities but also by industry actors. So you can read through that and understand what reuse is and know how to comply with it. But to lower the entry barrier because we want to be friendly to also hobbyist developers we provide a tutorial that guides through the example repository and also an FAQ. This doesn't only answer questions about reuse itself but also about basic licensing and copyright issues. So hereby we have a resource that empowers developers to understand what they actually do and shall do to make their project reusable. But to not rely on manual work we developed the helper tool. So initially this was just a linting checking whether a project is reuse compliant but it also has been added a few nice features to make it possible to make a project, a repository completely reuse compliant with only a few commands. So we're going to have a look at how this works now. So first of all we have a look at how this example repository looks like. As I said this is public so you can also download it again and reproduce what we did here. So which files do we have here? We have a gitignore file, also a make file, a readme file and a source code file. So these are plain text files. So one is a C file, another is a markdown file and so on. But we also have two binary files here, a cat and a dog picture. So how does reuse from the start seize this repository? So we see that there's missing copyright and licensing information which is obvious because these files right now do not carry in any information about this. So the summary is that we have only zero and zero files with copyright and license information and of course this is not complied with a reuse specification. But we will change that very soon. So first of all we add headers to the C file to the make file and also to the readme file with a copyright information with Jando and the license dpl3.0 or later. So we just run this command and it says that it successfully changed the header of these three files. Let's look at what changed exactly. So we see here that the make file got this information as a comment header, the readme file as well and the C file. Also note that the comment syntax here is according to the file extension so there shouldn't be any issues. You can also customize this with templates. But of course there are three files missing still. So now we add information about copyright and licensing to all the picture files. So these are the copyright of the create artist and under ccby4.o license. And here we're going to apply this to every single file inside of the image directory. And here the output is a little bit different. It created a separate file called dot license. So the file name and dependent dot license. Let's see what this file contains. So we have here these files and see for instance the information about the cat.jpeg.license file. This is the same information. So we have the file copyright text, tag and license identifier. So one file missing. We add this for the gitignore file in this case the license is cc0 because there's nothing copyrightable and yeah, this is pretty standard. So what's the situation now? Let's run reuse lint. Oh, the project is still not compliant with the reuse specification. Let's see what happened. Now we have missing license text here. So it successfully understood that we have three different licenses, but of course we don't have provided the license text files already. Well, we could now download these manually, but the easiest way is to use the reuse tool as well. So we have a command reuse download dash dash all. You can also download a single license. And yeah, it successfully downloaded those three licenses. So let's see what the situation is now. And yes, very good. So our whole project within a few commands is now completely reuse compliant with three different licenses stored in a nice location. So everyone knows about the situation about copyright and licensing for every single file. And as a small bonus, we also have an spdx sub command that just outputs spdx bill of material. So many of you in this death room are probably aware of the format. And yeah, this is here because we have the information available for every single file. This is pretty simple for the reuse tool to output. So as you have seen, the helper tool makes things so much easier, but we didn't stop here. We also wanted to make this more public so people can see the status of other repositories as well. So for this, we created an API. This API allows you to register your project with the API service within just one form to fill out. And it creates a batch depending on the current reuse status of a repository. So this way, people can add a batch, a dynamic batch. For instance, in their readme file or on top of their project. So to show people that their project is reuse compliant and to be warned about if something is wrong. So to make use of the API, we start just here at the reuse.software main home page. And we go to the API button here. And now people could check their repositories. So as I said, this is quite simple, just a name, email address, and of course the project URL. We don't do this here right now, but have a look at the already compliant projects that we have. So we see here that we have 433 projects currently registered with the API that are also reuse compliant. And for instance, just have a look at different projects that are already here. So let's have a look at the OSS review toolkit that most of you should already be aware of. So the output that one sees in the API is firstly the batch. So the review toolkit is fortunately reuse compliant, very good. So we can add here the batch and copy this to the readme file, which the project already did. We even have passable JSON output. But probably most interesting for humans is the Blaine lint output. So we see here what the reuse tool outputs. And as you have seen, this is the same output as the reuse tool that we've seen before. And we see that almost 3,000 files inside of this repository are reuse compliant. And this is also the output that people see when they click on the batch on the top of the repository. So as you can see, there are a lot of ways how people can reuse and adopt the best practices and which make it easier to do that. And there is much more. So if developers prefer to see the reuse status of their developments continuously, they can also add the reuse tool or the lint command specifically to their CI pipelines or in pre-commit hooks and much more. So as you can see, it's quite simple to adopt reuse. But we do not stop here, of course. So there are a few things that I would like to share with you which we are currently developing and where we'd love to have your input as well. So first of all, we will improve the tooling, of course. So even more automation regarding the helper tool and the API. We have many ideas and you can contribute by using or going to our public issue trackers and of course the code base. But we will also work on the specification. This currently right now is already very stable and I would say we covered most edge cases, if not all, but we want to provide more flexibility. So one idea is to have the reuse YAML file. So this should soft-deprecate the depth file format, which is a little bit strange and not so fitting for our purposes. But instead, yes, invent this own file format, which make it even simpler and more flexible to add these reuse YAML files to different parts of the repository. We also want to enable people to declare snippets inside of their source code files. So for instance, if someone copied something from Stack Overflow or a different site, this has a certain different copyright holder and license. And the specification shall enable it that people can mark the licensing and copyright information of this specific snippet in their source code files. We also want to integrate reuse better into source forge platforms, but also with other ongoing free software compliance initiatives. So this is quite important for us because as I've said in the beginning, we want to be compatible with existing license compliance initiatives and work better with them to make everyone's life easier. But most importantly, we want to spread the word about reuse. So we will support communities and companies with adopting the best practices that reuse offers. We have a few nice ideas here. We'd love to have your support and if this goes public. So yes, this is very important and please also help us to spread the word. Speaking of spreading the word, who's already using reuse? Well, unfortunately, I cannot tell you because this is not a centralized project. Everyone can do it on their own source code on different forges. But what we know is this. As you've seen, we have more than 400 registered projects with the API that are successfully following the reuse specification. There's also a majority of the projects that are funded by the EU via the Next Generation Internet Project. And this is really great because now we have publicly funded code that is reusable for everyone with a clarity about the licensing and copyright information. What's also nice is that the KDE community, which you hopefully already know because it's quite old and big, adopted the reuse best practices in their policies. So every developer is now asked to follow the reuse standards. And they didn't already already put this there in theory, but also in practice. So their frameworks that are forming the foundation of many KDE applications are also completely reuse compliant already. On the corporate side, we see or we know that reuse made its way to the policies of Siemens, SAP, LiveRay, but also, for instance, to the energy working group of the Linux Foundation. Also, the Linux kernel is already partly reuse compliant. And given the age and the complexity of this project, this is a quite good use and we hope that they make good progress. So is your project already reuse compliant? So if you want to adopt reuse or get more in touch with it, the first step would be to sign up the mailing list to take part. So this is quite low volume, no spam, promised. But this is the primary way how to get informed about the latest developments about reuse. And this is not only a top to down communication, but you can take part in decisions and in discussions which form the future of reuse. The next step would be to make one of your projects reuse compliant. See it as a demo. As you've seen here before, it's quite simple to do that. You can use the reuse tool. You can do it manually. Just try it out and give us feedback. You can also integrate reuse into your community, like KDE did, on company policies like Siemens, SAP and so on did it. Furthermore, you can contribute code to reuse itself. As I've said, we have all the code, all the documentation, all the specifications, public. You can create pull requests to it. You can give feedback or create issues. And you can also become a corporate sponsor, like Siemens does and did for many years. But most importantly, please help others to adopt reuse. Spread the word, tell them, tell your company, tell your community about this, because this profits everyone. This benefits everyone. If the more projects you use, reuse and adopted reuse, the easier it is for everyone to just reuse parts of others third party code. And that will be an ideal situation, right? So I hope I was able to give you a nice introduction into reuse. I hope you enjoyed it. And now I'm very much looking forward to your questions. Thank you. Here we go. Thank you very much, Max. That was a nice talk and I really enjoyed to hear about reuse. So there's a lot of work that you can take away. And I'm very, very interested into the... I have the duplication. Sorry. There was a duplication. So I had to stop the one. I was very interested in the efforts concerning the snippets declaration. I think this is a very important thing and can save us a lot of work. Can you elaborate a little bit more on that and what the status is? Yeah, so it's... Maybe I can also add some question to that because if you care for snippets, then I would also love as an additional feature to check whether the snippet license is compliant to the file license and give a warning or something like that. So is this also considered? Yeah, so I have to say we had a short discussion on the mailing list, on the reuse mailing list, which you are very invited to join. And yeah, the current status is we want to do that, definitely. The thing is that we would like to coordinate this with SPDX since they already have a few tags that one can use. Frankly speaking, we are not so happy with them because they have a different naming scheme in some parts, so that may be a little bit confusing to users. So it's on my personal to do this to get in touch with SPDX very soon, also regarding the reuse YAML file to just coordinate here. Regarding the features, we already clarified that we want to keep it somewhat simple. So the idea is to, for instance, not have the nesting. So for instance that you cannot take a snippet that contains a snippet and make this a snippet of your own code because that would be really confusing. So basically you can start a snippet and end a snippet, start a snippet directly afterwards again and so on. So that's a feature limitation, but it makes it so much easier on the tooling side and also for users. Regarding the check, whether the snippet license matches the file license, that's very interesting. We didn't discuss this yet, but it would make sense to have some kind of check or warning there. But as you noticed, or as I've said, reuse is not a tool to check whether your license compliant in the sense that your licenses fit together, specifically just mark your files according to the copyright and licenses and the analysis still has to be done by the users. Sure, I agree, but in the end, so if a person integrates a snippet with a license, which is not compatible to the file license, then this file cannot be reused. So because there's a legal problem which cannot be solved. Plus it makes it... I understand your point. We should discuss it. As I said, that's a really good point. I'm just a little bit afraid to add a license compatibility checker instead of the reused tool because that could be really complicated given the amount of licenses that are just on the SPDX list. And this becomes even more complicated if you add custom licenses, for which there are a number of good reasons. Like for instance, for MIT licenses, there's the edge case where you have custom licenses, at least for the reused tool. So how to check that? But it's a good point. We should think about this. Like perhaps some common things with CC by SA when people copy something from Stack Overflow or so because this is probably one of the most common origins of snippets. Coming from the capabilities point of view, I would... Sorry. But I think from the capabilities point of view, I would put this into kind of solver thing where you really make all your legal decisions. So if there is the outline that there is a certain license, then I think this is already enough. That's then about you to interpret whether you're going to use it or not. Mm-hmm. Yeah. I mean, definitely, like a flag, this file contains multiple licenses caused by snippets. That may be a good point. Yeah. Absolutely. A few people can also use license expressions and license a file under conflicting licenses. Well, then there are copyright holders, but yeah, true. How do you handle the multi-license case? What do you mean by that? There are sometimes components and things that come with multiple licenses assigned. Yeah. And as a user, I have to make a choice. Okay. But can you reuse, identify and treat this when it's compiling a little? Sure. We support the full SPDX license expressions. So you can declare as many licenses as you want. So there's no limitation here. We use the library by SPDX. It can make, for instance, it's licensed under GPL and LGPL or MPL. And so everything is possible in this regard. What's possible with SPDX? Cool. Sorry, Oliver. I wouldn't want to cut you off. No, no, no problem. No problem. So you have mentioned that many of the EU-funded projects are going for reuse conformance. So what's the big thing on the reuse, let's say roadmap in order to make it more or even more used in the ecosystem? Yeah. So the first thing that I've said already is to ease the tooling and the specification, like this reuse YAML file, I think we'll create another uppick perhaps and make it easier. But also the snippets might solve the last edge cases that make it harder to integrate use or even impossible if people cover the legal edge cases regarding the uptake. I hope that more communities and companies join adopting reuse. We are also planning to ease that or to make that easier. I'm just teasing this now because it actually isn't public yet but we're planning to run a project in this year in the next few months for which projects can apply, free software projects can apply or we'll be able to apply to the FSVE to ask for help. So basically they receive the same support that we give to these NGI projects. So our experts will have them identifying issues potentially and guide them on the way, on the roadmap to become reuse compliant. And this shall reduce the threshold for projects, especially if they're quite uncertain about a few licensing issues or copyright issues. And I hope by this we will also spread the word a little bit around and show that adopting reuse is not so hard as it may seem on the first side. Great. When I think back a year ago when I came across you the first time we had one short discussion that was about assigning licenses and writing things into files that probably are somehow licensed by something else because if I have my own code and if I'm using it for my code base, it's all fine. Did you meanwhile think about what happens if somebody is taking code into his own code base and then therefore by formatting it with reuse that it's accidentally changing things or overwriting things into or assigning things to a license which might not be the correct one? Unfortunately not. I mean this is a real problem. I see the point. The actual reason why we wanted to or why we kickstarted reuse is that this doesn't happen anymore. That people do not just copy code or files randomly from the internet and put them in their own repositories and a few months later nobody knows that this file actually belongs to someone else and this is licensed under a different license or from a different copyright holder. So ideally that's the idea of reuse is that if the licensing and copyright information is ideally inside of the file or as close to the file as possible that this doesn't happen anymore. That if I copy something over then my reuse tooling check so if I copy something from a reuse compliant repository and I myself I'm a reuse user then I will be warned that well there's a new copyright holder and potentially a different license in my repo now. So this way it's much more easier to reuse. I'm not so sure but I'm looking forward to feedback whether there's some way that detects or whether reuse tool detects that I just reused a third-party code under a different license without me respecting it or that I just slapped my own copyright on a third-party file. I'm not so sure whether our quite slim tooling is able to do that without adding some AI or so which we definitely want to avoid because as I've also said in the talk the goal is to make it as simple as possible and the tool also very slim and easy to install and easy to use especially. Totally agree I mean in the end a fool with a tool is still a fool and the best tool won't avoid. That's for sure something that you cannot cope with but you're right I think that it's a good probably it's a good choice. I mean in the end it helps and that's something that is pretty clear it helps to get the new stuff in right from the beginning and you're right if it's written into every file that is inside the repository it is close to the source and it is at least there. It's still not protected from manipulation but this is something that we probably do not it's not really worth thinking about this but absolutely and there's also a room for other initiatives like I said we want to cooperate with other people and we also want to prepare the floor for initiatives or for best practices that come in the supply chain after reuse basically so reuse is upstream so directly working at the source so it would be great or if we are able to make the work of like tools like phasology easier in the end to detect such issues that have been created by manual mistakes then it's even better than we are we're very happy about this. Absolutely I'm pretty much sure when I think about the scanning then having this information in place it will definitely help and it will very clarify the things because wherever you have to go to look out for a component then you always are in the discussion is it declared or is it an effective one but if it's in the source it's effective and that's pretty much clear if you are an original source yeah okay there's how much time do we have left five minutes or so still some minutes one comment on reuse and phasology for example recently there was an agent added to the phasology toolkit who cares especially the SPDX license expressions the agent is called OTO and recently I scanned the package which was made with phasology and it's really not a big deal anymore to analyze packages even if they have different licenses inside that's great to hear so do we or is there any other questions that came up let me see not sure whether this is associated here Oliver I think if there are no more questions we probably might switch over to our last talk I guess you had the chance to to look into it probably you can use this session to moderate this in because then we have an option to say something about the talk before it starts seems to be muted sorry classic problem yeah thank you thank you very much Max it was a very interesting and inspiring talk and I think looking also on the chat in the death room about rewis and more ideas this is really a good thing the next talk today will be done by James Curtis and his colleagues and he will talk about continuous integration tooling to deliver OJ compliance reporting I think it's not that long talk but I found it very interesting to and yeah I'm happy to have James and his colleagues there and I'm handing over to the first time organizers then to start the video track in a couple of minutes or seconds so thank you thank you bye thank you bye bye