 How many ways can I embarrass myself in my first FOSDOM? I can apparently use a broken USB key. And then I can follow up with a Mac. So we're here to talk today about copy left data and theory, which we could have alternately titled Sorry, We're Going to Talk About It Yet Again. And so let me go through really quickly what we're going to hit on today. We're going to talk a little bit about some data copy while meantime, I'm going to be loud. So brief introduction. Who am I? I see a lot of old faces, which is awesome. Thank you. Oh, thanks. I have written way too many copy left licenses, including quite a few that were discussed. I was involved in the writing of some of the ones that were mentioned in the last talk, for which I apologize. I was young and stupid and didn't know any better. I am now the co-founder of a company called Tidelift. I'm not going to talk about Tidelift very much today. But one of our goals is to help free software in general by helping get free software maintainers paid to do some of the boring maintenance work that otherwise they sometimes tend to ignore or that they tend to get just sort of said, well, why didn't you do it? And now we'll actually have, perhaps, some better reasons for them to do it. So let's talk about some numbers. Libraries.io is an open source project. Tidelift hired two of the maintainers, including Andrew, who's down here in front. It covers a wide variety of package managers, everything from the Emacs package manager to Node, and lots of things in between. Covers about 2.4 million, or maybe it was 2.5 million this morning, I don't remember exactly. And it also includes dependency graphs. So one of the things that's been missing, we think, from the copy left data discussion is that when you scan all of GitHub or you scan Fedora, there's this discussion of, so which are the important packages? Which are the ones that we actually care about? Because, of course, the folks who are saying, copy left is super healthy, look at Fedora, which by my count a couple weeks ago is about 55% copy left. And Fedora is a curated, high quality, high value set of packages. Node, with all due love for Node, has 600,000 packages. Perhaps not all of them are of the highest quality. And in fact, we know from our data analysis that some of them are literally class projects that are uploaded to Node. Whereas in something like Fedora or Debian, there's no such thing as that. Everything is of the highest value. So by using dependency information, we can see what are the top 10% most dependent on packages in Node. Presumably, the class projects probably not in there. I admit I didn't check. This is a graph on the x-axis. What's the percentage of copy left? On the y-axis, how many packages are in this repository? These are for the repositories. Wow, this is where, see, I forgot to mention, I actually am a lawyer, which is why I apparently flip x and y-axis now. Like that's how long it's been since I did actual math. On the x-axis, total number of packages. On the y-axis, what percentage is copy left? You'll see that these are all, that outlier all the way to the right is Node 600 and some 1,000 packages. It's about, I forget the exact number, it's about 3% to 4% copy left. We've got one that's 80, I think that's our repository. It's not very big. It's about 20,000 packages, it's about 80% copy left. All the rest of them are clustering in this lower right-hand corner. This is all the packages in our repository that have dependency information. So what if we look at the top 10%, right? Because I think, personally, there are many concerns and questions when counting copy left. But for me, the most cogent critique is this one of, but what about the curated? What about the best packages? It turns out the graph looks pretty much the same, right? And the takeaway here is that the vast majority of package managers, both the average, the median, all of them cluster around these numbers of 8% to 12%. So the next time, somebody asks you what's the state of copy left in open source? Precision of some of the counts we've seen from, but gives what I hope is a pretty accurate number. Let me say first, there are some caveats about the quality of that data. These are packages in the database from these package managers with dependency information. We know at least A license, might not be 100% accurate, but we know A license for about 79% of them. We know no license information at all from 14%. And from 7% of them, actually believe in copy left. I was told last night at dinner, I think that nobody believes in copy left anymore. I'm here to tell you that that's not actually true. And I think there's three reasons, and these are, I think often historically treated as different reasons to support copy left. I think it's important to understand that they overlap. So there are folks, often associated with the FSF, but not entirely, who believe that the copy lefts are important because of freedom. I'm not gonna belabor this to this audience because you all know most of this. Quality, this is sort of traditionally the OSI associated thing. If we open the code, it will be better. And so copy lefts are good at opening the code. And there's a third one, which I think is somewhat sometimes underappreciated, which is if we copy left this code, we are encouraging people to share back with us. We are getting these contributions and we are preventing free writing. And it's important, I think, to understand, again in the discussion last night, someone told me, someone who has sometimes been publicly skeptical of Linux and Linux, mentioned that the Linux kernel was clearly and obviously the greatest success of copy left licensing. And it was interesting because this person is definitely in camp number one. They're all about the freedom. Linux himself, to the extent he ever indicates that he cares at all, is in camp number three. And yet, Linux's action by trying to solve his needs under camp number three increased freedom in a really important and tangible way. All of these reasons are still relevant. They're still valid. And yet, we've gone from 55% in Fedora to 10% now. And so what happened? Why are we only at 10%? Before we get to some theories about why we're only getting to 10%, I assume most of you know that perhaps the most dominant theory of why are we where we are is that, oh, developers hate copy left, or companies hate copy left, or take your pick of any number of people who supposedly hate copy left. So after my child was born, a couple of years back, I left the Wikimedia Foundation and I spent a couple of years serving as the outside legal counsel to many of the biggest companies in Silicon Valley. And these folks are supposedly the leading cutting edge of, oh my God, copy left is the worst. Now it is true that quite a few of them, in fact, their official corporate position is copy left is the worst. However, it really surprised me, the number of them that did not hate copy left. More than one of them said to me, yeah, I would actually love to use a license that when I'm doing software as a service, actually encourages my competitors to actually share code with me. They specifically cited, hey, copy left sounds good because it's gonna encourage people to contribute. They're definitely not in the freedom game, but they are still in the copy left game, at least in theory. Here's the problem. Then I was like, well, you want a software as a service copy left? AGPL or AGPL or AGPL. And then their faces turned white. And I mean, it's Silicon Valley, so the faces sadly were mostly already white. And it's funny for, we don't get out much. And yeah, and their interest in copy left was not high enough to overcome the complexity of AGPL, I would say the ambiguity of AGPL. They really were interested, and AGPL really didn't solve their problem. It's not just these big companies that I talked to that are frustrated, of course, any of you who follow other people in free software or an open source on Twitter, will have seen an increase in the past couple of years about, oh my God, why are people so rude to me in my issue tracker? Why are people so inconsiderate? I'm doing this for free, but why am I doing this to myself? And the desire, again, for copy left has not gone away. These folks also want contribution, they want participation. And so even if, in fact, they don't believe that copy left leads to freedom, even if they don't believe that copy left leads to greater utility, there's still, I think, quite a bit of desire amongst developers for copy lefts that actually help with the contribution problem. The problem is our current licenses do none of these things. This feels to me like a solvable problem. This feels to me like a solvable problem. AGPL is not fundamentally broken. It is pretty darn hard to read, though. And it has an FAQ that's not very helpful. We've got to try to fix it. And that's where I think I come from of my frustration right now. And in fact, again, I bear part of the blame as a former board member of the open source initiative. If you'd asked me the question five years ago, what do you think about proliferation? The answer would have been proliferation bad, fewer licenses, we need to streamline more, we need to streamline more. And that answer I think has, well, like, like the computer, the progress of licensing has stopped. And we've found ourselves in a situation and the software industry has not stopped, right? In the past 20 years, we've run from a world where almost all software was distributed to end users to a world where the vast majority of the world's most valuable software is distributed as services. Big data is huge and never distributed. Machine learning is huge and very rarely if ever distributed. And that's not for nefarious reasons, often, not always, but it's not for nefarious reasons. It's not like, oh, you know, we hate you. It's just, it turns out you don't have a million cores handy. And so this machine learning code is probably not actually that helpful to you. And so what I want to transition to in the last part of the talk in slides that are thankfully brief is a plea for us, for those of us in the room who are lawyers, to think again about not just licenses, but also as in the previous talk, contracts, other legal forms. What can we do to create the effect of copy left, to create more freedom, to create better utility, to create better cooperation? That might mean, that might be something as simple as AGPL 3.1. It might even be as simple as a new AGPL FAQ, for at least some of the worst problems of AGPL's confusion. It might be a lesser AGPL. That's something that I think a lot of people in the room have had ideas about. It might be things like, some of you may have seen recently something called license zero. It's an attempt to rethink copy left in the commercial context. I don't know if it's a great license, but it's an idea that really pokes and prods a lot of our intuitions about what a license should be. Similarly, there's a project called, and this is where my slides would be useful, called the Civic Data Trust. The Civic Data Trust is an attempt to say, you know what, actually maybe copyright licenses aren't the right tool for this job. Maybe we should be using alternative legal forms like trusts and contracts in order to experiment with things that reach the same goal. That have us, again, working better with each other, not hoarding our code, sharing our patches with a little bit of nudge from the legal system. And so my plea to you as lawyers in the room is to think creatively to the next time someone you know who's on the OSI board tells you not to write a new license, ignore them. Go, seize your joy, write that new license. You know you want to. Just don't name it after your company. And for those of you who are developers, I urge you to think again about what your options are. It's in fact not true that just because you choose a copy of a license, no one is gonna use your code. It's true not everyone will, but definitely some people will, and you might find yourself as the next lenis, not just because you wrote an awesome piece of code that everybody uses, but because you wrote an awesome piece of code that encourages people to give and to rethink what they thought they knew about licensing. On that note, I will stop my ranting, close my embarrassing Mac laptop, and call for questions or just spit balls. Do I get to, audience vote, should I prioritize questions from the institutions I've just insulted, or accidentally slighted? Actually, I'm gonna start with, so for those of you who couldn't hear, Bradley said, called me out on saying lawyers. I believe my slides actually say legal nerds, because I do think some of the creativity here can definitely come from outside of the lawyers. We can help you identify tools, but ultimately we can't feel the pain of failed cooperation because we're not the ones doing it, so I totally agree with you Bradley, it shouldn't just be up to the lawyers, and I certainly, I would hope that those of us who are lawyers can help support those in a way that's constructive and not just destructive as you're correct to point out, we sometimes have been. So that's one institution I may have accidentally slighted, second institution. So the question is, why is eight to 12% bad? What am I measuring that against? So the initial draft of this slide deck said, I am not making any judgment, I'm just telling you it's eight to 12%. I realized that was a lie, so I took out that. I think the, I was comparing it by reference to Fedora, like I said, I did a scan of Fedora 27 a couple of weeks ago of the complete repository, 44,000 packages or so. That is not part of libraries.io because for various reasons it doesn't do the core operating system, but if you want to submit a patch to libraries.io to make it handle the core operating system, Andrew's right there. That code, by the way, AGPL. Yeah, I mean, I think compared to what we think of as the golden native copy left, where there was more, how to put it, certainly I come from a far enough back age where my first piece of software that I ever released was under a copy left license because I believed that that would encourage contribution, right? And ultimately at some level I still do. And so my judgment call is I think we're a little bit worse off if the world has gone from where Fedora or Debian used to be the most dominant package managers and were predominantly copy left. And we've now gone to a world where most package managers are far and away vastly permissive. And by the way, I didn't mention those eight to 12 numbers are about one half. It depends a lot varying on that. There's more variance in that number. But it's about, overall, it's about half weak copy left, about half strong copy left, and network copy left is a rounding error. Hope that, James or? The man with the bow tie, sorry Dave. Yeah, so James mentions that Silicon Valley should be interested in copy left licenses because of the much stronger patent language that is typically contained in copy left licenses. I think I agree with that, so I think that's, to some extent, a flaw of history. There are, as some of you may know, Intel and Oracle have both, well, an Intel employee and Oracle have both published very permissive licenses of very strong patent grants, though not necessarily patent terminations. So I don't think that's quite as clear as it once was, but yes, yes, right, right. And yes, and so the other part of that is that if you put your patents under a permissive license, they can go all over the place, whereas if you put your patents under a copy left license, you retain some control, some visibility, and the scope of the spread of those patents may be limited, which is definitely correct, though that's not one I would be super, it's one I would tell one of my clients because I'm supposed to tell them the full spectrum of solutions, but as a free software advocate, I'm of mixed opinions about it. I think we're now done. I'd love to have more discussion, but we're on the hallway track or stalking online. Thank you.