 Today, I'm going to do a presentation on a topic that's been top of mind for a lot of us. Mostly as conversations though, there's not, I feel in my opinion, there's not a lot of lessons that we've been able to internalize from a lot of the activity that's been going around open source licensing and things like that. Now, changes to licensing might feel a little personal to a lot of people. Might feel just a technical blowback to a lot of others because you're introduced to technology at work and things like that, but irrespective of how you feel, I just thought together as a group, as a community, we could go over what some of the recent licensing changes have been in the open source world. I don't want to limit this to just one example that everybody knows about, but rather go through this in the light of a larger sort of rolling set of licensing changes that have been happening in various parts of the world with various technologies, some of which you might be very familiar with because of the popularity and scale and some of it that might not be as visible in the public lens and might be somewhat more obscure and things like that. So with that introduction, welcome to this talk on use versus abuse open source licensing. Let me begin with the question of what fairness is and let me be very clear about the fact that I don't know. So this could mean many different things depending on where you're coming from and who you might be, you might be a business, you might be forced to do a change, you might be a person who has contributed as part of a code base thinking that the trajectory of that community would go a certain way, but didn't and so you might be somebody who's an organization that's downstream dependent on some upstream code bases and things like that and so any change in licensing could remain anywhere on that spectrum of this is really okay and I understand why it's done versus this is totally not okay and they should definitely have not been done and so even a cursory glance at some comments on the angry orange side as I like to call hacker news can reveal this whole spectrum of emotions that people feel and so like I said I don't have an answer to that question but let's revisit this about at a half way mark through the presentation and towards the end and see if we can derive an answer. Licensing changes are not without their consequences. Some of the implications of a licensing change can be reduced adoption of code bases. Some implications can be about potential legal implications because of making the change or not making the change. There's always an item potentially increasing amount of uncertainty because of the entropy that changes can bring and perhaps the biggest downsides or consequences of licensing changes can be an increase in fragmentation within the ecosystem and also increasing costs in terms of communicating these changes, maintaining these changes and what have you. Now there's also a deeper implication of what a change like this can bring and this could be in the realm of packaging and distributing a lot of the software that a community is responsible for. There could be uncertainty about any patents that have been awarded or would be awarded in future and who should the ownership of some of these go to. There's also the whole conversation around how and why customizing and packaging the software to downstream consumers should be done and there's also the whole notion of is this being done for deriving business benefit or not and so there's really many dimensions in which to slice and dice this whole notion of what the consequences are of any change in license. Now, I'll quickly go through a few stories. I know it's after lunch and people tend to doze off and it's probably not the best time for telling stories but the first one I'll open with is something that happened about five years ago. Yeah, definitely over five years ago. MongoDB, everybody's probably aware of this. Yes, people know MongoDB, yes, no, maybe, yeah, good. They're all not sleeping. So MongoDB went the way of changing their license from a GPL to what is known as an SSPL or a server-side public license. Now, the GPL is something that's more popular than the SSPL is and that's because the SSPL was invented at that time by MongoDB and what it basically stated was GPL belongs to a family of what is known as copy left licenses. We'll get into a little bit of detail later but the server-side public license basically said you can qualify to call yourself open source software as long as you make the open source as long as you make some part of the source available to the community. And so in theory and if you look at it pedantically, there's a question of are you truly open source if you're not allowing open source contributors to come and participate in the development of your software or not. And in this case, MongoDB said I'm going to make the source available but it's not open for people to come create a pull request, write some modifications and merge those pull requests back into the code and things like that. It is what they were before the change but after the change they just said we'll be making the source available and we'll also sort of control a lot about how the stuff that you modify is distributed publicly and things like that. And so the reason for doing this was very straightforward. So Mongo, which was public back then even wanted a manner in which to control a lot of the distribution of its own software that happened and it was trying to preserve its ecosystem. It was trying to build new revenue streams and so some of the forces of free market capitalism dictated that Mongo deliberately exhibit these qualities and also extend that kind of control and influence over what it is that it was publishing. And so the move was very simple. Another example, which is quite popular I think in the year that followed, I think this is about four years old to the date. People have heard of Elastic, right? And Elastic does a lot of fantastic work in terms of the tools that they make but four years ago what happened was Elastic was forced to change their license from the Apache license, which is again a big favorite in the open source world to what they call the Elastic license. So again, this license was invented by Elastic in order to protect some of the infringement that Elastic felt was kind of not fair towards the project itself and the community that was building it. So in particular, we have a notoriously popular infrastructure company to blame and basically what Elastic was trying to do at that point was to protect itself and its products and its projects and its tools and its contributors and its committers from AWS creating a fork of Elastic making that available on their platform as a cloud distributed model. And so people could go ahead and subscribe to an AWS service of Elastic Search as opposed to subscribe to the Elastic Search by the Elastic company and their cloud service. So a lot of other companies followed a similar model. Readers, the popular messaging tool also went from Apache to source available. Again, they were trying to protect a lot of their rights to distribute and they went from being fully open to source available. And we come to like our favorite tool, right? The one that sort of spawned all of the conversations this year, dominated Hacker News about its stories and everywhere we went, we kept hearing how the HashiCorp license which has happened, frankly, it has affected my usage in no way at all. So whatever I was doing with Terraform and all of these other projects, I continue to do that. I'd love to hear stories if things change for others and the move was from a Mozilla license which is a permissive license to a business side license. So this is, again, a slight variation of the service license but it served to protect the way HashiCorp was able to choose to allow whom to redistribute Terraform based work and who not to distribute. And we come back to this question again, like I promised, of is this fair? Now, given what you know and given what we know about Terraform, HashiCorp justified in blog posts and so many others, they issued a lot of clarification about the fact that they're trying to protect themselves from other companies that are taking the work that has gone into Terraform and being able to create a deployment platform using Terraform and being able to offer that as a commercial service to open users. And so this was really protecting the core business interests at Terraform and this was not aimed at making things more restrictive or constraining the experience around the contributors and the committers and things like that. Which makes the question now, if you've had 2014 to 2023, roughly eight or nine solid years of work where you've based a large part of the project's evolution and life cycle on the work that a lot of open source contributors have done, while it is very important that you preserve your business ethos around the tool, is it fair to alienate newer contributors and existing contributors in that form or manner? And that's really the question that we are left with at the end of their justification and what they have to show. And so let's revisit the question one more time closer to the end. And let's look at a few more terms and let's aim to learn a little more before we can give a fully formed answer. I want to start by introducing a few terms to the audience just to make sure we're all normalized in this space and we all know what we're kind of talking about. So there's the concept of public domain when it comes to open source. And within the whole notion of software ownership, public domain refers to this state where intellectual property rights which exist for software does not exist within this domain. So what that means is any software that is a part of the public domain is need not necessarily apply to software but I'll sort of continue to speak about this in the context of software that's being written. So this public domain or being available in the public domain means that software is freely available for anyone to use. It's available for anyone to modify. It's available for anybody to distribute it and even monetize and build a business around this without any restrictions or permissions. And so if there's so many examples of digital commons where a lot of software is available that's actually in the public domain that you can go and use. On the other end of the spectrum is proprietary software and we all probably know and use a lot of examples of these every day. And so the definition of propriety or proprietary software is defined on two axis. There's the notion of ownership and there's the notion of control. By ownership I mean any entity whether an individual or an organization who specifically grants the rights to make use of a piece of software exerts ownership over that particular piece of software. And there's also the notion of control where this entity that has the ownership also has the ability to exert influence over how a particular piece of software is modified, how a particular piece of software is distributed and it's managed through a set of restrictive licenses and things like that. So again, this is to be aware of the other end of the spectrum. And so between these two ends lies the meat of what we are going to focus on. And so I want to introduce the first kind of open source license. And this is known as a permissive license. Once I show some examples of these it will become much more clear but basically a permissive license grants users some freedoms. It's okay for a user to come along and start to use, start to modify and start to distribute a piece of software. And then permissive software comes with very minimal restrictions about what you can and cannot change, how you can and cannot distribute and things like that. There's some very popular examples of permissive licenses which is the MIT license. It's a very popular license among open source distributors. The Apache license is again a big favorite. I think all of the CNCF projects follow the Apache license if I'm not wrong. So that's a big favorite in this community. I think Kubernetes is also Apache license. There's a blog post by some former, I think CNCF general manager, I think Dan Kohn who mentioned that there's, Apache for them is the most permissive while being able to take both commercial by being able to extract a lot of commercial value and being able to orchestrate enough of the non-commercial use cases around a lot of their software. And so it's the one that he describes as fitting the wider spectrum of work that they have. The other big popular permissive license is the BSD and the BSD license I like to think is the one that started it all off. So back in the day when Unix was being forked from IBM and between collaboration between IBM and Berkeley, the first sort of licensing issues started to jump in and BSD was like the first license that they made and they made a very simple two clause BSD license which is very open. It just defined what was copyrightable and how things could be distributed and there's a three clause BSD that was introduced later that basically allowed people to do attribution and things like that. So these are, again, very popular examples of what are known as permissive licenses that allow you to do something and then a second type of license known as the copy left. Everybody here is probably aware of the term copyright and so this is a play on that word called copy left. It's a slightly more restrictive license in the sense that it will still allow you to do open source the way you want to do it and a lot of open source projects continue to make use of copy left. It's just that they require that any derivative work that is based off of any already existing copy lefted software requires that you continue to be copy left. So if you write some piece of code and it's derived from some other copy left software, let's say a GPL license, then they require that you continue to maintain a GPL license or a similar copy left license for that work and so in that sense it's a little less permissive or a little more restricted compared to permissive licenses and popular examples of copy left include GPL, a bunch of its derivatives like LGPL and AGPL and things like that and it also includes the Mozilla public license. So if you've been paying attention, Hashicorp actually switched from a open copy left Mozilla public license to the business side license. So I just wanted to introduce a bunch of the terms and some of the taxonomy surrounding software and things like that and if you wanted to dive in to the business source license a little more, some people call it BSL also. Officially though, it always comes with a BUSL so I prefer that although when you say BSL, people are probably not going to mind except maybe like someone very pedantic. So this is a source available license which means that you obviously make the source code available for the project and it borrows some characteristics of the other open source licenses but it's not considered a true open source license in the permissive sense or the copy left sense of the word and so it really doesn't quite board well when you assign a BUSL to your project and call yourself open source because it's really not open source well and truly to be honest but there's some advantages to this route. It allows you better management of commercial licensing and placing some restrictions on how your software can be used commercially which is what we saw in the HashiCorp story and things like that. Let's look at a few more terms that people have to know. So there's a, I'm sure everybody has come across the word intellectual property at some point in time before is basically to, it basically allows some exclusivity to be granted to somebody who's exerting ownership over software. Then there's the term called patents that opens up its own sort of can of firms in terms of what it is, what you can patent and what you can allow the use of. Trademarks are slightly orthogonal but somewhat belong in this world. Licenses do have to govern what trademarks are, what they aren't and things like that. And then comes one of the most debated words in the open source world. So free software and what constitutes free, what isn't free. It probably deserves a whole conference of its own so I'm not going to go into much detail here. And then there's the final thing that is more definitive in this world of open source which is freedom. So open source licenses are defined by four kinds of freedoms that they allow. And so your choice of license should come from what kind of freedom you want your users and your community to essentially have. So there's, depending on what you choose and what you leave out, there's always a license for that. There's about 200 different kinds of licenses. It's hard enough for getting people to read through one license, let alone fully understand and internalize what's in 200 different licenses. But this question continues to remain unanswered for me. I don't have, despite me doing a deep dive into all of the different aspects of what is software and what constitutes fairness and what goes into a license and things like that. I still haven't been able to come up with a definitive answer for is this fair use of the different clauses. And so what do a room full of engineers have to do with this? As you start to incorporate more and more open projects into the work that you do, please be aware of what the licenses are that are associated with all of the different projects. Please be aware of what it is that you're including and what it is that you're excluding. And let me ask this room before we close. Are you all writing a Docker file at some point or the other? Yes, no? Yes. I mean, I still can't tell you if you're sleeping or not if you just not. So maybe a little sound in the room. When you're creating a container, using this Docker file, let's say you borrowed some parts of it from somewhere, you asked a colleague and things like that. What kind of license would you apply to a Docker file? Have any idea? And now when you use that Docker file to create that container, what kind of license applies to the container itself? Right? Questions to think about. If you're incorporating, let's say, let's say you're writing a note, you're all, you seem like hipsters to me. So let's say you're writing a note JS app, right? And then this note JS app has some NPM dependencies and then it has some stuff that you wrote. And then maybe you looked on Stack Overflow and borrowed some other ideas about some other layers you wanted to include. And so you now have a multi-stage Docker build that's building a container. Some of whose licenses are GPL and some of whose are Apache and some stuff that you've written on your own. And so what's the license that finally applies to the container that you just built? I don't have an answer, something for you to think about and while you're digesting lunch. And so there's very little awareness about what we are writing into our own containers. Now lack of transparency in software is a much bigger problem, but also are you setting yourself up for success or are you setting yourself up for failure? These are questions that I want to leave you with at the close of this presentation. Not to dampen your enthusiasm about using a lot of open source projects, but the realm of security and compliance in addition to a lot of things also includes in some sense business continuity that will come from knowing what's in your container and what applies and does not apply to you. And so that's really what I'd push this group to think about and as a community, we need to start coming up with ways in which we think about this more transparently and we need to think about this more in the open about what it is that we are building and what exactly assigns ownership. But thank you all for coming to this talk. Hopefully it was not a bunch of uncomfortable questions to end with. I'm happy to chat through anything else that you might have in terms of opinions. I'll leave the floor open for a couple of minutes for any questions that you might have. Yes. Thank you. Thank you very much for the presentation and talk. So is there any tools that we can use? For example, given a set of questions like Q and A, okay, do you want to license this part, that part and so on? And at the end, it will give recommendation which license that suits for your needs. I'm not aware of a specific tool that might do just that yet, but have you heard of S-Bones Software Bill of Materials and the SPDX format? Yes, this morning. Okay, good. If you've not, there's a talk I'm giving at 3.30 in the other room, so I'm going to go into S-Bones, sorry for the shameless plug, but S-Bones allow you to determine the license that's associated with every single component that's inside your bill. By using S-Bones, you'll be able to definitively say which licenses apply to what parts of your container and what components that are inside a bill. And by ingesting an S-Bomb, you should be able to, one of the things that you should be able to do downstream from ingesting an S-Bomb, is determine what exactly is the best license that you will have going forward. So I don't know of a tool that does this fully automatically and spits out like a one word result, which means, you know, let's build one. But also I think just having to ingest S-Bombs and determining that answer is the best path to go. Okay, all right, thank you. Great, thanks everyone for staying awake and coming to my talk. Hope you have a great rest of the evening. Thank you.