 Okay, hello everyone, my name is Steven McCourt. I work for the Wikimedia Foundation Legal Team and I'm here to talk about 3D and open source software licensing. The purpose of this talk is to provide a general introduction to open source. If you work here you're probably very familiar with how open source works and you probably are very familiar with the licenses as well as the technical side. So I'm not going to go into a lot of detail but feel free to ask questions. Also I work on the legal team, I'm happy to advise the members of the Wikimedia Foundation engineering team so if you have really specific questions or projects you're working on it's probably best for us to discuss those afterwards and I think I'll have a little bit of time with your general questions at the end. Two notes, I have a copy of this presentation with more complete details on what I'm presenting so if you'd like to refer to that afterwards let me know. And second, in the spirit of free culture and the Creative Commons licenses we use on all of our projects, all of the images drop as presentation as well as the presentation itself are free to license. This presentation is a remix including a lot of work that was prepared and re-assigned previously so you'll find attributions at the end of the deck with more information or you can find details on that. So a note on terminology, you'll see the words free software and open source licensing used around the internet largely interchangeably but they are actually very different terms, they come from different historical and philosophical backgrounds, there's a bit of an ideological and linguistic war that goes on between free and open source software but we use them roughly to mean the same thing here but they are different or we'll go into the details of the differences. A lot of what we do qualifies as free software but we also use a lot of open source software and licenses and a lot, there's some arguments about where exactly the difference lies. This presentation is designed to be largely pragmatic so to avoid falling into the war between free and open source software we're just going to refer to them collectively as free open source software or FOSS, you might also see the phrase free licensing or free Libra open source software FOSS around the internet but we'll just be pragmatic about the terms we use. So why do we use open source so extensively at Wikimedia around the office and all of our projects? If you're hired as an engineer working for Wikimedia you probably have a gut reaction to that and you probably also understand a lot of the arguments for free software but it's important to revisit them just so we can inform our discussion of how we comply with free licenses because exactly how we use them is largely motivated by why they're good for our organization. So first and foremost the Wikimedia Foundation is a public charity, we're a non-profit organization and our mission includes a dedication to developing content under a free license. So we simply use free software licenses and open source licenses because we say we were part of our core mission and we build all of our materials not just the photos and texts but also the software and tools under licenses that allow others to use what we create and then we build a product. That's just how we do things at Wikimedia. Second, we have a community of people that work on open source software and this community contributes back to our work and they're able to participate in our work because we're so open so by using open source and free software we develop a healthier community and we sort of grow that community and I think if you look at the community on Wikimedia.org you'll see one of the more popular and productive communities that we have. And then finally we use open source and free software because basically it works, it does great things. You probably already know that, that's probably why we work here. So that means we comply with open source obligations both as a legal matter as well as a moral one. The bottom line is that we're legally required to comply with these licenses but it's also the right thing and it helps our software improve and our community grow. So I'm going to walk through some of our licensing obligations. I'll talk about some of the categories of licenses and then the different obligations they have as well as some examples. These categories are rough categories. I think you'll find them around the internet a lot of people use them but they're not necessarily set in stone so you might see some differences if you go through the Wikipedia articles on all of these licenses. So in general they fall into three categories. All open source licenses allow people to use, modify and share work. We use the list of licenses that's prepared by OSI. If you look at our contracts you'll see the phrase OSI approved licenses but there are other lists of what qualifies as a free or open source license. The Free Software Foundation publishes one of these lists. If you see anything that isn't on the OSI list or the Free Software Foundation's list I wouldn't recommend using it. That's usually a big warning flag mostly that people are confused about what they're doing or perhaps they are not actually releasing it under a free license. But if you have concerns about an acronym that you see come talk to the legal team and we can help you figure out if it actually is an open source license. But the list is pretty comprehensive. It's probably longer than you need to know every single one of the acronyms on there but the OSI list is longer than you do. And then the three categories that we'll discuss today and I think you can probably group all of the open source licenses into one of these categories and there are of course nuances within them but these categories are a useful way of thinking about it. The first is permissive licenses. These are designed for simplicity to basically you want to get your software out there. The second are copy left licenses. These are designed to promote freedom, to ensure that stuff stays free and open source. And then finally the apharo category which basically consists of one license and that's designed to really, really promote freedom especially in a world that has web services. So all of these licenses are very similar. They allow you to use Modify and Share but they have different obligations exactly when you use or share the software. So we'll discuss some of those. The first are permissive licenses. Permissive licenses allow you to do anything including sharing copies of the binary code without sharing copies of the source code. They are very popular. They're increasingly popular these days. They are just basically designed for simplicity. Why would you want to use a permissive license? Basically it's something that you just want to get out there and you don't want to attach any additional conditions. They have very minimal strings attached and we'll discuss what those exactly are. So there's really only two things you need to think about when you're using a permissive license. First, if you give someone a copy of the code, then you have to share a copy of the license in the license file. And then second, if you do share a copy of the code, you shouldn't change any of the licensing text. Just claim that you wrote it. You have to keep the original credit or attribution place. Exactly how this works is different from license to license but basically if you follow those two key obligations, we'll be on the clear. And of course, I'm happy to discuss if you have questions about particular licenses. These obligations occur when you distribute the code. So we'll discuss how that works with a few examples. The common permissive licenses include the BSD-MIT license as well as Apache. We use Apache for mobile apps. It's sort of our recommended license but there are others you can use. If you see a new one and you're not sure exactly how it fits because there are much more than these three that we can discuss but there are also benefits to sort of sticking to a few standard licenses but just using Apache as many times as you can. So what do you do if you distribute a piece of binary code like a mobile app under a permissive license? The answer is pretty simple. All required to do is include licensing information inside of the application. This is something that you do already. If you look inside the Wikipedia mobile app you'll see a list of libraries used with links to the license and that material is readily available for people who use the app. You'll find this all over in many other apps as well. And we actually go above and beyond. We distribute copies of the source code but that's not one of the things you're required to do. Under a permissive license, we just do that because we sort of go above and beyond in this case. So what happens if you use a permissive license in the side of the service where you're not actually distributing the binary source code but you're just using it perhaps for renting software? This is even simpler because when you're not distributing a permissive license the obligations don't come into effect. This is part of why people like permissive licenses. They're just very easy to use without really thinking about what the license requires of you. So there are two exceptions that you need to always keep in mind. If you do share source code you should always preserve the licensing information and if we do go above and beyond what we're going to do inside of the Wikipedia mobile app this is an important thing to keep in mind. And then second, if you're putting the source code into Debian packages or other distributive repositories you might be required to include correct licensing information and there are tools that can help you catch this sort of compliance. The second category of licenses are the copy left licenses. Copy left is different than permissive. As I said before, it's designed to protect freedom. In a nutshell, copyright licenses allow you to use and modify but if you share binaries built upon the source code then there are some additional obligations that go into effect. You have to share a copy of the source code and you also have to release the code that you have incorporated into under the same license. This is sometimes called a viral clause. A better term might be reciprocal or predatory but the key is that you have to give the code under the same license that you got it under and that is how it protects freedom. It ensures things are not locked up by someone who is contributing modifications to your code. So in our context we usually use the GNU General Public License or the GPL. It's far away the most common viral or copy left license that you'll see and it's widely used for media wiki R software but many other pieces of software as well. For our purposes we use the GPL Virgin II. GPL Virgin II and V3 are very similar but they are different licenses so you should keep that in mind. When we label things for the Wikimedia's purpose we use GPL V2 for any later version. That doesn't mean we put V3 on there. It means we put the phrase any later version on our code and that's just a standard practice that people do with a GPL software. So what do we do if we include a GPL license library inside of an application in distributed binary form. This is similar to the first example under the permissive licenses but the outcome is slightly different. The copy left license will require to include the license in the app which is the same as permissive but will also require to do the source code available to users and then also under the same license. That is an additional requirement that is sometimes burdensome for people. So one question that you're probably wondering is what must you make under the GPL? What does this viral clause actually apply to? So if you're including the GPL code inside of the code of the Wikimedia app then the answer is more clear cut but if it's a separate library that's being incorporated in there the answer can get a little bit tricky. I recommend talking with the legal team to understand exactly how this works. If you have any questions about combining the GPL with other licenses then we probably need to discuss how the combination is being made in order to determine if the viral clause is coming into effect. So what happens if you're using GPL license libraries on our server? The second example under the permissive licenses and similarly to the permissive licenses there is no obligation because we're not distributing the code to our users and buyers we're merely using the code. Now there are two exceptions as we discussed before when we do publish the code we have to maintain the licensing information and if we're distributing this in a different package then we have to comply with those rules as well but for the most part using the code doesn't trigger the license compliance. The apparel license is not technically a new license but it's one that's becoming really more popular in the age of web services it goes above and beyond what the GPL requires so this is the last category of license and it consists largely of one which is the apparel general public license of the AGPL. The AGPL requires you to distribute code even if you haven't distributed the binary version if you're using it to create an interactive service that users can access over the web essentially if you're using something like Media Weekly as a web service then you would need to also distribute comply with obligations under the license. This is very different than traditional open source licenses but it's designed to ensure that we can protect freedom in sort of a GPL style in an era when people don't often distribute code but just run into services. So if we're using an AGPL license library on a server which is user facing we'll use actually interact with that library where required to make the source code available and then possibly that requirement extends to the AGPL. So this requirement is rather big I mean a lot of very few people actually use the AGPL there are some notable services to do but it's not as common as the GPL. The GPL was sort of designed with Linux and GCC in mind and the AGPL was designed with web services in mind but the just changes in licensing meant more things end up permissive to remember something like the AGPL but if we're looking to sort of protect freedom and be aggressive about that with a web service then the AGPL might be a choice to consider. There's some open questions just like the GPL of exactly what source code you're required to release but unlike the GPL we don't have a lot of industry practice around it since it's a newer license and not as widely used. The legal department has sort of a checklist that we can help you walk through if you're considering using the AGPL or if you want to release some software you've created under the AGPL but the standards are sort of unclear. We also have an additional new question what is user-facing code or code that users interact with so that means that not necessarily everything we run on our servers that's released the AGPL would trigger the AGPL requirements but this is sort of a complication that we can work out together. So in summary basically three things to keep in mind always include licensing information when you distribute code if you're using a copy left license and mobile or distributing the binary they need to consider it in a little more detail and then be extra careful if you see the AGPL anywhere and discuss with us I have a few additional points on top of this besides those three that we just discussed. So first always keep track of where you get your code many developers don't think about it but just because you see it as under open license doesn't mean it's free to be used in any way at all if you're copying and pasting random code from around the internet you might end up with licenses that are incompatible. One notable example is Stack Overflow which uses commons 3.0 for their license with a Sherlock clause that can have some complications when using it with GPL so please avoid copying and pasting code blindly from Stack Overflow into GPL license code if you have questions about where code came from perhaps it's from another license that is compatible with the GPL since things might have had an original repository other than Stack Overflow. I think the open source initiative is trying to get that bit stacked should we? No, we're trying to get Stack Overflow to fix the license for their code. Yes, because using CCBiase for code is a super-bottom organization would possibly be that right. Yeah, so we have yes, and Creative Commons has worked hard to get CCBiase 4.0 compatible with the GPL to help us exactly how that will help Stack Overflow? Only GPL v3, Greg. Yes. And yeah, so just status quo, be careful but in general it's good to know where your code came from even if it's not from Stack Overflow because that can help us track down who created it and in practice people who release their stuff on Stack Overflow or elsewhere and being used by Wikipedia is awesome so we can talk to them about giving them to use up for the right license if you're under that. So, be careful about incompatible licenses. As I said before, just because it's free doesn't mean that it is free without conditions and sometimes those conditions don't match up easily. So there are sometimes some Mexican matching problems here are a few of the most common ones but if you're ever in doubt come talk to the legal team, we can sort it out and basically make it so you don't have to worry about it. It's really helpful to use consistent licenses to avoid this sort of mix and match problem so we recommend that you use the GPL v2 or later for media wiki standard work. This helps ensure that we use the same license we're on the same page and we avoid unintentional mixing and matching problems compatibility questions. If you're creating extensions other scripts to do things permissive license might make more sense but if you're contributing to media wiki core or media wiki extensions then GPL v2 is a great statement. Another really important point is to ensure you have a license header and a file and all of the code. It not only helps us keep track of the licenses that we use and obligations that we have but it also helps others comply with the licenses that we're giving them. Basically we should make it easy to understand what their obligations are. So in practice the licensing files can get separated from the code that they're attached to so using a consistent licensing practice can be very helpful. The best way to do this is to literally include a license file in the top level directory and then include brief headers and all of their source code to save the file. This file charges for media wiki core so you're telling me that only about just over a quarter of having a clear license header indicating that it's GPL v2. So this is based off of an automated code scheme so it's possible that the automated code scheme isn't picking up on the standard practice that we're doing. It's possible that we also, it's likely that we don't call it a standard practice. I'm just surprised because that entire file story is supposed to be GPL v2 plus it's policy. There are things that end up in there as files or javascript. It has different licenses, right? And you're actually right, there's a piece of it that is online. So using something, using a consistent practice can be a great way to kind of change that pie chart and once we know what that practice is we can build a better scanning tool and then make sure that we have a pie chart that actually reflects what's going on inside our projects. I mean I did that with an open source scanning tool so we can manage that. And that is, this pie chart doesn't have a date on it but it's now 2 years old so it might be worth it. Is it 2 years old? Time passes. So the Software Freedom Law Center recommends this header standard. It's a little bit wordy but I think it's a good just reference point as we think about what we want to do in our own source code. So a few key things that we want to note here. Really? So what this has is a reference to the top level directory where you can find the license file so it doesn't actually mention the license in the header itself, it just says go here and find out what our license is which means we can have this in every single file and keep track of one license for the repository, that's all it has. It doesn't have a list of names it just has a credits file so we don't have to change all of these headers every single time there's any license a new person contributing to it and again it keeps everything in one central license file this makes it easier for programmers but harder for code scanning tools so if we do want to follow this header consistently we need to just keep that in mind and create our pageants. So this can be easy, there are Emacs extensions tools to add this sort of thing in it's possible to have your ID collapse down the license header so you don't have to look at it all the time it's just there for others who download the code and again as I said before it helps other people know what we're using I think we've all had trouble when we found something awesome on github that has an unincorporated license and we can't tell if we can use it or not we don't want to put other people in a difficult situation and then it can just make it easy for us to track what we're doing in our component and then finally if you've ever asked to sign a CLA or contributor license agreement come talk to the legal team we do sign these but for a variety of reasons they can be somewhat different so talk to us before you do end up signing one basically these are tools that other projects use to keep track of everyone who has given them an open source license they have a lot of great advantages but we just want to make sure they're not going above and beyond what they claim they're doing so if you ever see a CLA or ask to sign one to contribute to something like HHVM come talk to us and make sure it meets our standards and then finally you have a legal team we love to talk about open source licensing both the history, the philosophy as well as the practical impact on you it's free to reach out to one of your lawyers or even ex-lawyers because we love to talk about open source so some credits for the slides and I think I have a few questions I'm happy to discuss with anyone this time so like OIT we have something going on it should be the generic license we put on in chief appeal yeah so we don't really pay anyone to use this it's our intention and it's not part of a larger project it's just sort of your yeah like some of the scripts here and there and then just be careful for using other things from a wrapper around another script that will include a license for that yeah and when we're talking about just simple scripts then I think a permissive license is usually best it's fine like I think the MIT one yeah in MIT you know works great we use Apache for a lot of our stuff we can discuss some of the pros and cons of them but any one of those three that I mentioned Apache, BSD, MIT so there's a couple questions from IRC one is regarding CLAs does WMF have a page somewhere listing CLAs that have already been reviewed and out come and I know I also have some questions to you yeah I don't believe we have a page listing CLAs that we've signed but we do have records of them in the legal team so if you if you have one come talk to us and perhaps we can put together just a general list of CLAs we look at that's when Brian gave us a good question I was wondering about can you talk about the so difference with the GPL and the AGPL as I understand it for the stuff we do we could change to AGPL but what would that but presumably that would have an effect on other people deciding whether or not to use media wiki software yeah so I guess there's really two questions there so the first is you know just the difference between GPL and AGPL that is a matter of when we're when we're going to require people to comply with the terms of the license so people don't often distribute media wiki as a binary they often just use it as a web service so right now the requirements of the GPL are not as applicable I guess to most uses of media wiki and putting it under the AGPL would increase the number of like uses that would require them to look at the license and comply with the terms so if we wanted to ensure that everyone making modifications to media wiki and running it for users to interact with was required to release those changes under the AGPL then the AGPL is the license to use it's sort of more aggressively pushing the copy left terms onto our user base so that I think is an open question and I think increasing the burden for users to comply with the license has pros and cons right it can help us get better code but it can also decrease the number of people who use our software it's something to discuss and consider but I don't think there's a clear cut answer on whether or not it's good or bad the second question is can we make the change from AGPL to AGPL so let's say we decide that's a good thing we have a license under the GPL from all of the contributors to media wiki but that doesn't necessarily give us the power to change the license for the entire project historically so we would need to have a conversation about how not only should we make the change but how we make that change happen is it for the project going forward or is it for the existing contributors agreeing to a new license but those are two I think relatively big questions and in my view we want to see a pretty big game for the project pretty big advantage for the whole ecosystem to make the switch I know that that has another question I'm sure I have tons actually the other one is for a piece of sample code it sounds like it might be like for example the boilerplate expansion where we're saying we want people to use this it sounds like it might be better to use a more permissive license for that because we're encouraging people to use it but the second part of that is when you excerpt code you kind of say oh here's a way to do I don't know register an extension and I quote part of a a licensed piece of software should I be sort of concerned about that or is the level of a function short enough not to worry about yeah so I that is you know a question that doesn't have a super clear cut answer and you have just like you can quote a piece of text in your essay you can quote a piece of code in your documentation without necessarily requiring permission from the authors right that is something that's generally allowed under fair use so I depending on what you're quoting and how much you're quoting it you don't need to you don't need to comply with the terms of the license but as I said before compliance isn't just a matter of like the technicalities of what we're legally obligated to do providing a link back to the code repository that the quote came from and giving credit to the authors as much as you can is a good way to kind of inform your users about what's going on so I think it's a it's a very kind of it's more fluid answer than you probably want but I'd recommend always you know linking back to the repository so that people have a way to find the license information that's really curious okay great I have a question go ahead great yeah it's actually kind of a follow on to s's question and I'm sorry for asking it so yesterday I was looking at an issue where we have a couple extensions that are bundled in the tarball and they have unclear licensing history not terrible licensing history just not clear licensing history and the question that came in my head which is one that I'm sure that you've thought about but don't want to give a real quick answer on and you're free to say let's talk more is are all extensions to media wiki just by defacto gpl v2 to the media wiki core being gpl v2 given the tight integration between extensions and core reason I ask is that would make this issue that I'm seeing clear cut and dry you know like not an issue yeah I think it's something that you probably want to talk a little bit more about but I do encourage you to look at the overall intentions of the users I think users contributing to a repository that is under the R&D but in making it an extension in some separate form they're uploading it to media wiki.org there's lots of different places that people could be providing us permission depending on how exactly we acquired that piece of code really we have a license file in there that tells us what the authors of the code want but if you're concerned about something then we should smooth around and try and find what it is, where we got it from what they might have agreed to as part of the process and ultimately talk to the authors if we need to to get the right permission in place but yeah I do see your point these contributions are being incorporated inside of a larger a GPL project and that's sort of the intention of their copy left pause in the GPL but unfortunately the application of that copy left pause is not always cut and dry so we're probably going to talk a little more in detail before we just assume it is GPL by the base definitely because we haven't had that I know of fortunately or unfortunately the WordPress moment where WordPress went through a big discussion with FSLC etc about WordPress plugins I think they call them right and their copy left status due to WordPress core so yeah that's just a, sorry, teaser really mostly for what it's worth I think strictly is a legal matter putting inside the ethical issues that's a legitimately complex question I don't think it's as clear cut as WordPress made it sound at the time so it's something that we could make a good faith interpretation either way so it's a little bit more of a policy question than it is a legal question yeah agreed and just so it's clear the extensions that are referring to are predominantly contributed by staff members or long time volunteers that have other GPL v2 contributions in the repository etc so it's pretty clear socially what the intention is the technical legal stuff but yeah and I think it's just important to keep in mind even if we are in a sort of legal gray area that is acceptable or we have an agreement from a WMF staffer to release everything they write as part of their job of the GPL but they just haven't specified inside of the code that makes us entirely in the right under the GPL but if I'm not a user of it then it's not very helpful so we still have work to do any last questions or anybody else sure another thing I ran across was I wanted to take a picture of a screenshot of some code in a one of those online editors and I think it was CodePen and so I was wondering so the code was freely licensed whatever just something that Crinkle wrote but they claim that their website was copyrighted so then it's like well it's free code but I wanted to kind of show it running in this cool editor and I tried tweeting them they couldn't really tell me whether it was okay to do it I think they finally said go for it I don't know do you have any thoughts on that where you're sort of taking a you're taking a picture of something that's free but it's kind of framed within an online editor or a site like Flickr or Google Photos or something yeah as a copyright matter the code that you're seeing inside of CodePen is freely licensed but the the license doesn't go into a lot of detail about the copyright status of things generated by that code and it probably shouldn't right and we don't need a 15 page license that covers all the eventualities so I think the copyright status of the code is probably pretty easy to figure out though the frame around it could be a fair use I mean including screenshots of things is a really helpful way to inform users about what's going on on the site it's being used in a very different way than the site itself and I doubt that frame is something that they are worried about protecting but these are all sort of subjective considerations and will depend exactly on your jurisdiction so legal caveats aside you're probably safe but if you're concerned you can talk more alright thank you thanks everybody for coming to the talk see you next time