 Now, everyone, thank you for joining in this late session of the day. I'll try making it a bit entertaining for everyone. And we'll start with something interesting from last week. Last week, Java and Scala developers, programmers, woke up to this announcement. As soon as my clicker works with my computer, I can even show you that. OK. OK, this one. LightBend is changing the license of Aka from Apache 2.0 to a non-open source license, specifically to BSL, Business Source License. And I didn't pay them to do that as a teaser for my talk, by the way. Aka has been around for over a decade. It's a very popular toolkit. Many use that toolkit in their code, in their programs, in their products. Now, imagine what it feels like last week. In my company, we've experienced something similar twice over the past couple of years with different projects. It was a painful and an insightful experience. And I'd like to use this talk to share with you some of these insights and look into these cases when your open source turns to the dark side. I'm Doton Horvitz. I'm the principal developer advocate at logs.io. At logs.io, we provide cloud native observability platform that's based on popular open source tools. So this topic is very relevant for us, as you'll soon see. I'm an advocate of open source and communities in general. And the CNCF in particular, I co-organized the local CNCF chapter in Tel Aviv, where I'm based. I also have a podcast called Open Observability Talks about open source based observability. You're all welcome to check it out in your favorite podcast app. And as you probably have noticed, I'm a bit of a Star Wars geek. So funny anecdote from last week. I was invited to speak at Container Days. Someone was even there to see. And I was surprised, amazed to see how many Star Wars LEGO kits were handed out there in the booth. So may the open source be with you. And let's start with our experience from last year. So it was the beginning of 2021, second week of January. Everyone coming back from the New Year's vacations, starting the year. And then a bomb dropped on us. On January 14th, 2021, Elastic NV released this announcement that it's changing the license of Elastic Search and Kibana from Apache 2.0 to a dual license, SSPL, and Elastic license. And not only that, it's going to be taking effect immediately. Back then, the latest was 7.10. 7.11 was due a couple of weeks later. It would already be with a new license. For those who don't know or are not familiar, the ElkStack or ELK is a very popular stack for search and for log analytics. It's been around for over a decade. It's based on Apache Lucene. Elastic Search is essentially the database. Then Kibana is the visualization. And then there's LogStash, FileBeat, and other tools and libraries for ingestion and instrumentation, all of which used to be Apache 2.0 until that point. As I mentioned, I work for Logs.io, and we provide an observability platform that's based on open source tools, such as Prometheus, Jega, and also Elastic Search and Kibana. So for us, Elastic Search is at the core of our system. It's a critical system, critical component. And we've been investing in tweaking and optimizing it for our use case for years. So you can only imagine what the confusion that this announcement called. And in fact, it was a bit even more confusing because the title of that announcement that I mentioned was doubling down on open. And the announcement itself talked about, as you can see here, this license change ensures our community and customers have free and open access to modify, redistribute, collaborate on the code. Free and open to distribute, collaborate. Sounds a bit like free and open source software, right? Maybe it's not such bad news. Maybe SSPL is open source. And we weren't the only ones confused by that. Actually, shortly after the announcement, because of that, the OSI released a special notification announcing and declaring that the SSPL is not an open source license. It does not comply with the open source definition. It discriminates against specific fields of endeavor and essentially describing it as Foxpen license. And that's, by the way, very important. Source available is not open source. It's Foxpen source license. So as you might imagine, lots of shock and confusion, not by us, by the entire community. Lots of turmoil, many posts, social media, blogs, articles. Doubling down on open is not open at all. Elastic search and Kaban are now business risks. Elastic promises open, but delivers proprietary, even angry bunny rabbits, which is really spooky. So that was the sentiment at the time. And shortly after, people started calling to Fox, the project to keep it open source, to keep it Apache 2.0. And I'm glad to say that we at Logs.io said that immediately, Outloud came out with this very clear message that we are in favor of Foxing and we'll do everything we can to do such a forking effort. And the far greater player in the market, AWS. We have a member here that is prominent from AWS. He can testify, decided to step up and make this fork happen. The sentiment was very clear. People indicated that they would prefer such fork over an Lastic SSPL version. And that as soon as such a fork is made available, they would switch from Elastic Search to that new fork. So what happened with the licensing? What did the community do? What did we do? I'll back to that very soon. But first, let's talk about what is open source anyway. We all know open source licensing, sale, Apache, GNU, MIT, lots of material about that and discussions here in these forums. However, is open source software licensing enough or prevents the project from changing license? And very importantly, who can change the license? The OSI has a slogan on its website that I really love, guaranteeing the hour in source. So following the same vein, ask yourselves who is the hour in source? Who governs the open source project? And essentially the three main categories for that. The first one is open source by individual maintainers. Free maintainers, open source maintainers, enthusiasts doing it on their own free time. Actually that's the vast majority of projects out there on GitHub and largely even just one or two maintainers behind the project. Here you see a couple of very high profile ones like Curl that is deployed on millions of devices from your washing machine to your car, single maintainer. Log4j, we all remember Log4Shell from less than a year ago and the extent of its reach. I think 10th of Maven Central was somehow dependent on this vulnerability to maintainers. So I think this is very clear. This is the first category. The second category is open source backed by vendors. We saw Elasticsearch in Cabana. We can talk about Grafana, MongoDB, Acca that we just mentioned re-licensed last week. That's the second category. And the third, we're here at the open-source summit by the Linux Foundation, an excellent example. And obviously all the affiliate foundations, the CNCF, the CDF and so on, excellent examples. And unlike maybe the other categories here, there's more diversification. It's multi-vendor, multi-entity. So there's vendor neutrality in many senses of the world to a greater extent. So these are the options for who can govern the project. And why is that important? You'll soon see. Now let's look at some cases, case studies of open-source turning to the dark side. And we'll start with a case of open-source turning, going non-open source. And let's go back to the Elasticsearch example, case study that we started talking about. We saw the announcement, re-licensing Elasticsearch and Kabbana. And by the way, Elastic NV said that he did that to fight off competitors, primarily AWS, that had, it was very big and had the commercial offering based on that open source. And while Elastic felt that they were the ones doing all the heavy lifting. That was the reasoning behind it. And just to put it in frame, Elastic NV itself is not a small company. It's a publicly traded company, around the eight billion dollar market cap or so. That's it. But the problem is that it didn't end there. The rest of the Elk stock that we mentioned before remained Apache 2.0. But then, they started introducing breaking changes to the components to make sure that they comply with the Elastic's official distro. That they work with an official distro. So just for an example, there's a, for example, file bit. It's an agent that can read logs off of a local log file and then send them to a remote Elasticsearch cluster. And it stayed Apache 2.0, but then there was a piece of code that checked that if the remote Elasticsearch is not official or not certified, it would not work. So many people upgraded the file bit, for instance. Suddenly it stopped working for them. For other distros, but also, by the way, for older versions of the open source. So if you still ran Elasticsearch version 7.10 or older, it would stop working for you as well with the open source. So it happened with file bit, with other bits, with LogStash, with client libraries that are used for instrumenting your source code. And every other day, you'd hear another user, another developer that upgraded and started, stopped working for him or her, and they started digging into the code or documentation, finding these things, pieces of code, and obviously being raged by this thing. I think the best description for those who don't know Elasticsearch is this tweet that gives the analogy from relational databases. Can you imagine the reaction to Oracle's MySQL team if they had decided to fix MySQL client libraries so they could only connect an official MySQL version? So that's it, and as I mentioned, the community called to fork the project, created the fork, the fork was brand named OpenSearch. It was led by AWS, together with Red Hat, SAP, Capital One, and my company, Logs.io. And you'd say, okay, what's the big deal? Just hit the fork button from the Apache 2.0 version, and that's it, right? That's what OpenSearch is meant to be. But then we discovered that it wasn't that simple by any means. That's, by the way, a summary from the community call on the project effort, and as you can see here, the engineers that went in to do the forking discovered that both Elasticsearch and Kibana projects were entangled between the Apache 2.0 code base and the proprietary XPAC code base. So they needed to separate it one from another, sometimes even line by line traversal. It was not exactly the fork it experience that you'd imagine. And inside there were also other things that they found like dial home features, telemetry that was collected, some branding elements. So things were very entangled there between the open source and non open source and elastic proprietary things. And if you want to hear more about that, first of all we have here my distinguished gentleman Kyle here, but also we had an episode on Open Observability Talks podcast. You're more than welcome to check out the episode, relaying beautifully all this journey and great effort that was made to make this fork happen. An amazing experience. And gladly, half a year later, July 2021, the fork reached 1.0, open search reached the general availability. And shortly after many started moving to using that, including some big names such as Dow Jones, Goldman Sachs, Pinterest, SAP, Zoom, Rackspace. Obviously Amazon moved there to use it. Logs.io moved to use that. So that's the story with Elastic Search. Elastic Search was the example of an open source going to a non OSI license, a non open source license. But remember what I said at the beginning, open source is more than just a license. And things can happen also within the OSI licensing realm. For example, going copy left. And I'd like to look into the case study of Grafana. Grafana is a very popular open source tool for metrics dashboarding and monitoring. It's Apache 2.0, it's backed by Grafana Labs, which also offers Loki and Tempo other projects. And in April 2021, last year, Grafana Labs released an announcement that it's re-licensing Grafana, Loki and Tempo from Apache 2.0 to GNU-AGPL version three. And by the way, Grafana explained it by needing to balance the open source community needs with their commercial, Grafana Labs commercial needs, something like that that essentially comes down to again fighting off competitors using the open source project. AGPL is an OSI approved license that meets all the criteria of free and open source software. So what's the problem, right? So the problem is that there's a new reality. People woke up to realize that the open source tool that they use has suddenly become infectious. It was a copy left license. For example, Google in its official open source policy bans use of AGPL saying very clearly that the risks heavily overweigh the benefit. And it's not just for them. It's for many others, the case for many others. So why is that such a risk and what is that copy left anyway? So without going into the legal talk, I'm not a lawyer by the way, so bear in mind as an engineer putting it very plainly, using AGPL software with modifications requires that anything it links to must also be licensed under AGPL. So it spreads effectively virally in this case. And so it means that if you modify the code, you're at risk of license contamination. And actually, even if you link to it like a DLL, you may also be at risk in some cases. But even more so, this is triggered if the AGPL software is interacted with through a computer network. That's section 13 of the license, which effectively means that if you think about it, you don't have to actually package a product and ship it to be liable. Anyone connecting to it, it's already valid. So think about Google or any other SaaS company. The actual product is a service that users interact with via the internet. That makes AGPL a very problematic license business-wise to SaaS model. But just one note, even if you just use that for internal use only, okay? Let's say that you don't expose anything. Even then, it might be tricky, this viral effect, because let's say that you have, I don't know, vendors working for you, contractors, temporary employees, something like that. From the licensing perspective, a user is a user. There's no distinction between internal and external. And you might find yourself needing to expose source code or things that you hadn't planned on exposing. So definitely check it out, even if you just use that internally. It could be quite infectious. But we're here at the open source summit and it's important to say it's not only problematic for vendors like Google and others. It's also problematic for the open source community itself in some cases. Because of the license contamination. If I'm a project that wants to be Apache 2.0, that's my decision. I definitely do not want to be imposed by another license because of a tool or library that I use. And that's actually what happened with Grafana. It's widely used by quite a bit of projects under the CNCF. And after Grafana's announcement, the CNCF, the Cloud Native Computing Foundation, released a very clear clarification saying that if you use an AGPL, they didn't name Grafana. It's a general guideline, of course, for AGPL. It's an open source, but it's very problematic. And because of that, and you can read the problematic part, but the guideline that they provided, the directive, switch to an alternative component or freeze the component, the version prior to the re-licensing. Do not, please do not upgrade to the re-licensed version or ask for an exception. That's the guidance that the CNCF released to its projects. So we saw examples of Elastic NV and Grafana Labs. But it can happen not just with vendors, which brings me to the third case study, case study of two very popular NPM packages, Colors and Faker. Both of them MIT licensed, a very permissive license. I'm sure you all agree. And both maintained by Morac Squares, an individual, a single open source maintainer doing it on his own time, on his own free will. And earlier this year, January 5th, 2022, Morac deleted the entire code base of Faker and released a new version to NPM, a new package, 666. Now I found it a bit ironic that the logo for Faker is this magician's hat and then, poof, disappeared. But jokes aside, Faker had around 2 million downloads weekly, in addition to many, many other projects, JavaScript, Node.js that had dependencies on this project. So just imagine that poor guy, Falk, that upgraded automatically to the latest update. And what happened? To his defense, Morac gave heads up a couple of months earlier in an issue on GitHub. You can read here, but essentially, no more free work for me. I'm no longer going to support Fortune 500s. It's plainly pay me or fork it. But it didn't end up with Faker. Three days later, on January 8th, 2022, he released a version of the colors package, the other package, with essentially a malicious code, an infinite loop, that essentially turned any Node.js server using it into a denial of service situation. And colors is even more popular than Faker. It has 20 million downloads weekly and has other 4 million projects on GitHub dependent on it in one way or another. So obviously, immediately after this release, it created the ripple effect. Many projects went down, were broken, including some very high profile wines. The AWS CDK is a great example, the Cloud Development Kit, until NPM rolled back the rogue release and stopped this from getting worse. Mark released this blog post titled Monetizing Open Source Is Problematic, which I think puts it very plainly why he did this. Find it, need to put these extreme measures in place. So that's the case for open source going rogue. We saw case studies past year or so, so not even going too far back of open source going non-open source, open source going copy left, open source going rogue. But what can we learn from these cases? So I would like to go over some learnings for building open source, for using open source, and for vetting new open source for your organization. First, if you're building open source, remember this. Open source is not a business model. If you're building or considering to build an open source, please take one thing. If you're taking one thing out of this talk, open source is not a business model. The problem is not with the commercial vendors, as we've seen. It's with the commercial incentive. So if you're a vendor and if you choose to go down the open source path, you should have a sustainable model in place. If you don't, you will end up in conflicts between the open source community needs and your business ones, and you'll end up doing things such as re-licensing defensively and pulling the rights ratchet on your users, as it's sometimes called. Not to mention also essentially ripping off the community members that actually contributed their code and time into the project, which is another problem with the CLA, the Contributor License Agreement, versus the DCOs and others. I'm not going to open this discussion, definitely an important discussion to have in the OSPO forums. So that's for vendors. And if you are a maintainer and if you decide to open source your project, please do not expect material compensation. Yes, even if all the Fortune 500s are going to benefit from your project, there are enough opportunities out there to get paid for development, for coding, by the way, even for developing on top of open source as an employee by companies. But this is not the way, open sourcing a project. And of course, if you do want to monetize your project, you can build a vendor entity around the project and offer services around it. You can see the examples of Chronosphere for M3 or Confluent for Kafka or others. And if you go down this path, of course, remember my advice for vendors before. So that's for building open source. If you're using open source, here are a few best practices on how to keep safe. First, manage your third party licensing exposure. Same way you manage your security exposure. So prefer least restrictive licenses that meet your needs. Look for license contamination. If you work with S-bombs, we had quite a bit of talks about S-bombs and do use them, not just for the security, but also for mapping the components and their respective licenses and then map that. Manage your third party licensing. Next, take care with automation. Put in place license compliance checks before updating third party components. Don't do auto updates without safeguards in your CICD pipeline or whichever automation you have in place. Also, code smells in the open source can signal something is wrong there. And that can buy you time to act proactively rather than reactively. Remember the examples we gave before with things such as entangled code, some dial home features, things such as that. Obviously, code smells require you to have some familiarity with the source code of the project. Not everyone have that, but it's not infrequent that more heavy users of an open source go in, whether to modify or to just understand how it works better. So when you go in, keep your nose open for code smells. And lastly, if you do find yourself needing to tweak the open source to suit your needs, please prefer extending the open source functionality with plugins over downstream modifications. Or if you can, do contribute to them upstream. And the reason behind that is that vendors blocking plugins is less common than blocking code modification via relicensing. So that's for using open source. And if you're vetting a new open source for your company, your project, here are a few things to consider beyond the standard thing that you already have in place. First, the obvious, which open source license. And remember, not all the OSI licenses are born equal. For example, the copy left licenses. And also remember, source available is not open source. Also ask yourselves and check who's behind the open source. Is that a single maintainer? Whether a maintainer or a vendor, just a single entity is a single point of failure. Be careful of that. If it's a vendor, remember, it may pull the rug from underneath the feet. You can find yourself in the right ratchet situation there. And obviously, from that respect, foundation open source is the preferred way. In that case, it provides more diversification of the entity as much as it can. It also has its risks, of course, but important. Also ask yourselves, what is the governance policy? What do they, how do they ensure that no single entity grabs control? What's the promotion path to contributor, to maintainer, who can review PRs, who can approve PRs, who can ultimately perform such relicency? And again, here, foundations help by facilitating the governance and providing oversight on that respect. So a good advantage there as well. And lastly, if you do have these problems in great concern, you can consider vendor distros or some SaaS offering over the open source that can shield you from some of this. So distro is essentially a packaging of the upstream open source delivered by a vendor, but it's delivered with indignification, along with some support, some hardware certification if you run it on-prem or if you run on the cloud, it could be as a SaaS model. And on the way, you can also help fund the open source, because many of the contributors to these projects are actually these companies and vendors that provide the distros or the services around the open source. So that's for vetting a new open source. Now let's summarize what we've seen. Open source is more than a license. As we've seen, open source can turn to the dark side in many ways. It can be re-licensed, it can go rogue, or otherwise pull the rug from underneath your feet. It can happen to veteran projects over a decade old. It can happen anytime. Remember Aka just last week? So beware of the bait and switch stunt. For me, it's a personal concern to see this rice ratchet model spreading. I actually wrote a blog post about that over a year ago. You can see here. I called it is vendor owned open source and oxymoron. You're welcome to check it out with your QR code. And to summarize what are the best practices. So first, select open source wisely. Check which license it is, who's behind the project, what governance policy is in place behind the project. Also, use the open source wisely. Manage the licensing exposure. Don't auto-update without safeguards. Beware of code smells and so on. And last but not least, build open source wisely. And again, remember, open source is not a business model. My great concern is that people start losing trust in the open source over these vendors' activities. So always ask yourselves, who's the hour in source for the project that you're looking into? I'm Doton Horvitz. Thank you very much for listening. And may the open source be with you. And I believe we have a bit of time for questions. So glad to answer any questions that people have here. No questions. Yes, please. I don't have a question. I have a comment. Just a second. We need a microphone for that. Or I'll repeat the question. I thought there was some microphones. Say, please. I'm a bit surprised that's why the license changed regarding elastic. And in the end, you mentioned that open source is not a business model. You also mentioned that the IOU company used it for ages. So basically, what happened is that elastic search put a lot of money into the product. And you use it for free. And as far as I understand, perhaps I'm wrong. It all started because somebody, another entity, wanted to use elastic for free, get all the benefits. And they're nothing. Let you forget to mention. OK, let me read the question for the audience, and then I can answer. So we asked why elastic NV, the vendor, decided to do the relicensing, the reasoning behind it. And he also asked, as a follow-up question, if people use the open source without paying anything. That's, in summary, it was a long phrasing. I hope that I summarized there shortly. So first of all, I thought that I explained, but I'm glad to repeat the explanation. As I mentioned, elastic, and I quoted what elastic NV explained itself. I can't speak on their behalf, but what they explained, essentially, is that I'm not repeating word to word, but the quote was there, but that they're essentially fighting off, or that they're competitors making use. I think it's similar to the vein that you said, making use of the project. But elastic is doing the heavy lifting in contributing to the and maintaining the open source projects. So that was the answer to the first question, which is? It's a conglomerate. Sorry? It's not really. Cloud provider is an aggregate of many, many services. It has data streaming. It has a database. So it's a conglomerate, but in that specific vertical, they had a managed service that provided, essentially, a managed version of elastic search to that. So that was the, and AWS was the biggest threat, obviously, because it's a giant. And they mentioned that as part of the reasoning, yes. And for your other question, you said that others use the open source without paying for it. I think that's the essence of open source. So I don't know how to react to that. We're all in a, this is Ospoke on. So everyone is familiar with the open source model. The model is that you have an open source, and everyone is free to use it. The free is not the cost, by the way. The fact that you don't pay money. The free is the freedom to use it in whichever way, shape, and form that you make use, including commercial uses. And actually, the fact that it discriminated the new license, the SSPL, server side, something license. I forgot the initial there. It was discriminated between fields of endeavor was against the open source, the very open source definition. So you invest a lot of money in a product. And then somebody says, oh, that's super nice. We pick it up, and then we make all the money. So it's, in my opinion, a problem of economy, just like the individual contributor. And it's a bit more complex than, oh, you should be careful about open source. It's, you get what you saw. So you're opening a whole discussion if you agree with the open source initiative about the open source definition. I'm not going to open that. It's out of scope for this. No, no, I'm firing with that. But I'm just saying what I presented was not to challenge the very definition of the open source definition. Open source has a definition. You may say that the open source definition is wrong, or it needs adaptation. It's an important definition. But what I presented is the alignment with the open source definitions as they are. And this, at least I showed also the OSI's specific announcement. It said, this does not comply with the open source definition. It's not open source software. If we need to change and maybe invent more licenses, it's fine. My personal opinion here, and I didn't go into that here, but I'm glad to say. I think that extending the definition of open source to say open source is also the business model is wrong. Someone gave the analogy of, how do you say that? Someone tells you the sale of oranges went down 60% and said, OK, let's extend the definition of an orange to pairs and, I don't know, grapes. And then we'll get to better. So it's the same thing. Say open source will also be some sort of commercial licensing, and then we'll have more adoption of open source. No, open source is open source by the very idea that you put something out there. And people use that to whichever need they have. And let's remember, these vendors became successful because it was open source. So they pulled people in to use people that might not have used them. I see that, actually, from the discussions after ACA's discussion. I followed discussions on Reddit and others. People said, we would never have chosen ACA had it not been open source. So it was a great funnel for them to reach that point. So a bait and switch where you say, I'm open source, come use me. And after you then start tightening the ratchet on the licensing, I don't know. For me personally, but that's already beyond the facts that's my personal opinion, I don't think that's the purpose of open source. New licensing models, actually I had a discussion with Stefano, the managing director of the OSI about that. I think we need to cater for more licensing options. There is a variety of licensing models within the OSI realm to cater for several models. I suggested choosing the right model. I would say also educating more about building sustainable business models is very, very important. But again, that's a whole talk on its own about how to build a sustainable business model around that. I hope that answered the question. Any other questions? Anyway, so thank you very much for listening. You can find me tomorrow. I'll be at the CNCF booth at the exhibition hall. If you have any questions, I'll be around today and tomorrow. Or just reach out to me at Horowitz on Twitter, LinkedIn, Quora, WordPress, whichever medium you want. I'd be happy to follow up with questions. Thank you very much for listening.