 Are we good to go okay? Thank you for coming to our presentation today So before I begin I can have a quick show of hands. Who's running like a Linux server right now in production at their company? Okay, and of those people that have their hand up How many of you have your legal team concerned that the Linux kernel is GPL license? I'm sorry for you But for everyone else right GPL the Linux kernel being GPL license is just a normal part of what we're doing right and So by the end of this talk, I hope you'll also learn to stop worrying and love the GPL for eBPF based programs, too So with that let's begin with the most important part. We're not lawyers. This is not legal advice This is a summary of intent if you really want to get a extremely precise and expensive answer go talk to your lawyer Your lawyer so talk to your lawyer. This is a free conference talk. Yeah Okay, so what's in this talk? So the first thing we're gonna walk through the motivation Why is eBPF worth this GPL hassle a quick overview of eBPF licensing considerations? reviewing the history of the relationship between CNCF and GPL licensing and then Providing you with a little bit of practical guidance about how you should think about GPL licensed eBPF programs in your application So the motivation it's no secret eBPF is eating the world Right. Why is that? So if you think about the Linux kernel, it's a 30 year old technology It doesn't innovate as fast as it used to it needs to provide stability because it's running on billions of devices Or worldwide and the Linux kernel is great because it provides a lot of performance and visibility into the system It controls everything that's happening between the application and the hardware But it kind of has lost that flexibility of adding new functionality on the fly as you need it you can do things like kernel modules, but they can be difficult to use unsafe and crash your kernel and may not always be stable and How can we kind of bring this program? Ability into the kernel and how does this benefit a cloud native environment? Right, so this is what eBPF is it's bringing that flexibility and program ability into the Linux kernel To allow us to provide new functionality to meet the demands of the cloud native world in terms of scalability performance complexity it's a safe performance an observable way to instrument the Linux kernel and the great thing is it's available by default in most modern Linux Distributions so provides a lot of benefits, but we're not here to kind of Talk about that So the kind of like history in the old days it could take you five years to get something from a User demand about what you kind of need in your Linux kernel to actually being available in production Right, this is a very long cycle So instead with eBPF we can now do that in a couple of days And this drastically reduces the innovation cycle and what we can add an instrument into We need to provide for our applications and In the word of Brendan Gregg's this kind of gives superpowers to the Linux kernel and I guess another way to think about this in the cloud native world, right? So you have your smartphone and originally it was just shipped with the applications that the that the Device provider gave you but with the iPhone in the App Store suddenly you could download new functionality on your phone on The fly and have something like an application to guide you around a new city or a way to buy Metro tickets or call a taxi, right? That's new functionality that you needed and we're able to download it eBPF allows you to do something similar You're able to add new functionality to a device you already have on the fly and that's pretty powerful So You now know why you have this great, but what do you have to actually provide when? You're going to put eBPF in your project I'm going to go over there's a lot in the slide, but I'm going to break it up into three pieces, right? So the first thing you have to do is provide eBPF by code then needs to be loaded into the kernel. This is what ends up Practically needing to interact with GPL code so you have to be careful there and it's going to Optionally interact with a user space application that you provide This is normally what you do in a project. You'll have a user space application and BUPF by code. So and this is the thing that usually does all your business logic and will interact via maps With the kernel and it's usually run privileged. Usually there's an asterisk there There are some short small use cases where you don't have to run as a privileged application But the linux kernel provides a whole bunch of stuff for you already, right? There's a syscall API that you interact with so you can actually load the eBPF programs and load the maps that they use to Exchange Jada with the user space There's also a just-in-time compiler and a verifier and a virtual machine all of this is inside the kernel space So and this is sort of real magic is all of those components that you don't have to provide You just have to load your bytecode into the kernel functions So here's a really simplified Lifecycle for what an eBPF program will look like very high level Right, so you start out in user space and you compile your usually see your rust Though I guess we have to go now right so into eBPF bytecode then you basically Use the syscall API's and you load that bytecode into kernel space for for the verifier to take over Your application then does whatever other business logic it needs And before I get to maps Now in a modern program or user space program We actually use lib eBPF you can actually wrap those two pieces together into an application And the application will then load the bytecode for you, right? You don't have to do it as separate steps modern applications will do that As part of normal startup So if you flip over to the kernel space, you know once you've loaded the bytecode via the API It'll verify that it's safe to run and it's possible to run It'll attach it to the correct kernel event hooks So you're you know, whether it's networking or or file system layer Whatever the eBPF program is supposed to be in line and executing on and then it becomes an event driven Executable right so it's not async It's not running parallel on some sort of thread as as event data or a kernel data flows through the kernel It'll use these hook points and basically Activate these eBPF programs at that point, which is why it makes it super efficient But the big thing that I want to come back to at the very end of this is that the kernel space and the user space side Are interacting via these eBPF maps via the syscall API? And this is how you communicate back and forth So you're able to get observability data out of the kernel or you're able to reconfigure eBPF programs to take different actions So now that we know why we want to use eBPF and a little bit about what it was Let's dive into the licensing overview. Why are we even talking about gpl? If it's so scary for so many people so eBPF has its origins in the Linux kernel and a lot of what we're gonna be talking about is licensing concerns working specifically with the Linux kernel implementation of this and Because the Linux kernel is gpl to license We're gonna have to talk about how that affects the way that eBPF programs are licensed So the big question is do eBPF programs need to be gpl licensed? Well like any good lawyer would say it depends. We have a few more billable hours that we need this month So when eBPF bytecode is loaded by the Linux kernel it's going to check if it's using any of the helper functions that are marked as gpl only and see if the eBP the licensing of the eBPF program is compatible with the gpl if it's not and doesn't have that licensing then the kernel will reject that eBPF program because it's not doesn't have a compatible license and Not all eBPF helper functions are gpl only however as Alexi would say he's one of the co-founders and co-maintainers of eBPF that all meaningful BPF programs are gpl license so the practical answer at the end of this is yes eBPF programs need to be gpl licensed and And One example to kind of show why the practical answer is yes is talking about this Helper function to generate log outputs So if you have a non-gpl license eBPF program, but you want to add this in for debugging purposes you actually need to change the license to be able to get the kernel to accept the program and That's why it's pretty highly impractical to avoid Licensing eBPF by code not as gpl And so as you're thinking about adding eBPF into your project you should think about how gpl licensing fits into that overall And so as bill just said that the kernel actually has a very specific set of function calls where you Tell the kernel which license you're about to upload in terms of your eBPF program So right even we have program program you have to basically mark it as saying this is licensed a certain way separate from the other licensing auditing you're doing this is actually a Call you make to tell the the kernel runtime or the eBPF verifier That it's acceptable to use of certain helper functions, right? And it's a different this is a different nomenclature than what you'll see in the header files So it comes down to what is what does the kernel? Want you to choose or what what is available? There's actually a list in the in the kernel licensing documentation about acceptable licenses Generally the you know you're gonna want to pick a dual license You can do gpl2, but it's it's generally especially in cncf I think you'll want to choose a dual license here What matters is is that it's gpl compatible at runtime when the when the eBPF program loads into the kernel but cncf doesn't actually allow gpl as a as of a license under its IP policy, so programs have had to Or I should say cncf projects that have used eBPF. I've had to ask for an exception And that makes it difficult historically to basically add eBPF and get it through the process But before we talk about that about there's been some recent changes there I want to finish up just talking about the other licensing issues. You may have run into in discussion the other the other issue other than the eBPF side is Can the why do you have to do with the user space executables that are interacting with these eBPF programs and you know the truest answer is complicated contact your lawyer, but the documented intent from the Linux kernel developers is that this should be sort of a normal reasonable practice Because everything in the user space here is using the Linux syscall API, which has a very specific carve-out as an exception So that it's treated as user space and not part of the kernel There are a lot of terms associated with licensing and before I go any deeper I think it's important to sort of Reflect on what these terms are especially in the GPL licensing right what what are we really interacting with here? What are the what are the terminology? So one of the big issues is you know is eBPF programs Derived works of the kernel right and that's up that's sort of why they have to be considered GPL licensed and on If you think about kernel modules, they absolutely are right because you're interacting with With a very specific Linux kernel you have to link against a specific version Whereas eBPF programs It is a little more complicated but The compiled by code even if it's separate sources the compiled by code because it's interacting with the Linux kernel and using functions Is considered a derived work? Generally or at least that's the intent again. We're not lawyers and then The other issue is whether or not when you decide to take your bytecode and your application code at the same time and package them together Is that a modular program or is it an aggregate and the document in intent? Is that they should be treated as aggregates when it's eBPF because the eBPF program is Using the syscall API and and and the thing with aggregates is if you have two programs even if they're interacting, right? If they're using provided Normal mechanisms by the OS Normally you would see like if your shell if you're interacting via pipe or file descriptors The the Linux syscall API is meant to be similar to that where you're able to interact across the syscall And it's considered separate programs from that point on so even though you've developed them together And they're intended to interact They are considered an aggregate when you package them and you're able to load bytecode from the application And like I said, it's all it's all comes out of the intent around the Linux syscall API License exception note. I think the Linux kernel developers have done a really good job to document the intent here They actually provide this syscall note on all the on all the header files that make up the syscall API so that you can you can be Clear about their intent as to where the boundary is between user space and Linux the kernel itself So just a little diagram about why I'm talking about like when I'm in a modern ebp f user space application You'll bundle that bytecode that you compile does just a just a data blob, right? And then you use you use the Linux syscall API is to load that across the boundary And this is why it's considered an aggregate because we're using that syscall boundary instead of just loading it into memory inside user space is sort of two highly connected functions or applications And this is not this is the reference from why I say I The there's clearly defined intent here from the Linux kernels developer side, there's actually Some documentation that specifically says that generally They expect to be able to have user space program proprietary license that interact with ebp f programs And this is again about the intent that it's been documented There are other remaining question here about aggregates versus modular is ebp f maps if two programs share memory that's highly Detailed or or between the two applications like you have to know the internals Of the memory you may have to consider that as a modular program and so so a question can come up about whether maps are these sort of structures and Again, it's complicated contact the lawyer But the practical answer is because these maps are being uploaded in an interface to via the syscall It is practically speaking meant to be considered as an app the intent is to consider these as functions and aggregate can take advantage of and Is a great example the bpf tool which is used by everybody as a diagnostic to see what maps are doing, right? independently developed has access to all the maps that all the other programs are running on the system, so You aren't so intimately tied to even though you're providing the map as part of spin up When your application is loading their bytecode bpf tool can actually access that right and you can get information diagnostically from it one word about that though is You should think about documenting your maps a lot of the map types are sort of self-dictus. They're simple They're defined by the links kernel Side right, but there are a couple of types that are very flexible Which can appear as Sort of a blob of data. You can't really peer into them diagnostically. There's actually a solution here If you're if you're already writing a modern compile once run everywhere Bpf program You're already sort of getting documentation of these types for free automatically because there's a thing called The bpf type format and to make core work You have to implement that and then once you have once you have used that bpf tool Can see inside your your message trucks and actually give you diagnostic information, so It's important. It's an important thing to do from an intent standpoint because you want your own users You need to diagnose these things be able to look into these maps So a real quick licensing recap after all of this, you know user space Effectively practically Can be licensed however you want if you're cncf program project. That's not true You have to so by by the cncf Policy and and pick a license that cncf is okay with And then for the bytecode side for the epf bytecode side Generally, you're going to want to choose to do dual license with the GPL with some other permissive open-source license and for cncf projects That has been an exception to the rule. So let's talk about that now So cncf and the GPL has as Jeff has been hinting to been an exceptional relationship So it's kind of interesting right the the cncf is a sub foundation of the Linux foundation Which obviously hosts the Linux kernel and obviously the Linux foundation likes or supports the GPL However, cncf as being a sub foundation has its own separate IP policy And the key one here is all up on code will be made available under the Apache license version 2 So that's what Jeff was talking about all the user space applications for cncf projects have to be Apache 2.0 However, like in this IP policy, they can also ask for exemptions from the governing board. So Using ebpf in a cncf project has historically been complicated because the GPL is not specifically allowed within the cncf's IP policy However That's not really going to work and the governing board saw that there's really no stopping ebpf in cloud native There's cilium falco blixt istio pixie inspector gadget and even kubernetes are using ebpf right now a Lot more projects are considering it and I wouldn't be surprised if almost every single project on the cncf Landscape was using ebpf in some way in the future So to really take advantage of ebpf we need to reconsider how we think about the licensing considerations and the big news here Jeff has been hinting to there's actually recently a Blanket ebpf licensing exception approved by the governing board says great news and essentially saying that the cncf allows projects to have ebpf By code license as gpl 2.0 or later and this is great news because now we can have ebpf programs in cncf projects So that's the state of the licensing but you still have to do some things in your project so that you can simplify this And make sure that you manifest the state of intent that we've been discussing so The first thing These are my personal opinions. These are a nile lawyer This is not legal advice This isn't even my employer's opinions nor is it bill's opinions. These are mine Sorry bill so So we can use Cillium as a guide here as a project that's basically doing some things that The cncf has already looked over as sort of okay practice even before the exception So the first Things you should probably think about doing is you want to make sure that ebpf source code that's going to have to be GPL compatible is compartmentalized in your project a little bit you can doesn't have to be its own repo It can be a subtree, but it needs to sort of hang together And be obvious right to everybody who's looking that this is the part portion of the code that will make up Or be compiled into BUPF by code I think you should probably choose a dual license instead of just GPL by itself for reasons that sort of future-proofing a little bit and You want to make sure that you know this? complies with the intent that Associated with the documentation around the epf One thing you'll also have to watch out for is that the mechanism that I spoke of how the you tell the Linux kernel It's GPL. It's different than your license audit Mechanism right at the that the file level you'll you'll need to have the spdx license headers, right? Which is a different terminology than what you see Inside the function call that the kernel requires so Make sure and keep up with that that you don't have drift there. You don't accidentally put the the wrong or incompatible nomenclature right between the the license header at the top of the file and what you actually Say to the Linux kernel. It's a it can drift and so you have to watch out for that So going above and beyond that Again my personal opinion Do what you can to document those maps that you should think about the maps as being an as being a user space-facing interface which means that you should think about it as an API that you document, right and and I don't probably there's probably not a single project that does it 100% to the extent that I would like to see it but the btf Format gets us really close like to the point where you can actually debug these these These complicated map structures that applications are using from user space Which practically speaking is like a big benefit to everybody even even developers and when they're interacting with their own users, right? Being able to debug these things so and I sort of feel like there's an opportunity here to go a little further Like maybe we can get some more automated tooling and do Better sort of generated documentation, but that's sort of a future concern So what are the future concerns? If all the things that we told you right now had to be thrown out the window. Where does that leave us? As I said before ebpf came from the Linux kernel, but it's not the only place you can implement it It's possible to implement ebpf for other operating systems for things meant for it microcontrollers or Windows ebpf for Windows is a project on github right now You can go check it out and ebpf is coming to other types of operating systems so Because these other operating systems may not be gpl License that means that the ebpf programs may not need to be gpl licensed either Some of these implementations for maybe edge computing or running Windows nodes on Kubernetes are going to be relevant to cncf projects I guess one open question to take away from this is like can cncf get ahead of Windows ebpf licensing requirements before they start showing up as requests from projects because We want to keep we don't want to go through this again. We don't want to go through this again short story So that's the end and we're just gonna kind of wrap up here If you want to learn more about ebpf, I recommend checking ebpf.io It is also available in multiple languages including French Italian and Portuguese We're also looking for people to help translate it into more languages to bring ebpf to other regions. I know there's currently translations in process for Italian Swahili China and Chinese So if you're another language, please talk to me after this talk and last thing shameless self-promotion I just released the book. I haven't even seen it yet because it's arrived at the shipping dock at 10 30 this morning The book buzzing across space the illustrated children's guide to ebpf I'll be signing at 1 30 at the psyllium experience center If you want to see it at the same time I do for the first time then come by So with that, thank you for coming and unfortunately we do have a little bit of time for questions and we're not lawyers, so But we can't take any questions you want to have Maybe I start with the so I'm like novice to this ebpf You know add a little bit of it, but if I have to start learning we do we do a start Where do you start with ebpf? Yeah Yeah, so I'd check out the resources that I was recommending so ebpf.io has a lot of like good ways to get started The children's book is also breaking it down like pretty simply and then if you want to go past that There's also two more books. What is ebpf and learning ebpf available on the ice available website? Thank you Yeah, any other questions last call Okay, thank you for coming today. See you around the conference