 Hi, I'm Liz Rice. I am the chief open-source officer at ISO Valent with the company behind the Cilium Networking Project. And I'm also the chair of the Technical Oversight Committee at the CNCF. And today I want to talk to you about a technology I'm very excited about called EBPF, and how this is bringing superpowers, particularly for networking and security and observability to the world of cloud native. So let's start by saying what EBPF is. It stands for Extended Berkeley Packet Filter, but I'm not sure that's terribly helpful. What EBPF allows us to do is to run custom programs within the kernel. So you can write custom programs, you don't need to reboot your machine. You can add and remove these programs dynamically, and it's incredibly powerful. Over the last few years, there's been a lot of innovation and development within the kernel itself, and also with surrounding tool chains and verifiers and other components that have made this EBPF technology possible. And nowadays, most of the Linux kernels that people are using in production support sufficient EBPF functionality that we can really start making use of it in the world of cloud native today. You may have heard there's been an EBPF foundation created recently, and this is really concentrating on that underlying EBPF technology, the changes to the kernel and the tool chains that enable projects to build on top of EBPF. The EBPF foundation is like the CNCF, a part of the Linux foundation. And so where the EBPF foundation concentrates on that underlying technology, there are several projects within the CNCF that make use of that technology. And we're going to look at some examples of using these projects to explore some of the power that EBPF gives us. But before we do that, let's just understand what we mean by EBPF programs. So our user space applications can load EBPF programs into the kernel. This can happen dynamically as and when required. And we also have to attach those EBPF programs to an event. When that event happens, it triggers the associated EBPF program. What kind of events? It could be the arrival of a network packet. It could be a user space application making a system call. It could be hitting a particular trace point or some dedicated hooks within the kernel. There are a lot of different places we can attach these EBPF programs. So let's make this concrete by looking at a very, very basic example of an EBPF program. This is going to be like a hello world. So this is my EBPF code that I'm going to run within the kernel. It's very straightforward. It's called hello. And I'm just going to make it trace out some tracing. Let's say hello, kubecon as the message. And I'm going to attach this to the event of entering the execve system call. Execve is used every time we execute a new program. So whenever any new program is executed on my virtual machine, it's going to trigger this EBPF program. And I've also got some user space code that's going to read the compiled object created from that C code. And it's going to load it into the kernel and attach hello to that execve system call. So when I make this, it's got two steps. It's doing a go build to build the user space part of the code. And it's also going to use clang to compile the C code into the EBPF code that will run in the kernel. I need to run this as root. And that's attached my tracing function there to execve system calls. And we can see there's lots going on on this particular virtual machine. You can also see that the tracing function that I've used automatically gives us some additional contextual information about what was happening when this trace was generated. We can see the name of the executable. We can see this is actually the process ID. There's timestamp information. So EBPF programs when they run have this additional context that can be very powerful. So that's my EBPF Hello World example. And I hope it illustrates that it was triggered by processes running across the virtual machine. So EBPF programs are aware of everything that's happening across the entire kernel. We're seeing executables being run in lots of different processes here. When we run an EBPF program, it has to not crash the kernel. It has to not leave the kernel in an infinite loop. Those things would be bad. And in order to ensure that an EBPF program is safe to run, there's a component called the verifier. And the verifier will check, for example, that every time you have a pointer that you want to dereference, you have to, in your EBPF program, check that that pointer is not nil before you dereference it. That ensures that you're not going to be dereferencing a nil pointer. EBPF programs are limited in what memory they can access. And the verifier will check what memory the program is trying to access. And it will also check for things like not falling into an infinite loop. And because the verifier is ensuring that the EBPF program complies with these rules, EBPF is sometimes described as a sandbox. And that's absolutely true. But for those of us used to the idea of containers, it can be a little bit, well, kind of confusing. Because people also describe containers as sandboxes. And these are very different types of sandbox. One misconception that I've heard is people thinking that maybe EBPF is a replacement for containerization. And that is not the case. Both are kind of sandbox, but they're for very different purposes. But let's take a look at what it means to be able to run an EBPF program in a Kubernetes environment. So in Kubernetes, all our application code is running in containers that are grouped together into pods. But however many containers we have, however many pods there are, there is one kernel per node. Whether that node is running on a bare metal machine or a virtual machine, there is one kernel for that machine. And every time our applications running in pods want to do anything interesting, they have to get help from the kernel to do that. So for example, if your code wants to read or write from a file or send or receive a message, even if Kubernetes is creating new containers, for all of these activities, the kernel has to be involved. And that means the kernel is aware of everything that's happening in all the pods and uncontainerized processes on that host. The kernel is aware of everything on the host, and we can attach EBPF programs to that kernel. Our EBPF programs can be aware of everything that's happening on the host. So we can, for example, redirect or drop network packets from inside EBPF. And that's the basis of EBPF networking as used by Cillium. We can bring together the information obtained by EBPF programs. We can share it with user space and pull it together into really powerful observability applications. And we can use it to detect potentially malicious activity for security purposes. One thing I want to be clear about is that I doubt very many of you will need to write EBPF programs yourself. I showed you that hello world to make it concrete. But for most of us, you can use the EBPF applications that have been written by projects that make your life easier. EBPF programming can get really complex. So unless you have a good reason to do it yourself, you will probably be better off using the tools created by different projects. So let's take a look at some examples of using CNCF EBPF projects to illustrate some of the powerful things that we can do with this technology. So I have a Kubernetes cluster running a few different sample applications. This has Falco and Cillium and Pixie all installed. And let's start by taking a look at some network flows. This is using Cillium's Hubble UI and we can see a service map here showing us how network traffic is flowing between different services in this particular namespace tenant jobs. And because Cillium as a CNI is aware of the Kubernetes entities making and receiving these network requests, it can map flow information, network flow details to the pods and the services that made and received those requests. So we can build, you know, nice UI like this to see, for example, for all a service sent a message to Elasticsearch service and we can dig into the details of that HTTP request. Yeah, so that's an example of using Cillium for Kubernetes aware network information. As another example, let's take a look at Falco detecting one of its rules being violated. So Falco, much like my Hello World example, attaches to system call events. And this particular rule is like my Hello World looking for processes being executed using the exact VE system call. And this rule is looking for processes called NC, Netcat, which is often used for malicious activities. So let's try and trigger this rule. So first of all, I'm going to tail the logs from Falco and there are a few things that have been detected in the past already. Now I'm going to exec into a pod in this cluster and I'm going to run a shell and we should straight away see another rule being detected. There it is. A shell was spawned in a container with an attached terminal. And if I run Netcat, it won't actually work. But the point is that the system call was detected. It triggered the assessment of that rule and Falco emitted a warning about Netcat running inside this container. Now let's turn to observability. We already saw in the Cillium Hubble UI the ability to observe network traffic. Pixi allows us to run scripts to generate lots of other kinds of observability data. I quite like this example. Let's run this again just to prove that it's live where it's showing a flame graph of how resources are being used. So we could dig in to see which of the applications are using the most resources and we can see the whole stack trace to understand where our CPU is being used across the cluster. We can also get some interesting data about node resource usage using this Pixi UI. So we've seen that EBPF applications can offer very powerful observability because they can access everything that's happening across an entire node. They can also give us some real performance improvements and I want to touch on how EBPF improves network efficiency. When we run applications in a pod, the pod has its own networking stack and that's connected to the networking stack on the host through a virtual Ethernet connection. So imagine a pod arrives on the physical interface to this host and is destined for the pod or a pod on this host. That packet has to traverse the whole networking stack on the host to get sent on through that virtual Ethernet connection to the pod. As an EBPF application, Cillium can hook into an event called XDP where the moment the packet is received onto the host before it's hit the networking stack at all. And because Cillium was responsible for setting up the networking inside the pod, it can detect that the packet is destined for that pod and it can route it directly to the pod, avoiding having to go through the whole IP stack on the host. And this leads to some genuine performance improvements. We can see this in another flame graph. Here we can see what's happening when a packet is received. It spends some time being received in EBPF on the host and then it's sent into the TCPIP stack within a pod. I think that probably illustrates how that is more efficient than it would be if we had to also run through the TCPIP stack on the host, but more importantly we measured it. And there's lots of details in this CNI benchmarking blog post on the Cillium site. The blue line on the left here shows how fast request and responses can happen with no containerization at all, so sending and receiving between two nodes. The red and the orange bars show that we can achieve very close to the same network performance using EBPF, the Cillium and Calico in EBPF modes in those red and orange bars. And you can see that's significantly better than the yellow and green bars in the middle where that's Cillium and Calico in non EBPF mode. So the takeaway here is EBPF networking can enable faster networking and there's a lot more detail in that benchmark blog post. So EBPF allows us to build this huge range of different networking and observability and security tools. But the characteristic of being able to see everything that's happening across an entire node gives us another really significant advantage. That is, we don't need to change the application or the way an app is configured in order to instrument it with EBPF. Contrast that to the sidecar model that we typically use for a lot of instrumentation in the Kubernetes world. Nathan Leclerc did this really great cartoon illustrating how EBPF is a really great new model for instrumenting applications. Let's explain why that is. In a sidecar model, we have a container for the instrumentation that's injected into the application pod. In order to get that sidecar into the pod, there has to be YAML definition. Now, you probably don't have to manually configure that YAML. It's probably done as part of an automated process, perhaps in CICD, perhaps through an admission control webhook. Something adds the YAML definition for the sidecar container into the pod specification. But what if that goes wrong? Or what if somehow an application doesn't go through that automated process? If you don't have the sidecar in the pod, it's not instrumented. You have to get that sidecar successfully configured into every application pod that you want to instrument. That isn't the case if we're using EBPF. Because the EBPF is hooked into the kernel, the EBPF program is hooked into the kernel. So it doesn't matter that the pod, it can see what's happening in any pod without any need to change or reconfigure that pod. That makes it much more secure in terms of being able to see every application that's running, whether it got there by fair means or foul. If somebody manages to maliciously run some code on your cluster, they probably won't go through the trouble of adding a sidecar pod into it. With EBPF, it will be instrumented because it's running on that shared kernel. If we use EBPF tooling to instrument our Kubernetes applications, you've seen how we're able to collect a really wide range of different information, different characteristics about our applications. And we can use that for security forensics. Here's an example of something that we've been experimenting with in Cillium. You've already seen how we can map network flows to Kubernetes identities. We know exactly which pods and services are involved in any given networking flow. You've also seen, even in the Hello World example, how we can see contextual information about the process that was involved when a particular EBPF program was triggered. If we combine those two things, we can get some really deep insight into what happened perhaps for malicious activity. In this example, we can see a pod that's got network connections to two completely legitimate destinations, Twitter API and an Elasticsearch service. And we can also see exactly which pod has those connections, but within that, which executable, in which process ID, in which container, generated those network connections. We can also see when this happened. So imagine that one of those destinations was a command and control center for malicious activity, or perhaps it's a cryptocurrency mining pool. We'd be able to see exactly what process caused that network connection to that dubious endpoint. We'd be able to see when it happened. We'd be able to see what pod was presumably compromised. And with this information, we would find it much easier to track back whether the pod was compromised at runtime, or whether it was already compromised before it was even deployed perhaps at some point during the supply chain. This kind of deep forensic insight that we're able to collect using EBPF is hugely powerful. So I hope this has given you some insight into why I believe EBPF and the ability to run custom code in the kernel is revolutionizing the kind of tools we can build for networking and security and observability. It's creating a whole new generation of amazingly powerful tools to help you operate your clusters safely. Thank you so much for your time. I do hope you have questions, so come and ask me your questions. And if you want to find out more about EBPF, the foundation site is at ebpf.io. And of course, you can find out more about Cillium and about Isovagant on our websites. Thank you.