 So welcome everyone to our session on BPF where we hope to go beyond the buzzword and explain how this call the next technology can have a real impact on your daily Kubernetes experience. My name is Andy Randall. I'm the business guy at Kinfolk, an open source company based in Berlin. My name is Albon Cricky. I'm a co-founder at Kinfolk and I'm the director of our labs team, which is actually very active in various Kubernetes and BPF related projects. Yeah, and I know it's always a struggle working out who's speaking in a joint talk. So to help you out, just remember that I'm the one with the British accent, that's Andy. And I am Albon, the one with the French accent. So first of all, hands up those in the audience if you've heard of BPF or EBPF hands up. Okay, I see a few of you out there, a few at the back, maybe not so sure. So for those of you who aren't so clear, the concept of EBPF or extended Berkeley packet filter is pretty simple at a high level. First of all, it allows the user to write programs that run in kernel space. Now, as Brendan Greg of Netflix put it, this is a fundamental change to a 50 year old kernel model, because up till now you've been used to writing programs that run in user space. Now you can actually run them in the context of the kernel. Now, these programs are not completely unrestricted. There are certain hook points where they can attach into the kernel and data structures and helper functions that are available to them. And lastly, they run in a restricted virtual machine environment, which acts as a sandbox. And all of the code is verified against certain strict conditions to ensure that, for example, the code will complete and not hang in the kernel in an infinite loop. So you care about this because as a Kubernetes application developer, you're probably not going to be writing BPF code, but you are going to be using capabilities that BPF services to use. So the first of those is potentially some of the fast networking and customizable networking capabilities that you get available with BPF. Next, anyone who writes applications needs to debug applications and BPF provides some pretty cool tools for surfacing things that are going on in the kernel that help you debug either feature problems or performance problems. Lastly, from an operations perspective, BPF gives you some capabilities that really help when it comes to monitoring applications and also particularly monitoring them from a security perspective to check that you haven't got attacks in your cluster. So these are all pretty good reasons to care about BPF, I think. So at this point, I'm going to hand over to Albon to talk to you a little bit about the history of BPF and how we got to where we are today. Thank you. So it started in 1997 where we had the first batch of BPF. It was used for TCP dump so for network capturing. And that's pretty much it until much later in 2014 where we have the new eBPF for extended eBPF. So Alexi Starover-Toff comes up with the idea of a universal in kernel virtual machine where we are kind of different eBPF program running together for a variety of applications. A bit later, in 2015, we have the Iovizer project established that comes initially from Plumgrid technology from Plumgrid and then it gets moved into another Linux foundation, Umbrella. A bit later in 2016, we have XDP for Express Data Path. And that's the idea to apply the BPF program very early in the network stack so that right in the NIC ideally we can filter packets on a block unwanted traffic. So it makes it very fast. In 2017, we have in the Linux kernel 4.11, we have new data structure like LPM maps that make it possible to filter large amount of IP or CIDR on the network traffic. In 2017, we have a feature called SOC map. That's a new BPF map that's used for intra-host network to make it very fast by bypassing the network stack. That's used in Cilium on Istio when we use Istio, the service mesh together with Cilium, for example. In 2018, Facebook announced the project Catron, that's a BPF load balancer, and Facebook announced to the world that they are using that on their networks. In 2018, we have a new BPF helper function for filtering events by C groups that's used for containers. For example, I used that in trace loop and inspector gadget to figure out which trace came from which pod and which container. And lastly, in my section in kernel 4.8, sorry, 5.8, we have a new kind of BPF ring buffer as a communication mechanism between the BPF program in kernel to user space. That's a new ring buffer that is much more efficient memory wise. So I picked a few of those items to show how it's relevant to Kubernetes, for example, with Cilium or Istio, for tracing tools on Kubernetes. So let's take these capabilities that I've talked about and look at the landscape of tools that are out there that you can use. So the first layer that I want to talk about is there's a set of low level tools, which a lot of them have been around for quite a while, where you get very low level access to BPF capabilities typically on an individual host at a process level, something like BPF tool which actually ships with the Linux kernel, or the BPF compiler collection, which we'll talk about in a minute. So for coming at BPF from a programmer perspective, and you want to be able to write programs to inject into the kernel, then there are libraries available that you can use that shortcut a lot of these capabilities to make it a lot simpler, whether you're running in Go or C++ or Rust or Python, there's probably a good library for you out there. So when we get the next layer up, we start to look at more complete solutions and less kind of individual APIs and libraries and in security and networking space is one of the main areas for the BPF is used. This is one of the first BPF enabled networking solutions in Kubernetes, but Calico has added it as well recently, Catron, as I mentioned, Valco is used in for security applications from Sysdig. So a lot of things that you can plug into Kubernetes there. The last layer in visibility. This is, I think, you know something that anyone from an operations perspective should be looking at these tools to say, you know, how can this give me greater insight to what's running in my cluster. So things like we've scope or Hubble actually provide the capability to see in real time what what network connectivity is happening, how things are connecting in the cluster. There's a few other tools as well which don't fit into any of these categories, things like the Cloudflare BPF exporter project or net cost, which actually analyzes your network traffic to give you some idea of what it's going to be costing you to run that in the cloud. As an example of one of these tools, Hubble is from Cilium folks sits on top of the Cilium CNI plug in and surfaces up metrics and and also analyzes kind of a graph view of your traffic in the Hubble UI and look something like this, you know, nicely displayed to see what the network traffic is. And there are other tools that do the similar thing so we've scope actually quite a while back, the team at Kinfolk work together with we've works to author the agent that collects the data using BPF about network connections. So here you can see which pods talking to which pod, all thanks to BPF. One of the nice things about this is as a part of this project, we developed a library called TCP Tracer BPF. And, you know, that's available so if you wanted to build something similar, a lot of the base capabilities are there for you to build on the BPF compiler collection we've mentioned a couple of times because it really is the granddaddy of all these BPF tools. The nice thing about BCC is, it's very kind of Linux style so these are command line tools with arguments that do one thing do one thing well allow you to snoop and control your system. You know, really quite easy to get up and running and using any of these tools and I recommend you check those out if you haven't already. BPF trace is kind of the cousin to BCC, maybe less well known and less frequently use because it does have a bit of a learning curve, but it's great for quickly writing custom BPF functions it's got a lot of flexibility for tracing and debugging your Linux apps. As you can see with BCC on BPF trace there are a lot of tools really powerful one that come off the shelf, but they don't know anything about containers or Kubernetes so if you run Kubernetes and you want to run that on your Kubernetes cluster. There is no easy way to use them. But if only there was a really easy way that these people who running Kubernetes clusters could take some of these tools and apply them in Kubernetes. Yes, if only that's where can folks inspector gadgets come in comes in. So you can think of inspector gadgets as a Swiss army knife of BPF tools or gadgets for communities. Some of them come directly from BCC with a tiny wrapper on top of them to make them Kubernetes aware and some other are developed by default independently outside of BCC. So what do we need to make BPF tool communities aware. We need a few things first the granularity that we care about. We don't want to trace at the PID level process level. We don't necessarily want to trace all the process on the system. We only want to trace process from a specific parts usually. That's not something so easy to do because BPF programs running in the canal they don't know anything about Kubernetes spots or Kubernetes labels. So we need to do a bit of work there. Another thing we want to do is to aggregate the information by labels so we don't want to see the piece or to select individual process but aggregate using Kubernetes labels. The last thing I want is to have a cubes detail like experience so developers will not need to SSH on a specific node. They should not need to know in advance. This is the node where my product is running. They should have cubes detail experience where they have a common line interface directly for that. So ideally that's a cubes detail plugin which is exactly what we do. It looks a bit like that. So at the bottom left I'm on my laptop. I use a cubes detail gadget. That's a cubes detail plugin so it execute the cubes detail gadget plugin. And then it only talks through the Kubernetes API. It doesn't SSH when you know it use Kubernetes native concepts like birds on the mindset and so on. From there, Inspector Gadget deploys a board on each node. That's a demand set. And each of them will be able to execute the different gadgets. And it will execute, for example, trace loop or those from BCC. And those will install a BPF program in the kernel. And the kernel will be able to gather the information and send it back up to the CLI to cubes detail gadget. So what do we have today as Gadget in Inspector Gadget? The first one here is capabilities. It's a way to get information about capabilities that are exercised in your pod. The use case is, for example, you have a pod that needs some privilege. Maybe it need cabinet admin, can season me, but it's quite difficult to know if you are not deep into this level of kernel knowledge. So what often people do is just to give all the capabilities, but that's not so great security-wise, instead of just giving what you need. This tool allows you to run the pod and see what actual capability are exercised by the pod. And then you can pick that information to write your pod stack on PSPs. Other gadgets in Inspector Gadget are open-sloop, exact-sloop, bind-sloop. They tell you when a file is open, when a new binary is executed, or when there is a new TCP pod open by your pods. So you can see this information in real-time as it happened. Other tools are TCP top and TCP tracer, so that gives you the network traffic happening inside the pod and what is the most voluminous traffic with TCP top. And another is TCP tracer where you can see one new line for each new TCP connections to see what's happening in your pods. Profile is a CPU profiler that this one comes directly from VCC as well. It's useful to see why something is slow. Sometimes it's not so easy to see which pod or which thing is slow. One example I had to debug is when there was a Kubernetes system with way too many IP top role for some reason there was a bug in all versions of Kubernetes. And then this generates the network traffic to be slow, but that was not so easy to debug. So in this tool, we could see what are the network, sorry, the kernel stacks on the user space stack that are the most often called, and then I can see what's happening there. The next loop is a call it a flight recorder. It's a record of the system called done by all the pods and keep them in a memory in a ring buffer. And just in case something crashed, then the user can ask what happened and see the last few season calls exercised. And the last one I'll talk about now is a network policy advisor. The use case is when you come to develop to implement security on your pods, but you come to a project without any previous knowledge of that, it's quite difficult to know what is supposed to talk to which other pod. So if you use the network policy advisor, you can run your pod, see what kind of network activity is there, and it will automatically create network policies that you can look, see if they make sense and if they make sense, pick them up. It's a lot faster than creating them from scratch when you don't have any knowledge of the project. So let's take a look of a couple of those tools in actions. Let's see. The first tool I want to show you today is execs new. So that's a inspector gadget tool that allows me to see what kind of program are executed inside each pod. So let's get started. Qtl gadget has to run inspector gadget shows the tool execs new. And then I don't want to filter. I don't want to get information about all the parts, but only a specific one, I will select the namespace default. And then I will use Qnity's label selector to choose only the pods with the label run equal cooking. So let me run that. I have only one node in my cluster on Miniku. And now it starts to record the new program being executed. There is nothing to display yet because I don't have any pod to record with that label. So now I will create a new pod. So Qtl run with some options. And I will select the image fedora. I give it the name cooking so that it will have the label cooking. So it will be selected by my inspector gadget tool. And in there I will execute a script, a shell script with a search. And I will use this anti-pattern which is to call shell script on executed for the purpose of this demo. So here I call the shell script on this website. And I pipe the shell script into a batch directly. So here I don't know what this script contains. So hopefully inspector gadget will tell me what it is doing. First it takes some time to download the fedora container image. And when it's ready, it should, here it is. I start to see the commands. And here I see it will execute grep, cart, org, mkdir, rpm, and so on. So I see it downloaded rpm file here. And then it's installed with rpm and so on. So in this way I can use inspector gadget to select a specific pod with labels. And then see what it's doing with xxnoup in this case. But I can also use opensnoup to select to see what are the new files being opened. Or bindsnoup to see if it's listed on a TCP port like for AT for example. I can see that in real time in my terminal. Okay so hopefully it will finish soon and then I will show you the next demo. Let me stop that first. Okay the next demo will be about the network policy advisor. First I will show you that I don't have any pod running in the demo name space. And then I will deploy something there but first I will use the network policy gadget to record all the network connection that will happen. And try to generate a network policy from that. So let's start with kipctl gadget network policy. And the first step is to use this sub command monitor to record the network traffic. And I will only select a list of namespace. So in this case only one. And I will output the result in this file network2is.log. Okay so what it's doing is installing the BPF filter. And every time there is a new connection, a new network traffic in that name space it will record something in that file. For now there is no traffic because I don't have any pod. So let me deploy something. So kipctl apply in this demo namespace. And then I will select this Kubernetes manifest file. As you can see it contains a lot of services and a lot of deployment but it doesn't contain any network policies so far. So all the pods are free to talk to each other. The only thing I need to do now is to see what traffic is gathered and what network policies are generated. Let's see if my pod are created. kipctl get pod. Dash end demo. Okay most of them are already started. Let's see one of the difficulties to get started. So that's good enough. Now let's see what has been generated. I will stop the recording file. The recording gadget into this file. And just to show you how it looks like it just a piece of JSON that lists the different TCP connection that happened there. But you don't need to worry about that. You don't need to read that file actually. What I do is to use inspector gadget with the network policy. With this report that will generate the network policies. So attack this network trace file that I just generated. And then I will create network policy. So now I can look into this network policy and that it contains just the list of network policies that has been generated by inspector gadget. So I can see for example that the card service is about to talk to redis on this TCP port and so on. So it created a network policies for all the different traffic that has happened. And what you can do now is to see if that makes sense. And if it does, you can use that as a base to what your network policy that's a lot faster than writing them from scratch. Thank you. So that's it. Thanks, that was fantastic. I do all of you out there in kubeland. I hope you got excited by that and inspired to go out and try it on your own clusters. Thanks for coming to our talk today. And we look forward to your questions.