 Thank you. I'm sure many people will be interested in Rubberbanks Adventures in the Cloud. For now, we have Christian Fecative from solo.io. Give it up for Christian, please. Sorry, we're having a little problem with this buzz sound. It's not in the room, it's in your ears. It's being worked out, and I'm sorry for the overall loss. Good morning, everyone. This presentation will be about ABPF. I'm not sure how many of you are already familiar with ABPF. Yes, if you could raise your hand, please. Oh, nearly everyone. Are there any ABPF experts in the audience, maybe? There are some lights coming in, but I cannot see any. Last time I gave this talk, that was at Cube Huddle in Edinburgh, at the end of last year. And there were lots of ABPF talk there. And yeah, that was almost an ABPF summit, because there were multiple talks. But here, I didn't see that many. So hopefully, this will be something interesting for you, especially if you are just starting out with ABPF, or you want to find a new way to experiment with ABPF itself. So first, some introduction. My name is Christian Fecative. I'm a field engineer at solo.io. You can find the contact information of me on this slide. The company itself, we are an application networking company founded not that long ago. We are based in Cambridge, Massachusetts, in the US. But we have multiple locations. We are all working remotely around the globe. We are doing mostly application networking related stuff. But that's a quite extensive field. So we are nowadays covering multiple layers in the application networking stack, starting from the kernel and ending at the regular traditional application networking layer, application layer, that is layer 7 in the OCI model. Our products are OpenCore, Anthroposkation model. But we also have a few open source, completely open source project as well. For example, Bumblebee, that I will talk about quite soon. We have many happy customers. We are well-funded. We are also hiring. So if this is something that's interesting for you, please reach out. And come and say hi at our booth and enter to win a drone. So first, my background, because this presentation has the title of how I like to, how I get to know and love EBPF itself. So it pays us to include a little bit of background of me first. As you can see, I was mostly a systems infrastructure. There was SRE platform and general professional life. I was mostly doing the same thing in all of my positions, previous positions. But you can see the trend how the name of the position changed over the years. My main focus is and was observability most of the time. So I designed and operated observability solutions for video streaming clusters, hundreds of VMs in on-premise data centers across the globe. These were serving millions of users. I was an SRE at a password manager company that's quite famous nowadays. Maybe you can guess the name. And I was also operating clusters, running Kubernetes and Istio. EBPF is quite a hot topic nowadays, as you also know. It also has a steep learning curve. And fast forward to last April when I joined Solar.io, I got to get familiar with this technology. I was able to make some contribution to our Bumblebee project and to the BCC framework as well, which, as you will see, is basically the first generation of EBPF-based tools out there. I was also able to create a developing EBPF application workshop. If you go to Solar.io and choose the events page, you can find workshops that we frequently do. And there are some EBPF-specific workshops there as well. So first, let's see what EBPF is. EBPF is basically a flexible, safe, and fast way to inject custom logic into the kernel. The origins are dating back to the days of TCP dump. If you were around for a long time, you might have already used TCP dump to troubleshoot networking issues that was using originally BPF, which is predecessor of EBPF. There are multiple use cases regarding EBPF. This can be sorted into four main families. For example, security, tracing, profiling, networking, or server monitoring. Security is quite trivial because EBPF, as you might already know, is based on kernel events. So you are able to track actual C-scores happening in your system. Tracing and profiling is also quite important because you can get a unified overview of your actual process running in the user space. And you can get some information about this process from the kernel space. There are multiple provider tools currently that are quite popular. These are nowadays, the more modern ones, are based on EBPF as well, for example, Parker. Networking use case, you can do, for example, layer 4 load balancing and other similar things with EBPF. Again, we are at the kernel level, so this can be quite performant and safe to do so. So that's another great use case. For example, Cilium can come to mind if you are familiar with that open source project. And the last family is observability and monitoring. And this is what I will cover in this talk. This is also a quite nice area because EBPF can solve observability challenges that cannot really be solved otherwise. This diagram is probably my favorite EBPF diagram because it's quite simple, but still it's, I think, the best way to understand what EBPF is and how it works in your system. On the left-hand side, you can see the user program. On the right, you can see the kernel space program. When you have an EBPF program, you have to take into consideration both sides because the user space program is the one that the user will interact with. This will display the statistics, for example, if you are getting some statistics out of the kernel. And the kernel side logic is basically where your actual EBPF logic is. So first, it looks like that the user program is responsible for generating the BPF byte code. This is just some lifecycle management for your actual BPF code. But this also has to be done by the user space program. Then after the BPF byte code is generated, it will load into the kernel. And as I mentioned, EBPF is safe. There's a verifier on the kernel side. And you are only able to run custom logic on in your kernel if it passed the verifier. It's fixed. So after the verifier is passed, you have that BPF rectangle there. That's your actual BPF logic. EBPF is basically event-based. So you can specify multiple events, K-PROPS, U-PROPS, race points in your kernel. And when these points are being reached in the kernel, you can trigger your custom logic, which is the BPF rectangle there. The last piece on the kernel side are maps. Maps is a way to exchange data between kernel space and user space. So once you have your data that you are interested in in the kernel, you have to populate your various BPF maps on the kernel side. And from these maps, you will be able to read the data from the user program. And this is basically how you, for example, visualize this data. Okay, so now we know that what the BPF is, to some extent. We know that, but why it's important? As I mentioned, it's important because it can be a solution to impossible tasks and scaling issues. Scaling issues, I mentioned that when I was talking about the networking use case category. BPF is quite a performant. It's in the kernel. For example, you can do some really effective copy operations that way between your network interfaces. So if you are operating at a really large scale regarding networking, for example, you are a cloud provider, it might make sense to look into BPF. It can be also a solution for other impossible tasks. For example, authorability-related ones. If there are some SREs here, I guess you had some hard time to catch out-of-memory exceptions in Kubernetes clusters because there are very, there are a few error-prone way to catch those. If you want to export those as primitive metrics, then you might have an even harder time because there are no real exporters. You have to either mount the Docker socket someplace and get data from there. The C Advisor doesn't really cover all of the use cases of out-of-memory exceptions, so it's quite hard to get it right. For example, this is a great example where BPF can help. And it's important because there are multiple personas. For example, application developers, SREs, DevOps engineers, and network operators who could benefit from this. In this presentation, I will focus on the SRE DevOps engineer parts because back in the day, I was also one. Okay, why BPF is scary? Because some can think that it's scary because basically it's the kernel and that just doesn't have a quite nice ring to it. Nowadays, everyone is spinning up new web applications, microservices, and Kubernetes clusters. Everyone is parsing JSON. You might have a Kafka queue in the middle of your stack, but basically most of the engineers nowadays are not quite used to do some low-level systems engineering work. So when they hear that ABPF is something related to kernel side, to kernel, it can be scary. EBPF can be also scary because of the lack of the documentation. I'm not saying there's no documentation. There are some. It's getting better and better. But you have to either seek out LKM blog posts by various EBPF authors, or you have to check out the kernel mailing lists. And the user experience is not that great. Most of the people, again, are not really used to check out the kernel mailing list if you want to troubleshoot a bug in your JSON parser web application. EBPF can be also scary because people don't really know where they can start to get familiar with it. There are lots of emerging tools. There's a new EBPF-based tool every other week, so it can be quite daunting to find the right one where you can get familiar with this technology. And I mentioned it already. It's the kernel, and I think that this is the thing that can be scary of all. Let's take a look at the EBPF landscape. On this slide, you can see the application landscape of EBPF. This logo collection is from the end of last week when I last gave this presentation. I was thinking about updating it with the new logos, but it was also quite hard to get these on this slide. So I decided not to. There are a few new tools out there, but as you can see, it's a quite extensive landscape. So again, it can be quite hard where to start with getting familiar with this technology. To make it simpler, let's focus on observability. And for observability, we can follow the advice of Brandon Gregg, some of you might have heard of him. He is a great contributor, one of the biggest contributors to the EBPF ecosystem. This quote means that you should not reinvent the wheel all the time. You should not start from scratch. You should not program your way to everything. If there are tools available, let's just use them because people spent hard hours to get to the point where those tools are working properly, and we should really benefit from using those in the first place. On the slide, you can see the screen shot of BCC. There are multiple layers out here, and there are some BCC EBPF tools attached to all these layers. So for example, if you want to troubleshoot, let's say system libraries, then, for example, KillSnoop can be a good solution to get started finding performance issues or troubleshoot issues. Then there are multiple other programs targeting different areas. This is basically a GitHub repository, so you can go to github.com and find the BCC repository, and you should find all these traditional BCC tools in the tools folder. If you want to run these, you have to do it like this. First, you write a single Python program, and that single Python program will contain the user space program I was talking about. That's written in Python, and the kernel space program, that's written in C. But how is that possible? How can two languages be in a single program? It's possible because the C program is basically a string. And yes, this is one of the reasons why EBPF can be hard to start with. You can see a screen shot here. This is actually the wombkiller troubleshooting tool. You can see the BPF code written in C inside the Python program. This was the traditional way to get started with the BPF. Then during execution, BCC will call Clang, LLVM, and perform a header lookup to know where the certain points in the kernel are. Clang, LLVM, and the kernel headers have to be present on the target machine where you would execute your program. Then you have to compile your code at runtime, but this can be quite problematic because, let's say, you are operating VMs in production, and you don't like to underutilize these VMs. So let's say these VMs are consuming, I don't know, 70% of the available resources, for example, CPU or memory. And let's say you want to troubleshoot something. So for example, try to catch out of some of the exceptions. So we go SSH into the machine, deploy the script. You have Clang, LLVM there. That's quite huge. You have the kernel headers available, but you have to compile your program. And you have a virtual machine that is basically consuming the proper element of resources. But if you compile your traditional BCC code on that production machine, the SRATMs will be at your door in a few seconds, because those machines can easily tip over because you were operating at the desired resource utilization, and now you are performing a very resource heavy operation. So it might not be a good fit for production use cases. There is a better way to do this, and this is by following the BPF core methodology. Core means compile ones run everywhere. For this, you have to have the BTF type information available. This is usually done by having a VM Linux.h header file available on your system. This will make sure that you can run this everywhere. You will still use Clang as a compiler and the libbpf user space loader linker library. And if you have all these things together, you are on the route to create portable and better, more modern eBPF programs. The kernel and the user space code, in this case, are both written in C, which is great. So no more Python, no more injecting code as a string. And you can even compile it in advance. So after that, you don't need to have Clang to be available on the target machine. You can just compile it in advance, ship to the target machine where you want to execute it. And as long as the requirements are met, you can easily run an eBPF code in a quite efficient manner on a production system. But we can do even better than this. It's also possible to focus on the kernel space program. As you would imagine, the kernel space program is where the really exciting and interesting stuff is. The user space is basically just plumping. You have to manage the lifecycle of loading the code and cleaning up, then visualizing the data. That's not something that's very interesting. So focusing on the kernel space program can be much more interesting. So back to the original question. Where to start to learn eBPF? The answer can be, for example, with the Bumblebee project, which is an open source project by Solot.io. You can go to Bumblebee.io or find the repository under the Solot.io organization. Basically, it's fully open source. There are multiple examples there, so it's a great way to get started with eBPF. But Bumblebee can help you to build, run, and distribute eBPF programs as OCI-compatible images. And it can even expose these events as Prometheus-compatible metrics, which is quite nice because you most probably already have Prometheus, so it's quite trivial to hook it into Prometheus and visualize your data that way. And if you are talking about the observatory use case, it's also quite true to use the de facto standards. It's working like taking the existing eBPF kernel space code and generating the user space code for you, without even knowing what that user space code will be. And we also wanted to focus on the user experience and the developer experience. So with Bumblebee, you have a Docker-like experience, meaning you can use basically very similar comments that you would use with Docker. For example, you can build, be run, be push to push the images, to pull the images, to build the OCI images with your code. And that's basically a really nice way to distribute these programs as well. You can see a very short video of Bumblebee in action. You can see an example. It's a C file. That's the kernel space logic. That's all you need. After you have the file, you can do something like B build and use the local file there. After that, the image is built. You can push it into a local or remote registry. After it's pushed, you can just run it. And it even has a user interface like this if you don't want to expose the metrics. And just for developing, local developing, you just want to see what the output will be. Basically, for this, it's quite nice because you can use the, if you go to the BCC repository, you can find the libepf-tools folder inside this repository. And there, you find, most of the time, two files per tool. And these files are basically rewritten to use the core principle, which is quite nice. These are the more modern equivalent of the traditional tools. And you can take examples from this folder and put these into Bumblebee. For this currently, you only have to take care of a single thing. And that is, you have to have compatible map types with Bumblebee. Currently, only HashMap and RingBuffer is supported. So there are a few tools that are using PerfBuffer. That was a traditional map type. You have to port that to RingBuffer first. But after that, you can basically just put it into Bumblebee, and your code will work. And you have primitive metrics with the exposed data. On this slide, you can see a few differences between PerfBuffer and RingBuffer. One of them is using buffers per CPU. The other one is using a ShareBuffer. There are performance differences. The event ordering is much better. The developer experience is also much better. There's a ReserSubmit API to make sure that you can write better code. The requirements are to have a fairly new Linux version. But I think most of the cloud providers are already shipping Linux versions newer than this. So these are not really hard requirements. I would have a demo, but I will skip this. Please find me if you're interested in a live demo. I can do it at our booth or somewhere around here. I will just skip this to go to the finishing slides. You can see the roadmap for Bumblebee here. We are trying to keep up-to-date with LibyBF. There's an active development going on with LibyBF tools. There's, for example, a new compact layer that can make it, that can obstruct away the actual map type that you would use. So for example, if you are running your code on an older Linux where only PerfBuffer, the older map type, is supported, then it will fall back on that. If a newer one, for example, Inbuffer, is available, it will use that. So these are quite nice improvements. And we are aiming to keep Bumblebee updated to be able to use these changes. We are also working on tighter Kubernetes integration, for example, to get correlation with podnames easier. And there's an operator work already in progress, basically that can enable you to have Bumblebee deployed into Kubernetes clusters as a demon set. And that demon set can load the OCI images into itself and expose the data as it goes along. Currently, it's already possible to use Bumblebee with Kubernetes clusters. For that, you just have to package it as a demon set. The original CLI tool can be packaged as a demon set and it's working quite well. If you are interested in a demo, please find us at our booths and I can show it to you. We are also planning to add new map ties. We are also planning to have logging integration. So you can, for example, log all the events that you would want to expose. And you can inject these logs into your logging platform. We don't currently support histogram metrics types for the events. We just only support gauges and counters. But histogram is something we are also working on. And we are all the time working on adding new libpf examples to the Bumblebee repository. And you are also all welcomed to contribute to those examples. My key takeaways from learning libpf in Bumblebee can be seen here. Basically, now that I know that libpf is really a game changer, focusing on only the kernel space is fun. It's much better than doing all the plumbing with the user space code. Integration is key, because, for example, for observability, it's really nice to have integration with Promethex right away. And we are just getting started. And we would like to all of you to benefit from the libpf ecosystem. So feel free to take Bumblebee for a ride and please reach out if you have further questions. You can join our Slack and come to our booth to enter to in a drone. Thank you very much. Beautiful. Thank you very much, Christian. Before you get off, I actually have a question. That is not libpf. Do you know when Ambient Mesh will be made production ready? Excellent question. Ambient just got merged upstream last week, I think, or maybe this week. Oh, no, I think it was last week. And I have to double check it. But I think the next Istio version, Istio 1.18, will include this profile. So you can already, if you go to the Istio repository, you can find the Ambient-specific part of the readme. You can download the artifact there if you want to try it out. You can also check out the academic distro.io website where we have a virtual environment running. It's a free course. You can take Ambient for a ride there as well. But I think the next Istio release will already include Ambient as a profile. So that's also a quite nice way to experiment with it. Thank you very much. Do we have any questions for Christian? My question is, can we monitor from in the observability space of eBBF? Can we only see L3, L4 level metrics? We can also see L7 level metrics. And if it is L3, L4 only, then will it be my primary observability or will it be my default level observability? If nothing can capture, then eBBF captures. I don't know if that makes sense. Yeah, that absolutely makes sense. eBBF can be used for a lot of things. And it can even parse user space logic as well. And it's also possible to generate layer 7 metrics, basically. So it's definitely possible. You just have to think about the most efficient way to do things. And you also have to take maintainability into considerations. So if there are tools available that might not already use eBBF but are solving your problem, then it might make sense to use that for that purpose. But for example, let's say, if you take monitoring as an example, in the primitive ecosystem, NodeExporter is the default of standard to monitor host that can solve a lot of problems that you might already have. But for example, out of memory exceptions is a program that it cannot. So it might make sense to use eBBF for that one. Or let's say you cannot use NodeExporter for some reason, but you only need a subset of the features that NodeExporter can provide. Then it's also possible to write some eBBF code with covering those features that you are interested in so you can have a more lightweight and efficient lightweight NodeExporter based on eBBF, if that's what you wish. Any more questions? Thank you for your presentation. I have a question. So you presented a lot of cases where you could use eBBF, but what would be the use case where you would not recommend using eBBF? Oh, excellent question. Let me think. For example, there might be some cases where security is taken quite seriously. And running eBBF might require advanced privileges. There are improvements happening on this area as well. So there's currently a capability that is aiming to provide the least necessary privileges that are needed to run eBBF code. So we are getting there, but there are still some security concerns maybe that might raise some security eyebrows if you want to run eBBF in those cases. Another example might be I'm not sure. The previous question about layer 7 metrics, that's basically solved with layer 7 components, for example, invoice proxies. If you're running a service mesh, you get observability out of box to cover all these metrics. This can be solved with eBBF to some extent. But for example, invoice and the steel sidecars are a great and well-adopted fit for this use case as well. Mainly with eBBF, I would focus on really hard observability challenges, because I think this is the area where eBBF can shine. And if you have fine integration with the observability ecosystem, then again, it's a great fit to solve these challenges. Running little eBBF programs and exporting those as primitive metrics is quite lightweight and quite efficient. The code for these logics is basically just, I don't know, 10, 20 lines of C code. So it's also not too hard to maintain. So I would say that's the best use case for eBBF, in my opinion. Thank you. Anymore? We have one more question. Well, I'll take it then. I have one question. We see no one. One more. So I have had the chance to play around with eBBF and try out maybe 20 or 21 of the most common 25 C calls. And I saw that Bumblebee makes things a little bit easier, preparing the framework for the type of program that I have to run. But still, I think it's a very steep learning curve and a very hard adoption for consumers of tech. Does Solo plan to make it even easier than with Bumblebee? Yes, so for example, our product is using eBBF to some extent under the hood. For example, to accelerate the service mesh, we are also generating metrics based on eBBF. And we are using some part of the open source Bumblebee code in our product to do that. But the open source Bumblebee, for this exact reason, to make it all easier. We are not, as you mentioned, it's still not the best user experience and developer experience. It's quite steep still. But for example, on the roadmap, you can see that we want to have, for example, better and tighter Kubernetes integration. And that's, for example, one area where we still have some work to do. Operator is a great step in that direction. But being able to correlate to actual pod names can all this make it easier. Because without this, you still have to do some user space plumbing on your own to get the actual data. So the answer is, yes, we are still working on improving Bumblebee. And it's open source. So if you have any idea or feature request, feel free to file a GitHub issue or even contribute to the project. It's quite a small project still. And I think it's one of the easiest way to get started with eBBF. So we are happy to onboard you to the open source project if you are interested in that. Thank you very much, Christian. I am sure that everyone was very interested. And no one there to say that they were eBBF experts. Thanks, Christian. Please give it up for Christian. We are going to have a little bit, 21 minutes break. We are going to start again.