 Hi everyone. Thank you for coming and attending. I'm Quentin, I've been working on BPF for about five years now. I started at 6-win and I worked at Neutronome on BPF Hardware of Loan. So the objective of the presentation today is just to give you an update about the latest BPF features, maybe what's getting inside the BPF architecture in the kernel, what can be used to create more efficient programs and these kind of things. So I'll start with BPF basics like the core features of BPF that are evolving and present also the new features we are getting with the latest patch set. And depending on time or so, word about the BPF universe, I mean BPF tooling and projects based on BPF itself. So just before we start, a few minders about how eBPF works. So you get programs compiled from C most often with Clang or ILLVM and then injected from user space into the kernel where you can attach them to one of the existing hooks so that can be TC or XDP or sockets for networking or K-probs for tracing and so on. Before attaching them, you have to make sure that those programs won't crash your kernel so you have a verifier that makes sure the program is safe and terminates. You can also JIT compile it to have a more efficient execution. So the characteristics of such programs, they are using 64-bit long instructions, they are using 11 registers, including one for the stack, which is 512 bytes. You can have about 4,000 instructions in a program and you don't have loop loads. Well, that was true at least at the beginning. So we have had a few changes of that recently. So the stack is still 512 bytes, but you do have a mechanism that can allow you to use more indirectly. I'll come back on that later. Excuse me. We don't have the same limitation as before in terms of the number of instructions in a program. So you could have only four instructions. Now you have up to one million instructions. It's not sure that I can have any program doing one million instructions. It's the number of instructions that will be validated by the verifier. So the verifier emulates all the execution path of the program and it can check up to one million instructions. And that went up from 130,000 or so a few months ago, a few years ago. So we have bounded loops too now starting from kernel 5.3. They are bounded in the sense that you must still ensure that your loop will terminate. So basically the verifier needs to make sure that you won't be doing weird stuff that could introduce infinite loops in the kernel, but that's pretty useful for the most often use cases that needs loops. So that's quite nice to have. In terms of performance, I won't go too much into the details, but we have a number of performance improvements that are happening in the kernel for BPF. So LLVM can favor 32-bit sub-registers in the programs and we get better performance and lower code size for some architectures, mostly 32-bit architectures. That was especially relevant when I was at Neutronome because we are trying to make the programs as small as possible to make them fit on the cards, for example. One of the latest additions to BPF is the new set of subcommands that can help you work with maps, and especially when you want to do a lot of operations on maps, like look up a high number of values or date a high number of values in a map before that you would iterate over each entry in the maps, so hash maps or ray maps, compared to your BPF, and you had a risk of hitting a deleted entry. The entry had been deleted before you reached it in the list. And now you have those batch operations that can help you make that faster, more efficiently, and without this risk. AFXDP gets some improvements too, but there's a presentation on page pool I think later, so I won't mention AFXDP here. So I jump to what's actually new in terms of just new things you can do with BPF, but that's still pretty low level. I'm not talking too much about new use cases, because we're still focusing on networking, so that's still the classics and the DDoS and load balancing. I'm talking just about what new things you can use in your programs themselves. So we have BPF, which is BPF type format. It's a format for data close to DWARF, which is used for debugging programs on Linux. BPF provides information for BPF programs and maps too. So one simple example is here. We have done from the program that's running inside the kernel, and we can see that we still have the C instructions that were used to compile these programs. So the C instructions were encoded into BPF and send along the BPF bytecode to the kernel so we can keep track of them. So BPF is not really one of the latest features in the sense it's been around since 4.18, but it's receiving a lot of changes, a lot of improvements, and it's used by more and more features too, so that's why I'm mentioning it. It's generated with PA-Hole or LLVM. BPF objects are verified in the kernel for consistency, so you cannot just introduce any BPF object that you like. It has to match with the program on maps you're using. We can also produce a BPF blob for all the symbols in the kernel. We do need a specific configuration option for doing that, but after that you have BPF data available in the CCFS system file that allows you to access to all the symbols that can be used by BPF probes. For example, when you're using BPF on trance points or K probes, it allows you to access data structures from the kernel just not with an offset from the beginning of the struct as we had to do before, but directly with the name of the field in the struct, and that's especially important for trying to compile a tracing program just once and being able to run it on a variety of kernels that might have changes in that structure depending on the computation options or kernel versions. That's what we could compile once from everywhere for BPF, but really it's being used for a lot of things. We also have now global data in BPF, which means that we can use global variables in the BPF source in C, and it translates into data being stored in specific sections of the ELF file, and that's useful for making BPF templates in one way so you can just have your object file that you compile from C with this global data in read-only sections and then you can just update the read-only sections instead of trying to find the relevant information in the code section so you can adapt your program with that to a variety of use cases or configuration changes. This global data is used with MATS somehow to interact with user space too, so you have a possibility to map them from user space and to be able to read them and to see what the program in the kernel is using. Something close to global data is another kind of variables. It's external variables, so you can have external something in a C program that you compile into BPF. It's actually limited to a very small number of variables which are Linux kernel version, and you can configure something that you can use to configure your kernel. This is one thing that relies on BPF, for example, support for those external variables. Using that makes you able to adapt your program to with closes like if I'm using a Linux kernel version that is higher than 4.0, you can adapt your program. We have BPF tromper lines that can convert the native coding convention so the host coding convention into BPF coding convention. It's a way to attach programs more efficiently to entry and exit of functions. It's useful for networking program too because now you can attach program at the entrance and exit of XDP programs and see all incoming packets to your BPF program and outgoing packets so you can see the changes that occur, for example, so that would be a good thing for the beginning. It can also be used in what is called the BPF dispatcher, which is a mechanism where you're reusing those BPF tromper lines to avoid the cost of red ponies following made down and specter attacks when coding XDP programs. So we get also performance improvements through that. Another thing is global function and dynamic linking, which appeared just in the latest weeks or days. We have global functions supported by BPF now, which means the functions you're using in your main program don't have to be static anymore. They can be loaded as separate programs. The functions are loaded inside your program and they act as placeholders. So at runtime, you can jump from your BPF program into another BPF program of type BPF prog type X, sorry, and come back. So just as you would do with a regular function code, but there are different BPF programs, so you're starting to get something that can be really modular and you could imagine building a BPF library that can be injected as a set of small programs and code them from a main program. So that allows for dynamic policies. I want to change the processing of that packet depending on what its metadata are. That can help for code reuse. I don't want to use the same snippet in all my programs. I can just call it as an extension. And since I have less code reuse, I get a shorter verification time because I just need to inject those extension words. Another mechanism that appeared recently is the possibility to override the strict ops in the kernel, which is quite restricted at the moment because there is some wrapping to do in the kernel for the strict ops, so those structures that hold operations to do on some specific algorithm. So the only one that's being handled now is TCP congestion ops. And Martin Café-Loe, I think, used that to reimplement custom TCP congestion control just with BPF programs by overriding the operations that are being done by default in the kernel. So that's a possibility to introduce new use cases too. There is more to come very likely because the community is very active. We have improvements on XDP, so multi-buffer XDP is being discussed, egress XDP is being discussed, static linking that would be the merging of several object files into just one BPF, so you could have really library written in C about BPF programs and just compile them together to have your BPF programs is being discussed too. We'd like to have step-by-step debugging one day to be able to better debug BPF programs. Some other use cases too, there is a Linux security module based on BPF, which is being discussed at the moment, not merged yet. I wanted to make a brief update about the tools and projects, but I don't have much time, so I'll just leave this slide and if you have some questions, maybe I'll take them now. Thank you. Yes. So do I think the possibility to do M-Map on some values is going to be the end of the regular way to use a BPF system code to communicate with MAPs between users based on kernel? No, I don't think so, because it's restricted to some specific use case. I think it will be global data for now. And I mean, you have a lot of different MAP types and a lot of different things you can update in them and that would not necessarily be suitable for M-Mapping things. So the more we can M-Map, I suppose the best it gets in terms of performance, but we're not just here yet in terms of replacement. Time's up, so if you have any other questions, please comment, let me know. Thank you.