 Četko. Tako je. Tako je, je tko. Tako četko, četko, začne. Sveče mi odmah. Zelo je moja prvama, ko je to vseh komprencijo. Se smo povedali o EBPF. B.P.F. today, so, this is quite a condensed version of a series of blog posts, which I did on the Collabora blog, and I will provide references to and links to those blog posts, because it's quite a lot of material in a short amount of time, so it's pretty condensed. So if you want to learn more about it or have a better in depth like understanding of the topic, you can visit those blog posts and read from them. So I apologize in advance because this is very condensed. OK. So a little bit about me. So I work in an open source-oriented consultancy company named Collabora. We work with customers who like doing open source, who use open source and do it properly, and also work with customers to help them do it, get it and do it right, and not put themselves into a corner with open source technology, like ending up in a situation where they can't upgrade the kernel or, yes. As a hobbyist, I also really enjoy working with embedded systems, taking them apart. So I'm a huge fan of projects like OpenEmbedded, Yocto, OpenVR team. I'm always looking for new tech to improve embedded systems. This includes E.B.P.F. and how I arrived at E.B.P.F. I don't consider myself an expert in the subject, so I'm a user. I'm constantly learning. And, yeah, I really like open systems. So as you are well aware, we have a lot of smart devices around us, and these devices are getting more and more powerful, and they are all connected. And usually these devices are much more capable than what default software, which comes with them, is capable. Yeah, the software complexity is also rising on the embedded devices. So I'm just finishing a project right now where most of the logic on the embedded device was written in JavaScript. So it runs a browser, and yeah, that's the embedded device basically. And we also have, like, in this new IoT world with smart devices, very important privacy and security concerns. So it's very important to know what's actually happening in the systems around us. So this is where technologies, like E.B.P.F., help us because we can gain insight into what's happening into the system we can observe. So, yeah, this is a problem which we have with embedded devices. So even though they're much more powerful than they used to be, so in the past, embedded devices were like 8-bit microcontrollers, which were driving on LCD display. But nowadays they have, like, a full operating system on them. They have an MMU, they run full Linux, mainline distribution, and, yeah, recent versions. And nobody actually complained that it's very easy to develop on embedded Linux. So why is this? Obviously, the increased complexity in hardware and software, but also the resource constraints around embedded devices. So the kinds of resource constraints I've made here a list. This list is, obviously, non-exhaustive. I mean, these are all pet peeves of mine. You can add your own pet peeve there. So I have nothing against Busybox. So it's a really awesome project, and it's very useful. But I bumped my head a lot of times against it, yeah. So, basically, these are the types of resource constraints embedded systems deal with. And embedded engineers have a lot of solutions to work around those resource constraints. And, yeah, from network file systems to, like, remote GB sessions, there are quite a lot of tools. So the toolset, so embedded developers, are quite big. And this is also where we can add EBPF as just another technology, which helps embedded developers. So EBPF does not intend to replace any of these other methods to use and develop debug embedded systems. It can supplement them. So it's another tool we can add in our tool chain. And this kind of sounds weird, like, hey, this already exists, and we try to apply it on embedded systems. And, yeah, it's kind of like a solution in search of a problem. And it kind of is. So, embedded engineers, in my experience, we look at what the non-embedded engineers have, and we, like, want to have those capabilities. We want to have that memory. We want to be able to, like, build or operate these systems on our own machine to avoid cross-compiling and stuff like that. So it's the same thing with EBPF. So it's a technology which was developed and mostly used on servers to introspect, like, big iron machines. And, yeah, we want to use it on embedded. So because we're using the same kernel, even though it's mostly used in another part, in another industry, we can reuse it. So there is a precedent for this, like symmetric multiprocessing. Support for it was added in the kernel very early, and not for IoT devices, not for embedded. But now it's also used for really small devices, which basically reuse that code. So we can do the same thing with EBPF, because it's there in the kernel, we can make use of it. How do we make use of it? That's, like, the subject of this talk. So I'll go into a brief explanation of what EBPF is and how it works. So basically we have three parts. We have the kernel part of EBPF, the user space part, and we have the byte code, which runs on a VM between them. Yes, so I'll put some better learning resources in the end of the presentation, because I can't, like, really explain in depth, but the basics are this. So we have a virtual machine which runs in the Linux kernel. We have byte code for that virtual machine which gets loaded from user space via a syscall. That byte code is verified for safety, and by safety, I mean, technical safety, it's verified so that byte code can crash your kernel, can cause hangs, can, yeah. So the byte code is very restricted. It can only have a very specific, maximum number of instructions. It can't, like, cause unbounded loops. It can contain unbounded loops, and that can cause hangs, and yes. So once the byte code passes the verifier in the kernel, it's compiled to native code, and that native code is executed in various process execution paths. It can be inserted in both kernel execution paths and user space process execution paths. It's like a form of function hooking. So it's an event-driven processing. So in EBPF, in byte code, you write a piece of code, which gets attached to a particular function call or a particular instruction. You can use trace points, k-probs to attach this logic, and when the normal workload runs, your EBPF compiled code also executes. So, yes. The purpose of that code is to collect data and share that data back to user space from the kernel or the application because you have very small native compiled code that doesn't really affect the performance of the process you're inspecting. So this is very important and one of the benefits of EBPF. You can analyze normal production workloads because usually you're just adding a few instructions at a very specific point. Yes. And data gets shared with user space. You can, for example, print text data to a ring buffer, like a trace ring buffer, and just get it, or you can use maps. There are special data structures which you can use to share data. Yes. Okay. So here's the diagram. So you have a user process, which calls BPF load, and it passes the bytecode verifier, and if the validation is unsuccessful, the sys call will just return an error, and then the code is compiled, and in this case, attach to the sys open handler for the system call. And then when other processes in the system try to open a file, the user process, which attached the EBPF code, can read, hey, what is this guy doing? What is this guy writing? And yeah. So we'll see some code, which does this in a short while. Okay. And this is how the bytecode looks like. So if you look at it, it's 64-bit. So the virtual machine inside the kernel is always 64-bit, and it's generic, so it's 64-bit, even on, for example, 32-bit systems. Yes. And all those user space get to this bytecode, well, sometimes they write it, byte by byte. So there are projects, which do this, and this is very hard, which is very much like writing directly in assembly. And to avoid doing that, because it's very hard, a backend was added to Clang, which is capable of producing EBPF bytecode, but writing restricted and compiling a restricted C language. It's restricted due to the nature of EBPF. So even though in the restricted C language, you can do a while loop that gets unwinded in the bytecode. So actually the VM is not running a loop. So EBPF is evolving very fast in the kernel, and in the recent versions of kernel, in the kernel, so the number of instructions was significantly increased to a few million, so the verifier was made very efficient. The verifier runs, simulates the code, which you pass to it. So now, because it's very efficient, they also managed to add support for bounded loops. So if your loops are bounded, they can be executed and compiled, so that's the EBPF VM is constantly evolving, and new features are added to it, but it will never be touring complete, so you can't, for safety reasons, so you can't crash the kernel or, yes. And the problem is with using just clang to compile the bytecode. You solve only the creation of the bytecode problem, but how about inserting it in the kernel and actually fetching the data in doing the system calls to get data from maps, that's really hard to program. So this is why BPF compiler collection project exists, and this is a framework for writing user space programs which interact with BPF. So this abstracts the clang calls, so it compiles on the fly the restricted C, which you pass to it, so you don't have to call clang manually, and you have bindings for multiple languages, which you can use in user space, and you can write, for example, the user space interaction part in Python and use the restricted C for the kernel. You have these two parts in the same program, and this is a BCC tool, and the BCC project comes with a lot of preexisting tools, which are production ready and very well tested and used, so that's like the biggest advantage of this project. So this is an example of the diagram we saw before. So we have the restricted C as a string in Python, and that gets compiled without us interacting at all with the compiler, so we just create a BPF object pass the restricted C string, and tell BCC, hey, I want to attach this code to the dosis open, it's okay, prob. And we, in the restricted C part, we have just one function, and it gets one parameter, that parameter is a context parameter, which contains the register values of the dosis open function call. So basically the register values are the pointers to the parameters of the open C call. So what we do here, we read the file name, which get success in the open C call, and print it to the trace link buffer, and this is it, so it's pretty simple, and then we do a while true, sleep to just not close the program, because eBPF programs, which get loaded in the kernel, get unloaded automatically by the kernel, when the process which loaded it terminates. That's why we do a while loop, until you kill this process, in the trace link buffer, you'll find everybody who tries to open files, and what files they open, and yeah, so it's pretty neat. Yeah, and basically this is all, and it's quite simple to write, yeah. So as we explained, we have two parts, the part which collects data and sends it to user space in the kernel, and the part which does the clang compiling, and this is called for us. Now the main benefit of using this technology is the BCC tools package. So this is a huge collection of tools, which you can already use. So this diagram is from a book, which is upcoming on eBPF, and there are quite a lot of tools, and they're very useful. For example, there's a tool exec snoop, which when you execute a program with it, it brings you all the execs that program will do, that process. So for example, if you run exec snoop Emacs, then it will tell you, hey look, Emacs just forked an exact GPG, then you can inspect all the children very easily, and you can just run a simple command, like show me just all the processes, which this one forks and execs. So that's really useful. You have a lot of tools to do introspection like that, and they're readily available. Okay, so this is a summary of the benefits, so we can create a kernel. It doesn't cause performance degradations. For example, if you want to use GDB, that stops the process. And it's very hard to analyze, especially bugs, which only reproduce in production, when you're stopping the process to inspect it. It's much easier if you use something like EBPF, which inspects the workload. Basically it's a tracer, not the debugger. Okay, and you don't need special debug bills for this. You can always enable it. If you just enable it, and don't load any EBPF code in the kernel, it's like it's not there. So you don't have any overhead. You can just, it's already enabled in the majority of distributions. Yeah, and it's fully upstream. So there are no downstream code, which you need to compile and maintain. It has quite an active community. And you can do more than just observing the system. So by default, EBPF programs do not have side effects. So this is for safety, to not compromise the system. But in certain subsystems, like networking, you can implement firewall rules using EBPF, because you can have EBPF programs, which can be attached to every packet receive, for example. And they can inspect the packets in that restricted SQL, and say, hey, drop this packet, accept this packet, so you can implement firewalls using it. Yes, so this is pretty convincing, yes. Okay, now, when trying to use EBPF on embedded devices, there are some problems. Some of them are general, like EBPF specific and not necessarily embedded specific, but some of them are very specific to embedded. There are multiple approaches to solve these problems, multiple projects, which have different trade-offs, and there's no silver bullet, like if you want to use EBPF on embedded, you have to use this one project. I mean, you can mix and match the projects, and yes. So one of the problems, which I also talk about in the collaborative blog post series, is that because the VM is 64-bit, you have trouble reading data from a 32-bit system. So the VM, by default, all registered pointers are 64-bit, and when you get the context data structure from the user space, from the kernel, from the user application, all that memory, which you get access to through the context and the registers, that's like 32-bit, and the structure offsets are aligned for 32-bit, and it's very hard to read that data. So you have to do some pointer arithmetic. The EBPF VM, even though it has 64-bit registers, it can do 32-bit subregister addressing. So if you only fill half of the register with data, you can dereference that as a 32-bit pointer. And this is doing it like this, it's very fragile and not portable. So I have examples in the fourth part of the blog post series. And to address this, BTF was developed, and it's already upstream in the kernel. So what this does is add type info to EBPF programs. So this is part of core, and this is the project which tries to make EBPF code portable, and this is a really hard thing to do. So the dream is to just compile an EBPF program once, and then be able to load it and run it in any kernel. And yeah, that's hard. So by adding typing information to the bytecode, we basically, so the bytecode can be stored in like an elf file. So what this does is add a new elf section, which contains the typing information, which is extracted from kernel sources. Now, this is hard because kernel sources are c and are pre-processed, and the pre-processor cuts part of the source code out. So there are hacks which work around these problems, like if you have an if-def, the value which you're testing is converted to an EBPF variable, which gets tested at runtime, and both branches, which the processor cuts, are included in the bytecode. So basically it blows the bytecode a little, but there are these kinds of hacks, which, and they work. But yeah, so EBPF is a stable kernel ABI once that BPF system call was exposed to user space, and user space wants to load bytecode instructions that became a public ABI, and it's like it has to be compatible going forward, going backwards, sorry, and also the kernel headers are like that. Problem is that when you want to poke at internal kernel data structures, they may not be exported. So you get the memory in the context structure through the registers, but you don't know what you're accessing, and the kernel headers which usually come with a kernel dev package in a distribution, they contain the user space-facing structures. So to access the non-user space, headers, you need to copy-paste the kernel structures into your restrictancy program, so you know what you're working with. And that works, and that's what we're doing right now, but I don't know if BPF will solve this, hopefully it will help. And we also have this problem like a lot of kernel structures have different layouts based on kernel configs. So there's a huge variation in kernel configs, and this leads to very big variation in how the structures structure. So I don't know if there's any hope of actually having portable, like eBPF code because of this, but we can make it as portable as possible. And because BCC compiles using clang on the fly right now, the eBPF bytecode, it requires kernel headers in the file system because the eBPF bytecode needs to access those data structures. So in starting with kernel 5.2, kernel headers can be bundled inside the kernel image and they can be accessed during runtime. So these are the complete kernel headers. Yes, the work is still ongoing on this. Another really interesting problem is security. So right now, any program which loads eBPF in the kernel needs to be root or have admin capabilities. And eBPF code is assumed to be not malicious. Actually adding unprivileged eBPF support is very hard because you have to have like mitigations for side channel attacks and it's like very, very hard and not likely to happen. So yeah, care must be taken when you're running eBPF code in production. So don't fetch restricted eBPF bytecode from the internet and compile it and insert it in your kernel. Because that can provide a lot of system information to outside processes. And there's a really good LWN article on this. OK, so we're gonna talk about methods which use eBPF unembeded. So the kernel tree comes with a few examples. These are examples in the kernel tree. So they already have pre-compiled eBPF. Many of them are just stored as byte arrays in the C source code and you have like this lib eBPF library provided by the kernel which you can use to load and interact with that eBPF. So yeah, you have pros and cons with this. Like it's really lightweight, it's easy to compile. Yes, and it runs on very low memory devices but you need to maintain all the user space interaction yourself. And if your eBPF use case gets very complex, you're basically reinventing BCC and all that framework, so yeah. But there are projects which do this and we'll see a project which does this and it makes sense to do it. Now, you can also use BCC directly. So it's pretty big. You have advantages that, hey, the upstream project is very well maintained. You get all the BCC tools and everything works, it's well tested. The problem is the space. So BCC compiles on the fly using lip clang. So that means that each tool links at runtime with clang, which is pretty big and you also need like random access memory to run that. And from my tests, I needed at least 300 megabytes of space just for this. And it was developed for like cloud servers and yeah. There is a project for embedded devices and Rodeb, which uses this. It bundles a full BCC tool chain. It has more than BCC, so that's why it requires two gigabytes of storage but yeah, it's a pretty big tradeoff if you want to do it. So core, if it actually eliminates the on the fly compilation using clang, will slim down BCC and will make this variant more attractive. Yes, but it still will require Python because most of the tools are written in Python. BPFD actually had a lot of success with this project and I tried to upstream it in BCC and ended up like totally abandoning it. So the main idea of this is to run a demon on the embedded device, very small C program, which communicates, you can use multiple transport layers, you can use telnet SSH, even the Android debug bridge and communicate with a full BCC installation on a host machine. So you run Python, clang, you compile the byte code on a host machine, then you send over the transport to the embedded device which just interacts with the kernel. And this actually works surprisingly well. A big problem was latency. So for most of the tools latency was not that bad, but you have tools which read a lot of data from the kernel. And the poor embedded device needs to serialize and send all that data on the wire and you have noticeable latency. And the problem which actually I think killed this project is it's very hard to maintain. So there is no standardized protocol to communicate with the embedded device. So what we did, we took the internal interfaces of BCC and made them communicate with the BPFD binary. So the internal interfaces of BCC change all the time. And I don't think anyone wants to create a protocol there and to make sure it's stable and yeah. So that's maintenance nightmare. So the fourth approach is to use a domain specific language compiler and compile eBPF directly on the target device. So this is the ply project. It creates a very small binary which is an eBPF compiler for a very special language which is OK inspired. And this binary is not portable. So you compile it for a specific kernel, for a specific device and then you can put it on the device and run it. So here is an example. If you attach to a K probe for an I2C transfer, you can pre-stack traces. So this is very useful, for example, to know who does I2C transfers. Because for every call of that function, you get a stack trace, here's who. And it's really simple to do that. Basically, you copy the binary to the target and just run that command. The disadvantage of this approach is that you lose the control which you get from BCC. With BCC, you get a lot of control about how you interact with the eBPF program in the kernel, what you read, how you write, when you read it. Here you can only interact with that very simple domain-specific language. So you lose control, but you get really fast information. And yeah, the project is still under heavy development. So that specific snippet is from version 1.0 and version 2.0 is a completely right. So yeah, it's still under development. This project is using also the method 1, which we saw earlier. So it's a totally self-contained project which has its own custom user space compiler. And another approach is to use the Go bindings. This is useful to replace the Python dependency. So Go is really good at producing like statically compiled binaries. And yeah, you get a lot of control while using Go, the same control which you also get by using the Python bindings. And yeah, the problem is that most of the tools, most of the existing BCC tools are already written in Python, and many of them are very complicated. So actually porting them to Go, yeah, it's hard and not many people see the value in doing that. So some tools were ported, like for example, Xexnope, which I mentioned. So in that location you can find, it's in the bindings directory. The bindings also contain ports of the tools. And for the blog post series, I used the generic loader, which was written in Go, and I loaded pre-compiled eBPF objects. So I transferred to the target the Go loader and the bytecode. And yeah, that also used to work. Okay, so always forward or embedded on eBPF. Well, the very important is for to actually have portable eBPF because even though, for example, you can have a Go project, which is very small, and the problem is the bytecode, which you copy, is not very portable. So you can have a generic tool, which is very small, which you can bundle in a distribution and say, hey, it doesn't matter what, board you're on, you just installed this distribution, you have the tool, you can use it. So the core needs to be as successful as possible. It will make BCC more lightweight. We can use Go to eliminate the Python dependency. Yes, play will continue, BPFD very unlikely. And yes, it's already useful today if you're willing to jump through some hopes. But there's work remaining in the future. It will be more accessible to embedded devices. So people are interested in doing this. Yes. So I have some learning resources here. So LWN has some really good articles on it. You can also visit Brandon's blog. So it's a really, really good blog with lots of information about eBPF. There's a book, which also Brandon is writing. So it will be out later this year. It's a very comprehensive book about eBPF. The blog post, which I mentioned, which I wrote. So you can get a very more in-depth and hopefully better explained version of this talk there. Yes, and well, you can find a lot of information by just simply googling about eBPF. Lot of people are posting blog posts and doing interesting projects with it. So lots of projects try to port their existing code to eBPF, especially projects, which used to insert kernel modules to observe the system. And that was it. And thank you. Thank you very much for attending.