 So hello, I'm Jose and I'm gonna talk today here about Something we have been working on in Oracle in the toolchain team like for the last good five six months Which is to support the kernel the internal, you know virtual machine Whatever you describe it, right? Bbpf, you know in the new toolchain by the new toolchain Well, I mean, you know like this collection of development tools like the ccb new deals Assembler disassembler and so on So this is what we will go through today very fast because we only have 35 minutes to cover everything so Let's start. What is this about the project that we have started in Oracle and we are working on well Basically, we have three different phases here The goal is to add support for ebpf to the toolchain which basically means to introduce a new target in the same way that you know I don't know x86 on Linux is a target. Well, in this case, it will be ebpf to the new toolchain We will see that we the support that we specifically added to each of the components But then of course and that is what makes you know this target interesting One thing is to be able to generate the bpf programs from, you know from your compiler And then something completely different is to generate programs that can actually be run in the kernel Why because the kernel it has a kernel verifier, right that can reject your program actually quite easily So that is the face to and that is the face that where we are right now Because we have the target on board on the toolchain But now we are fine-tuning it so the generated programs can actually be run on the kernel context And then there is the third phase which is in parallel with the second one basically, which is an ongoing activity Which is to expand the support for this specific target So basically the general idea is that the bpf developers Compiled the bpf developers. They should have you know, you know a Complete and useful toolchain right toolset like the developers for any other platform basically even though bpf is very peculiar Okay, I think that I don't need to introduce bpf here, right? But just in case because especially the naming part, you know, I mean it can be a bit confusing Well, bpf is the Berkeley packet filter Originally it was intended to be used to filter packets basically to describe the characteristic of a network packet to discriminate it Well, especially in the recent years it's been evolving into something else which is a much more generic sort of Virtual machine that runs in kernel if you ask the bpf hackers kernel hackers They will tell you that ebpf is not a general purpose virtual machine But everything else indicates that it's becoming one, right? So This ebpf ebpf bpf think is that originally what was the original restricted bpf for field for packet filtering Now it's called cbpf or classic bpf The new more general bpf is now called ebpf or extended bpf But I was told a few weeks ago like two or three weeks ago that now Basically, we are going to call or they are calling ebpf. Just just bpf Which is I think it's good So what is ebpf? From a programmer perspective Especially from a tool chain person like me perspective right compiler person like me Well, basically ebpf is an instruction set architecture Which if you look at it is pretty straightforward, right? Most instructions are 64 bit long. It's very uniform. There is one instruction which is 128 bits that yes It's used to load 64 bit constant into the registers But then it has some peculiarities like for example, it doesn't give you a stack pointer Which is quite weird when it comes to instruction sets and you know, one human machines and It has no floating point support which makes full sense because those programs are to be executed in the kernel and in kernel floating point You know, it doesn't make that much sense The instruction set is supposed to be orthogonal to ndns It is not which is a bit unfortunate, but okay and The instruction set is in purpose by design limited Right. It's not like a general architecture like a spark or meeps or exit the six that it's designed, you know To write very general purpose languages ebpf is not It has registers which are 64 bits Why they are all general purpose registers, but one which is a frame pointer, which is read-only And this is the instruction set. This is how it looks like Right. I mean the same the first row is, you know, it's divided by the different classes, right of instructions It's pretty straightforward like that. It is a small and simple Then like in every Architecture, okay, you have an instruction set, but then what about the API? Well In the case of compiled ebpf Well, what do we do with the compiled ebpf? We put it in L files in object files Now, what is a valid Elf program containing a BPF program? Elf file contains a BPF program. What makes it valid or not? That is what typically in more Normal architectures that is typically what the ABI will tell you the PSABI for example in the case of a Unix system Well in the case of ebpf Currently in practice What constitutes a valid BPF program in an elf is determined But what the existing lbm backend produces and what the different kernel loaders Basically accept, right? There is a BPF load.c in the kernel which is seen in some samples directory But the main consumer of elf files with BPF is BPFleaf, which is also part of the kernel so It is not really documented this ABI Why because up to now there were only one compiler producing compiled BPF code and from C and Only one consumer which is the Linux kernel now We can expect that there is only to be continue being only one consumer Which is the Linux kernel, but now we have another compiler Generated BPF code and then at this point. I think that it will be good if we start thinking about documenting You know the ABI of What constitutes a valid BPF program like what relocations are used for what purpose the section names You know things like that So we have started the port so the first the first thing you have to do when you add a new port To the Knutulchen is to decide the triplet the triplet basically is you know three parts is a name Like for example x86 linux new for example, right? The first part is the CPU then you have a vendor part and an operating system part In this in the case of BPF which is sort of a bare metal architecture, right? If we imagine the kernel as being like the hardware which is actually executing BPF BPF doesn't run in any kind of operating system Also, you know where case the vendor doesn't make much sense. So at the end the triplet became BPF unknown none And of course you can use you know, which is a good thing You can use shorter names like BPF none or just BPF to refer to it Once we defined a new triplet Then we had to add the help support so the help support. We already had defined an I'll identify the identification number for BPF which is 247 for LVM and then we added a new list of relocations of Which of those relocations three of them which are the only ones which Surviving the object file after linking It we did that carefully so they use the same relocation numbers than LLVM So we are stay binary compatible with with programs generated by LLVM. We don't want to break compatibility With what LLVM produces obviously that we very see that will be totally silly So once we had elf support then we had to To create a BFT port BFT is a library in the new toolchain that basically gives you support for object files and we added the support for elf 64 bit L files, you know for BPF in both big Indian and little Indian Then the next part was to define the op codes, okay? I come from the Noodle's cultural meeting and I am reusing some stuff here. So this is maybe too detailed, but well It's nice anyway Some of the new ports are using a software called season Where basically you define the characteristics of the hardware you want to target in RTL So in a language which is very similar to what GCC uses internally You know as intermediate representation of the programs, right? So well, we define the architecture, you know hardware elements like registers fields of instructions operands You know instructions including semantic of instructions because that is how we generated the simulator for example See Jen is very nice anyway, so from the CGN description we generated BPF assembler at this assembler So we have the BPF as which is the gas the Gnu assembler port for BPF and also well using object dump You can disassembly BPF code in exactly the same way that you will do with any other architecture and Well, that is how it looks like Very straightforward We also made a port of the linker Now this is a bit Not that straightforward why because BPF is peculiar and the way that Leap BPF and the kernel is using elf You know to load BPF programs It doesn't match that well to the elf model, right in elf We have object files Which are relocatable and then we have executables, right now when you compile an elf executable An elf executable has a single entry point Which is I think minus a start and the score a start whatever, right? BPF today it doesn't quite follow that model because when you load BPF object in the kernel an elf object in the kernel containing BPF programs basically the leap BPF in the kernel expects the elf file to contain a set of Sections and the names of the sections specify where those BPF programs and how those BPF programs are installed in the kernel, right? So an object file containing BPF programs It has multiple entry points and when you try to reconcile that right with the elf world It doesn't work that well We are going to explode you know to explore you know in the future in Oracle how to maybe You know look for some alternative models, but We got the linker working and at least it allows you you know to Compile a set of functions in an object file Another set of functions in another and then generate a third linked object You know containing the combination of everything How useful is this at the moment at this point? I'm not that sure of that but you know in real BPF development, but we are going to continue exploring that that line and Then of course well the rest of the plethora of the binary utilities, right? You have you can make archives you can get symbols from the objects You can copy the objects adding new sections, you know the typical like in any other port Then finally, and this is where we are working right now Also, we have a gdb port and We have a simulator We have not pushed this stuff upstream yet because it all it still requires work and Due to BPF been peculiar a specific especially you know the execution model of BPF There are some problems that we are still working on For example, because the idea of course is that if you are a BPF developer The idea is that you should be able to do the same that if you are writing software for any other architecture Let's say embedded architecture, right? I mean like for example to be able to inspect your program with GDP Now we will see that BPF is problematic in that sense because the instruction set is so restricted and the verifier is so strict That for example, there is no way to get backtraces in a BPF program So how useful will at the bugger like GDP without the ability of doing backtraces? We shall see anyway, we have a simulator it works, but it doesn't work very well So we are working on it as soon as we have the simulator working properly We will port it we will push it to GDP upstream. So it will be released with the next version of GDP of GDP And then finally well the compiler we made a GCC backend And then those are the architecture specific options that this port supports I was in the Linux Plumbers conference two weeks ago and I got a lot of feedback from the kernel hackers for which I am very grateful actually and It's this is funny because I mean I am a compiler guy. So I look at the BPF Problem and world, you know from that perspective, right? I mean you give me see I give you BPF, right? But when it comes to use the BPF and you know an applicant usage of it I really appreciate, you know any feedback that they can get for me that is Super precious, right? So for example something that for me made full sense What they was writing the back end which is for example, just to have a minus M kernel option Which will be the key valent of minus CPU option in other architectures, right? Well apparently it's not that useful Why because of the back ports, you know that the care that the kernel the production kernels have, you know Like in distributions and and so on. So, okay, the option is still there is the false to latest But probably will remove it at some point Then you can specify if you want to generate big indian program BPF programs or little indian BPF programs And also I added an option there which is Well, you know that the size of the stack of a BPF program is limited Currently to 512 bytes, which is not a lot. So I added an option So the compiler will will basically give you an error by default, you know, if you Overpassed that limitation. So, you know at compile time rather than at runtime, you know that your program will be not will not run on the kernel Then well, I also added some compiler buildings in this case for generating non-generic load instructions in BPF LLVM has similar things and Well, initially, you know the function helpers of BPF in the kernel Basically, the BPF programs are so limited that in order to do certain things basically you escape to the kernel is sort of a sort of a Seudo system call that a BPF program can do so the BPF program runs in the kernel But then it can call to one of a limited set of kernel functions, right To do complicated the stuff or stuff that requires, you know, like writing memory reading memory, you know, which could be potentially dangerous, so At the beginning I used the kernel built-ins Compiler built-ins for them. Then I realized it was a bad idea at the Plumbers to using the feedback from the kernel hackers So now I have a patch I have not pushed it to GCC yet Substituting, you know, the built-ins by this attribute That you can apply to to function declarators and then well When you write a BPF program in C using the LLVM Then you always want to include a header file, which is called BPF and the score helpers dot H Which is part of the kernel of the Linux kernel that the header file defines helpers and they find some data types and goodies which are generally useful for you At the moment GCC is not able to use the kernel header because the kernel header has some stuff Which is you know, LLVM a specific we are working right now. I'm working now with the kernel hackers on this But for the moment, I am shipping a file, which is BPF dash helpers dot H along with GCC That should give you the same interface that BPF underscore helpers dot H gives you But please note before you jump to my neck that This file is to be obsolete. All right. I want to remove it. I hate this file, but At least for now temporarily We actually need it Does it work? Well The number that you can do see there, maybe they are not that impressive to you But I can tell you for a new GCC port. This is not bad at all. All right. I mean this is Well, there is the result of running two big test suites, which are part of GCC Of course, those are compilation only tests because until I get my simulator up up running I cannot run the GCC backend GCC test suite that requires running the programs But they will so this was the part where I show you what we have done now, um, I Don't know if we have time how much time we have left like Nine minutes. Oh, we're plenty of time. Yeah, sure. So BPF is peculiar, right? It's and that it makes it that makes it, you know, like very especially interesting and fun to compare to so then Examples for example, and the anise In theory BPF was designed in a way that it is orthogonal to and the anise Which makes full sense for something like BPF. Why because the Linux kernel can work in both big Indian machines and little Indian machines Right now the idea the initial idea is like. Well, let's make our instruction set Agnostic to and the anise right so the fields are the same everything is the same But of course the values encoded in the different fields in distractions Well, it follows one and the anise or the other depending on their architecture where the linux kernel is running It makes full sense, right? That made me very happy. However, the way a BPF instruction is defined in the kernel somehow Basically makes this basically not possible why because it uses bit fields which are smaller than one byte and In the C specification basically tells you that the compiler is free to reorder those fields freely And it happens that GCC and also LLVM When they see something like this the destination register field and the source register field is swapped Depending if you build in a little Indian machine or in a big Indian machine so despite the good initial intentions basically in practice There is no one BPF instruction set There are two one for little Indian and one for big Indian the difference between those Instruction sets is that in one the destination register field comes first and then in the other it comes after their source register field Okay, this if you work in the kernel using this definition It's not that painful because it's transparent to you But trust me if you want to if you if you make a toolchain to generate the stuff for this architecture It's extremely painful because you have to you know, you have to swap those fields all the time So for to naturally see Jen this program I mentioned before to make to describe Hardware architectures it has a very good support for macros. So I Could tackle that, you know, like now I have problems in the simulator to because of this anyway What else is peculiar about BPF? Well, this is the function prologue and a p-lock of a function in MIPS. I Choose and MIPS because it's one of the most straightforward architectures, right? This architecture. What do you use? Well, you get the stack pointer of the caller, right? Then you allocate the space in your in your frame by you know Decreasing the stack pointer then you save your frame pointer You use your stack your frame and then in function a p-lock before returning to the caller then you You restore the value of the stack pointer Fine. This is the typical that every architecture or most of it should do in BPF you don't have a stack pointer First shock. Okay. I don't have a stack pointer. So where is my my stack finishing is automatically allocated sort of With the kernel very fire One of the things that it does when you load a BPF from in the kernel is to scan it instruction by instruction Then it finds where the functions are where a function start its function start and where the function ends Then it tracks the usage of the frame pointer that is given to the function And it is able to determine how big the frame the stack frame for this specific function is It's super cool because you know, I mean this is like At hardware architecture that does automatic memory allocation for you So that's why you don't get a stack pointer. You don't need to keep a stack pointer. The hardware keeps it for you Okay, this which is Very comfy, right? It's not that nice when it comes to compilers. Why because GCC, you know, I did not like this idea that that well that much So initially, I was like, okay, I will tell GCC to eliminate the stack pointer in Favor of the frame point and that worked until I tried to compile the first function using a variable length array or a And then GCC went into an infinite loop, you know, it was very very very sad. What happened there? So basically I had to I made a hack, you know, so I'm using the register number nine, you know to implement sort of a pseudo stack, right? So you can use things like variable length arrays and a loco Of course, then the kernel hackers probably will tell you but you don't want to use a loco or anything like that in BPF programs. Well, fine, but as a BPF compiler, you know, I mean, I want to really support as much see as possible Especially because recently BPF is getting so popular in different kernels as kernel subsystems That really I mean, I will not be surprised if in five years We can't support C99, you know on top of BPF Whether that is good or bad. I'm not getting into it, but I will not be surprised. Actually people are starting Oh, I want to get half a virtual machine in this user land application. Can I use PPF for that? I have heard that more than twice in the last few weeks. So it's coming. Yeah Another peculiarity Basically in the kernel, this is an excerpt from the kernel interpreter The kernel has two different implementations of BPF. It has an interpreter and also it has a just in time compilation Right, which are each architecture has one The kernel every time one BPF function called another BPF function The kernel allocates a new the interpreter allocates a new array Which is the stack frame of the cold function of the collie, right? And this means that the stack is this joint or we should assume that the stack is this joint actually in the JIT Implementation the stack happens to not be this joint, but you cannot assume that right? so this means That generally speaking you cannot access the the frame the the stock frame from the caller of the caller The verifier supports referring to it using an absolute address and that is used I was so grateful for that that is used to implement passing arguments by reference, right? You pass a pointer instead of a value, but You cannot access the stack frame of the caller as an Offset applied to your frame pointer relatively from your frame pointer And what happens then that there is no chance of passing arguments in the stack So you have a very hard limit of the number of arguments that you can pass from one BPF Function to another you can imagine how painful this is for me, you know when it comes to test for example the compiler There is also the stack limit. Yes, five minutes another Interesting thing, okay BPF doesn't give you assigned division instruction, which okay It should be that people don't need it. I guess I don't know but well LLVM gives you an internal compiler error if you try to compile this code GCC generates a phone call, which I think is much more elegant, but the same useless, you know BPF doesn't have a zero register I come from maintaining the the spark back end of GCC so you know for me a zero register is like I don't know Underwear, you know, I mean I feel uncomfortable without one, but it works, right? And it has many other limitations And those limitations are not capricious, you know If there is a reason why the instruction set is limited like this, but then when it comes to implement the You know see on top of this Well, it's fun. It's funny Very good news Alex say say it in the plan verse Last week that a memory model is coming You know memory model in an architecture basically tells you How the memory writes and loads are ordered? Implicitly or explicitly with explicit instructions So it's good that BPF is getting a very well-defined well-defined memory model because that means that I can add an instruction scheduler to GCC and This is something that I am doing and it's also a proposal Which is that I have a huge problem BPF is so limited that I am find myself in great pains to test my compiler Because the number of arguments to functions is limited the size of the stack frames is limited Everything is limited. So when I try to run the GCC test suite It's a car mates. So I'm introducing a new target an option which is XP PF. I call it six XP PF for exceptional as BPF or whatever I don't care about the name actually which basically is BPF without the restrictions right The main purpose is to test my compiler Also using this you should be able to have backtraces and if I support that in the simulator and in GDB You should be able to sort of debug your programs before you know Using the kernel side the bugging facilities that Alex say and the others are designing on the kernel side and also I think it will be good, you know to Explore, you know the impact of okay. What will happen if we change this limitation or we limit this limitation? So this XP PF thing is happening on the toolchain It's working progress So the current status is that the binutil port is upstream so it will be part of binutils to 30 Something I forgot that will be released soon and it will be part of GCC 10 Which will be released next year next step the face to I was talking before which is okay now the programs that we generate from GCC should work on the Kernel I'm working with the kernel hackers to on that Because for example now working on on that the self tests in the kernel They should work with with both LLVM and GCC and they should do the same thing right should be interoperable and To support the compile ones run everywhere, which is btf. Basically, I don't have time to get into this but it's actually quite interesting and And well in general to work with both the kernel community and the LLVM people So we can all together evolve this field of compiled BPF, which Well, it needs, you know, it needs love basically and And that's it any question So what would this simulator be able to do like for specific BPM program types? Are you going to mock out the context or something the simulator? Yeah, well the idea is that we are going to simulate kernel contexts Because I mean our main idea is that you should be able to develop and the back your BPF application as much as possible Before doing assist call to the kernel I am aware that the perspective of the kernel hackers is different in this sense Because they test the LLVM backend with the kernel tests and they do all the testing and the kernel side I Sort of disagreeing that point, you know, because I think that we should be able to do as much as possible before Going into the kernel and I think it will be beneficial for the kernel as well So we are going to to support some kernel context. Yes, I Would like to come back to the kernel options. You were talking of removing. Yeah, sorry about that. Yeah, okay Go on. Yeah, they suck Understand correctly is that means that in a way newer GCC would not be able to compile BPF for older kernels No, that's not like that. No, basically if you if you had a build system, you know I mean the idea was that in the same way that GCC Tells you that an instruction doesn't exist in this version of the ARM Architecture, for example, then my idea was well if any BPF instruction was added in In the kernel 5.3, right? And then you you build your program and you want the problem to be able to work in the kernel 5.2 Then you build with minus m kernel 5.2 And then the compiler warns you or gives you an error because a a this instruction is not a valid instruction for your target That was the idea it turns out that most people are running kernels in production containing back ports for You know from from future kernels. That's an assumption. I'm not sure it's true at least in the embedded world So maybe we can discuss that afterward. Yeah, anyway, I think I found the solution for that Which is to support the option, but it defaults to latest So basically if you ignore it is not gonna bite you. Yeah, so with the strike stuck trace generation could could you just have like a External channel where the machine would help you to generate the stack trace if you if you need for debugging What do you mean with external channel? well that you don't try to derive the The the the the stack trace from from pointers on the on the stack, but you Query somewhere external. Yeah the simulator just gives you a list of the functions because it knows them Yes. Yes. Yes. Yeah. Well, yeah, actually it's gonna have to be like that because This is the concept of return address in BPF is not I don't think it exists actually, right and My problem is dwarf in this case, right because I found this problem straight away when I was writing the backend And it was like, okay, what is the CFA address? Ops Oh, yeah, okay. Well, and But yeah, yeah, that's a good idea. Yes, it's gonna be it's going to have to be something like that or XP PF Which is that I know it's not optimal But hey compile with your problem with minus X minus XP PF then you can debug your program logically Because that's the thing. I mean logically, right? I mean the generated code with minus XP PF is not gonna be the same that with Compiling real BPF, but at least you can do some logical debugging right on the algorithms that you are implementing in your Of course currently BPF doesn't allow you to have unbounded loops for example So the algorithmic that you can encode in a BPF program is Well, very limited But I don't know it may change in the future Okay, thanks. We done. Thank you