 My name is Ronaldo Carvalho de Melo. I work for Red Hat for the last 14 years from Brazil. And I presented several times at DevConf, but always in Bernou, not like this now without seeing anybody, which is really strange, but let's try it. This is about a, so this is about the BPF type format, which is a format for encoding types like C types, the kernel types like kernel data structures, user space data structures, kernel ABI data structures, et cetera, et cetera. We came from the BPF subsystem, which is not really the main topic for this presentation, but it gave us a BTF, which is a very compact form of representing types, which ends up as a fraction of the DWARF, which is another type of information, but it as well has a fraction of what is contained in DWARF. All the tools besides the BPF ecosystem use it as well. For instance, you can use in the kernel to have a print K, let's say, or some S and print F that you pass a pointer and you specify what's the type for this pointer, and that will instead of printing just a integer or a string, it will print the whole type, like you do on tools like GDB or Crash and things like that. So, and I will show something I did, which uses this information to do the printing of raw data, arbitrary data, using the BTF type information that is available now in the kernel. So, where is it in the kernel? In the kernel, you have a new ceasefire called the cease kernel BTF VM Linux, and all the types for the kernel, all, I don't know how many, it's thousands of types. They are all there. You can get full information about it, all the name of the data structures, enumerations, everything. And more recently, we have it available as well for kernel modules with a feature called split BTF, which we'll come to. So, what is in there? All kernel types, both the kernel ABI, which is set in stone, so it's something that was defined at some time in the past, and then we can change it, or we can perhaps add some new field, but while keeping the existing ones with the same semantics and at the same offsets from the start of the data structure. And we have as well kernel internals, which are always in flux, like the internal kernel data structure that represents a task or a socket buffer or what represents a block device or things like that inside the kernel that they are not covered by the kernel ABI. Sometimes for companies like ours, for Red Hat, we have some extended kernel ABI where we guarantee that even internal data structures don't change like things that are in the upstream or community kernel ABI. So, let's see, with these, that there is a tool which I wrote long ago which can consume both BTF and Dwarf. So, you have this, the name is PA-Hole, which I will explain a little bit about it later. And, but if you just say PA-Hole, rwloc underscore T, what you are asking for these data structure, be it a enumeration, be it a union or a struct or a type there for whatever. And since you don't specify where to get this information, it will look first at the BTF information, which nowadays is present on most distributions. If you look at the system right now and it's up to date, then you're gonna see it on that location that I specified. So, it will get this information and will reconstruct the data structure on the screen in a way that is even compilable. And it will provide several extra information like its size, how many cache lines it uses, how many members, what's the number of bytes in the last cache line. This is interesting for developers, PA-Hole. So, it started for looking at strict holds. So, you get the kernel, the kernel has thousands of data structures and these data structures have lots of members. If you are not careful, you can end up with alignment spaces between the members in a data structure. That if you are careful enough and reorganize the data structure, you can make it smaller, it can consume less memory, less cache lines, and you can make the kernel faster for things like network socket buffers or for what represents a data coming to and from a file on a disk. But this name is strange. So, let's call it another name. So, we can use aliases in the shell command-interpreter and just say, no, when you call type def, it's the same thing as PA-Hole. And you say struct, it's the same thing as PA-Hole. I'm gonna say union or enum. And then you can do, instead of PA-Hole L, L lock T, you do it very naturally, like type def, L lock T, and then you get this type def that you ask for from the kernel that is running on your machine right now. If you do struct list head that represents one entry or the head of a linked list in the kernel, then you get this other data structure. Let's see, if you do enumerations, if you do enum and then this name, perf event type, you're gonna get the enumeration again in a way that you can even recompile or and build it on your source code. And this is used in BTF programs to when they are building bytecode from the restricted C. So, and this enumeration, keep it in mind. We're gonna use it later. And when I meant all kernel types, it's all kernel types. And then you can then use this expand types option of payhold type def. And then it will get that RWLock T, which it was like this, like this, and expand it all the way down. And then you'll see that a arc RWLock T is in fact a strict keyword WLock, which is in turn an union, which has an atomic T is in turn, just an integer integer that is accessible using some special operations. And then you'll go all the way down and you'll get the offsets from the start. This can be useful when you are decoding kernel Opses or that you want to know what is that some offset on a complex data structure full of sub-structures. But okay, let's continue. Split BTF, split BTF is for kernel models. So the kernel module BTF, which has the types that are exclusive to this specific kernel model, refers to the kernel BTF information so that it doesn't duplicate it. It's available since payhold version 1.19 and the kernel 5.11, the last one. So it's not yet enabling things like Fedora. You have Kconfig variables for that. This kernel specifically was built with Clang and it works as well. So when you go to that directory, the C's kernel BTF, instead of just the VM Linux, you have one per module, okay? And then what's in there? You have, you call payhold ACP, ACP IPad, specifying this ACPi underscore pad as the name of the file that you want to print all the types. It didn't work because you didn't specify what is the base BTF. So you have to do this BTF base ACPi pad and then it gets the first data structure that is specific to this kernel model, okay? If you do it like this, payhold and you specify this prefix here, then it does everything for you, it specifies a base as VM Linux and goes on. Okay, can I do more? Okay, we can do more. Using plain structful is powerful. You'll now, you want to see a kernel data structure just say structful and you'll get the kernel data structure for developers, for reconstructing type, no need for kernel headers. You can, instead of using kernel headers to build something that will ultimately be turned into a binary and get into the kernel, you just use this type information that's available on every machine. It matches the running kernel, some more. Then that's the part that I, some extra utilizations. There is a coworker at Red Hat called Joe Lawrence. He came to me and said, behold knows about types. We need to extract more diversoning info in shell scripts. It's related to work that we do in kernel live patching. Can you help? I said, sure, let's try to do some prety printing of raw data using type information to format STD in the standard input of behold and let's support arrays and it's available in this version of behold. Okay, so mod version info, you'll get this information from a kernel driver which is compiled with this specific feature. And then that's, this is the data structure, okay. And then let's prety print it. We have to extract this using be new tools, object copy, get this versions section and output it to this version file. The version file has this size and then you say print the first three entries on an array that starts at the offset zero of this versions file and get the information from the kernel module. The type is this and we get this. So, yeah, it seems satisfied what Joe Lawrence asked me. So let's go to something else. You now get ADI stuff like ELF64 header from the kernel. That's the definition of the type. And so if we get some any binary like bean bash and then you say, give me the first one and then it's the first one that's the header. And then if you look at the, at those numbers it will make sense. And then for BCP as well. And then case closed. Okay, no, let's add some more features. Dash, dash header. Instead of saying that's the one counter you say that the header for this thing is ELF64 HDR. And then it will stop. There's just one header, okay? So we can have as well header variables that few this in the header type that can be later referenced to the code ranges in the file, for instance. So the path data header, I maintain the path header, the path tools in the Linux kernel. So I was thinking, yeah, I may use this to, to decode the path data file. So you'll have the magic number, the size of the whole file. So the information we need to see to get where to start decoding types like the attributes or the events are here or they offset the size, et cetera. Okay. So we go to, to see the path header, the path event header, it's a variable size of record. You have a type which will specify how to decode this thing. You have a MISC, which are flags and they'll have the size. The variable size and record is specified here. The kernel produces this in the user space or any other program that uses the path event interface will use this information to know how to decode the information produced by the kernel. So in the path tool, when you specify here that's the path header is the same thing as an ABI. Perf can have this in a BTF format or in the war format, doesn't matter. The PAHole can use both. Then it uses variable size of records and has well-known member names like type and size, which is good. So if I say sick bytes, which is an option to that offset and size bytes, which is after this offset, how many bytes are related to this specific path event header type? And then I say, give me the first four, count four. And then I say that the size of operator that will specify what is the size of this variable size of record is the size member inside the path event header type. So it shows me the first four, but then since there is no information here, it's just an integer, the best thing you can do is to show just numbers. Then if you go and tell something else to use header variables, you can just reference like this header data offset and instead of hard coding the numbers. So it will use the dash dash header per file header, get the information there from the perf data file and you'll get the same thing. But then do you remember that enumeration path event type? Oh, take a look at this specific one, the perf record com. It's number three. If you come back here, we know that this one is three. Okay. If I say that there is a field in this perf event header that is the type for this thing. And then I say, and if I don't specify type equals something, it assumes that is type equals type, it's just type. It's a well-known and member name for this specific semantic. Then you'll specify that the enumeration for the type that's specified here is that perf event type, okay? So it will get that number for a and try to do a lookup in this enumeration. And if it finds it like it did with the number three here, things starts to get better. You get the perf record com instead of just three. And then this enumeration has something interesting. Something interesting. It mapped number three to perf record com. And then the types here shows that it's an union. So com has the same start as the header. And I can go from this enumeration, if I do lower case on it, to this type, okay? Okay, and then I know what to cast that specific chunk of data coming from the standard input, which is to this type, perf record com. And then I can get to this. I know that there is a PID and I'm formatting it as an excellent estimate because I just ask it here. And then I know that the com is a string and then I can do better. And I can do this for all the other types, all these types and get everything decoded. Okay, there are lots more, but we end up with something like this, which let me explain the time is going up. So you have PIO, you say being perf, where to get the data types. You say that the header is this, so this type that it gets from here. And you say that the records of the type perf file, 80 tier, and then you say in the header, get the range from these member that has offset and size, same thing for perf event header. I say that there is a size type, a size member, there's a type member. And I say that to decode those types, you have to use this enumeration plus this other enumeration, combine the two and then decode everything. Okay, so the goal here is to use it to document the file format while providing a full printing printer, replace the damper of raw events in perf report. New records that we add in the future will get automatically supported. The future is to experiment more with this, finish the perf data disector with features that are not specific to it. So you can use this to decode the gif file or a whatever, any other type, any other file type, integrated with the decoding that is present in perf for that maps integers in kernel ABI strings where they are not documented as enumeration. And try maybe add some other features that this features to all the tools, like ddb or crash and fix box on this presentation. This presentation is available here and together with other presentations about BPF. And there are other features that are already present like filtering by the enumeration number that maps back to the number. It's already quite capable. So that's it for the presentation. Now it's time for questions. I thank you. Thank you, Arnaldo. It seems like we don't have any questions in the Q&A section but if anyone want to ask anything please follow up on the Discord server, on the Discord track in there. And I think that will be everything. Or if you want to add anything, we have still four minutes left. So it's really up to you. Yeah, I mean, this BPF thing is really enabling lots of new features. All this type information was already available in DWARF but DWARF was too big. DWARF version five and four improved a lot in this regard. There are new features where compaction is possible. But by now, BPF is supported inside the kernel by the BPF verifier. So a really vibrant community around it popped up into existence. It's used, I mean, in lots of places in BPF Trace, I saw a talk yesterday by Aji Osha that worked with me, even MP A-hole, that where he showed examples where you can specify accessing tasks, strict fields and in the past is required that you use it kernel headers that should match exactly that kernel that you are using. This was a source of confusion sometimes and problems. But now, since the type information for the kernel is shipped together in the kernel image and when you boot it, it is available there and the kernel can use it from inside the kernel to validate the kernel. It's from inside the kernel to validate lots of new features that are being implemented in BPF, like BPF compile once, run everywhere. It's really exciting what you can do with this. I mean, it's fast to use it. You don't have to process lots of data. So I really think that people should look at it more and find ways to use it since it's available everywhere. Thank you very much for your talk and for the new tool to try. And I think that's it for this talk. Okay, thanks a lot. Yeah, thank you, Arnaldo.