 Yeah, so BPF documentation and standardization, so the BPF guys in here might have noticed me doing a lot of work on the internal BPF documentation, and the background is that EBPF gets used quite a lot outside the Linux kernel these days, and people that are not just doing fun stuff in the kernel tend to really like a specification for the instructions that they're working with, especially when they want to incorporate that into another standard, and so we've been looking at it, and before, like until I started that work, if you look for BPF documentation, you'd have something in the kernel that was kind of really intermixed with classic BPF and packet filters. You'd have a random GitHub page that hadn't been updated for years that actually had a somewhat better but also very different piece of documentation, and this was the time when the EBPF foundation or like this came around, and I know other people in standard approached them and figured out when could you get a standard, and you know, when all these official stuff works, nothing ever gets to do. So what I've been trying to do is to get the kernel documentation into a shape where it's actually a useful reference, and I think we've made some good progress on it. There's a couple things, the whole documentation is entirely silent about, and one of the interesting things is the whole split between the verifier and the actual instruction set, where the verifier ensures a lot of things that would trip exceptions in a classic instruction set just don't happen. Now, a lot of people that are using EBPF outside the kernel kind of never got that verifier memo and are trying to run EBPF code that's not even verified. So that's another issue and one I don't like, I've been in one of these fights. So for now I've been trying to document the instruction set and I wonder how we can make it more or less official, including the thing with standards is that we need some versioning. There's a bunch of things in EBPF that weren't there from the beginning that got added later, and how do you version it? Do you version it by the kernel version it appeared and people look at that for their back ports and external projects? Can we assign a version number? Can we assign say names for extensions that people can say how they're doing it? The only sort of versioning we had is the compiler, especially C-lang that just made up like instruction set versions on their own, which in some cases are referenced by the existing kernel documentation in a, let's say, somewhat messy way. And yeah, been trying to get a feeling from the community. What do you all think about putting a version number or extension numbers on there and how we should handle that whole thing? And also what we should do about exceptional cases, no matter if they're handled by the verifiers in the kernel or something else, and if we can define what we do on overflows, divides by zeros and whatever and just say don't do it, which works pretty well for the verifier, not everyone else. That's just what I wanted to kick off. Yep, excellent questions. As far as version numbers, you're right. So do we have V1, V2, V3, and only because this is how clang does it? We never reduced or removed any instructions so far, but it's also like in reality possibilities potentially. Do we need different names for them? I don't know. We'll probably just continue with like V1, V2, V3, when we add new instructions, it will be before, I'm guessing. So I don't see why we would name it differently, like how would it help if we give it any? So you'd be fine with basically taking over the clang, as everyone says, I always tend to say clang. Yeah, thanks. And take your versioning, bring it to the kernel documentation, maybe say what kernel version first supported that version. Yeah, yep, that would be good. And I mean the other interesting thing in that context is these like magic packet access instructions brought over from classic BPF, and obviously any environment outside the Linux kernel doesn't have all that much interest in them. Right, so yeah, so they, well, well, we can't obsolete them, technically they used to be running directly, now they're just like pseudo instructions, they don't appear in JITs, like in the past, JITs knew about this special instructions, we're doing like crazy stuff with them, now they just like meta instructions, when the verifier sees them, it converts them to function calls. So at the JIT level, they don't exist. Yeah, so one thing I was thinking about was move them into a separate document, remove them from the main reference, just declare the opcodes reserved for see that extension or something like that. It's actually good, yeah, probably worse, like doing indeed, like describing some of this meta instructions that are not really instruction that are special, because they don't have like equivalent one-to-one mapping, or even one-to-many mapping in any normal CPU. So definitely worse, like somehow explaining it, this is obsolete to some sort. And I don't think like anyone really using them anymore. Even to use them, we had to add, so either need to be written in assembly, or there were special intrinsics, while they're still there in Clang, so the Clang, only then Clang would emit them. Yeah, so I guess what I took out of it, versioning is good, you guys like it. What can we do about like publishing more or less official language spec on something that's not just the source code in the kernel? That's the question to like probably expert in standardization committees. Oh, Paul is, well, Paul just disappeared. So I asked Paul, I asked Paul once, like what does he thinks about like standardizing, and his answer was, well, what standardization body do we want to go to? Yeah, and I- I saw is it, and how heavy the whole standardization process has to be, so I don't have a good answer, because I have no experience doing standard, so that- There is an opinion in the back. Text, test, okay. Yeah, so I've been doing IETF since about 1994. I don't think you can go to IETF because it's not a protocol. IETF doesn't do instruction sets or APIs and things, it leaves that to language specific bodies. There are, IETF does do abstract APIs, and so, for example, if you're familiar with security, there's GSS API, we're doing, you know, crypto libraries and things like that, but it didn't do the abstract, it didn't do the concrete binding into any particular language like C and so on, just as you gotta have these following arguments in some syntax or whatever. That's really all you could do in IETF, I think. So just documentation, right? There's BPF docs, it's up on the IOvisor BPF docs repository, and maybe we can move that under EBPF foundation stuff, and so that's one possibility because it's just documentation. If you wanted to go to some standards body, then the real question that I would have is, do you care about any particular programming language binding, or are you just trying to document the instruction set? You get different answers to where you go for those things. I think the really important bit is instruction set, right? When people build hardware devices that interpret it, it's really useful not to say basically necessary to actually have a valid specification that you can implement a test for. And if that is called an official standard and an industry specification or just a very reliable website, probably doesn't matter too much as long as it actually is a reliable documentation. Yeah, so for that, I'll get to ISO in just a second here, but probably the shortest path is to host it under EBPF foundation, that's my opinion. For ISO, it is very common that other organizations publish a spec under another organization and then put it into ISO to be ratified, okay? And then it becomes an international standard, right? And so if there is a reason to do that, like if you needed to show up and government RFPs and things like that, and there's a reason to get an ISO number assigned, and so then the most typical way that's probably appropriate for us, again, is to have something like EBPF foundation published specification that's stable with a reference and a version number and so on, and then submit that to ISO for ratification. And I'm familiar with at least three other organizations that do it that way. You can host stuff directly in an ISO working group, but given that you've already got a community of people in the EBPF foundation, I don't think trying to create an ISO group to do that is the fastest path, the fastest path is. Create a document, publish it under EBPF foundation, or just on a website, as long as it's a stable, referenceable URL with a version number, and it won't change because you've snapped out of the PDF or whatever it is, right? And then you submit that to ISO for ratification, and that's something that can work, but it's really only useful to go that route if you think you need an ISO number, like if you think you need to show up in government RFPs, then it's worth it. If you don't think you need that, then it's not worth it. Right, I think the bar is much lower right now than that. Right, right, right. So that's why, right now, the simplest step is to say, well, take the repository that's currently on IOPISER, that's the BPF docs there, move that under EBPF foundation, and have it be maintained by EBPF foundation. That's what I would propose. And yeah, I mean, my impression was always the plan would be kernel is kind of first source, and we generate documentations out of it and don't totally lift out of it. I personally don't care what way it goes. I was just thinking about a process that allows for it. Yeah, I would agree that kernel probably is the source of truth, as far as documents go. We just need to have sanitized snapshots that people can climb. Yeah, I mean, the point that you made earlier is a lot of people don't do stuff in the Linux kernel, right? Don't look at the Linux kernel as the, it's difficult to look at the Linux kernel as the authoritative source. So having, so for example, the fact that the ISA at least has a document in the IOvisor BPF docs repository right now has been very useful for people doing other runtimes, right? Some people have user space runtimes. It's pretty incomplete though. I know it is, right? But it's better than nothing. And for some people, it's better than having to look at Linux kernel source code. So it's, because it's more accessible. But kernel docs, I thought, like they get automatically published not on the kernel, not in the kernel gate, right? So like once it goes through, like, whatever, gen docsy stuff. Yeah, I mean, the kernel docs that I've been working on that started out the initial one, I mean, it's basically a markup file or like a forest of trees of markup files. And it allows to generate things like PDFs or HTML documents. And that's a pretty readable document format. But as I said, if we want to incorporate this by reference into other standards. I mean, in theory, we could do, say, still Linux 6.0 version of this document, but that's not very useful. But we're really good to generate it from a tree, maybe actually a slightly patched tree that stamps a version number in and then publish it with the eBPF foundation or whomever. And you will always find a stable version of this document at this place. And once in a while, we might do a new one. Makes sense. And I have to say, I worked with the IETF a little bit, not anywhere near as much as you did. And I really like the process with the proposed standards and everything, but. Yeah, it's got a great process. It's just that the scope of it does not extend to APIs or even instruction sets currently, right? So it'd be a herd cell to try to get the IETF to do it. I'm trying to think of even which area you'd take it to. And all I can think of is probably the apps area and it would go to the dispatch working group and they would try to decide what do we do with this? And so I'm just trying to branch predict what would happen with the dispatch working group. I mean, the IETF has some weird corners where things work differently, say, NFS working group, which is what I was active with. And if they actually wanted to do that, I'm pretty sure they could do a weird case, but I don't know if they do. But it's really easy if there's actually a protocol that goes across a network by NFS, right? But for this, there's nothing that goes across a network per se, right? And so the IETF people would say, does it really belong in the IETF or is it better in some other organization? And yes, it's got a great process and culture and stuff, no questions there. So how do we replicate that into some other forms? I do like the BPF foundation document idea best. Just reminder, colonel.org slash doc has all the docs from the get tree all the time. So it is a fixed location with a constant URL that's always updated best one. Yeah, I mean, one thing is to always have the latest one. The other is really have a repository of the stable snapshots that people built their hardware or software or whatever to. Yeah, no, I agree. So I have one question. So let's say if we would put this under the BSC from the BPF foundation as an official document, so one thing is like to describe the individual instructions. Should there also be a description that is maybe more verifier specific, like for example, like the division by zero and all the stuff that the verifier would prevent or should they have not lived there or that's it? I think it's a very good question and I don't have all the answers to it, right? Because as I said, I've been dealing with non-verifier environments that I don't like all the much. So my approach would have been take one step back from that and this is like these are, this is what is considered invalid in the BPF and in a verifier environment, the verifier would catch it and in the non-verifier environment, well that's the next question. Are we gonna have a standard for all the non-verifier environments because there are a lot or do we leave it to each individual usage? But I think it would already be actually very useful to say what should the verifier, what shall the verifier do for a verified environment because right now, I mean, there is not a whole lot of verifier documentation and it's also very Linux specific in that it combines what is generic verification of the instruction set with what is verification specific to program types. I guess I don't have a strong opinion but you talked about the case where you're using the verifier and there's a case where you're not using a verifier. I'm personally less interested in the not a verifier case but I agree with your argument because there's not just one verifier, right? So for these other environments there are, there is another verifier that's used over the runtimes, Prevail, there could be third ones in the future and so if there was a spec that says any verifier better be doing the following things if you're wanting to ensure safety or whatever, here's kind of the minimum bar that would probably be a useful piece of documentation. But again, it would have to be to your point about not things specific to any given prototypes which you can make prototype generic things like to say if you're gonna return a pointer you better make sure if you're gonna pass in a pointer then here's the things you're gonna do into whatever the prototype is, right? And so that I think is useful, right? Since I deal with at least two verifiers right now. Yeah, and it's like the project, the standardization project I'm working on people got really upset about some of the more tighter regulations like the whole loops thing that was the issue here and that's why some people decided they don't want the verifier, it didn't help that the other prototype that was mine was running UBPF which didn't really have a usable verifier at that time anyway. And one thing for example I had briefly suggested and we've completely gotten to other things is that we do a minimal verifier that actually does everything you really need to have a valid instruction set because it's missing exceptions and everything. And then we can have a separate discussion if we actually want to do any validation above that or if we will have a runtime exception model for things that are application specific. Division by zero is a good example of something that there's a very good example. You have to make some choice about, right? Yeah, so divide by zero is an example that we've changed our mind. First we implement, we just borrowed it from the classic BPF that divide by zero should just terminate the program and that turned out to be bad for all sorts of various reasons. Like the program just like exits all of a sudden and like state is incomplete. That is more dangerous than doing it arm 64 way. So then we've switched this divide by zero to do what our CPUs do which is just like zero. Nope, no, not a nope. I forgot that. No it's, I think it's returning zero. Yeah. And if you do the module operations like it's returning the same invalid result. Yeah, so that would be the next step and I guess I'll just stay in contact with all you guys what we can do intelligently there. So I wanted to thought a third topic if you're done with verifier but if you have more verifier stuff then. Nope, I don't have maybe anyone else's. So part of the discussion I was leading in the BPF track for anybody that wasn't in there was about making certain things be cross platform when we're talking about the BPF and BPF tool given that there's other runtimes coming along that BPF isn't just gonna be only for Linux, right? It's more broad, right? And so the question came up about things like is there a spec for BTF, right? If you want more other runtimes and stuff to do it. Right now the only BTF spec that I've ever seen is the one that's in the Linux kernel, right? It's, you know, GPL and so is there anything we can do around an actual BTF spec or something there? So that would, and maybe there's other components too. There is a document, there is a document like it's not, well it's not GPL, it's, there is a BTF.trst that describes all of the format. So in documents, well document has a different license, my understanding. So maybe we can collect them all in one place then and post them on the foundation or something. Yeah, that'd be awesome. Cool, I like it. Okay, thanks everyone.