 I think we can start it. And this is the Yang Hong from Meta. And actually, I will discuss two topics in the rest of our. And some of, I actually did the work, and some of them will be for imaginations. Let's just continue. And so the first thing is to try to kind of like a comprehensive kernel function support. And you will see what I mean here. And so currently, we try to utilize call some kernel internal functions, right? And for example, you can use structured ops, actually call some congestion control functions in kernel. But mostly popular use case, actually, is BPF helpers. And inside helpers, you can go kernel functions. But there are more ways to do that. BPF helper, you cannot stay every function create a helper for kernel. That's just not scalable. So in the recent this year, actually, the BPF for K function ID set is introduced. And you can see this is a bunch of code I just copied from kernel source code. You have kernel function types and what kind of types for kernel functions. And then you define some kind of like a set of functions. And this set of functions actually associated with a type and a kernel function type and a program type. So the particular here is to register a set. Like you give a program type and you give an ID set. The ID set actually is a bunch of functions but clarified to the K function type. In this way, for each program, you have a set of functions you can use and directly call kernel functions. In BPF programs, you can just say, OK, this is the kernel function prototype. It has a K symbol here. And actually, you can call this function directly and in your BPF program. So this is a big improvement. And to call kernel functions in the BPF programs. But how about we try to use more kernel function in BPF programs? And yeah, so as I mentioned earlier, currently it's just like a BPF program type and a kernel function type. If we want to use much more kernel functions in BPF program, we probably need to clarify kernel function with more information. And it's just a couple of random examples I found this morning, actually. Like a C group show, if you say, I want to call this function kernel function. Suppose all the argument are resolved. But this argument actually has a mutex log. And it basically means this function could sleep. And so we probably want to present this information and attach it to this function. And so during verification time, we know something could go wrong. And if we have, basically, it's not a sleepable program. And another function is like insert inode hash. And so this is, again, two functions. And inside is a bunch of spin log as well. And these are, actually, you need to be careful and insert in context. And inside the log, you probably cannot really call this program. Actually, there's a lot of export symbol functions or other functions. And could be enhanced with additional information and used by kernel. And so how do we do this? And so actually, I implemented this BTF declaration tag last year. And this is the one of purpose for this tag. And although we didn't really implement anything yet in kernel. And so this BTF declaration tag is a try to tag some information to functions. And it will be encoded in VM links BTF. And for example, for the PROC C group show, and you can have an attribute here. Attribute, say, declaration tag. You have a string, new text log, column, C group, new text. So you encode this information inside the BTF associated with the function. And in this way, you will be able to, during the VM links parsing time, kernel loading time, you get this information. And later on, if this function is called by BTF program, and you may reason in Verifier about what's the property of this. And what kind of locate hold, or what the precondition or post-condition so that. Yungham, I have a question. Sure. This is by the way, this is awesome work. I hope this comes true someday. And then we'll be able to call it kind of functions. Question regarding that new text log declaration that you have, it says C group, new text. Does it mean it acquires C group new text within the function and releases it? Or is it an imbalanced log? Or does it mean it? Yeah, exactly. And in this, I just write a new text log. You can have more fun, green granularity of information. And in this particular function, new text log actually has both new text log and unlock. So you could say, OK, new text log, like a complete, or full, some information. Because the attribute itself just take some kind of string. And you need a pretty sure you need a define. The kernel need to define some kind of simple language and how this information will be encoded. So for imbalanced logs, at least, and the kernel has requires and acquires and all this stuff. That part, we are going to convert that part to the declaration tag. It will be encoded in the BTF. And the same thing for this I insert, I know the cache thing. And the BTF declaration tag not just applied to functions. It can apply to structures, structure members, global variables, function parameters. So a lot of latitude you can. Later on, I will give you examples like why it attached to the structure members will be also useful. This is a type tag. And the type tag is used to annotate types. And the most popular example here is we have a few pointer annotations. And originally defined the address space. And that's only available to sparse. And for example, you have an RCU pointer, and then, and sparse will do some checking. And human be also visually do checking, although it doesn't really make an effect due to the compilation to the actual code. But we could replace this RCU to the BTF type tag. And this type tag will be encoded in BTF and will continue to the VM-linksed BTF. And so currently, the kernel actually supported the BTF type for the user and the CPU. They are actually used by the verifier. And suppose you have a user pointer, you want to use a direct memory access, and the verifier will flag out. That's illegal. And the CPU will be useful and to detect a particular structure member or global variables. It is a per CPU and a pointer. And then you can use appropriate per CPU helpers. And another interesting use case is kind of like print opaque kernel data. And why is it useful? For example, we have a task structure. And you have this BTF local storage, RCU, and BTF storage. And if you go to the BTF local storage, and it actually will have this list. And this list is actually the opaque list. If you use a current kernel helper function, and BTF SN print BTF, which intention is to print the hierarchical of all the data routed with a task structure, a particular BTF type. But it will stop here, because it doesn't really know what type it is. It's just a list, a bunch of elements. Who knows? In this case, it will stop here. Just print a bunch of pointers. But we actually have a comment here. And well, it's a list of a BTF local storage element. If we know this information, actually, we can just continue to print it down, right? That's cool. This is exactly the declaration tag information could be used for here. And previously, we have these comments. But we could annotate it with a tag, and the declaration tag to this member, like a BTF local storage element. It may need additional information to traverse this list. But this is the basic information. You encode this BTF local storage element here. And then, and the VM Linux will have this information. And the BTF SNPrintFBTF actually can be enhanced to print the actual storage comment. This is the HH list, the detailed comments. And further down, print the BTF local storage element comments. So yeah, I think this is the first part. Any comments, discussions, any new use cases? So I guess this is, I mean, there will be a lot of work sort of figuring out all these locations in the kernel, tagging them. Do you expect any pushback from people saying, look, don't add these things to my code? I don't like. It's possible we have to define first. That's why we didn't really do anything with this declaration yet. And we have to define some kind of language. Because if you tag many things, you have to have a consistent mini language to see what it looks like, what the information encoded. If you get this string, and what's just like a chase point, you have a provider, you have a category, and then you have a final name. So we may need similar things. You agree. So it will take some effort. But the good thing is this information will attach to the data structure or function definition itself. So it's like a command goal is to make it easy. And if people understand what's the semantics of these tag declarations, and if a declaration changes, they should immediately just change that tag on the spot, right, on the declaration itself. Yeah, so I think I have a follow-up question. For what we have right now is the funk ID and type ID stuff. What happens pretty regularly? Someone renames the function, and we don't notice that. And we will get some warning during build time that will no one notice and all that stuff. The same thing will happen here. If someone renames BPF local storage LM, if we won't know until someone reports it as a bug and all that stuff, do you have any ideas how we can catch this compilation time and fail the compilation with some meaningful message? And in this particular case, for the tag stuff, and yes, and the tag is intended to encode again some functions. As you mentioned, the function name could be changed, right? And then kernel code that function. And I don't have a good way to do that, to be honest. And unless we have additional, just like a BPF ID set, right? And you have some of them there, and otherwise you get a zero and with the initialization variable. But this, yeah, yeah, so. Well, I mean, like, we can do something like what OPT-Tool does, right? Like, we have resolved BPF IDs, we can probably generalize it to do like the BPF resolve whatever. And they can, like, this tool can like check all this, right, we'll need to teach all the semantics, obviously. But it's probably a better way forward, because otherwise we'll just have this big throttle, essentially, right? And some of those annotations will be used very rarely in some particular applications, right? And we'll just not notice it in time. So we should, I guess, my point is this is probably very important to think through how to prevent degradation of those tags. I haven't designed that, but typically the best thing is actually define kind of like not the function name. The verifier doesn't really, shouldn't really check the function name. It should check what this function to do and map this function to a name in BPF. So in this way, you will be OK. And if function rename and verifier only check, OK, this semantics, there's some name encoded as semantics of this function. That function exists and that will be good. So that's just indirect. That's not really check function name, but check whether a particular stable name in verifier. And a verifier will check that one. And so if function rename will be resolved in that case. But if function is gone, that's a different story. But function rename through this interaction probably will be OK, I would say. But I don't know. We'll have to actually design. Sorry, actually, so here, I'm from bad dance. In bad dance, we have similar like a requirement. So we have some, many times we need to modify some new helper, BPF helpers, and it's not there. But we take a different approach. So I think this is very interesting. So in bad dance, we do a generic helper. So basically, we just register two like dummy functions to say you can write a dummy helper. Then the application call that function, call that the dummy helper. Then we add a new kernel module to implement, to really implement what we want to do in the kernel. So basically, the argument that we try the kernel module, as we always, the kernel module will provide real functions. And then the BPF, the general helper, just provides a rule to tell the user level program to call it. So this way, it's pretty flexible for all that function we name, like all the change. So we don't make it easier to resolve. So I don't know if you guys have the similar approach, or a more direct approach to think about that. Yeah, I didn't really investigate the kernel module part. But I look at the things and it looks like it's possible we convert quite some kernel module functions available to BPF program. And so because they are kind of like in many cases, they are kind of like a little bit simple. And also their functionality, pre-condition, after-condition is limited. No, I mean, I don't need to convert. So we try to call to. You just call them. Yeah, but as you're training from BPF, you cannot call directly, right? So we can drive the kernel module to call that function, and then we come back to the BPF program. So you don't need to unable to convert. So yeah, I mean, it's kind of like a very short. Yeah, but the goal is BPF itself is not really goes through a kernel module. So the general helper will call that. So the general helper will call that function in the kernel module. So yeah, we just say like how many types. So we will say how many arguments you put in the. So basically the user level program and the kernel module has to agree to know what they are calling about. So one is like function name. The other one is like how the argument they provide. So I mean, they have to be consistent if there's like another consistent, then they'll have some issues. So I want to go back to Andre's question and ask if there's a different way to accomplish this. You know, I don't know the BPFS in print F, BTF internals, right? But the problem that you pointed out Andre is because the local storage element is in a string, right? Not something that compiler recognizes because it's inside of a string literal, right? Another, because here you have to modify the code anyway, right? To annotate the structure in some way, right? Yes. If the structure was instead where you have the H list head, if that was a union between H list head and something that was strongly typed using BPF local storage, you could still access it with the H list head. But of course, then your types use the actual type name in the union. And so you'd get a compiler error if you did the rename. And so it would annotate that. And maybe even the SN print F, BTF would actually expand the stuff because you're using it in the union. So that's my question. That's actually the initial idea I tried. But that doesn't really work. And the reason is the H list, actually, you go to the list element itself. And it need a list node. And if that structure has multiple list node, you don't know which one. Then you cannot really cover it. Which one has multiple one? So basically, you have a list element, right? Something like that. With the list element, you need to have a list head again inside that element. You could have more than one list. Then you don't know which one to really cover. Which list head in the embedded structure? Exactly. Suppose you have two elements. Two basically, the same data structure, like a BPF local storage element, there's a two list involved. You don't know which one. You need additional information. So you're talking about the case, if I understand right, where one of those list heads is the next and previous, or whatever, pointers in the list being referred to here. There is another one to say the beginning of a list of something unrelated. Exactly. And so how do you walk the list? Yeah. And also, we think this is ugly, I mean. Because this is a list, you have a union, like with BPF local storage element. They are not really union. I get your answer, but it's perhaps not insurmountable. It's just cumbersome, right? Because you can still take it. It's insurmountable, I'm assuming. Yeah, I mean, you can take it. The same idea of if you were to have used a union here, you could have used a union in both of the other two cases inside the list element, right? In one case, you'd have a union with other list elements. And then the second member, or the first member, depending on the order, right? It'd be unioned with some other type there. Yeah, yeah. And so you could still derive the right information and still be compile time type safe, right? It does mean that you have to expand every use of, you know, generic structure into a union to get that. And so maybe that's not desirable in all cases, but it does solve Andrei's problem. Yeah, yeah. If you have a union here and there may work, I mean. It might work, but it will be enacted as too ugly. Like we can do it for like BPF local storage, maybe. But like if you go and like do it in task struct, it will work. What? No. So I don't think, yeah. But you are right. That's a hack. And I tried early. It should work, but it's not pleasant. I mean, from the code perspective, you are right. There's one other problem that we have sometimes on dereferencing stuff in BPF is the, there are fields called private data, which is like a word stop pointer. Yes, yes. Do you think we can annotate that? Like it is decided at runtime based on who is the owner of this tend tree or like who is the owner of this FS context thing? We can, that's why I mentioned that we can have some kind of a language. If you have a limited choice, it's possible. I mean, it's not arbitrary. I don't know how to encode that. So we have to case-by-case study what kind of things are needed to encode. You are correct. A lot of it is a file. It's private, right? And you cannot really define that thing on the private because generic data structure, there are too many things there. Exactly. Yeah, you have to have the people that are different to conversion at the, basically the local site. Generally, it has some information to decipher in the struct, right? Yes. I am of a, so maybe we could have a map of some sort that says this field represents the type of the dentry. Yes. Or like type of FS context, and if it's like fuse, or if this is like a legacy, whatever, and then based on the value of this thing, you could, the type can then be input to a newer type or something. It could be encoded in the decry type tag as well, I guess. Yeah, something like that. You're a little bit of high-level information provided to the attribute. A little bit of high-level than private. Give a little bit more context. Yeah, it's possible. I had a question about the K-functs that call, that acquire like a lock within it. How do you verify that or enforce that by the time the program ends that that lock is unacquired or that there aren't any like safety things that we're trespassing upon? So you are talking about the verify side, right? Yeah, or I think like a couple of slides ago, you said something about like how you can tag something as like this function acquires a mutex within it. Yeah, and so that's, I didn't really do any study yet, but the idea is we have a program pipes and we potentially know the location of this program will run. We should know that in most cases, K-prob or K-function or some program. And from this location, we may annotate relative information about the restrictions for this relocation. And then we should match with the kernel function and to satisfy the constraints in this relocation. For example, you kind of like a chaser function and this function is, for example, in the MII context, right? In that case, and we don't really want any spin lock. And then if you call kernel function as spin lock, as there you have problems. So something along that, yeah. So verification could be a little bit more complex. You need both the program context where this program is run and also neither the function itself, also need a lot of information provided to satisfy, verify it. Johan, could you go back like a couple slides before? Sure. So I think, yeah, even more. So I think you're answering like how this mutex lock like whether it was acquired will be verified, right? But I think Johan was asking even before, like can you go back even more? Yeah. Are you asking about this get type acquire stuff? Yeah, I think I was asking more on the other slide where it's like you have a function where you can acquire some mutex and then like enforcing that like that mutex is released by the time the program ends or that if there's like a, if like you're calling a kernel function, there's the if branch where it's like, if this condition is met, then you acquire the mutex or you don't. Just making sure that like everything is released by the time the program ends. You grab the lock and then there's another function called that makes it grab again. Yeah. Like it takes quite a week. Yeah, it was logs. I don't think we can. So I don't think it's will be possible. Like when the logs are still held after program ends because that's what this annotations that kernel annotations are doing. This underscore underscore acquire this so that it's only potentially do both through the lock depth. If we do something like look depth inside if you have and do all of this lock checking dynamically, but that's even biggest story. Apart from the Cape Punk stuff, right? Like you can annotate these, the kernel data structure members saying that this needs a mutex and then you could in BPF provide primitives to acquire that mutex in a sleepable program so that you could access the data structure members safely. So while the kernel function story is a bit more complicated because there are a bunch of branches and they're too long to do that. But the data structure stuff could still work and be pretty useful actually. Yeah, data structure may be a little bit easier, but kernel function, yeah, it's really hard. I don't know what kind of language you will use at all at this point. How to do that? And the BPF signature. Yeah, that's what I said. That's just the initial baby step, I mean. So how to really use it. My other concern would really be the, it seems like it'd be fairly trivial to deadlock yourself in BPF now and like just lock the kernel up. But if necessary, we could easily convert existing attribute like acquire release and to the BPF tag, that's just straightforward. I'm not too pessimistic about this, right? Like there are, if you have a function that has a mutex there inside and it is releasing it and acquiring the mutex, you could annotate this with a simple annotation and then you could use that information to check whether this can be, this should be used in a sleepable program. This is very valid. The imbalanced locks, yeah. Where this will get ugly I think is if that, if somebody comes along and changes that function, like adds a mutex to it or removes the mutex, then they would have to, I guess, also update the annotation. But your program that was calling it would now need to be recompiled or some, or rethought through, refactored, re-verified and refactored. But I mean, I don't know. This is on the edge, right? Like maybe there's tons of stuff in there that doesn't apply and it's super useful to have, right? Like I'm not, I just think as a generic problem, it's quite hard I think. The program would need to be re-verified. But refactored even too. Like your program could now become wrong because of the annotation that was changed. Like if you thought there was a mutex release and somebody refactored the kernel code to pull the mutex out of that function, now your BPF program is wrong and will presumably not pass the verifier. Exactly, like this is what we expect, right? The maximum damage should be taken by the BPF program rather than the kernel. So the BPF program should be rejected. We expect verifier rejections between multiple versions of the program and chain load them in reverse chronological order. But this is the same for helpers, right? I'm a kernel helper called some kernel function. Although we currently call it simple, if you call complex one, it could change. And then, I don't know. So it's the same. Yeah. Thank you. So is the goal here to have perfect verification that's safe 100% of the time to call this kernel function or is the goal just to be safer than writing your own kernel module? Because I think the latter is what I would target. And I think that's absolutely realistically possible and then we can peel away and add safety as we go. But, because this is fantastic, I think the end result is still a lot better than making your own modules out of tree. Yeah, we can try kernel module function. That's also my first target. I mean, think try to implement a kernel module style of things. I think the, I would also agree that the goal should be the latter. Like if we claim perfect verification, people are going to challenge this heavily, right? So the goal is to improve the safety of like, if you want to implement your kernel module like logic, here's BPF providing the same sort of functionality. And it is better to do that because you're less likely to make a mistake as because the verifier is going to try to catch most of these issues because of the stuff you're building. And then as the ecosystem develops, it gets safer and safer. If we say that this is going to be foolproof all the time. The other thing that I think, the idea is about the goal of this is maybe make the BPF as a group for the kernel module to grow its feature function into the kernel for example, we have a test point, that kind of things. So by the way, the kernel module doesn't need to change the kernel territory. You can use the BPF to grow its feature into the kernel itself and the changed behavior, part of the behavior of kernel, that kind of things. Yeah, I don't know your question. I will end all what exactly you are expecting. A kernel module here is just from here. We target on any kernel function not just kernel module functions. So not sure exactly what your question is. You just. I mean, because we just mentioned the purpose about this is either to improve that for example, make the kernel module more safe or make the BPF program can have a more feature. The other way to look at this proposal maybe is to make the kernel module have an ability to change the behavior of the kernel because right now if some feature, for example, you want to hook at some test point for the kernel module is maybe difficult. So if it was a BPF, we have another ability to do that. So kernel module can use BPF to hook at some type and some hook and the callback to the kernel module is able to implement some feature. That's possible too, kernel module could use BPF. Actually, it's possible as well, I will say. BPF program can run kernel modules. What's great, BPF programs can implement kernel modules and kernel modules. But I think that's probably not recommended, I would say. From the user journey's perspective, like I'm more interested in the Edward of kernel modules and move them over to BPF. This is what we did with like our security telemetry stuff and we were building BPF LSM. And the more functionality you provide to do that safely, it just has a lot of benefits there. Yeah, the goal I said kernel module is an export of the symbols because they are kind of like a standalone and not like a lot of inter-function relationships. Okay, seems there are no questions, so we go to the next one. Next one is more about imaginations and making the right BPF program more pleasant. Yeah, the first is more abstraction and source-level. So one more use case that I wanted to mention, you didn't mention it, is instead of, not instead of, in addition to, as another application of this BPF tag is to use it to tag BPF code itself. So right now we have the static subprox and global subprox. And global subprox have a lot of advantages in terms of limiting the complexity of what kind of code BPF verifier has to verify and all this stuff. But it has limitations compared to static functions because the verifier doesn't know limits of the integer. So in static functions you might know that this argument one always has values from zero to a hundred. We lose that in global functions. So I think this could be a great mechanism to allow user BPF developer to provide hints to the verifier for global function input argument restrictions, like output argument restrictions, stuff like that. We've done some of that based on types and some particular cases where like first level, the reference of the memory we allow to do that and all that stuff. But the tagging can be like an answer to kind of feature parity between static subprogram and global subprogram. What, do you have anything like that in mind? And like... It's being recorded. Okay. The good thing is also that we need that only for Clank, right? So like we can use it like from day one basically for BPF programs. I would say it's possible. And so we can annotate a little bit more detailed information pre-condition, after-condition and the verifier can try to verify based on these assumptions. And that's maybe a little bit easier. And if assumption doesn't matter, you just reject. So instead you just explore all possibilities. And maybe that's a good way to do, to improve for, I mean verification speed. Yeah. And also maybe other benefit. Okay. And let's go to see how we can write interesting BPF programs. Okay. The first is like try to explicitly express a parallel loops. And yeah, so let's just item like auto item objects and you do some series of this item. And the object can be collections or maps area or dynamic allocated area or some other stuff. And the reason we want to have this explicit loop is it will be show no interloop dependence. And so in this case, actually we could have explicit special BPF instruction coded and to verifier. So verifier can maybe do a little bit of trick but short time verification time. There's a really no need to do it. We tried this before, right? Last time we tried this, we tried to put the whole control flow graph into the verifier and then verify the control flow graph. Yeah. So like, sorry, maybe you have a next slide. I'm sorry if I jumped into early. But like, I mean, I'm just interested in this in general because we tried it once before, right? So the question is like, it's nice to write this at the C level. So, but then the question is like, what is the LLVMR IR look like? And then what is the actual BPF code instructions? Like how do you verify that and ensure that like at the block, you know? Like without, it feels to me like you need to go into like LLVM IR and do like an intrinsic or something to make this happen. Correctly. That's a good question. I just comment this slide this morning. I just threw a whole bunch of questions. So like way, way a long time ago, we tried to do this with like an intrinsic. Yeah. So that was just a BPF back in intrinsic. It went all the way through the compiler and then the compiler could just say like, I guarantee there's no jumps in here. But the problem was I never figured out how to get the verifier to do it without building the full CFG and the verifier, which we did. And then we ripped it out. It was, it was kind of hard. Yeah. But anyway, it's cool. I wish we had an answer. Yeah. That's a way like a high level or a construct. We may have some special IR and special instructions. And so we can, we can pass this information and to verifier. So verifier and can assume this if anything doesn't match, it can reject rather. Some, something like that. Yeah. It's tricky though. I'm not sure maybe there's some like, like the verification people know that, but like I never figured out how to get the, how to get the verifier to, I guess verify the control flow and the truth. Without a full control flow group. Yes. Yes, I agree. I agree. That's really hard. That's why we just a proposal. And so we can work on it. I got excited. We jumped in. Okay. And so the next thing I want to discuss is macros. I spent a little bit of time to deal with these things. And currently for chasing program, we have VM, VM links.h and for networking programs, typically they include in a bunch of UI API headers because they do not do chasing. So they get all these macros, mostly macros and if it's a BPR programs. And in the future, and when networking program is added with certain chasing capabilities, it's possible they need to use a VM links.h for core location purpose. And in this particular case, for example, and for to use a VM links.h, they include the first, they include the other stuff, then there will be really, really likely have really different areas because we have to define some type and you have some type later on. We have a problem here. And so what are possible solutions? And I kind of like working on the LLVM-BPF accept identical diff attribute. This attribute, they try to ignore identical type of definitions and for structure for type diff or for, you know, all these things. And in this particular example, if you have a structure S into A, the first has a structure attribute, the second one will be ignored instead of a redefinition error. And possible future extension could be handle core-based type of definitions. Basically, if two type, no identical, but mostly the same. And then we may still be able to proceed, try to ignore. So in this case, hopefully we can resolve the VM links.h plus other.h files issues. And, but for macros, actually there's an alternative solution. We tried earlier, but we didn't proceed. And we can encode the macros into DORF with additional compiler options. And we can convert this macros to a VM links.h, macro.h. And then you don't need to include all these string.hs. And then everything you just fill links.hs and from BPF perspective. And the downside for this and possible BPF space overhead in elf will be five macs because the kernel actually has a lot of macros. You just consider even this list for each element is a macro. We also include it in the elf. So a lot of things there. And, but the plus thing is VM links with macro.h. We also include in kernel internal macros as well. So this make your write a BPF program can use close source code, close to the kernel itself. I think it's also a problem like when you have multiple different definitions of the same macro, right? Which can happen across multiple files in the VM. So what do we do in that case? That's actually unlikely based on previous experiments. And most macros is in header. If that one, the same as other BPF definition and that means that you have same structure name into different places of three, we can't like it's best to modify the source code. Try to make it a little bit different. Well, we have a pretty typical use case like even for BPF, right? Like where we have like the function definition where you specify kind of like what do you do with each function identifier, right? And like in different cases, you do different definitions based on like which file it is, right? Yes, we have a same problem here. If you define all this in the .c file, the same macros in different CR, yes, it will have same issues. We have to figure it out, yeah. But most of the thing we could have to where we ignore all the macro in .c, just including the .h, that's as, I mean, as a first base and including all comments which should satisfy 9.9% use case. Yeah, that's my last slide basically. Can you go back to the previous one? Okay. Oh, actually one more. Oh yeah. Loops, loops stuff. So we discussed it a little bit, but like how hard would it be to teach, not very far, very far doesn't know about this, teach compiler about transforming loops like this into BPF loop calls with callback. So like extract the body of the loop in a callback and actually write it as a BPF loop calling that callback. Is it doable at all? It is potentially doable in Clang front end, but there potentially will be this argument in Clang community. So. Because it's concept, it's conceptually very similar to what compilers do for like a sync transformations, right? Yes, and there is a, we may be able to use, basically we use some code from the C++ side. It's totally possible, but there may be some. Just saying there is a precedent, so. Yeah. C++? Oh, so you're saying you would do this in C++ you're thinking? Yes. So basically the C++. Have you ever tried to write C++ to BPF? How does it go? Today actually you can compile C++. There's no compiler side of the restrictions. You can compile C++ code to BPF. So like what is this, like what is the? I didn't check, but you can compile. In this case, I'm not sure whether it works or not. The easiest thing is some simple C++. Basically C code to compile to C++, it works. I was just thinking like you might be able to pick this up at the IR on the BPF back end, right? And then turn it into a phone. It's possible, we could maybe do something. Yeah, I don't actually know what that looks like. Yeah, we may in the early IR stage, we may do something. Down? Okay. Oh, question? Sure. Regarding the VM Linux problem, they include VM Linux.h. This way or this way? Yes, yes, this way. So I think we are all aware that there are a definition problem and now the solution is usually that you include just the first one and then when there are collision, you just define yourself manually, probably. Would it be possible? I don't know if this is the correct assumption that you regenerate the VM Linux.h every time when you build the BPF program, but if it is the case, could you regenerate the VM Linux.h depending on the include that you have afterwards? I mean, to generate a custom VM Linux.h that does not have the collision, to avoid the collision? Yeah, it's, I think that's potentially need every time you regenerate the VM Linux.h that based on what the header, right? If you have existing headers, it's possible you could with some kind of code processing and get this structure or the enum or other stuff and encode it somewhere and then you try to regenerate the VM Linux.h and then during regeneration, you try to exclude all these existing types. I would say that is hassle for user and the typical user, they probably just generate a VM Linux.h once and use it for quite some times for different programs. So your mess is potentially working for a single program, but if you mess, it has many programs every time to generate. And another thing is a VM Linux.h and it's kind of like generate the cross different, basically, compilations, right? Different kernels in general. And but again, and all these standard header files are UAPR headers, so they are supposed to be stable. So VM Linux are probably okay in most cases with these UAPR headers, not like a kernel internal that it changed. But UAPR can be extended, so they are not that stable. Yeah, that's true. Especially for VPF. That's true. It can be extended, yeah. But I'm trying to understand your question. Are you saying that we should just generate subset of VM Linux types that are not used from other headers? Well, on the next slide, right? Like this identical, whatever it's called now, VPF accept identical definition. That's basically the idea, except you don't require a programmer to do anything. We are saying that some types, even if they're already defined, that's okay to the compiler. I think that's like way more logistically easier to use solution. We just need to get it upstream. I will shed some light of how we end up, probably some history due. So this has been for a year or so now. I'll fight with the VM community about this extra flag. And one of the suggestions was to just create a tool that will indeed would do what you're describing. You include the VM links at age, then a bunch of headers, and the tool will go and will see all of the compilation error. So it will be clang based tool that sort of does compile, but then removes all of the stuff that actually like duplicate, generate some other like that page for you, and then you will pass it to compiler. So essentially it would mean that right now, well in the past we had like clang would compile like x86 and then LLC. And we feed this horrible monster and this how we're compiling VPF program. Now we have just clang, clang, target, VPF compile everything, it's nice. With this tool, everyone essentially would need to do another step of like, it's just install this special clang based tool, then teach everyone how to do this regeneration of VM Linux and so on and so forth. So just like operationally it's like so much pain. That's why like this idea was discussed, but rejected, that's why we're trying to go with this accept identical depth. I thought it would have been like an extension of VPF tool dump BTF, but yeah, it's not that easy. It's not that easy. Yeah, you have to do the whole like clang parsing. In other case, it will be the... It cannot just like string match the types. It needs to be a LLVM tool and a clang front end tool and a try to compare and you can know these definitions. I have a question. So for this attribute accept identical, does it have to become with the first definition or could it be like labeled as a second definition? Currently the idea is we label the first one and we ignore the rest. And if you go to the second one, currently it will still have a definition. We would like to be able to reorder the header files because in Google we have some styling enforcement policies. We would like this VMX to be... The last one? Not necessarily to be the first one. So it would be great if we can allow this attribute to be tagged as the second one, not have to be the first one. I think it's okay. And so you can annotate everything with this BPF accept identical div. And the idea is you see the first one, the rest must be identical regardless of attribute or not. And you can just have this push clang attribute in the beginning and in the end, something like that. I think it should work. Yeah, and you need to annotate it first. You don't need to put this on every type. Yeah, you just have a clang. Like currently the way we do attribute preserve access index for the core, so it's only one line at the beginning of VM Linux.h. So if you want VM Linux.h to be less, just move this line, put it first in a file. If your styling guide allows. Yeah. I'm just saying it'd be great to have that. This is awesome actually. This is an awesome feature. Okay, auto battery. Thank you. Perfect. So we're right at the lunch time. So we'll go come back at one and we'll continue with BPFCI and demo of the BPFCI.