 policy enforcing everything and then the thing I see that's a problem with that is other processes that are highly privileged can also just talk to the kernel and they're unaudited and we need to have some kind of kernel enforcement, which the policy file could then move into the kernel and be stored there somehow and we get into the problem of how we want to update the policy file. The thought I had was to essentially add a signature to the program itself and the signature is going to encode some extensions. So we add a code sign instruction, which must be the first byte. This allows us to quickly determine whether we're able to enforce code signing on the first byte from a kernel. Then we have the signature, the signature extensions in the EVPF program. The signature covers the extensions in EVPF program. What I would propose is binding the identity and there is some haziness here for me personally because I'm not very familiar with SC Groups work, but I think we want to encapsulate identity at a very granular level and this could allow for a program to run as a lower privileged user based on a higher fleet-wide policy and then we would add some kernel flags for enforcing signatures and a default trust public key. Yeah, go ahead. This is basically more, since it is very security related, right? Could you explain what you think about, what does identity mean here? Is it a user, public, private key or something or how does that work? So the identity I was thinking would be, so it will be a delegation and the identity I was thinking of was we would map it to something on the machine at runtime essentially. I don't think we want to do a public-private key pair because you don't want to ship the private key with everything and that is why. So what we're going to trust is a root key and I think if you're familiar with the SSH certificate format, something similar to that might be more apropos for EVPF or you have principles that are allowed to be, this is allowed to run as essentially. Did you have other questions there or does that explain everything? Or did you want to? I think this makes sense. So I want to introduce the content of a like signatory service which may or may not be on the same machine. We should definitely have a reference limitation that ships on the machine to make it very easy to test this but if you're talking a fleet scale or like a big business where you have a bunch of machines, you may want to have this off somewhere else kind of like your identity and access management. So essentially the user makes a request to the signatory service after they've generated the program and says hey, I want to sign this and then based on that user's abilities it will sign the program with the appropriate allowed capabilities on it. And then that program is sent to the kernel and the kernel verifies this at load essentially and it says you are, when it goes through the verifier it should be able to, before we even verify, we should be able to check the signature and say this is a signature that goes to the trusted root key. And then the capabilities are pulled out and make sure that all the capabilities and the extensions match the capabilities that the program is actually trying to use in the verifier. And if any of that fails, give up error, no permission. If that succeeds then go off to do all the dynamic translation necessary just in time and try and run your program. There are some open questions I have. So does the edited model like break down somehow with C-groups, like how do we represent like what needs to be represented when you start talking about namespaces? Do we need to support revocation lists? And like there's a big trade off here when you make a decision on whether you want to support revocation versus whether you want to use short-lived credentials. If you use short-lived programs where you re-sign them constantly now you have the problem of I need to have a pretty high up time signing service that deals with this all the time. Or you can insert the concept of like time bounded into the credential itself and say that some of these are 15-minute programs for an engineer that's delegated or and these other static programs that the fleet deploys on a regular basis are good for months and that would potentially solve that problem. But there's a trade off to be made here and I don't have the answer and I don't know if there's a one-size-fits-all answer for this. It may be something that we might want to have flexibility on. So I just wanted to separate because the unprivileged and the editing concepts are a little bit more advanced to the on a very basic principle, right? If we have an EDPF instruction by that is signed by a trusted authority and that that signature can be verified in the kernel. There's a couple of differences we last talked about this, right? We were talking about file-based signatures, right? There's no concept of a file in a BPR program. So the signature is effectively in the program by code itself. Yes. That is sort of one key difference here. And I don't know what identity at like we in a privileged like sort of user space where BPR programs are only allowed by rule. I don't think identity matters that much as long as the program is signed by a trusted authority or the generator of the program in case of dynamic. I mean, we could also eliminate that if we think it's not a useful feature, right? Like start with the smallest thing that works. Yeah. But yeah. I mean, do you need to decide up front what the identity is? Because I worry about identity because identity means different things to different people, right? And I'm not sure like in this model, do you have more slides, by the way? Am I cutting off other slides? No, no, no. I kept it short because I figured this would be like we've had a very open-ended conversation on this for quite a while from what I've heard into. So in my view, I think that the kernel piece should be a BPR program that I can write and then you can write your BPR program. And then we can sort of evolve independently if we want. And eventually probably some there'll be some common thing that lives there. I would suspect over time, but we don't have to agree on identity at that point, right? Like my identity can be whatever my customers want and your identity can be whatever your fleet wants. And I actually know we have customers with different notions of identity for whatever reasons, right? And we also, I would say like we have, I know people want revocation lists in this, I've heard people talk about it. So like, but maybe that doesn't make any sense to you, right? So like I'm not sure, like, I guess I would be trying to break it down to like what do we need in BPF to support this and like try to remain as flexible as possible so that we're not encoding something into the kernel that everybody has to do. So I would say that what I would like to see is the kernel become the enforcement point for the checking of the signature and whatever extensions you want to support. And if we want to make a helper program for different policies in that certificate, that might be an appropriate place to do that so that you could have that concept of identity segregated for each use case and user thoughts. I mean, so there's a, there's a, we alluded to yesterday about a mod verify sake sort of helper yesterday. I think like we may not, there is an instruction byte. We have access to the program and the BPF program is being loaded. We could encode the signature verification in the BPF verified itself and add a kernel config, but that sort of disallows the flexibility, John, you were talking about. So you could do this signature verification check in BPF pro in that hook there. Use the helper that we discussed yesterday of some sort. And then we have verification. And you don't need to do anything. You do need to make sure this comes up. Cut you off, sorry. I do think you need to get that program in an early boot and we need to have a way to freeze it so it can't be unloaded, right? Because you do want it to be in the trusted boot probably with the PM at the bottom or whatever your platform does. There are two pieces that we connect from yesterday, right? Like Roberto's use case about early boot BPF and in the block area as well. And then the signature verification stuff. So I'm looking at this from the cross platform perspective because I want something that works on both Linux and Windows. And I think everything that we're talking about is actually applicable to both. And I agree with John's comment about, you know, using the gatekeeper style thing instead of hard coding it into the verifier or the kernel or whatever else. The one question that I have for you is when you talked about the first byte and the instruction, it sounded like you were putting this in line with the program itself as opposed to metadata about the program. And that's the one part that I find perhaps questionable as to why you put it in the program. And I'm thinking about scenarios like I want to take the same program and have multiple signatures. So I take your program, I vet it, and then I re-sign the same program with my signature for use in my fleet, right? Even though you wrote the program and you sign it first, I'm going to counter sign it or replace the signature or whatever. And it seems to me like that's much easier to do by putting into maybe a separate elf section and then passing it down along with the program because as long as you have the same key signing it and you're signing the same bytes and the signature matches, it shouldn't matter exactly how you encode it. And I can imagine the encoding could actually vary by platform. I don't know if it needs to vary by platform, but it could, right? That's kind of what I'm wondering. So what you're proposing or like what is proposed here is the, you don't need to change the bp of syscall here. In that case, you have a file, you have the attribute, you will need probably a .sync field. So one way to put those two together is to say that on Linux, the way that you map it into the syscall is by putting it back to back and saying prepending it, right? And on a different one that uses a different syscall mechanism, right, that has, passes two fields, you wouldn't necessarily have to do that. So it could be that the prepending stuff is a Linux specific thing as opposed to an ebpf generic. I would also say that the instruction byte code doesn't prohibit anything you said there and so much that you could still wrap an instruction byte code with an instruction byte code. And that would be a payload wrapping mechanism as long as your kernel understands that it would still be fine. Maybe I didn't fully follow. Wouldn't it be simpler just to put it as a field in the attribute that you load? Like we could just add it at the end and here's the signature array for implementation. Yeah, because if I look at, say, in Mateo's patch, he did something that was alongside not in the program or whatever, so that's what made me think about it because this is a new way of talking about it or whatever that's different from, say, how Mateo was. So the one thing I want to strongly talk about is that when we do this signature, however we do it, that we are assigning both the capabilities and extensions as well as the ebpf program. The easiest way is to get those aligned side by side so they don't get messed up is inline. If you start splitting the program different and putting the capability somewhere else, it's a chance to screw up the verification or a chance to allow somebody. So in this case, I'm calling the capabilities potentially an identity, what the ebpf calls may be allowed from the core framework as well, things like that essentially. Are you allowed to write data or are you only allowed to read data? These could be things we could do. None of these are mandatory, but I'm calling capabilities just a generic catch bag of extensions that could be added based on use cases. Yeah, the concept of capability shows up in a bunch of other security contexts, whether it's a specific term and use of certificates and IPsec, the term escapes me right now, it's EKU, so it's similar kind of an EKU usage. It says you're authorized to do the following things and you can't go outside that. So I think that's a good idea. An EKU is the common way of doing it, yes. So I think that part is a good idea, so it's just everything else, it seems to me like it's, it could be done either way, that whether you encoded prepend or whether you encoded as a separate field or whatever depending, seems to me like it's isomorphic either way, anything you can do with one, you can do the other one and so if we say that this should be done cross-platform, that's okay. I don't know if that's gonna cause any other constraints, I don't know, that was my question, so. Just, so this is how I map it in my head and you can correct me if this is correct. This is an incorrect representation of the capability stuff, right? You have this byte that verifies the sort of, who built this program, right? Who signed this program, that bit is checked there and then there is a mini sort of LSM-like policy there. This is what you can do with this. You should be able to do based on your credentials and that I've given you this permission, right? From an external entity and this is what you're allowed to do and your mod verifies, I say in the BPF pro LSM hook just basically then after verify signature expands this policy or capability stuff into various checks that it can make. Okay, these helper calls, this stuff, right? That bit I think is more advanced, right? Like currently the mechanism that we're looking for at BPF programs, especially due to like side channel stuff is all or none, right? Like, or programs that are generated by trusted command line delegates. I agree that it is more complex. I also would state that it's not necessary for the first cut of anything. These are all, that's why I'm calling capabilities like this grab bag or whatever we want for whenever we need it. But you're right in pointing this out because this is not a one, two, three year thing, right? We want this to live long enough that it can be extended. Yeah, I don't think we wanna deal with like an API change for how we do signing and I think that's why this has been stuck up so long, potentially. Don't be sensibilty, like now we don't need to use it. Yeah, it's very different from like simple engineering stuff that is you ain't gonna need it again, but it's more, it has to be future proof in when you're designing the stuff. It's design resilience. And maybe we never use any of it except for like whatever use cases that immediately come to mind to the people in this room. I think identity is something that some people care about. At which point and what are you signing? At which points? So the signatory service is doing the signing of the program. The byte code. Which byte code? Like before it got like relocated, all the sub-programs got added and all the stuff were after. I was saying before and then any translation that happens in the kernel essentially. So like use cases like BPF trace are just like out of the question in this. So what I would think is that BPF trace could still work if it had the ability to talk to the signatory service as a flag. And if we standardize the way that works then signing still works, right? So you're saying like you will trust BPF trace to do the signing? To talk to the signing service. Like what was the mechanism of trusting BPF trace I guess? Cause like it doesn't seem BPF related at all, right? Like how? Yeah, there is, I have a talk today about like the way we could do this. Like we can do it with like, you can create a policy domain of like dynamically generated BPF programs that are actually trusted and are allowed to run unsigned code at that point, right? Or they can, as Jason mentioned, you can talk to a signatory service and hey, hi. So in that scenario, like if you allow BPF trace to like do its own signing verification, right? Like what does BPF trace provide to the kernel to prove to the kernel that like this BPF program is okay to run? Yeah. I can't do it. So for like our case we would do you would when the BPF trace started, right? It would be signed. So like you would know that that application is BPF trace. And then you would also know things like where it's running, like what's the context? Like you'd have the UID, the GID, like all of this stuff. Like is it in a pod? Like and you probably would also know you might care about the attach point but probably like first cut probably not. And so you would say like, okay, at least I trust this application and I trust it to run in this context, right? And at that point we would probably just audit it, right? But then there is no signature that you assign it to the kernel, right? I mean, like these... You could still do the signature, right? You could still apply a signature in BPF trace. But what does the signature mean, right? Like why? All the signature, what the signature gives you is it lets you put enforcement on at the kernel, right? So now the kernel can, if you want enforcement on it, it works. That would be a sort of imagine the signature has some capability checks there, right? And the capabilities in code where BPF tool should generally be allowed to do. And if BPF tool somehow generates something that somebody's found a thing that is trying to exploit something in BPF tool, or BPF trace, sorry, then it would, signature would be like, hey, you're not supposed to do that and you're doing that. Okay, so signature is not just like some hash function of the bytes. It's like something more composed of like bits, what's allowed and all that stuff. Yeah, I think some of this was, you'd ask the question, how does the signatory service know that BPF trace is the thing that's doing it? And I think that's what we were just responding to, it says, okay, as long as BPF trace is signed and so on, the kernel would trust whatever key the signatory service has, which could be unbox, right? Which is what Jason was saying before, right? So there's such like a local RPC call, right? And so the signatory service, if the kernel is the only key that it trusts is if you're using public private keys, right? And the private key would be held by the signatory service. Using something other than keys, it's whatever it is, right? That the kernel has. And so the signatory service is what is then checking all that, is it running in a particular context? Is it really BPF trace and so on? So I think that's what the, we were trying to answer to your question of how does the signatory service know the kernel still only trusts the signatory service in Jason's diagram? I don't know if that answers your question, but I think that was- That was- I'll just say there's like the other option is to just not use the signatory service for BPF trace and have the kernel verify, like I know that this application and I know that this UID and this all this kind of stuff. And I think that's what you were saying when you were saying that you could just put it all into an BPF program and one BPF program would know that it's BPF trace, another one would just use signatory service keys and we don't have to agree on that because people can use different gatekeeper programs. I'm just a little- Yeah, they can get complex and we should avoid complexity where we can. And this may not be the right solution, maybe there is a better solution we can come to in this meeting, right? Like that's why we're here today. I like this picture because he specifically said the signatory service could be on box or off box. And I guess I like the left half of this picture. I like the right half of the picture in the other one that was John's diagram that had the gatekeeper program on it. I mean, I think you could still combine these two together. Right, exactly. That's my point is I would combine these two, put the gatekeeper on the right side of this one and the signatory service in the left side of this one said the signatory service could be on box, out of box could be bundled into the application. Like if the signatory service was bundled into BPF trace, you could get one variation we were talking about. And so all those are possible, you stand for ways to compose these. But I would put this picture and the other picture together into one diagram. I saw the rim, it's like by the solution and I picked up the mic, so let's see. No, let's say it's high. I'm curious to see what you folks think about this. Well, just wanted to remind everyone what Andrei was alluding to, that the programs themselves, it's a completely useless piece of thing to sign because there's so much BPF does after it's parsing, like doing the relocation and adjusting sections, missing the function, so it's the only thing. So it's only the most trivial programs that don't use maps at all can be just simply signed and will not be modified by the user space loader. I agree with this, right? This is, the signature here assumes like that the bytecode that is generated by the compiler itself at that point, before runtime relocations is sort of, could we, have you ever considered like doing these relocations in the kernel? Like is it ever going to be the case that we're gonna do these relocations in the kernel? So, well that's what we've been doing for the last year, right? So slowly moving all of the pieces that BPF does into the kernel. Once they are in the kernel though, then you have the capability of putting that all behind a signature check. Sort of, so the way it is right now, we have this loader program that can be generated, that's part of this like light skeleton. What it consists of at the end, like it has consists of two main pieces. One is a big map array of one element and the program. So both need to be signed. So this program, this loader program essentially is executor of the instructions, the pseudo instructions that are in this array. And inside the array there can be 10 different programs with the attachment points, 20 different maps, how they create, how they populate it, and so on and so forth. Just gonna add that I like the idea of having all the relevant relocations here be done post signature in a trusted component. On Linux that's in the kernel, on Windows it's in a trusted user land process that basically runs with the same privileges as the kernel, it's just different address space. And so that same direction that Jason is proposing, if you do that, then I think that is abstract enough to work even though the details are different between Linux and Windows, the concept is exactly the same between the two of them, which is you check the signature, you do all the relocations in the appropriate spot on the trusted side, that you don't do any of the relocations until after you've confirmed that the signature matches and that you're gonna be doing what you're gonna be doing, and then you can do other relocations before you actually install the jittered code or whatever it is, so. So I think fundamentally, right, like you have what you mentioned here, right, there is a loader program that encodes the actions that are being going to be taken by an intermediary like libbpf or something, right, and then what you do is you sign these actions plus the original instruction blob, right, which is fine, the format of the signature that Jason proposed is a byte in the beginning of the program. What follows is all the other stuff, right, the actions that are going to be taken by the loader program and the actual instruction byte itself. So the issue is with the verification check on how you're going to verify that stuff. So just another quick comment on this first signature, first new instruction encoding the signature, I still don't see any point of doing that instead of just providing the signature as like another property when you do BPR pro globe, right? Is there any benefit to like embedding it into the instruction set? Why? Either way you need to change kernel, right? So you just extend the syscall. You don't have to change the syscall to the answer that that would happen. But like you have to change the kernel, like even if you don't change the syscall, like just to support this, right? If you want the enforcement at the kernel then you have to change the kernel anywhere. You don't though, right? Like currently you just need a helper that Roberto proposed yesterday. Okay, yeah. And then you verify it in the BPF program. And we add the, sorry, I forgot that we're talking about the BPF gatekeeper hybrid model right now. The kernel API or ABI changes the helper that we add and the helper is beyond this particular signature used. Do you have a microphone? Yeah, sorry. But like if verifier doesn't know about this instruction and doesn't skip it or does something with it, the program should be rejected. So like either way you need to teach verifier about it. Like if you are going to teach verifier, you might as well just like teach the syscall to get like this as a separate property. I agree. I think this is like the verifier change to me is not an ABI, like it's fine. The verifier is going to skip instructions. Instruction set is an ABI. Yeah, it is. It's an ABI, let's say. It's like even stronger basically. Like cause like part of the syscall you sort of can like deprecate like while this instruction set is. I'm personally indivulent as to how we encode the signature is. So that was one, like, but another one, like I want to make this more interesting, right? So let's say we even solve all this problem that like program code itself is like not changeable unless until it gets to the kernel, right? What do we do about like different attachment points, right? So like you can verify the U probe or K probe and then attach it to many different functions. And that might be, might have like some security implications, right? The capability stuff. This would say that like the, this function should be always attached. Now, like if you encode this. So basically like this magic signature encodes like everything that's possible to do in BPF. Like, okay, who, who this becomes like a language basically. Cause the capabilities of like BPF are like super diverse, right? So like K probe targets, U probe target, like F entry targets and like other different, like they are not always like just one string that you attach to, right? It's if you care about this, right? Like if you care about the attachment point there, if you, if you consider that like you don't, you don't have an extra policy that is sort of enforcing the stuff and you want this to be in the signature, then it could be in the signature. Otherwise you could just say like, look. So this signature is like pass through into like some custom BPF program in the kernel and like kernel meaning like very fire, let's say, doesn't know about its format. Is that the idea? That's what we are proposing, yeah. I see, okay. So this is like some black box set of bytes. Yeah. Which is like wire through into like custom BPF program that makes ultimate decision. Some work like that in my head, yes. Okay. I don't even, it wasn't clear, sorry. So like very fire just like, if you want, you verify. If not, like we just ignore the signature basically. I would just like put it as another field in the struct to pass it all the way through, pass that attribute to the. The important, like the thing that I didn't understand, like the important thing is like this signature, like very fire and kernel itself doesn't need to understand like it's format because if it is, then like it's this whole committee discussion, like what I put there and okay. Which is why I think it's best to put it in the program. So we keep it like custom black box to like specific solution, right? Yeah, okay. That's fine. I was wondering, do you have a, or like would there be a use case where you have several sections of the program with different signatures or incapabilities maybe at some point, like if it's encoded. So you're assuming like a helper that's more restricted than the parent or? No, no, like so that basically your pseudo instruction for the signature verification would only cover some part of the program and then you can have like a different part with different signature and capabilities, for example. I was not envisioning that. Okay. It seems like a lot of complexity. But that would be an argument in the favor of putting it in the instruction set. That's what, yeah. If you can, if you have multiple signatures in the instruction byte code, then you're... Well, it gives you the ability to wrap multiple signatures, but he's talking about actually partially signing the program, I believe. And partial signatures, like I struggle with a use case for partial signatures. I really don't like the idea of new instructions here. Like they are not like, there is nothing like PPF instruction set is like X86 and ARM and MIPS and all combined. Do you have an instruction X86 that says verify mis-signature? I don't know, are we talking about SGX? So, I thought about it. Even as JX is different. So, I think under his question about what are you signing and is it before or after? And I said, well, this is what Jason's proposal was. And I had a chance to think about it myself to see what would the equivalent be on Windows or potentially other runtimes? On Windows, the verifier is not in the kernel. It's up in trusted user land. And so, the question is are you checking, so if you were to think about the gatekeeper program or checking of signatures, is that done before or after verification? Now, in Jason's answer, you do it before verification. What that means is that if you were gonna do the verification, sorry, if you're gonna do the signature checking in a gatekeeper program, that means you need the gatekeeper program to be running in user land prior to verification. Because of verification, you never even hit the kernel. That's complicated. It's possible to run a BPR programs in user land, but we haven't done any of that work. And so, that's actually fairly complicated. You could also say it's done afterwards. You could say that it depends on your implementation. Just like if you have multiple gatekeeper programs, you don't have to agree. You can just say, oh, well, that's up to the platform. You could actually have multiple gatekeeper programs, one before, one after, if you wanted to apply that concept. Or maybe it varies by platform. So, all of these I think are possible here that I think if you said, and if you said that the gatekeeper program needed to run after verification, then having it be in the instruction set is odd because now the verifier runs on something that has this extra instruction. It's just odd, right? So that's why I didn't like the separate instruction thing. It's possible to do. It's just work, right? But I did want to think about the notion of saying the signature checking, I can imagine cases for doing it both before and after verification, either for different platforms or even different use cases in the same platform. So I think Andre's question was- I'm not sure how to work on Windows when verifiers separately, but at least in the Linux kernel, what we can do is to check the verifier, the signature before starting the verification, but then it only will be like a Boolean flag. Did it pass the verification? Then during the verification, we will have several LSM-like callbacks that will check whether this particular helper is called and so on and so forth. During the verification, there will be many times the secondary gatekeeper program will be called. And at the end of the verification, that will be the final decision because verification itself, like after verification, we cannot do the signature check because verification changes the code as well. Like it massages the code quite a bit, so the code is not at all as it was in the beginning. So it has to be like many, many steps, one in the beginning, then several calls during and then. Another direction that we're looking towards for the future, and I think those conversations happen across platform, but I don't remember which context it was in, whether it was in BPF meetings and the BPF summit, a leaf meeting or a BPF for Windows meeting or some combination of all of those, which is, is it possible to run the verifier off box? In other words, if the fact that something has been signed, does that mean it has been signed and verified by the signatory service and therefore you can skip the verification step because you know the verifier is already passed with the current kernel version. Is it possible to offload that in which case you get a bunch of CPU cycles back because you're avoiding the verification step? So is it possible to do that? Is it a bunch of work? Yes, but is it possible to say my gatekeeper program causes it to bypass verification because it knows the signature has that thing set that says the signatory service has already done that for me? Is that possible to do? Is it might be a good long-term direction? It is one that is being discussed. And so I think some of the things that Jason is talking about is actually aligned with helping in that direction should people wanna go in that direction. I don't think there's anything prohibiting what you just described. That's right, that's one of the things I like about this is it certainly doesn't prohibit it and it may even make some things be easier. For the Linux, we can give this initial signature check passed. We can skip probably 90% of the verifier but not all of it. Like that code elimination, you have to like, I don't even know if you can skip 90, right? Like for that code elimination you actually need to trace the state and know like which ifs are taken and not. So, but really like in practice, like is it that much CPU time spent like in verification? You verify once and around it like millions and billions of times. So far it was never like really such a big problem. If the verification is like considerable concern, like the verification time itself, like CPU use for verification, let's say. So, does anyone have a use case where they're loading BPF programs like very often that need to be signed? So not very fast. It would be useful if somebody had statistics for like recovery from a DOS attack, right? Where you don't have a lot of CPU cycles. How much of a difference does it make to have to go through verification? Does it slow things down because you're already maxed out? I don't know. So if somebody has stats, it'd be interesting. So, there's probably my naive understanding. The signature is like you claim what you do and you need someone to verify you actually, this is what you do, you didn't do anything else, right? No, that might be wrong, but I think that's my own understanding. Let me know if that's totally wrong. But the thing that's the state of truth is what this program is doing is in the verifier, right? The verifier knows exactly this program. Does this help or is it not that help? To me, like the signature service should be together with the verifier. It's the verifier to say what this program is really doing, that's the state of truth. And why do that twice? So signature certificate is like a verification or a stamp or a certificate of authenticity, right? Like I built this, this program was, I know about this program, I, there is some reference to the I here, like who am I, that is a part of the signature, right? And I certify that this BPF program, blah, blah, blah. So that is what the signature represents here. It's more than just policy checks. Also if you're doing enforcement in the kernel, maybe you have a workflow where you want to not allow an engineer to do something with a BPF program in production unless somebody else signs off on that. And it's very hard to do that without some kind of additional service sign off on that. And that's what the signature service allows you to do is to build in that additional complexity for a large fleet. I think there is still two sides. One is like what does the program actually do? And the other side is whoever initiate this program, load this program, whether he's capable to do that. I think that's the two part of it. I think the verifier is still the owner to is the state of truth of what the program actually do. Yes, but what if you wanna disallow, even though they have root privileges, what if you wanna disallow somebody from making right calls to something? So that's, like let's go back to, I think that's the idea is like you do the verifier, verifier say this program do, then after the verifier you'll say okay, this program does this and here's my ticket, I can do this. So I think there will be the after verifier. Oh, that's the keeper. So just to like do concepts here, right? This is what Jason mentioned there is the capability of the MAC aspect of the signature that we talked about. That is a secondary goal of the whole signature thing. Now, talking about the secondary goal, the MAC policy hook for BPF is located before the verifier. It's like BPF prog hook is before the verifier logic itself. This is where you read the instruction code and try to decipher. You have the Cisco arguments, you have the instruction code. You see, is this a BPF program that my MAC policy allows to be going further? And the verifier is actually doing as Alex mentioned, significant changes to the instruction byte code. So the MAC policy checkpoint is already established in the kernel and that is before the verifier. So if we need to move that check after verify, then that is independent of this discussion, right? We can move it after verify if you think this is the right place but that is independent of this particular discussion. The other aspect of signatures there is, I think you probably understand that. The main aspect is to establish this BPF program is known to be a good program, right? From by this entity. And it represents the sort of connection between this entity and this program. That is what the signature sort of represents. So it's just the capability part. The signature service only provides a capability part. And the signature service is just acting as a stamp of approval. So the signature service will not do analysis of the byte code? The signature service can do whatever you want it to do but it could do analysis of the byte code. It could also skip analysis and be like, yep, that's a program and here the capability is assigned with this identity. And if those turn out to be not true later that could be a different problem to be solved at the verification time. For example, I want to say that like all BPF programs should be built on this machine. That is my like trusted builds out, right? And then I need to ensure that when the kernel loads this program, this part of the BPF program being built in a trusted build server is checked. This is done, if you have a private key in that trusted build server, it signs the BPF program. The program is shipped with the signature. The signature is verified. So that would be. So this signature, right? Like it's like cryptographically signed so like you cannot like spoof it, right? So like random like an unprivileged incorrect process cannot just like substitute it, right? So this is important point. Like probably would be nice to like for non-security folks to emphasize it. Like this signature, once you get it, right? Like you can verify that it's truthful, right? How did you get it? That's not like the kernel's concern basically, right? And then like to song's point, right? That like we will need to teach verifier like about all those capabilities. If we do that, then the signature format becomes like the API, right? Cause like verifier needs to understand. I think like the alternative to that would be like to instrument verifier, right? Like with interesting points, right? So like helper like is called and like stuff like that. And then like maybe call program. I think we talked about this, but like kind of summarizing, right? We'll have like hook or trace point or whatever some mechanism to make decisions before the program is verified. During the program is verified, like we'll have like helper called, map was created, stuff like that, right? And then after that, like we can still pass like the finalized like that code removed BPF instructions, maybe jittered instructions and let the program also like process it if they really want to. That's like a lot of complexity, but if someone wants to go to the great lands to do that, they can actually analyze BPF instructions, right? Like we have BPF loop now. So I think that would be like the way to like keep this signature completely black box to the verifier, which would be good, right? And that to the capability part of the signature, which I think currently is... Microphone? Yeah, this is the capability part of the signature, right? Where we think like this is a future extension that we will need or like the first basic... Agreed, agreed. I like the proposal for the capability verification if this needs to be in the verifier as Song pointed, right? But the other thing which I wanted to like for the signature part, the basic thing is to establish that identity based, the verification of this. Like there is a, I have a signing key. I've signed this program, right? There is a verification key in the kernel that verification key verifies a signature. All of this is cryptographic math, nobody can speak. So there are these components. You can correct me if I'm wrong. I think that is where we have landed right now, yes. All right. So, and it just, it's just feel weird to me. It's like the verifier is check every bit of your DNA and the security build server just give you an ID. This is a valid, like the trusted person, but like the verifier is looking at this like much more great details. So I think in the model that Andre was mentioning, then the job of the verifier is just to check safety. It's not whether it's authorized, it's whether it's safe. And it happens to generate, oh, by the way, here's what it did, right? So it used the following capabilities, but it doesn't, it only checks safety. It's then the gatekeeper's job, whether or the signature checking job, right? To say, which could be before or after the verifier. And I think doing what you said actually allows it to be done either before or after because you're basically doing an AND gate that says, okay, as long as it is safe and as long as the signature only authorizes the capabilities that the verifier says that it uses, right? Then it doesn't matter which order you do the math. Because what some of you are proposing is a question to the threat model that is a signature strength to solve. So I'll give you an example, right? I built a BPF program. The BPF program is a valid program. The verifier is going to accept it. And I write some C code. This is compiled by Clang to generate some byte codes at the end. Now, it was not built on a trusted server. So the Clang is a malicious Clang, right? What it does is it adds a little side channel gadget into the middle of the program which the verifier cannot detect. Valid BPF code does branch target but for poisoning, right? And then, yes. So verifier has no way to know that this was generated by a malicious Clang, right? And has added instructions in the middle. So what that signature represents is your trust essentially for our use case or use case for this is this build tool chain that I used to build this program is trustable. I would put it that verifier checks like technical correctness of the program. So like array bound checks while signature provides like the intention correctness. Like yes, the intent is like no one and we are okay. Was it given it's correct technically, right? So verifier like still does it as if it was root. It was like all the capabilities but like then we need to make decision whether like to pass it like further if it's correct. I think that's how you, right? I have a question regarding like, so if you have like a BPF trace version and it's running on a machine. So how does it talk to the signatory service and how does signatory service make sure it's actually the BPF trace it pretends to be? So how does it look in practice? I mean, I'm just. So are we talking on like a personal computer or we're talking on like fleet scale? So what I would imagine is some kind of like we would define some kind of if we were going to have BPF trace be able to talk to a signatory service, right? We would define a like an API contract for how you talk to a signatory service and anyone can implement that doing whatever they need to do, right? And if my signatory service wants to do some additional verifications like that you're coming from a certain location in the world and you're running as this UID GID and I can map that to this internal identity then we can do that, right? And it's just a contract for the like I want to sign this, here's the information I have and then if all of that checks out the signatory service will apply the signature. All we want to establish is this BPF trace is something that I trust to be, right? So one way of sort of establishing that is an FS verity like partition. So I will say that all executables are like or any execute, the very minimum, right? Any executable that is generating dynamic BPF bytecode is going to be run from this FS verity verified partition and then the signature service can sort of ensure that based on your environmental setup this is coming from an FS verity verified partition and verify that hash for the FS verity partition. And we could extend the model to have like a pre-signed binary section as was described by John earlier, right? Like there's nothing prohibiting that. Yeah, I mean, early on at Netflix we tried to get system tap going and it had its remote compilation thing and it was a nightmare because service teams had implemented firewalls and when there's an outage and you're trying to run scripts in a hurry I can't even connect to anything. Networking is slow when I can connect to things. So it becomes a fairly big barrier to entry but I understand you could just say that's an operational issue and people just have to figure it out. Like my actual question is is there any, sometimes when it's just urgent and we're ready to throw anything at it is there any way to have like an NMI or a magic sysRQ to say don't do verification this system is toast. I just need to run BPF trace. So I mean, if you put the verification in the gatekeeper program right and keep it out, right? And there's nothing prohibiting you from having this magic like the secret you released to just disable everything, right? That would help a lot because it's that situation where it's like Netflix is down. You're essentially describing like we're gonna burn the world after we play here essentially. Yeah, the instance is gonna be destroyed anyway. It doesn't matter. Although it does sound like an interesting opportunity for a social engineering attack. I mean, even if it is a good opportunity I'm assuming that they have an audit trail when they release that kill key. Just, I mean, like, does anybody here have a signature service running already with like strong identity and like a real fleet? I mean, you guys? Not for BPF. Okay. Yeah, yeah. So you could, well, I mean, I know lots of people have but like usually it's quite complex set up, right? So like, did you consider, you know, hooking into existing, you know, identity services and stuff like I'm like, if you're gonna launch it onto this project to create a signature service, that's fine. I mean, it doesn't really involve BPF necessarily but like that's like a whole committee of work. So I think where we've landed though is that the gatekeeper processes like per team, right? And the gatekeeper processes doing the appropriate verification. I'm fine from a BPF side. I think the gatekeeper works fine. I mean, it's just more curious like if people sort of have scoped that what this signature service involves it's not a light undertaking as always. I think this is very valid, right? Once we hash out how the signatures are going to work the questions around relocation and this stuff, right? Can we do relocation in the kernel? Do we sign what Alexa proposed? We signed the BPF program, the loader program and the bytecode of the program. The actual implementation of the signature stuff it should be left to experts who already are doing that, right? Like I don't think I can, I have the requisite expertise there. What I would say is that like the reference implementation for like a signatory service, I would say it would be something like SSH Kijin's level of like a signatory service, right? Like I need this, I have the root key somewhere on disk and I need to do something with it. It is a toy. And then that's why you have all the other bigger, better implementations that will become someone's GitHub project most likely. But for it to be actually useful in the ecosystem, right? Some CA should be able to do this, right? And this you can think about the Istio CA or whatever, sort of implementing a signatory service with this stuff. But there's a lot of groundwork that needs to happen before that. I'm talking about the BPF stuff. This looks like an Istio project. You can put Istio there and then Envoy and... And CNCF look... Yeah, I mean you're gonna stick whatever system you're using as your signatory. Cool. So I would like to echo some of your point that I think I finally get is like... It's kind of... What my point initially like correctness is... Oh, like it's totally safe, it's very difficult. Like even Firefire is the best to know that. But even that is probably not possible to be 100% sure that it's like, it's secure. But I do see the point like you have a signature service you could be able to get the trust going on or at least say I have a BPF trace. The least thing I'll do, I will log everything I do. So if you intentionally do something bad, it's log and you will be caught later. I think that's probably what we do a lot in production. I mean, it's always good to have detection after the fact, but prohibiting a problem before it can become one is even better. This is like different like BPF trace is like the extreme case too. So it's hard to prove it's always safe. It's hard to make it safe. Like the other extreme is gonna be very simple. Like just exactly that program, you know that program. That's the other end. It's more like a great signature provides the spectrum of coverage and whether it's absolutely safe or at least I know who is doing the bad thing. Yeah, I agree. I was going to answer Brandon's question there. Like the Brendan mentioned, there's an external security service which has sort of availability concerns and then in the middle of an outage when you are on the limited availability spectrum, right? You can't reach the signature service what you do. In that case, there's a thing that I'm proposing today as well, where you say that look BPF trace will be allowed to run unsigned code as long as it is in the FS variety partition, right? And then we do what you said we do, right? Like we log all the SysCalls that are being, we anyways log all the BPF SysCalls, the payloads and all that, ship all the byte code that is being done, that is being generated for like offline verification or whatever you want to do with that, right? To do more threat analysis after the fact. But it's one solution if you want to tackle, if your operational needs do demand that, which I think is a fair ask, right? Like having to talk to a BPF trace talk to a signature service might get more complicated. So I thought that I understand what's going on. And then like you said, like we still need to figure out like the real locations and all that stuff. And now I'm confused again, cause I thought that like we don't really need to figure that out cause that's signatory services problem. And like signature service is like custom implementation for each probably big customer, right? So why do we need to figure out like, who does relocations and like when they are done and like what kind of relocations? Well, what am I missing? So let's say like in this case, like you can teach your signatory service that like lead BPF is trusted partner and just take the elf as is before all the relocations before anything, just hash the elf and trust the BPF to process it correctly. What's the problem with that? Similar to BPF trace, we are saying like we trust BPF trace. I think that is a fair. All right. Yeah, I just wanted to make sure cause it's like the programs that like I use in lead BPF doing like a bunch of runtime modifications of like RO date and all the stuff. They are not going away anytime soon. So like we need to plan for that I guess as well. The only thing that we have to agree on is what the extensibility points are, right? So for example, we talked about capabilities. If you don't put those into there now it's really hard to retrofit them later, right? So which parts do you lock down right now as being the extensibility points and then you leave everything else to save? I got an impression by passing all that cause we are saying that the signature which probably would be better to call manifest cause it's like not really signature cause signature has like this hashing. But anyway, this signature slash manifest, right? That's like black box that's just passed through into like the custom BPF hook in the verifier, right? So like we don't need to agree on anything like. I think you're right. But like I think what you're saying right is like you can sign the application and trust the application which is like better than what we have now, right? Like just doing that probably adds a ton of security to the system. Well, I think what they're at trying to do is also sign the BPF programs themselves, right? Which would be like sort of, in my mind that's like at first sign the application solve that problem and then worry about the BPF program but that's fine. But at some point they have to have trusted application. So they can teach that trusted application to do this signature, right? I think all of this, right? The trust boundary for one environment, right? Where you can draw these trust boundaries is going to be very different from where you can draw these trust boundaries on another environment. I may be able to trust the BPF, right? In that case, three locations are not an issue, right? I can sign the program pre-relocation and I establish a trust boundary along for this particular use case with the kernel and plus some libraries and some tools along with that. Some cases I may not be able to trust that and we need to at least in this particular case we need to cater to all of that or at least ideate on all of these trust boundaries, right? My worry is that like if something like this is adopted by like Fedora or something they will just like, yes. And like you won't be able to do like BPF development as like a normal, not like corporate backed BPF development. I would say that the key that's trusted should be like a writeable boot, right? Like with a fly. If you want to change the key, go for it. Like it would be like a PIM string, like decode this, right? I think it's, so say Fedora did this, right? Or some distro shipped with this, right? It's just a BPF program at boot and in it time. Like you should just be able to like delete the command line and be like, okay, now there's no BPF program, right? Like you shouldn't say like, that's why I don't think we want it as like a config flag that somebody can just turn on and then we're just stuck, right? Like your whole system stops working and there's nothing you can do and we give, you know. What if Fedora bakes this BPF program into the kernel? I think they probably can. They probably will. I mean, like with a patch? Like they'll patch their kernel? The preload like skeleton, right? Like you bake it into the kernel and like it just automatically boots. It's simple like in terms of like deployment, right? So then like you can't really do anything about it. What BSE discussion, right? What prevents, like can we provide the same implementation now from ourselves? Like where we can, where we can provide a same reference implementation or this will eventually happen, right? Like. I mean, I don't see what would prohibit Fedora from doing this with like, why don't they just ship a kernel module that does this as well, right? Like they could lock it down however they want if they're shipping the kernel. By reference implementation, you mean like the reference implementation of the signature service? And who's going to implement that? Everything like the whole ecosystem, right? Like this is, these are how the signature verification is going to work in different trust boundaries. This is what we recommend. These are the goals behind the signature stuff to do. These are the goals, right? Like which we think is it's relevant for, not for gatekeeping or limiting BPF development. Honestly, like if a distribution is curtailing BPF development significantly, it's to their detriment. There is sort of a Nash equilibrium here, right? If you make the distribution less useful to somebody, the distribution will be less useful. So, and it'll be less used eventually. I also, I don't, maybe, I get the impression that some people believe and I sort of disagree. Like if you load a signed BPF program, you know that that's the BPF program that you're running, right? But you're not fundamentally changing the secure, we have lots of BPF programs. If you could write to the maps, you can totally break the system, right? So like you've kind of closed down one minor, in my viewpoint that like that's okay, so you've closed one minor security problem, but you still have all of these maps that you could DDoS, you could exfiltrate data, you could do like all sorts of things that are just, to me almost actually probably a bigger security problem than actually the actual BPF program. Because it's like a hook with a map that does an LRU filter on whatever, right? Very honestly speaking from my, the way I see it is it's all supply chain aspects, right? Like the signature service has the comp, I'm trying to call BPF helper foo, right? And the compiler is like, yeah, no I'm not gonna generate byte code for foo, I'm gonna generate byte code for bar, right? And this is, this could happen, right? Somebody could make your compiler sneak in a binary in the thing, but that thing shouldn't be allowed to sign your BPF program then. Just final remark, like I just don't want to make this, like I don't want us to design the system around the assumption that like you will be able to write like all possible, like majority of like useful BPF applications in a way that like their code will not be runtime modifiable because I don't think that's realistic, like majority of applications right now, they do some tweaking, like is it like the BPF or BPF trace generating or like you do like RO data patching before like the program is loading, like you change the code, well code or slash data logic basically, right? Like before you load. So like if you're saying like this will work as like reference implementation only for things that like you write and it's like unmodifiable, that's very limiting. Six, some use cases here, right? From my perspective that I see. You know these programs we talked about, how we talked about yesterday which are fundamentally sort of extending kernel functionality with C Group V1? We want them to be, we want to be sure that these programs are shipped and signed and verifiably built, right? Another use case is a group of BPF programs that you want to allow on the fleet. Like look, I want to deploy this when something happens and I want to be sure that those are verifiably built. In the dynamic use case we want to, we want to ensure that whatever we implement here we provide a solution for dynamic BPF code generation, right? This is very important because we understand that when you trust the application, the BPF program that ships to the application is also trustable because it has been trusted in the environment. But there are BPF programs that you sometimes are not attaching to an application currently. Like you have, you're just shipping BPF programs that I want to attach and do something with that. We do that as well. There's a question in the back. Plus one to KP, all of what you said applies in Azure as well, so same. For dynamic, so for program which do the modification, are you able to protect the chain until the program actually run? Because if you can verify the signature at the beginning and then you can protect. I believe that's only going to be possible once the dynamic location moves into the kernel. Is it or is not? It would not be possible until the dynamic relocation is into the kernel. Or some high trusted process and I believe the Microsoft case. Which is essentially I'm guessing the service account. Who's? He's not. Yes. I don't think anyone heard any of what you just said. Oh, so like, so if I can, is there a situation where we can move all of these things to the, you said this is possible with the loader program currently. Okay. We've already moved 90%, but there will be cases that it's just like not possible. There will be dynamic modifications. Like BPF traces is a prime example. Like you cannot. Sorry. So it's only through the signature service or pretty much sounds like what we'll do, like summarizing the discussion. We'll let the helper that can check signature of whatever, signature of the map, signature of the program. And a bunch of folks throughout the verifier at the beginning at the end, including that can be paired together with standard LSM hook. Like BINPRM, where it can do, let's say, IMA hash. And based on IMA digest of the, recognize that this is actually BPF traces running. And so, well, this is BPF trace is now trying to loading, connect these two events that BPF trace started to run. And BPF trace is doing BPFC's call. And at this point say, well, this is my policy. This is how I want to implement. I don't care whether anything that BPF traces loading has a signature. Just allow this process that initiates or based on the user. So it seems that the only thing, like, reasonable we can do is out, like, shooting ourselves and stopping the progress is to make it as flexible as possible, where the policy will decide anything, like, whatever combination it wants to enforce. So next steps then, right? What do we have to implement? We need the signature helper, right? We need these interesting hooks in the verifier. Yes. And that's it. And where is the signature stored? Where does it, sis call? Miner, after, no new instructions. OK. Damn it. Microphone. It would probably make sense to preserve this manifest. I like manifest more. But preserve it and, like, provide it in BPF broken for. So, like, BPF tool and, like, other tools can actually show you. Sure, yeah. But, like, the signature for me is, like, shot 256. That's signature. I'm not a security person, sorry. Any other questions on this from me? Or should we give someone else a chance to talk about signing stuff up here? Yeah, my topic is a very simple policy use case. So no signing going on there. Yeah, then you're the next one. Yeah. Thank you. I don't have a live demo, so we'll see. Thank you.