 I'm Lawrence, I apologize for my coughing, it's hard to control, I'll do my best. I want to talk about BPF signing, what it means and what we might be doing about it. Most of you in the room will know that eBPF is powerful, we use it to build things like BPF trace. Still incomplete, yes, we're always adding. With that power comes potential for abuse. You can build, unfortunately, great excitation tools, key logging, so on, I think actually somebody has gone and done it and built a root kit that uses BPF in some fashion. And I think this, as a community, has always brought us up and said, there's something that we do not want, right, this is bad for us because it creates negative view by outsiders and I think it's also a shame that the work that we do would be abused for such purposes. And of course, the problem is that to prevent abuse by necessity, we somehow have to limit what BPF can do. And obviously that's a big problem and so we've kind of been going around this topic in various incarnations, I think, in some way, shape or form, like every conference there's like one talk that has this as a topic, so this time around it's my turn. And I want to summarize a little bit the things that we've talked about in the past. So I think way back when BPF started out, like any application could actually load a BPF socket filter, it could attach it to a socket it owned and kind of do useful stuff with that and then Spectre happened and the whole thing became a maintenance nightmare. And the solution to that was turn it off, which is I think in a way it's pleasing when we can do that more often. So today, there's a syscirtle that says, you know, is unprivileged BPF enabled and as far as I know, most distributions today actually disable it. The next thing we discussed was, well, BPF is guarded by Capsis admin that seems kind of like a pretty big hammer, can we do better, Andre gave a really nice overview of the discussion that was going on and we came up with CAP BPF. But now it turns out that CAP BPF is not enough, we want to kind of solve this for username space is what we're going to do. Here we go BPF token, seems like a nice solution. We've also had kind of, I'm paraphrasing what I think KP's intentions are. KP, if you're here, please correct me if I'm wrong. Imagine you're an Android application, you're generally untrusted, but Google wants you to be able to load certain BPF programs that have been blessed in some way that are good because BPF is useful and allows you to do things that otherwise you wouldn't be able to do. So KP has given a talk or kind of, I think even published an RFC patch set, which is like, let's sign the bytecode. And kind of going even further, there are some companies that have requirements that say, well, anything that executes on the system should have some source of authorization. And I think they've given a really good overview of this as well. I think Microsoft is kind of further along the curve on this than most companies, but I think that there are security conscious companies that are kind of coming to that conclusion as well. Let's, because this is about signing my talk, what is the idea behind sign the bytecode? This is like very crude. Let's take the instructions that make up your program, hash it, sign that, done. This is not crypto advice, I want to tell you. And of course, there's like many, many details here that I'm skipping that make this actually difficult to work out. What I want to leave you with is kind of the reaction that I as a Scylium developer have to this proposal. It would actually make our lives extremely hard, I think. And the reason for this is this, which you might have seen. I want to thank Anton for making this great slide. This is a diagram of the programs we have in Scylium and how they kind of call into each other. You notice that to kind of angle it, to fit it on the slide, which I think is very pleasing. So what I want to say is like signing BPF, if that was the only thing that was allowed, that was like the answer to saying, well, BPF can be abused and like we need to somehow protect against that. That would be really bad for projects like Scylium, but also other things that are really useful like BPF trace. Okay. So in like the discussions we had, kind of Dave also gave the summary is let's instead of signing the BPF program, let's establish that the program that is trying to do the BPF Syscall is somehow trustworthy. And the goal that I had kind of with this talk is to show or to try out how difficult that is, how far away are we from doing this essentially. And I think this is orthogonal to signed byte code. If we really wanted to, we could do both. We could say, if the byte code is signed or your program is trusted, then we allow this. So it doesn't have to be exclusive. I wrote a little shopping list of the things that I'm going to need to build this. The first thing is to identify a binary. That's typically a hash of some sort. You need to prevent that file from being modified on disk. There's somehow a way to express trust. That's usually a signature. And then somehow I need to write a policy. Number one, I think I've heard it mentioned a couple times already. There's a system called FS Verity. It's a fast per file integrity mechanism that's really easy to enable, which is nice. And that gives you a hash and it also essentially turns the file read only. Signatures. And I went looking around in the kernel and tried to see what was there. And it actually turns out that IMA has a lot of the things that I wanted. One of them is like a signature format and a way to attach it to a file that's on disk, like in an extended attribute. It's like incremental infrastructure for caching of stuff. Very important. There's also like user space tooling that you can run to fill with stuff. And there's also like, they have some idea of key management, which I'll maybe we can talk about later. And there's some interesting integrations. Apparently, RPM supports this, which is new to me. And there's a project called Key Lime. They kind of, the idea is they take all of this info that comes out of it and tell you what is your system running, kind of different take on Tetragram, I guess, something we can integrate as well. Finally, I think this is the least surprising part. What's the policy going to look like? It's a BPF LSM hook and some new K-Punks. I'm going to attempt demo as well. Let's see how it goes. It shows up, yeah. So the first thing I'm going to do is run a lightweight virtual machine. It has like a, runs a patched kernel. And that's that, do a little bit of setup. Some of it, this has to do with like how FS variety works. And some of this has to do with how I kind of built the proof concept slash the hack. So you need like an IMA policy to make it work, but something to discuss. And then I can show you what are the kind of the parts that we need. There's like a certificate, a key that you can read. There's this create map thing, which is just as boring as it sounds. Creates a map, maybe I should have made it print something more interesting. I don't know. And then we kind of have the gatekeeper program. And what I'm going to do is I'm going to first sign the gatekeeper. Yeah, like, does it just create a map or actually pins it? It just creates a map. So it's like, basically it's a placeholder for can I do a BPFS call, essentially. So first I sign the gatekeeper. And then the second stage is to put the gatekeeper into the background. And you can see, okay, it's attached to the LSM program. You can already, there's like some debug output that says, okay, I'm actually allowing the gatekeeper access to the BPFS. So this call is a great way to shoot yourself in the foot. In the beginning I had a gatekeeper that didn't allow itself to execute the BPFS call, so that's kind of problematic. But yeah, I guess it's the future that we might live in. I can try creating a map again. You can see like there's multiple things because it tries to do some magic. So there's multiple BPFS calls behind the back. But you can see that it's denying access in the number four. There's just some enum that I'm printing out for debug. How can we fix this? We can now assign the create map binary. I can try again and voila, success. We can actually execute BPFS. What does it take to write this policy? That's what it looks like. I had to do, I had to export two things, which is like two functions that exist. It's the get task, xe file and fput. And I wrote a new little kfunk, which is like this ima file appraised thing. You might be thinking, okay, that new kfunk is kind of going to be 200 lines, but I think it's actually like 20 lines. And the experience of actually doing this was incredible, like using kfunks to add a new helper is really fantastic. So thank you everybody who's been working on that. Kind of my takeaways are that it basically works. Why do you need fput? Sorry? Fput. Why do you need fput? From BPF program, yeah. Because why did I add it? I mean, is it a new kfunk or what is IC? Okay. I mean, I just exported the helper. This is probably never going to fly, but this is a proof of concept, so I can do what I want basically, which is nice. You can cut corners. I mean, you could rewrite this somehow to say, you know, appraise the current task executable, or I don't know, there's ways around it. My takeaways are it works. It works with not too much changes. I think the, what, this identity, this hash thing, I used fsverity, but I guess if this were ever to become a thing, we would have other things that kind of could provide this, the hash and the integrity. It could be de-embarity, something completely else. I don't know. I think it's nice that the signatures are compatible with existing tooling, because it's kind of a pain to come up with these systems and writing the tooling. And finally, I think it's kind of the obvious one, but that trust should be flexible. I think that's also the kind of the biggest question I have, if this ever became a thing, like the way that keychains are managed is fairly strict for IMA, we would probably want to be able to say, well, kind of either we have our own keychains or have a way of saying, like, that's kind of the second point that I have here, right now it's like an all-or-nothing thing. If IMA trusts you, you can execute the BPFS call, but we probably need a much more granular way of saying, well, I only want a subset of signed programs or programs that have been signed by the specific key to be able to access it. I don't exactly know how we would do that, like one idea that I had when I was listening to Andrew's talk is that we could tie something like a keychain to a BPF token. So we could say, if you have this token, then also the process has to be signed by this key, something like that. I think there's many interesting avenues. And the second thing is like for the proof of concept, I had to write like a little bit of IMA policy, which is not bad, but I guess it kind of begs the question of like how much integration would there be if ever we did this? Would we be allowed to just reuse the bits? There we go, almost made it. Would it be allowed to just reuse the bits from the IMA infrastructure that we find useful or is there like a bigger discussion to be had? And that's really all I have. And if anybody has questions, then please go ahead. I think like one piece that's missing and like it probably would require working like tooling and libraries is some way to identify the specific workload, right? So like in your case, it's simple, you create the map, but like let's say you're running, like you're loading some BPF object file, right? It's not even one program, it's like each program is part of like the bigger whole. So we probably need to have some way like to instruct the BPF, like go BPF library to either do the digest from like the original L file and like provide that as part of the BPF program load, or even allow applications to override this with whatever custom identifier they want to provide. And just provide this like a buffer of bytes, stuff like this. So for BPF trace, it will be different, but the similar concept, right? BPF trace might like normalize the script itself, like maybe strip out like the white spaces or whatever and then collect the hash of that, or like maybe take the IR3, you know, stuff like this. So like equivalent programs probably should have the same checksum, but... So the way that this works is actually doesn't look at what the program is that you're loading, right? It's completely oblivious to you could be loading... Exactly, that's my point. Like you can sign the BPF trace, but like you don't know what it is running. Exactly. And I think that is kind of one that I'm proposing this from the model of this is probably what we would need for a CDM to work, because it's so dynamic. But that's too permissive, right? So like BPF trace can run like any malicious script. So you probably want to make sure that BPF trace is trusted. And also like whatever script it's trying to run is like validated. I mean, if you can figure out how to do that, I think that would be great. I'm... I don't know how I would do it basically. Yeah, but that's one thing, like we can extend, like it was a very simple property, like on the proc load where like loader, which supposedly is trusted, can provide some opaque string, right? And then like whatever BPF LSM policy you have, like they can like check whether it's known in workload. Yeah, I think that maybe that's kind of the point I was making at the beginning, where this could, you could do this plus other stuff on top. Like you'd say, oh, it has to be signed. Or you could say both have to be true, or you could say either has to be true. So just maybe a question. I am maybe not the best person to know, but at least the folks that I've talked to, I'm not sure you want the key on the node to sign, right? Like BPF tool would have to have the key to sign, but nobody wants to have those keys on the nodes. Like they want those keys as far away as possible and like a box on some, you know, that's doing the, building the image. So that would, I think has always been the trouble, the hard part for this, right? Like it's because BPF tool wants to generate a program. It doesn't, nobody's, as far as I can tell, nobody wants to give that thing the key, because it's going to be running on the node. So like, you're worried about the malicious code loading, but it has the key. Like, right? Yeah. That's what I was saying. That, thanks, Don. And this is the point I was trying to ask as well. BPF, the key cannot be on the machine, right? It should be on the, somewhere on the trusted builds. So this is what establishes provenance of the signature. Unless you give somebody rights to like two dynamic loading. So the gatekeeper needs to write a policy that these binaries can load unsigned BPF programs of the following type. That was, that was sort of the thing I was, I thought it's still doable with this design, no? I think so. So maybe I'm missing something. I'm not like security guy, right? But this is about identify workload. Like it's not about like signing or anything, right? Like I trust BPF trace to use like a blast algorithm, which could be just like a simple hash sum, right? And like I trust BPF trace to provide that identifier as is like kind of identifier of like, what is the origin of the workload I'm running? And then like you can have like pre-calculate the same hash on like 10 trusted BPF trace scripts and nothing else, right? So if someone is trying to run like ad hoc script, that will fail. But if you run like the pre-sign, it's not even signing, right? It's like pre-calculated, pre-blast scripts, then like it will let it through by combining trust in the BPF trace and then like knowing the workload ID. I think maybe this is a case of different, slightly different use cases. I would say that the model, I'm like this, what the proof concept is, right? Then maybe we need a different one, I don't know. But it's more like trusting in a, probably it's like saying I trust a group of people to do a good job and give me a binary that is as secure as it might be, right? So this might be for BPF trace, this is kind of applicable to Cilium, et cetera. And then you might have more, you might be saying, well, actually that's not enough for me. I want to be able to vet the exact programs that go in. And you could have LibBPF tools, you could statically combine your thing, turn it into a single binary that doesn't have any external dependencies and you can still use this process to kind of say, well, these are the 10 LibBPF tools, scripts that I would like to be able to run on my infrastructure, right? You could still do that if you wanted to. Sorry, Dave. Yeah, so there's maybe two types of programs, right? One's that may be pre-authored and could be sent off to a signing server to get signed by a key that's not online and whatever. BPF trace is generally not in that category. And there's one that are dynamically constructed in BPF traces in that category. Since you can't put the key on the machine in any secure fashion, you can't give that to BPF trace, then as we're saying, then you'd have to say, BPF trace can do unsigned programs. You can have a gatekeeper that says, for BPF trace, only allow programs that attach to the following hooks and call the following k-funcs and so on. So you can constrain it that way, but you can't constrain it to say which programs they are, but you can constrain maybe what operations, whether it's program types or k-funcs or whatever else it has access to. You could write a gatekeeper program to do that to process the input if you care about that. The other point that I was going to raise is not on an under-race point, but that was my response to that, right? You say it works, but in great stuff, by the way, please continue. I didn't realize that Cilium was in the category of dynamic generation of code, more like BPF trace. You said it was really complex, right? Oh, yes. And so that means that the parts of it that work like BPF trace in the talk that you're referencing that I gave, there's a policy that says, if you want to do hypervisor protected code integrity, then you're disallowing the dynamic code and only allowing kind of the static code category, right? The one that you could actually sign in the data center, whatever, which means by policy, you're saying BPF trace will fail, but that's because I've set a policy that is enforced by the hypervisor. When you set that policy, it means that there is a security distinction between code running in the kernel and what root can get, okay? Meaning root can't get code running in the kernel because root can't get the code signed by the key that the hypervisor needs to list. So then there's a security distinction. All of this that you talked about, even FS Verity and so on, I believe still relies on the assumption that there's no security distinction between those, and that's because root can poke into the process that's been signed after it's been loaded and modify it or whatever, and if it can do that, then it can subvert and get code injected in the kernel even though it's been signed, right? So this still relies on an assumption that it actually will fail in a hypervisor protected code integrity. So you say, it works. It works as long as you don't flip that switch and turn on such a security policy in your hypervisor. Yes, so I'm sure there's like, you're much more versed in this stuff than I am, so I'm sure there's like tons of holes in there, but I think for like the mental model that I usually operate to say, okay. Like, you know, execution the gatekeeper needs root. If you can do that, then okay. You can modify the policy basically, right? You could switch on or off. I mean, there's like, in a way like the way that IMA is designed is I guess more of a thing where you set a policy and you can never undo it. And maybe that's for the reasons that you mentioned. It works great. Please continue. It's great stuff. It won't solve the problem if you have like the HVCI style of stuff turned on, but for all the other cases, I think it's a great approach. Cool. Thank you. Mr. Lawrence, just a quick question regarding the signing key here, like where do you just to clarify, this use case where we want to sign binaries on a trusted build server, this would still work in this scenario, right? I don't need to have the private key on the host. No, you don't need it. It's just for the sake of the demo. Like basically, I just wanted to show it. The way it works is maybe you can scroll up. So in the setup here, you can see this. It says one key in key ring. Like it's kind of a little bit difficult to understand, but basically what this does is runs. It's like a binary called key cuttle or something. And that loads the public key into a kernel key ring. And then that's what's used by IMA to do its checks. So if I had shipped this IMA sign thing, that just ends up taking the hash and calculating, doing a signature basically, but you could ship that out of that. You don't have to have that on your machine. You could imagine that it's, like I said, as part of RPMs or bake it into the container image somehow, I don't know. Does it answer the question? Yeah, it does. And it's really cool. And it also gives the flexibility. If I want to use IMA or the implementation I had at some point where you could have the helper that was at the K-Pung that is already existing with the pkcs or pcks number seven style stuff, you can do that. Yeah, this is cool. Are we going to provide a reference implementation? Like based on IMA or... Sorry, can you say that again? Are we going to provide a reference implementation? Is there going to be like something that BPF tool is going to generate signed programs by default and the signature can be identified with a particular header and the kernel most like given reference implementation for distros or some things to build on, right? That's that sort of stuff. Good question. I don't know. That's basically what I'm here to ask about. Okay, so I know this is kind of work in progress, but the signature seems kind of primitive. Like you just signed the main binary because you can easily like LD preload something or have a library that pens on that as something that gets replaced with something or... Yeah, you need to basically take additional precaution. You would have to kind of do things on top that say you can't actually execute other stuff, etc, etc. So you're like a statically linked binary... Yes, yeah, all right. So it's like I'm not saying like by doing this everything is magically going to turn out super secure. Like even full disclosure, Cilium ships clang, right? Because we need to compile stuff. But I think if we have a system like this, we could work towards a future where we reduce the amount of kind of stuff we're taking from the outside of this is something that actually people care about where customers come to us and say we want to be able to have a version of Cilium that kind of has these properties, right? What do you think what did it take to be able to ship a Cilium container image where you then can lock down to say, okay, I went just to Cilium agent. I want this to be trusted and trust the compute base. But all the other potential applications that could be malicious or not, they would not have the access. Like what would it take for us to be able to ship this with the image in this setup that you have? I think for me, I can go to the shopping list. Okay, so I think this number for like Cilium being like a cloud native Kubernetes thing most of the time like would be distributed as a container I guess and that's just a glorified tar ball I think. So I think the like step number one would have to be find a way to add signatures that we can make the kernel understand to the star ball. Like right now there is a way, I think we already do this for Tetragon, please correct me if I'm wrong, John. Like there's a thing called Cosign where you can take a container, you sign that somehow. But it's not clear to me how to make the kernel understand that and I think it's, I guess we could say, well, let's not involve the kernel as much. Like the proof of concept that build is like basically all of the logic for like deciding is this trusted or not, is this signature ballad is kind of in the kernel. We could say like let's take it out and be a bit more lenient with that then we could adopt something like Cosign more easily. But I think if we wanted to go with this approach where actually the kernel has like the final say, we would have to figure out how to add these like fs-varity-compatible or dm-varity-compatible or something else compatible hashes to container images. I think that's for me like the biggest question mark. And then the number two is kind of the bit, this trust and identity. And that's what I meant with this trust model thing. Like it's not clear to me how tightly we would want to integrate with kind of IMA and the system key ring that exists. Like there's all of these concepts that I'm not well versed with, with we would have to answer I think. Like if you want to say like what does it actually look like if you want to say I trust the Scylian maintainers to produce a good binary, like how do you actually go about configuring that? Maybe on the side. Since you're talking about trust of building the binary, I know folks have started to use Tetragon in their build process to get a S-bomb kind of thing from it, basically a trace of the build process. And then they hash that and use that as input to the thing. People have tried to do this with S-trace before, but S-trace was usually too noisy. I mean, it's usually not consistent enough. I mean, maybe people that, maybe somebody might disagree with that statement, I'm not sure. But that would, we could do that. I mean, we could use Tetragon to monitor this is calls and the connection and the things for the build process. And then today we sign it with Cosign, which has a secret key. I mean, we could sign it with something else, I suppose. Can I take a stab at answering Daniel's question in a slightly, because it was more particular towards having one trusted process that is Scylium in that container that wants to load BPF programs. What you could do then is have a policy that says that a binary, only a binary that is signed with this particular private key is allowed to load BPF programs, right? You could, and load any BPF program effectively because that binary is then in your trusted compute base. Then you have what you start off is you build Scylium in like a new container and sign the Scylium binary on the system. You have the verification key preloaded into the container kernel key chain, right? You ship the BPF program that like enforces this policy, have it loaded as a part of system D when the container starts. And then at your BPF says call pro verify check, what happens is it checks who's whenever you execute that binary, it sets a blob on your task and provides you the ability to, am I assigned binary that is allowed to load BPF programs? Then at the BPF says call time you check that blob and you reject anything that doesn't. So in this use case, it is the opposite side where you have just one part that is a part of the trusted compute base and is allowed to load signed programs. So you don't need to sign the programs in this case, you just need to sign the binary that loads the programs because that's the thing you trust and then you don't trust anybody else effectively there. Yeah, you could do that. I mean, I guess we're kind of bad for us if only Scylium was allowed to run. There's other stuff we do as well. So I think it's probably not like a future we want where we use this to lock out, but I think it's a realistic ask that at some point people are going to say we want to be able to control what, you know, who provides us a software that runs in our system. Then you can also think about a loader spec as well, right? Like this is the way you interact with the BPF control, like the control plane that is allowed, that can do the BPF Syscall that is the weighted binary. And then you interact with that binary via whatever, your favorite IPC mechanism. Yeah, I mean, you could use this to have this, yeah, like a set of trusted programs that you can load in. You can talk to that daemon that's been signed by IPC and then do that way, I guess. Okay, I think I'm out of time anyways. Thank you very much. Yeah, thank you.