 Alright, so I'm going to talk about the relationship between BPF and confidential computing. I'm actually a member of the technical string committee equivalent in both foundations, both the confidential computing consortium and the EBPF foundation. And when I first started being on both, I didn't think that the, I thought that the two technologies were basically orthogonal and they were just two separate things that I did both of. And over time I kept getting asked as to what is the relationship between them and in being in meetings and everything on both of them, I started to think that there actually is some relationship and so I'm going to share my thoughts here. There's no code here. This is maybe a slide where architecture discussion, right? It is worth noting that out of the premier members of those two foundations, the majority of the premier companies are actually in common and so my hope is there's a bunch of other people out there that are also interested in the overlap between these two technologies. All right. And so by the way, I gave this a version of the same talk to the technical string committee of the confidential computing consortium and of course there I was telling them about EBPF and so I put in some of the same slides just so you can see the ones that I showed them away. I'm not going to go through them, they're just in the deck, right? And I'm going to tell you about confidential computing stuff that they already knew, right? So I'm basically giving a variation of the same talk to both sides, right? All right. So this is what I showed them as the definition of EBPF, right? Cross-platform and a privilege system component. This is the definition that appears on like EBPF.io and so on and this will become important later on, the specific wording here, okay? I talked about how EBPF runs in many contexts, notably like main processor, co-processors, you know, smartNICs and so on and inside and outside containers and so on. All of this will also become relevant, okay? I talked about how there were two scenarios. They were the same two I just talked about there in Lawrence's presentation about, you know, in one case you may be constructing an EBPF program, a priori, maybe you can go off and get it signed by the key that's in your back room and in the other case, you're constructing on the fly like PPF trace. Both scenarios exist in BPF, right? So I explained that. I showed them a classic picture here and I'm putting this up here. You can get the color coding because this will be playing like spot the differences, right? So this is maybe your classic Linux architecture. I showed them the Windows one and I placed spot the differences here, notably the verifier and the JIT compiler compiler move up into user space. The secure environment there, it could be a user space secure environment or it could be an offline on a different machine. Notice the signing step over there on the other machine and so you can send the bytecode off to a secure environment. It does the verifier, the JIT compiler and the signing step to be provided. This is the example where it's actually signing the JIT compiled code, not the bytecode, right? You can do it either way. I talked about both variations in my talk last year, but here we're talking about signing the actual native code so that we'll actually work with the HVCI case that I talked about. So the point is you can have a secure environment that is potentially not on box, okay? Okay, so now we're going to talk about confidential computing. Now that I've showed you what I showed them in a way that is relevant to the later slides, okay? So notably confidential computing, here's the definition from the CCC. So protection of data in use, that means like your memory pages and so on. It's already loaded, it's executing, right? Can somebody poke into the process, memory and change stuff? By performing computation in a hardware-based, a tested, trusted execution environment, okay? You say, what's a trusted execution environment? Okay, well it's an environment that provides assurance against data integrity, code confidentiality and code integrity. That means people can't modify the data that the process is using. People can't extract and peek at the data that the process is using. And people can't change the code that's executing the process, okay? And if the hardware, the TEE, can actually enforce all three of those, then you can call it a TEE. And if you're performing execution of something in there, then you can call that confidential computing, okay? So this is the definition that the CCC has, okay? Here's a picture using the same color coding of what some, there's different variations of TEEs. This picture applies to things like SGX from Intel, TrustZone from ARM, and some other ones. I mean, there's things like RISC-5, where you can build your own, there's multiple compositions and so on. So this picture would apply to SGX, TrustZone, and things like that, okay? So at the top, you write a TEE library with some source code. You run it through a compiler tool chain, you sign the binary, you sign the binary, all of this is done in some secure environment. You then take that signed binary, and you have an application that calls into some TEE library to say, I'd like to load and talk to this thing, please, okay? It then loads and talks inside the TEE after it verifies the signature, and so there's a TEE runtime, which in the SGX case is a hardware processor with microcode, and in the TrustZone case, there's actually a little OS there, you'll see a diagram of that later on. It executes in there, and they can communicate back and forth, and they can use shared memory. You can say, gee, there's a lot of things in common here, which the color coding is hopefully able to help you see, and you can see the rich execution environment, trust execution environment, okay? And so, notably, on this style of TEEs, the programs that run, or the signed TEE libraries are all passive, right? Which means you hook them up to some event, right? The event runs, it executes a bunch of stuff inside here, and then it returns, sound familiar? Yeah, exactly. All right, so you can say, gee, there's a lot of, just looking at the technology itself, there's a bunch of things that are actually consistent between them, can you actually compose them for doing something interesting? Is there actually any use case for doing something interesting? So let's talk about this, and so this is, again, just one of the ways of constructing TEEs, the SGX and TrustZone style of stuff. So after this, I'm going to kind of rotate the RETE boundary to make it look more like the previous picture. So putting them together, EBPF is a cross-platform technology that can run send box programs to extend a privileged system component. Well, a TEE is a privileged system component, okay? It doesn't run in the main processor, it may not run in the regular part of the processor, so in that case, it's sort of like running EBPF at a smartNIC, right? I can kind of, you know, offload it into the smartNIC and it runs over in this other processor. I want to offload it over into the TEE side of the processor, or a TEE processor, and it's similar to offloading a program, or a code into a smartNIC, okay? We know how to do that in BPF, right? And both of these two scenarios, both the design time one where I may get stuff pre-signed and the dynamic generation of code, more like the BPF trace, can still apply. Okay, even in the TEE case, both of those scenarios, you can say, oh, there's all these things you can do in both scenarios, okay? For security projects, right, many people want to do the security projects just like, I want to write my security stuff in Rust, good idea. I want to write my security stuff to actually run inside of a TEE. There's a trend towards doing that, okay? And so all these same scenarios can still apply. So what would it look like if I composed these two pictures that I've shown you, the BPF picture and the TEE picture, okay? So at the top, you have an EBPF program that goes through and you get some BPF program bytecode, okay? You then submit that off to a secure environment. This is like the static case, right? It's off to there where this could be off machine if it's the static case, or it could be on machine if it's the dynamic case because on machine, the key is not in the process. The key is inside a secure environment which could be inside of a TEE chip, okay? So this is not as good as saying it's often a backroom someplace, okay? But it's better than saying the key is accessible to various processes and stuff. And so I could potentially do online on machine signing by a key that's inside the TEE, okay? This is a possibility for the runtime case, okay? Dave, in this case, the verifier and JIT also run in the enclave or they run like this is not running in DEE, but do they need to be in that or? Notice I put example in the top right in the title there. In this example, I'm showing a case where yes, the verifier and JIT compiler would be running inside the TEE. Do you have to do it that way? You have to analyze the security properties and if you don't, you have to convince yourself that you're not violating security properties. So it's easiest to depict in a diagram that I showed it like the CCC tack if I show this being happening inside the enclave, okay? Then it's easiest to convince people that's true. I don't know if that is necessary, but it is certainly sufficient. Yeah, I agree. I think in this case here, you would really want the JIT to be in the enclave because you want to be able to trust the JIT, right? Exactly. Okay, okay. And so you notice this box here matches what we started doing in Windows, right? We're not doing it with, you know, confidential computing or whatever, but this notion of saying put the verifier, the JIT compiler and the signing component inside a secure environment that's not in your normal kernel, okay? It's kind of the direction we already started going in Windows and it's just consistent with that. It's just going the next step, right? And, you know, gosh, I'd like to do that in Linux too, by the way. Okay, so once it's inside there, okay, then your normal application can load it because now you have a signed binary just like in the TEE slide, right? You have a signed binary because it's already gone through the JIT compiler, which I showed what we're doing on Windows when you have HVCI, right? And so then you can use shared memory API calls and it goes into and out of the enclave, which is just like calling in between two things, okay? All right, so that's one example of composition. Okay, so what's going on here is you're doing runtime signing, which addresses the point that I was talking to Lawrence about in my comments back there, okay? That's one of the ways of playing them together. Here's another way. The trend now, this is the older slide, maybe not the one that's on the share right now, but that's okay. The only change is the title change to be more understandable. CVM, if you guys don't know what that is, if you look on the latest share, it says VM with TDX or AMD SEV SNP, okay? So I talked previously about like SGX and TDX and so on, sorry, SGX and Trust Zone and so on. A different style of TE is one that you can run a logical virtual machine all inside that secure part of the processor. Different technology, right? Because you have to have the ability to run threads and things like that instead of just be purely passive, call in, do stuff and return, okay? So this style of stuff, you can run an entire VM inside there and again, Intel, TDX and AMD, S&P, SEV, SNP fit into this category. So in this category, you have the red is the TE, it's the secure VM or the confidential VM, okay? And the stuff that's outside the red box is in the rich execution environment, okay? Which you have your hypervisor, your host OS and a TE library and things like that, okay? And so inside the guest OS, this could be just classic Linux with your existing kernel that already has BPF in it, right? It's just a VM, right? And so inside here, typically inside of a CVM for doing communication securely, then you have some attested communication that comes out. This is an example where app one is kind of like a management demon that can be installing other things, right? This could be like a cube agent. It could be any other type of agent that might be installing binaries like an app installer, whatever it is, okay? So in this example. And so it typically does a tested communication out and then it gets back maybe a secure workload. Here's some stuff to run once I know that you're running inside of a TE. You can run this confidential workload with some secret data, go and do some computation that's protected from anybody outside including whoever's owning and hosting the machine, like in a cloud-host or environment. And so what happens in these is it may get back either app data to be run or even another app to install, okay? This is what happens in how people... Oh, here we go. Now the example title is actually correct. So the app deployment actually goes and dynamically installs this app too or maybe just the app data for an existing one. It says run this stuff and protect it from anybody so they can't peek in it, okay? So this is what happens in TDX and SEVS and P today. So with that, you can pretty easily see well that app one could just instead of installing app two it could say I'd like to inject this EVP program please, right? I probably got the technology to a test out to say that this is running inside there, go and fetch stuff and maybe install it, right? So that product is pretty simple because you don't have to change Linux anyway. You kind of already have this as long as you write an application, okay? A database application that does a testation just like an attestation would be doing inside any of the other scenarios for this type of VM, okay? Since it's like the least work, you kind of already get this one almost for free if you're doing this whole example, you're basically already have BPF now whether you're using it or not, right? You already have BPF you can be using in the company. It's competing in context for the scenarios you already have and they're confidential because in the sense whoever's hosting the machine, right, doesn't have access to what that BPF program is doing just like it doesn't have access to anything the VM is doing for that matter, right? So this is the confidential VM case. Okay. Last example, I mentioned I was going to come back and show you Opti, so here's Opti, the way that Opti works. So this is on ARM TrustZone example. And so on an ARM TrustZone processor you have the normal OS, so let's say the thing on the left for sake of argument to show that it actually isn't relevant to you, the one on the left was Windows, doesn't matter, it could be Windows, Linux or anything else. And the one on the right, actually let's use the case with our EOS, that's Linux, let's use that. The one on the right has a kernel, but it's not a full VM. Opti is a very lightweight OS that's meant to run inside the TEE on a TrustZone processor. Remember I mentioned that TrustZone is passive, right? You call into it, it does work and it returns, right? It's not like a full OS. So this is a very lightweight OS that can manage pseudo-processes and dispatch remote procedure calls between the two sides, okay? It can't generate threads and things like that. It doesn't have a networking stack or anything, but it's basically a mini OS. So that's what Opti is. And so the way that this one works is since Opti, you can install multiple what they call apps, right? We might call those maybe libraries. They call them trusted apps. You can install multiple trusted apps inside the TEE on top of Opti. And so what that works is you have a trusted application one that's like your management app that tests outbound, maybe get some other stuff and installs it, and it looks a lot like the TDX S&P SAV picture, okay? So what comes back in is maybe I'd like to install TA2 on this machine or this device, or maybe I want to install just some app data that's already used by TA2 that's already there. And so the same thing can happen there. In theory, right, this is the part where it's purely slideware, right? So today, Opti is a miniature, a very lightweight, secure OS, okay? It doesn't support BPF today, but slideware, it could, right? We put BPF into lots of other things, right? There's other BPF runtimes and so on. If we put a BPF runtime into Opti, then you can even do it on an ARM processor on a Cortex-A class device, for example, that'd be pretty easy, right? Now we'd have to work with the Opti community that's already been screened, but it's absolutely possible to do this, right? And it works just like the other slides. That's the part that said nobody's working on this. This one's slideware, but architecture-wise, there's no reason this wouldn't work. Okay. Last part here, big topic, one of the main topics that the CCC works on is the whole topic of attestation in the context of a confidential computing environment, right? Because all of the TE attestation is a requirement to be part of the computing, right? You're not only running in a TE, you have to be able to attest that you're running in a TE, right? You have to be believable that you're actually running in a TE for the use cases, right? And so there's three types of ways to use BPF where it interacts with attestation on very different axes, okay? So the first axis is to say I'd like to use attestation in order to decide how and whether to deploy an EBPF program, okay? So this is the cases that I walked through in the previous picture where you test outbound and you're then handed, here's, by the way, here's your remote, here's your EBPF program to install and then I install that. And I only want to install the BPF program into something that can attest, okay? So that's sort of the first category of how attestation and BPF could be used together, which is your BPF orchestration system, whether it's, you know, Bumblebee or LEE for whatever you have internally could make use of attestation to deploy stuff into a TE. The second category, completely orthogonal to that, is I would like to write EBPF extensions to existing communication that's attested, okay? So I've got some attested TLS session going on and I would like to install EBPF programs underneath to maybe monitor that, observability, whatever it is, okay? And so here I'm trying to deploy code into something that's already been attested, right? The communication may already be live in the BPF program. Does that or does it not invalidate the attestation of that session? Because I've just changed the code that was attested when that session was established. This is a lot of the stuff that the attestation sig in the CCC talks about a lot, is to say, how often do I have to attest? If I attest a session, is that for the lifetime of the session? What if there's code changes, right? BPF is something that can cause code changes to your TCB when you've attested, right? And so that means that, say, if I attest at, say, boot time, then that is woefully insufficient if I can install BPF programs post-boot, right? So they have to be aware of that when designing attestation mechanisms for attested communication, okay? They either have to say, yeah, there's some time window, right? It was attested there and that's good for the next, you know, 10 minutes or something like that. And the code may have changed since then and I'll figure it out after 10 minutes and I'll reattest, okay? So that's the style of discussion that happens in that SIG. The third category is also completely orthogonal, which is to say, I would like to do some attestation. I'd like to write an attestation algorithm in a BPF program, right? Just like you can write a congestion control algorithm, right? You can run a firewall, you can run whatever. I would like to write an attestation policy using a BPF program. Okay, so this is an attestation for BPF. This is BPF for attestation, right? And so I could write a BPF program that somehow takes existing traffic and only passes, only allows it to pass up through a particular layer if it vouches that it's been attested, right? I can insert that in a different way in a particular layer. Or I could check that in APIs to say that the program and you'd say this kind of overlaps with maybe what the gatekeeper does, right? Or I could, in attestation, there's a verifier which is this concept of the thing that you pass your attestation data to and it decides whether you're good and then it gives back an answer and then you can use that answer in authorization checks. So verifiers often but don't always run on some server someplace, okay? They could be running on box but for scalability, people like to run them on some separate server somewhere. And so you can say, well, I wouldn't want to allow BPF programs to run inside the verifier to say send the verifier or observe the verifier in some way, okay? So all of these are use cases, very different use cases for saying I want there to be some interaction between EBPF and confidential computing. Either to use EBPF as part of confidential computing or use confidential computing as part of BPF or whatever, okay? So my takeaway is when I first started being on both committees, I thought that they were completely orthogonal and there wasn't really any significant overlap. This is my conclusion now. Some of those are like immediate term. Like the case that I mentioned you can deploy BPF programs into a CVM that runs Linux right now, okay? No difference, okay? To things that are much more speculative like, hey, should we put BPF runtime into Opti, right? And everywhere in between. So that is the end. Now you've seen the same presentation or a variation of it that I presented to the CCC. Happy to take any questions. You're all stunned. You're all ready for lunch. I'm curious like what is the feedback from the CCC? Like where do they I mean see like an immediate use case they would love to adapt EBPF to or like like what was the discussion there? I think overall the people that were in the meeting were not familiar with BPF as much and so they were in the, gee thanks for the information need to think about this, right? It was more of an education thing which was also great, right? The fact that we could come from, you know, the BPF community and educate the CCC, the community was a great thing and I think within the attestation they said, yeah, we should talk about these things because I mentioned, you know, how long is an attested session good for before you have to reattest it, right? If there could be code changes underneath it. It's something that they said, yeah, we'll take that on as a discussion item that they hadn't really spent significant time talking about. They started on it, but they didn't realize that BPF could change, you know, like your code and the people that were working on that hadn't really considered that aspect of it and so they need to take that and so now they're kind of talking about that and so that was, the interest was mostly in terms of immediate stuff was in the, how do I deal with the attested communication because there's a whole sig for that right now. How is this different from any other JIT, Java? Like why they were surprised that the program can generate code? Because in confidential computing with the notion of sign binaries and hardware based enforcement than the notion of dynamic code. Going back to the style that's not the CVM case, I'll just use this one here. In this case right here, you don't have dynamic code, right? In this case you do have dynamic code. In this case you don't, classically, right? And so this is the case, this is the style, you know, SGX has been around for, you know, decade or something like that and so the most classic one that everybody kind of has thought through starts from this assumption here and this style is the new trend, right? TDX is new, SEV, S&P, they're like less than two years old, right? And so all the deep thinking hasn't been done on this part, all the deep thinking has been done on this part and so that comes with a bias in terms of blind spots, right? But like in this new model, Java could run inside the trusted environment? Yeah, and they would say you should never run an interpreter inside this one here because it doesn't have security properties. It's not safe. The security people will tell you that any time that you put any interpreter of any sort inside of a secure execution environment that you cannot prove that you're not susceptible to side-channel attacks. Right now the evidence is that you are, right, but certainly the claim is there's no proof that you're not, right? So they said it's extremely dangerous to put any interpreter into any secure environment, period. And so if we said you put Java in there and say please don't do that, you're insecure, or at least you can't prove that you're secure, you can't prove the attack. How is BPF different? How is BPF different? How are you going to convince them? That's a great question. But I would say if I go back to this picture here, right, really it goes back to what we were saying before. I say if you talk about the static case where you can take a BPF program, sign it in the back room, now it fits into their existing model. It's only the runtime cases as if I want to run BPF trace inside there and then you get them to worry about stuff. So I just want to use BPF for the static case because I want to deploy my layer 4 load balancer written in a BPF program, or my NAT or whatever. They'll say yeah, that works great. So first thanks for the talk. It certainly gave me a lot to think about. That's what they said too. So I have a more generic question. So I stopped following this stuff like back 10 years ago with HGX and Haven and things like that. So I'm wondering like does it look like one either one approach like the enclave HGX approach or the EVM approach, TDX or whatever it's called would prevail? Like is it one more useful than the other? Or kind of it seems like both would be relevant. So if I understand your question you're asking between these two models between like this style picture and this style picture is one more useful than the other. I would say the trade-offs are on the plus side of this. This one here has a smaller TCP. There's a lower attack surface area. There's fewer lines of code in this style of stuff inside the red box here. The red box is as small as possible which means you can get a higher confidence in the level of security in it. It's easier to analyze because the set of code is much, much, much smaller like by two orders of magnitude or something. So this one has the small TCP which lets it scale down to small devices and it's easier to security vet because there is a argument that this one is actually more secure because you can actually reason about it and get a belief that it's more secure. This one, on the other hand, is the one that is far easier to incrementally deploy. In this style here I have to rewrite my apps to use this model. It'd be similar to saying take an arbitrary app and take half of it and write it in a BPF program. You're going to have to rewrite some of your logic. It runs on Linux and I can run it there with no changes. That's the part that makes this one be super attractive and why the world is moving more towards this one in terms of the dominant case because people just want to lift and shift and you can use this one to lift and shift. Take your existing app and run it, no changes is very attractive and so that's why very high security conscious things like maybe in the critical infrastructure case might be using this one, some defense department in whatever your country is, whereas this one is the masses that want to do stuff for financial and just regular business purposes and stuff and move to this model because it's so easy. This one is the easy on board to get confidential computing in the first place and if you've got plenty of spare cash to rewrite your apps and want the ultra secure thing because you care about nations data attackers then you go to this model. Thank you very much. Super interesting.