 All right, so let's talk a little bit about ServiceMesh and EBPF. In general, when we talk about ServiceMesh and the things it provides, the abstractions it creates, talk about observability, about security, about routing, you know, L4 and most importantly L7. Things like path-based, HP routing, canary rollouts, RBAC rules, metrics, logs, etc. When we talk about EBPF, Thomas did a great intro, so I'll just skim through this. Basically it's a JavaScript for the kernel, ability to run safe, flexible, and the fast GIT compiler programs in the kernel. Why are we looking at EBPF and ServiceMesh? Well, ServiceMesh does L7 processing, it has a cost. We want to see if we can use EBPF to alleviate that cost, right? Now let's talk a little bit about, can we actually do that? Can we use EBPF and ServiceMesh together and gain some benefits? The answer is definitely a maybe, and in the next slide I'm going to talk a little bit about use cases, kind of explore how EBPF can help the goals that we initially discussed. Now one thing important to understand, and excuse me for talking fast, just 10 minutes a bit short, is the EBPF run time model. Because EBPF runs in the kernel, it's important that the kernel knows when EBPF stops, right? It's not like a program, there's no thread that runs EBPF programs, EBPF runs when a certain subsystem in the kernel, and there's a bunch of such subsystems, invoke it explicitly, and for it to work safely, we need to know when it's terminated, right? In other words, EBPF programs are not during complete, and that's a very important aspect. Otherwise, every EBPF program could halt, bring the kernel to a halt, right? We have a blog post if you want to get more details kind of covering some of this, so feel free to go there and check it out. All right, so now let's talk a little bit about an overview of what I'm going to talk about in the rest of these slides. EBPF can definitely help the mesh, right? It can mainly do stuff kind of on the L4 level, help the kernel make decisions. As for L7, there's some problems, that's why we still need an envoy, and let's go straight into the point. So let's talk a little bit about security. As far as L4 policy, right? EBPF, you can send programs to be processed very early in the network stack, and you can definitely add benefits there, right? Blog by IP, and if you're in Kubernetes by pod labels using network policy, create logs, create events, DDoS mitigation are all great use cases for EBPF, though do take into account that if you use some of those features because EBPF executes so early, some of the existing monitoring tools wouldn't see those packets, right? Because EBPF kind of can take care of them before they make it to the rest of the network stack. Whew, now, as far as L7 policy, it's a bit harder, right? So usually when we talk about L7 policy, we need some way to derive the client's identity, right? We have some policy rules that we run that identity again. So for example, we can talk about PFAM TLS identity, or we can talk about JOT identity that are in a request header, right? So think about something like a JOT where we need to parse the JOT and verify it against a public key. Those are not very EBPF-native operations. They're kind of needed more during complete machinery. If you have a list of policy rules, right? So for example, user X can do a post request on resource, you know, path foo. That's also running through a list of rules that's dynamic. It's not something that's very EBPF-native. Something like external auth where you have to pause your request, send it to an external service, get a response, and then continue the request. Again, not something very native. It requires a complex state maintenance that currently is very hard to do. I'm not saying impossible, right? Never say never, but it's not the native natural fit when we talk about EBPF. Let's see. I don't want to say anything else here. Yeah. So basically all this L7 security policy are not a natural fit to the execution model we just discussed. So these are better done in a sidecar, right? And again, this is 2022. In 2024 my answer might be different, but we work with what we have today. Let's talk a little bit about the data path and load balancing. Now EBPF can definitely help, right? Things like XDP, think about things like the SOC map, the sidecar acceleration that Thomas mentioned earlier, can definitely help bring some performance benefits to the mesh, reduce latency, increase throughput. We also measured it and see the improvements. But once we talk about L7 load balancing, that can also get a bit complex. Think about an HTTP route table, right? You can write according to a path prefix, write according to a path regex, write based on the header, right? Do canary routing, certain percentage of the time, send a request here, a certain percentage, send a request there. All this stuff is not exactly the best fit for EBPF again because of this complexity and the state management. Think about complex load balancing algorithms, at least requests where you need to track each upstream to see how many active requests it has and make a routing decision based on that, right? So you need to maintain a data structure and with EBPF data structure, because of the runtime model, the kernel needs to be aware of them. So you need to have your data structure supported by the current kernel, right? So all of this stuff are not exactly easy, especially with HTTP2. With HTTP2, the downstream and upstream has different state managed, right? So different streams, different headers are compressed and managed in the context of the same TCP connection, right? So you have very complex state management that is separate for the downstream and the upstream. So you can just copy bytes from the downstream to the upstream. You, at the minimum, have to edit some of the metadata, you know, HTTP framefields. All right, moving on. Metrics of an observability. Here EBPF is actually a great fit for service mesh. Because EBPF is on the kernel level, it can observe all your workloads, whether they're in the mesh, whether they're not in the mesh, doesn't matter. You get a good amount of L4 observability metrics. As for L7, it's harder but conceivable because you don't need to manage any state. You just kind of need to read the bytes on the wire as they arrive and parse them. I've seen some examples on the internet, so it's definitely possible, a bit harder than, you know, regular L4. All right, let's talk a little bit about resiliency. And with L7 resiliency, things like, think about how complicated a retry is, you know. When the proxy does it, it has to schedule, send a request, get a response, see that the response is a 500, schedule an exponential backoff, and then tie the request again asynchronically, all the while holding back the original request, while continuing learning other request streams from the same connection upstream, right? All this complex state management is very hard to do with ABPF today. Things like passive health checks, right? If you get a 500, take your host out of load balancing. Things like circuit breakers. If a certain host has a number, a high number of requests, it's something that's hard to do with ABPF today. Again, a lot of state managers. I'm not saying it's impossible. It's just a lot harder than doing it in user mode. All right. So to summarize, L4 and ABPF is a very natural fit today. L7, again, if it's possible, and again, I never say never, right? It's not easy, right? Especially with HCP2 that has complex state management, and even more so with HCP3 where everything is encrypted and you don't really have a choice about it. Additionally, when you think about L7 latency, the bottleneck, what ABPF is good at is saving context switches, saving buffer copies from kernel to user mode. That's not really our problem with L7. Our main bottleneck with L7 is CPU, right? When you think about request manipulation, adding headers, removing headers, decoding HTTP, encoding HTTP to stream management, all this stuff will have a similar cost, and it doesn't matter where you run them, right? So the benefits of ABPF don't really shine when we talk about L7 processing where the main bottleneck is actually the CPU. So if you want your mesh to have the L7 smarts, you're going to pay the cost, right? And what we see with our customers is that they have their SLA, and as long as we're below the SLA, they want more features. They want more value out of their mesh. So that's kind of my summary. And again, all of this is true for 2024, 2025, don't hold me to anything here. Everything can change. And that's it. Thank you, Valco Javi from SoloIO, we're here to help you with your service mesh. Thank you very much.