 All right, hello. Thanks for coming. We're here to talk to you today about detecting suspicious data patterns in encrypted traffic with EBPF and KTLS. And I just want to say I'm excited to talk to everyone about this today, because I have been working on EBPF and TLS for close to 10 years now. So seeing all the pieces come together is exciting. My name is John Vasseman. I work on Tetragon. I'm the lead there. I'm also a psyllium maintainer, also a kernel maintainer. And I work at Isovalent. And Natalia is with me as well. And she also is at Isovalent. And she's the product lead for, I think, everything we're going to be talking about today. So here's a quick agenda. We're going to talk about Tetragon just to give you just a quick brief about where our background is, like how we approach problems like our lens for looking at this. And then we're going to dive into L7 observability and security and use cases. So let's get going, because we have 34 minutes now. So first of all, what is Tetragon? And the reason this talk isn't about Tetragon, we did have a talk on Monday. So go back to the psyllium con if you really want to get a deep dive on Tetragon. But I think it helps to understand what kind of problems we think about and how we go about addressing them. So Tetragon is a security observability and runtime enforcement agent. It's written in BPF, the really heavy focus on overhead. Try to keep that performance really low. We do that with BPF in the kernel. We move a lot of the business logic that is usually in user space. We move it down into the kernel. So all of your filters are down there. We keep a state of all the execs that have happened in the system, all of the sockets. So we have a map from sockets to execution. We have a map from DNS to sockets back to the execution. And we can build kind of a map of what's going on in your platform for observability reasons, first and foremost. And then we follow that up with enforcement. So you can say things like, if this pod or binary even is talking to another binary, that's allowed. If it's doing a DNS query out to a new DNS server, that's not allowed. So you can build these fine-grained binary level policies. And there were actually, Thomas Graf gave a talk Monday about this. So go to the SiloamCon if you want to know more about that. But that's really our focus is the security, deep observability, low overhead. So that's where we're coming from, both from a networking side, but also from a runtime. So think about file integrity monitoring. Think about building signatures to match SysCalls and so on. So that's kind of our lens. We want to make sure we do that transparently. We don't want to get into the application world where you have to modify your applications to get this data. We want to have a single agent. You deploy it, and then all of a sudden you get this visibility. You get this really good enforcement property as well. And we're not just at the SysCall level. We're getting down into DNS requests, HTTP requests, TLS, which we'll talk about what files are being opened, kind of deep, deep inspection, down in the kernel, mapping that back to binaries, user space, Kubernetes labels. That's the kind of tetragon viewpoint of the world. And now we want to talk about layer seven. So one thing that we do at ICIVailant, and I'm sure this isn't too foreign to most people, I know a lot of different people do this kind of observability, is you get a nice dashboard like this. You got your HTTP observability. You can build a service map. You'll say, this service is talking to this service. You can get latency histograms. You can get how many bytes are going over to this URI or that URI. So a lot of different platforms build this HTTP observability platform, and give you this visibility. And in Tetragon, we do something very similar. And we do it at the binary level, so you can pin it back all the way to the binaries inside the containers. So this is great. We endorse this kind of view of the world. You can even do things like this, where you get a process execution trace. You see, inside a pod there, somebody executed a node server.js, who are they talking to? They reached out to API Twitter, apparently, a few other things, Elasticsearch, and maybe in a malicious bucket that you want to detect in your system. So this is all really great. Like I said, we provide this along with a lot of other folks. But there's sort of a problem with this when you go to zero trust, and you start encrypting everything. Your observability tools cannot look inside a TLS session directly. And so that's what we want to talk about today. If you're going to build these tools, there's a bunch of different places you could conceivably hook into the system. And this is a kind of a picture here. In the upper left, you have the pod. And you can see there's a couple options up there that you could hook. Library instrumentation is an option. You can instrument your library with observability hooks, extract data. Uprobes, we put up there. That's a BPF stability to hook user space. Very similar to a library. You hook into the user side of the program. Your BPF program runs, and you get some observability out. Then we have sidecars still inside the pod. Basically what that means is you take traffic from the containers, route it over to the sidecar. The sidecar can inspect traffic. So these are all hooks for layer seven observability. Another option that Cillian provides is a host proxy. So take that proxy, put it on the host, bring it outside of the pod. Then when traffic is sent from the pod to the network, you route it through that proxy, either through EBPF, or if you have a more legacy system or an older system, you might use IP tables as well. Then finally, if you're inside of a virtual machine, there's even hardware hooks for this. So if you're a big cloud provider, you might have a hook on the DPU or the FPGA on the network side. So traffic comes out of the system. You get hooked there. That's an interesting one. If you have the ability to play with hardware, most of our environments, we don't have the ability to drop down FPGAs on the system. But we put back on our tetragon lens. Because we are a security tool, we don't actually trust the pod. So we get rid of a whole bunch of observability points that could work, but don't work inside of our trust model. So if you look up there, clearly, we can't hook the library anymore because we don't trust the application. That's the thing we're trying to observe, the thing we're trying to secure from a malicious user. If that thing is the malicious untrusted thing, we can't trust it to give us data. And then if you also look at the sidecar model inside the pod from a security standpoint, we don't want to trust that entire pod. That's our boundary. We don't want to inject something into the pod and then have to trust it. So we rule those out, which really leaves us with two hook points that are viable, either the host proxy on the left, which is an agent that we redirect to, or something we're going to call SOCMAP L7 BPF. And what the L7 BPF is, is it's a way for BPF to hook directly the socket. So any application that sends data, we get a BPF event with the data. So you do a send, you do a send message, even kind of a receive message, for example. We'll get that data in that BPF program. What's really interesting about that is this program cannot be controlled by the pod at all. Even though we can attach to the sockets, the user of that socket, the owner in the pod, cannot detach or somehow avoid the hook. We will get the event. So let's just put a BPF hook at the socket layer, read all the data. Those are our viable hooks. You might have noticed I put TCBPF there. TCBPF is a way to hook at the software in the network. It's a great hook. We use it in Cilium a lot. Unfortunately, it's after the TCP stack. So if your mindset is L7, you really need to be at the socket layer. Because if you're not, you have to worry about TCP retransmits, which will throw off any parser, building the packets back together. It's just you end up writing an entire new TCP stack in BPF. We don't want to do that. So what's the flow? Let's think about a curl here at ebpf.o. What happens? First, there's a DNS. Then we connect. Then we do a send message. If we're using the host proxy here, what will happen is that packet will be sent out of the pod. It'll be redirected to the host proxy. You'll get your event and then the host proxy will forward the packet along. So kind of a bump in the wire on the host to do your observability. All good. Again, if for HTTP, this is how the model works. How does it work for the SOCMAP stuff? Very similar. Still need to do your DNS. You do a connect to the IP address, and then you do your send, same as before. Except for right after you do the send, we get the BPF hook. We run our BPF program over the data. We parse your HTTP headers, your Kafka, whatever. We're looking at TLS handshakes, and we post that event to our observability stack, and then send the data along. The advantage here is we don't have to do any redirects. We're not even in the routing stack at all. We're purely a BPF hook directly to the data source. Again, works really well for HTTP and unencrypted traffic. So as I mentioned before, this is really great. But what happens when we want to encrypt stuff for zero trust, service mesh, all the words that we keep hearing about, everybody says it's encrypt. So we do that, and then our dashboards do this, right? Everything's encrypted, so I don't know anything about HTTP anymore because it's all wrapped in TLS. So everything's broken, we need to find a solution. The first one that Cilium supports as well is same flow again, curl, but this time the curl is to HTTPS. Sends it out, it's encrypted, no data. So that's the flow again. It's broken, what do we do? BPF doesn't really save us here, same idea. Curl, because the encrypt happens in user space before they even send the message, same problem. So we need to think up another solution. Let's talk a bit a little bit about what those are, what are our options, how do we resolve the encrypted data problem here. Because of our security model, like one option is to don't encrypt. Unfortunately, not an option. The other one I've seen used quite a bit is U-Probes. You could in fact stick a U-Probes in front of the OpenSSL library to see the encrypt process. So basically in that mode, you take your U-Probes and you attach it to somewhere in the SSL library, whether it's OpenSSL or BoringSSL or any of these SSL libraries. You hook in before they do the encrypt, you get your BPF hook. The only trouble with that from a tetragone standpoint and our viewpoint as a security problem is that that memory is still owned by user space. We're at the mercy of A, the user space calling our library and B, a malicious user could in fact change the data after we've read it. So we have a security problem from using U-Probes. Otherwise, it would have been a great solution, but unfortunately, we cannot use it. So we really have TLS termination, which is the psyllium option, and then we're going to talk about KTLS, which I think is an option that we've been investigating. So it's back down to the same flow again that we've talked about curl, but this time to HTTPS. U-Probes, we've thrown that out. So that's basically before the encrypt, like I said. Then you do the send message and you're good to go. So let's see here. Sorry, I got my slide slightly mixed up. There we go. So this is the next one about the termination. This is the one I wanted. Basically, you do the send, you do the redirect. The redirect goes to the host proxy. Host proxy does the TLS termination because you're sending it with encrypted. You then terminate at the host proxy, observe the data because now you have an unencrypted, and then you re-encrypt it and send it along to its destination. The complex part about this is really what this slide is from the docs in the psyllium side, because now you need to inject your certs into the encrypt. Because you're going to be decrypting, you need the application to do an encryption in a way that you can decrypt. If it just encrypted it arbitrarily, you wouldn't be able to do that. So you need to do a certificate management process shown here. I'm not going to dive into all the nitty gritty details of this, but the basic idea is you inject your cert, the application uses that, use that to decrypt, and then on the other side, you re-encrypt and send it out. So you basically interject yourself into that flow. So how do we do this with BPF? So BPF, we don't do this cert interjection. Instead, what we do is we're using something called KTLS, and KTLS was designed as a performance optimization for TLS. The original use case and it's in production and widely used was to avoid doing an extra copy every time I need to encrypt data off of a disk. So you can imagine if you have a disk image, it's a video perhaps or something like this. There we go. I'm back. Netflix has talked about this in a couple of papers and talks. Basically, the idea is you have a video or something on a file, and you don't want to copy it to user space encrypt it and then send it back in. You want to be able to send it directly from the disk to the network. By moving the crypto into the kernel, you can do that process all in one operation. Copy from disk, crypto, send without ever going to user space. So that was the origins of KTLS. That's why we added it to Linux a handful of years back. That's why it's been in FreeBSD for years. That's the origin of it. If you're curious a little bit, the breakdown is to do the handshake and all the control plane in user space and just do the data path operations in the kernel. You can see performance wins from doing that. What we did after that is we then hooked a BPF in the kernel can hook in front of that encryption. So when you send the data, you're sending the plain text, the BPF program runs, and then it's encrypted after the fact, which lets us read the text in plain text, but the application is allowed to keep the keys and do the entire control flow, and we don't have to interject ourselves on the certificate side. This is the flow here labeled out. The one piece to enable it is the application does need to do a set socket option to turn it on. We've done work in OpenSSL and a few other libraries to add support for this so that it's mostly transparent to users. They don't need to opt in, it can just be used if the kernel supports it. The main thing is that you would need an OpenSSL-3, some of the newer libraries to have this, but it is available in production libraries already. So that's the basic idea. Then by doing that, then we can get our dashboards back, and then that's the big win. Then if you look at what does this look like, we have a CLI here that we show for a lot of things when you do the crypto. Basically, the process gets created, it gets the connect, we see the TLS. This is actually important because if you want to make sure that you're enforcing this behavior, we can actually observe whether or not the user has set the socket option to use KTLS. Then we can do things like alert, or we can drop the traffic in more in a very strict environment, or at least alert to folks that this is not being used. So we're not going to see the next line, which is the HTTP. So this was basically some code we added to TetraGon to do the HTTP parsing in the kernel, do the TLS handshake parsing, and then print it all out. Again, we didn't have to manage any of the certificates, we didn't have to do this kind of stuff, we're directly in the BPF at the send side. With that, I will let Natalia talk. Yeah. So now that we got the S7 visibility back via KTLS, so what would be the end-case use cases here? So how, for example, an information security or SecOps team would use this? So I brought two examples, like both of them are very cool and interesting. The first one is like data extortion via token. So the main security change is like, let's imagine an enterprise, like many enterprise use like a SaaS service to store data, like Datadog, Twilio or Sentry. Then in that enterprise, each team has its own account and then an associated token. So for example, how could you spot if you have a malicious employee in a team and then it just creates its own account and the token, and then just start to ship like data to that account. So with traditional network monitoring tools, you only see that there is like encrypted data or encrypted traffic going to that SaaS service. You don't really have visibility into what token is used. So this is just like a very simple example. You have the organization, you have the application team, platform team, each team has its own account with a token, you have the SaaS service, and then basically you only see that traffic is encrypted over TLS, and then you have this malicious employee created the account with a new token. So one solution is like, let's use KTLS and then Incarno HTTP parsing, and then basically you would have the full visibility into an S7 network flow. So you would have visibility into HTTP header and then the body. So you can identify what was the token, and then for example, you can compare it with other tokens on the white list. So you could actually spot that there was a new account with a new token. And then the dashboard, it's actually the HTTP traffic in a Grafana dashboard exported by Tetragon. Of course, it's like Kubernetes Identity Aware, so you could see like Kubernetes namespace or Kubernetes workload, what was the binary that actually initiated that request. You could see like the method. You would have visibility into the URI, so that's where the token would live. And then you could see for example, the HTTP version, content length, response code, and so on. So this is also another screenshot from the dashboard. Here you could see like all the URIs that were for example, queried by a workload, and then you could see like how many times. And then you could do for example, some like per workload or for example, per binary. The second example is actually like detecting sensitive data patterns from the HTTP body. So let's imagine like, it's actually like a real problem that many enterprises struggle with achieving compliance standards. For example, to not log sensitive data like credit card numbers, social security numbers, IDs. So with KTLS and HTTP parsing, you would have full visibility also into the S7 network flow. So you could actually create signatures to detect sensitive data patterns or even like payloads. So very simple example, you would see here here is header and then here you would have visibility into the payload. So it's actually just like a very simple post request to like a payment service. So you could see the web server information. You could see for example, the user information. It was a Go agent. And then you would have visibility into the payload as well. So in case if there was like a payment, you could see like, okay, there was the email address, the credit card number, expiry date and CCV. So you could actually detect those, and you could even create like signatures to detect like more malicious payloads. So that's what we all wanted to talk about. I do a little bit of web app. So we talked about like, what is Tetragon? ABP based, runtime security observability and enforcement project. We talked about S7 observability from different views, architectures, trust models. We talked about like what happened if you introduced TLS, how it breaks the observability and then how could you still have observability if you have encryption. And we mentioned some use cases. So if you're interested in a more in-depth demo, just come to our booth, Celium booth or Isabelan booth and then we could show you. Or if you're interested in Tetragon, just check out the GitHub repository. It's open source. We have documentation website. If you have any new idea, create feature requests. People are actively working on it. We have many quick starts, user guides and tell your user experience. And if you're interested in more in EBPF, we will have a documentary film starting at 6.15. So you could check it out as well. So with that, I will open up for Q&A and thank you. Yeah, so if you have any questions, just come to the mic. I can hear you. And if there are no questions, we can go to... Oh, there we go. Okay. Perfect. Thank you very much. Just curious, you said the Grafana, those were an exporter from Tetragon. Is that available to us? The Grafana dashboards. I don't think we have any, we have some in the docs. I think there's some actually in the repo, but we haven't really added them to the docs at this point. I have to go back and check. I mean, so from the Tetragon project side, we just released 1.0 last week. And we got a lot of the quick start docs. I think a few of the Grafana dashboards are still buried in the code somewhere. We'll try to get them in the docs here pretty quick. There's no reason not to have them visible. I think it's just a matter of making sure we get them in the docs and visible for people to pull down. Perfect, thank you. Yeah, no worries. Yeah. Neat talk, really. I've never seen that before. With KTLS, I have to have a version of OpenSSL or whatever's doing the encryption that supports KTLS. Does that mean that I am maintaining my own main containers that have specific versions or is that pretty widely supported? Yeah, you know, if you have OpenSSL 3.0, then you have KTLS support. Because we went, you know, whenever, this was two or three years ago, we went through and added KTLS support to OpenSSL. And because it was a major feature, we can't get it in at OpenSSL 2.0 because it would require backporting a feature. But if you have OpenSSL 3, you know, which will become more and more widely available, hopefully over time, then you do have KTLS support. And that's actually independent of EBPF. We use it with EBPF kind of layering the technologies together to be able to look into data before it's been encrypted. But you can use KTLS just as a performance optimization even without EBPF. And so it exists there. Your applications will have to opt in with that socket option. So that was kind of where we landed on that is, you know, we didn't want to just default everybody yet, you know, to every case being KTLS. So you really have to do that. The application will have to do this set socket option to turn it on. Once that happens, then if you're in application using OpenSSL, then there's nothing else you need to do. Like under the actual library, we'll just handle the end shake, offload it all to the kernel. And you're good to go. You know, there's a Golang SSL library. There's a PR open to do KTLS there. I believe if there's somebody who's really up to speed on what boring SSL is doing, they can maybe correct me. But I believe there was something there. I believe Facebook has talked about using this as well. So like there's definitely some rather big organizations that are using KTLS today in production. Kind of, you know, in use. So, awesome. Thanks. Cool. Any other questions? We still have a few moments, the rest. Okay, cool. Yeah, if you guys want to find me later or Natalia, we can talk to you more if you're interested. We definitely, we have a demo too. If you come by the booth, we can show you a demo if you want. Just come find us. I didn't do it here just from swapping screens around. It looked a little bit complicated. I've seen a few demos that tried to reach out to the web and then the web didn't exist. So, you know, come find us. We can show you a demo if you're interested. All right, thank you so much.