 Hello, everyone, and welcome to our talk. We are super excited here to walk you through a concept of out instrumentation of your metrics. You collect from cloud infrastructure, cloud native applications, and beyond. And, you know, we can distinguish three main goals behind this talk. And those were motivation why we are doing it. First of all, you know, recently we hear a lot about EBPF and for a good reason because it's quite powerful. So we want to explain how EBPF actually works and how it can be leveraged in the cloud environments for monitoring purposes with prompt use, since, you know, there might be high potential. Secondly, technologies often, you know, overhyped. So we need to be careful. We want to look with pragmatic eye on how this concept is really feasible on modern clusters like communities and, you know, is it ready? Last but not the least, we would love to share some practical tips and give starting point for those who really want to start using EBPF right now for observability purposes. We hope it will get you, you know, started faster. And so we can get, we can together make a space of EBPF in clouds more major, especially for observability purposes. But before that, let's start with the short introduction. I'm here with Harshita. Hi, this is Harshita. I'm a software development engineer at Amazon and I was a ex-Tanos, BSOC, MinB and Bartek was my mentor. And my name is Bartek, full name is Bartuomi, but you can call me Bartek. I'm principal software engineer at Red Hat. I am prompt use maintainer. I also maintain TANOS project and I am CNCF attack observability tech lead and I also write a book with really called efficient go. So as you can see, we are not necessarily EBPF experts although we spent quite much time debugging and programming in EBPF last weeks. Yet technically we are observability experts looking into low level EBPF programming as a way to improve observability in cloud native ecosystem. That's why we will be looking on this concept right now in this talk from usability perspective, especially on how realistic is this to be applied today in the cloud native computing foundation space. So the flow of the talk is following. First of all, we will explain web method, what exactly this is, how it is useful. We will talk about practical instrumentation nowadays for implementing web method and challenges into obtaining and maintaining those. We will talk about EBPF and how it can improve the situation. We'll show a quick demo and talk about practical adoption concerns or potential. Harshita, do you want to tell us more about red methods? There are several methods of monitoring when it comes to managing workloads and production on the cloud. For example, learning the health and status of the system to alert when something is not expected. Officially do that. We need to look at the service from many perspectives, especially if it's a web service and there are methodologies and patterns people invented that help us to identify what things can be good to monitor and which might be unnecessary. So red method is one example. In the beginning, there was something like golden signals that we could read from Google Saripam. There is use method by Brendan Crick and fairly recently, red method was invented, I think by Tom Belki. So what is the red method? It is essentially a method to monitor request-based service. Rate, which is the number of requests per second your service is serving, how much traffic you have and errors, number of failed requests per second, duration, distributions of the amount of time each request takes, how long things are actually performing. Example, maybe nice traffic, not too much and no errors, but requests are taking too long. So red method is indeed pretty amazing. We can build automation around it, revisit, for example, we can build dashboards to show the common views of the system health for one service. We can build SLO views, alerts and recording roles, all the Prometheus goodies, but if another service is using red method, we can easily reuse, save resources, so dashboards, alerts, recording and so on. No need for magic templating or reproducing things from scratch for every application. All developers need to think is, how do I satisfy those signals described by red method and aside DevOps tooling will be compatible with it? But how can we instrument application in order to satisfy red method? Let's take the simplest example ever. We have Prometheus server. It is a web server that serves some HTTP requests. For example, it responds to prompt your queries, but it also collects the metrics and allows monitoring. So let's use Prometheus to collect metrics about itself to show how we can monitor Prometheus in red method fashion. We could then reproduce the same pattern for any other application than Prometheus. Let's go. As we mentioned, Prometheus collects the metrics, but it is also a server. So it exposes some metrics about it's web serving. One of the most important metric about it is Prometheus HTTP request total, which is a counter of all requests by method and status code. So how it was done? Well, in fact, we could see this metric, so instrumentation was implemented by hand in Prometheus code. And in this graph, we can see that it overall gives us the rate of requests served by Prometheus. So it satisfies the rate item in our red method. Great. But that's not all of it. It also satisfies the red error site as it has response code metrics label. So we know if this request was successful or if it failed, then we calculate the rate of errors, percentage of those and satisfy error element on red method. What about duration? We can look into the histogram. Prometheus exposes Prometheus HTTP request duration seconds. It allows us to know the latencies of those requests. And again, this is the last item on our red method. So with just two metrics, we satisfy all of items on red method. We can kind of show that we have a basic and essential monitoring enabled for Prometheus. We measure and collect enough metrics to use red method dashboards alerts, recording rules and so on. So that's the red method. Let's now talk about how we can implement the red method. I mentioned that the way Prometheus implements red methods kind of handcrafted, which means that Prometheus maintainers like PyTek wrote the code from scratch, literally imported the client of Prometheus and Go, defined and registered required metrics and incremented the counter everywhere. Web handler returns the response. This works great, but this has its own limitations. So there are a few challenges. Firstly, we have to implement it, but what if we don't have access to the coalfee application? Maybe it's closed source. Maybe you don't have enough knowledge, comparison, tooling or dependencies. Since you're implementing from scratch, you can make mistakes and it's harder to maintain. There will be inconsistencies. When one team handcrafts their instrumentation, they might have different metric labels, but different behavior, different listener position from the other hand for the same metrics bringing instances inconsistencies. Yes, some of it can be fixed with templates, but it is not so easy. So that's why handcrafted method is not the best. It's not what we would recommend. The other thing you can do is to use a well-known library like some GRPC or HTTP middlewares. For example, PyTek is a maintainer of Thanos and I was a Thanos mentee. And in Thanos, we use certain X-HTDP package that we wrote, which is a library anyone can use. It can be hooked into any code HTTP server that will immediately instrument all the routes and HTTP handers. The same can be done in GRPC using code GRPC middleware project that PyTek maintains. Similar to HTTP middlewares for focus on GRPC in the form of interceptors. Thanks to those reusable libraries as long as those are red method friendly, well-tested and used, they are much better choices. Thanks to consistency, if you use the same library in all your applications. Those helps with maintainability and avoiding mistakes, but still we have problems. For example, we still might not have access to the code. Applications can be done in many languages. So you would consistency across libraries in different languages. And those middleware and libraries might not be existing in all languages. We know it works in code, but maybe not for other languages like Java, Ruby, Python, et cetera. Last, but not least, big people do instrument their example. HTTP application is using something like a service mesh. For example, Istio, LinkerD, or maybe just Proxies, where you don't have to add a full service mesh. Just this side curve proxy to application like Nginx or Envoy. Those Proxies are part of every HTTP request. These can collect read metric, read method friendly statistics about the traffic. What are the pros? Those are tested and reliable. You don't need to write your own stuff so you can't avoid making mistakes. There is solid consistency because every application has the same side curve in this service mesh. And it will have the same metrics, but there are problems. It is much more complex to operate such system because service mesh is just, and it's a process is another side curve and it's a containers. There are a lot of things that can go wrong. It's highly to operate on scale. For some applications, it is also introducing significant latencies, making it so it won't work for everybody. There is some extra cost. It uses extra compute power like CPU or, so this can translate to real money on scale. So at the end, this might not be an ideal solution. Great, thank you, Harshita. So as Harshita explained, we have quite many choices when it comes to collecting permutous metrics for red-method monitoring of our HTTP requests, but they all have its own trade-offs, pros and cons. And none of those can be described as automatic, right? So let's explore if this EBPF thing can get us there, but exactly where? Like what would be the dream world that we can call auto instrumentation of our metrics? So to us, auto instrumentation is when I can create a new application from scratch in any programming language I want, or I can take some application from somewhere. And when I put that into the container, deploy on my communities and on many clusters and point my permutous servers around those applications, I will have a consistent metrics about what HTTP traffic is happening so I can use red-method to measure it consistently using my dashboards, alerts, I can create my runbooks consistently as solos and all of those goodies that we care about, all kind of consistency, consistently. That's beautiful, right? That's amazing. Let's see if EBPF can get us there, can help us here. So what EBPF is, first of all, nowadays it is the whole ecosystem actually with its own foundation called, non-surprisingly, EBPF Foundation. This foundation is aimed to help EBPF projects, EBPF-related projects to try to be maintained, but also ensures that the core EBPF is maintained and it's still open source because you never know. Big players are in this foundation, so it looks pretty serious. There is Google, Facebook, Microsoft, Clouther and more. So it looks very ambitious. So what is the EBPF core? The core is essentially a technology that allows this hooking into kernel functions, for example, syscalls and by executing small, save, small snippet of code every time this function is invoked, right? And everything in the kernel space. So very close to the kernel and quickly and safely. This is kind of the underlying technology of EBPF. The feature was itself added in kernel 3.15, which is like quite old kernel. So nowadays we have like five something. So the technology is quite major. So you can also store some data through EBPF program. It can store in a stack, of course, in here, but also in EBPF map that can be accessed by the user space of some application that is loading this and managing this program. And in our diagram, it is this goal library that is doing this. For example, EBPF program can gather how many TCP packets your Linux socket process, for example, and then a user space program can access that information through by looking on this EBPF map. So this goal library will know about that. So how we can practically use such data in Prometheus with Prometheus. It's quite simple in theory, right? We could develop an exporter, which by the way, it's already developed by Klauffer, which is super amazing, that allows us to inject our EBPF program and do certain mapping between EBPF map data and the Prometheus metrics with what labels we want. And then this EBPF exporter will periodically check EBPF map or actually on neighbor scrape to expose the data in the metric form, in the time series form. So then it's as easy as pointing Prometheus to this exporter. So kind of scrape it, every whatever interval you want and collect this data periodically and expose it for the PromQL and for their integrations. Sounds very easy, right? And indeed it is actually matching lots of items from elements from our goal. You can actually implement one EBPF application that will hook into places you want in order to gather those HTTP request statistics. And then if you have that, it will actually whatever application you will write in whatever language, in whatever technology, as long as it uses like normal HTTP requests through kernel, we can measure that in a consistent way. So there's no custom code involved. I don't change anything in the application. I don't add any sidecar. So there is no additional costs, no significant additional cost or operational burden. And there is consistency because whatever application I use, this EBPF exporter will provide a metric with exactly the same name and behavior. So it's kind of powerful. But as everything, this obviously has some trade-offs like everything has trade-offs. And there are some catches. This code is from our demo EBPF program. We wrote and you can see it's written in the C. It's kind of worse than that because it's limited C. So you cannot import standard libraries. You can only use few dedicated helpers and that really depends on the kernel version you are using. You have only 512 bytes for your stack. So you cannot allocate lots of huge variable to store, I don't know, some buffer or whatever. That even expects you to understand what is the stack, which is not normal knowledge of DevOps engineer or a series really not required. So that's already a challenge. And you can still kill performance of your kernel, so of the call machine because you can create deadlock still. You can have major slowdowns. So it is serious. And also certain errors are, we know some, you cannot hook into certain functions even though they are in the kernel code because they are in line, for example. So we just kind of a compiler optimizations. There are lots of, the development in this is quite brittle, I would say. And there is more, right? The error handling is just painful. And I guess this is just C. So lots of misleading errors. Like for example, this is unexpected type name, something, but actually this function, this macro BPF per CPU is just not existing. Why it doesn't tell me that this is just not existing? I was, you know, super confused. And also like this is pointing to line 25, but indeed it wasn't line 25, it was some other, you know, line because everything is shifted because of those macros. So you cannot rely on that. And like it's quite painful in comparison to, you know, beautiful programming in Go and Python. So you need to use to that. Second catch, second major catch is that you have to know Linux very well. You need to be able to understand the tools and helpers which are available through this BCC. So BPF compiler collection. You need to understand what is available, what's not in your kernel, in a specific version of the kernel and your operating systems and how to exactly check the signature of those functions and Cisco's because you need to know those signatures in order to use sometimes to even hook into it and then use those arguments of those functions for your tracing or monitoring, right? So it's kind of hard. And, you know, there's also, this also mean that this kind of being specific to the kernel version is that really none of the tutorials I saw work for me, for example, because I have like pretty new 5.11 kernel version. So I had to dig everything from scratch. My function names were totally different and it just was, well, I have to build things from the bottom up. So expect your code to be not portable across operating systems and kernel versions. So there's a lot of effort to maintain such programs on the scale, right? Like imagine, I don't know, like as an example, I couldn't use key probes which is essentially a certain way into creating a hook into some function in the kernel on the accept for Cisco, which is what typically used to kind of trace TCP connections. I had to use trace point, which is like a different mechanism. It was kind of fine, like at the end I could use that, but it just took me four hours to figure it out. So great, it is a bit hard. So let's go to demo, like what we want to show? What's the architecture? So in this demo, you will see one way of getting two of the red point method monitoring signals using eBPF rate and errors. And we have cloud for eBPF exporter running everything in the containers because we are cloud native. And we have a Prometheus server. As Harshita mentioned, we use this Prometheus to collect metric, but also we want to use this red method. We want to monitor Prometheus itself. So we want to compare the handcrafted metric with this eBPF based metric. And the critical part of this demo is the actual eBPF program. So there are kind of three ways I found how we can get the HTTP request kind of metric statistics. You could trace your program, right? So there is something like Uprobes, Uprobes which allows you to essentially hook into function, which is not in the current load, but actually in your application. So it's kind of similar how you do that in the bugger, right? And this is great. We know that in Golang we use HTTP slash net HTTP packages and there are certain functions that is involved when we have a request response. And we could hook into that and every time we hit this in our code, our eBPF program will be involved so we can make statistics from it. But as you can imagine, this is really code dependent because suddenly the library changed, the semantic of the version changes. Every time code changes, we need to update our eBPF program, not fun. Like we kind of breaking our goal here. And also it doesn't work with Java applications, for example, because all of those are running in JVM which is a virtual machine of Java. So everything is compiled in time. It's just hard. Second thing I've seen is you can attach your program to the socket. This is useful for filtering packets or filtering HTTP requests. Unfortunately, I couldn't use that, but maybe useful stuff. And the last item I chosen to do is kind of a neat way into hooking into important syscalls, accept for, which is required to, you can use other stuff, but like this is typically used what, when you want to obtain a TCP connection and the file description to the connection. And then you can read from that connection or write to that connection. So we do that in the server side. We read requests and the write response, right? And then we close this connection. So if you hook, we have kind of visibility into what's happening. The tricky part is that you need to read and write is used for any IO, so any file reading, writing. So we need to be able to really filter out the TCP connections only and ideally only HTTP requests and ideally only from our container. So this is a lot of work, you will see that. And I have no idea how to do it with the TLS. I hope, yeah, someone will show me at some point. So let's go. So in order to show that, we wrote some kind of Golang test, which is kind of end-to-end test. It goes in Go test library. And so what we have, we have here Docker environment where we start Prometheus and we start, I obtained some kind of Haki in Haki way, Prometheus process ID because this is how I will filter all those requests in ABPF program. And I can start ABPF exporter in the container as well. So you can see it is kind of privileged container here that I have to build my own ABPF kind of exporter version. I have to patch it a little bit, but you need to provide a lots of stuff, right? You need to provide headers, you need to provide access to the PROTS resources and some debug trace point kind of fire path. But you overall start the kind of containerized environment exporter in the containerized environment in the previous manner. So then I have to configure exporter and ABPF exporter has a nice configuration file. I kind of have it typed here, but at the end it's the YAML. I specify a program, I specify metrics. I choose to have started connection, close connection and overall requests. And I tell which ABPF map I use kind of it's called table here. And then I choose where I want to hook my program. So I have four functions. I want to hook into when we enter the accept for function, six call actually, when I exit that and then when I enter right and when I enter close. And then I parse this program and I kind of put it here in this YAML, but I also substitute some pit because I want to also filter by certain process ID, right? So we do that and then we just start everything, make sure Prometheus knows about the CBPF exporters and let's start it up and we open some kind of browsers. That's it. So when this starts, both ABPF exporter and Prometheus should be running and it should open our browser and we should see after some time that we obtain some metrics. So the first metric is what Harshita shown is like a Prometheus metric for HTTP requests. We have apparently like around 30 of those and this is exactly the same, well, it's kind of the same, what we aim for this to be the same information, HTTP requests, but from EBPF directly, we have like similar value around 30. There might be like some different delays on where we check this data. So that's why maybe the numbers are not exactly the same or maybe I have back in my EBPF program, but in an ideal world, those you provide the same, you get the same information from both methods, just EBPF just more, it has its own benefits as we mentioned. And we also have like, you can see that process started nine connections and didn't close any of those yet. So, you have some additional information. So let's test it out. So for example, let's go to my code and when I kill this, I should open a browser with like a wrong query. So I do an API call that was bad. So it should return with 400 status code. So we can see that Prometheus handcrafted metric, it, you know, found it really nicely. Yes, we have just one metric, which was found as a 400. And then let's see if EBPF program found it. Yep, it found that there was a one bad request happening in the system, but there is one more thing which I want to show. So when we go to the code and click to another step, we'd make a different call, which is a wrong, essentially not found error, but you will see something interesting where handcrafted metric doesn't show that. We have only like one error. However, EBPF metric shows that you can see there was increase and there are two errors with 400 code. So this is pretty neat because what is happening is that we in Prometheus and I maintain Prometheus, we probably forgot to, or maybe, yeah, we didn't plan to put this metric instrumentation for all the paths in Prometheus server. So we missed this path of kind of root and some broken path and we didn't instrument, we wouldn't know about that if someone will be spamming there and maybe kill our service because of DDoS of those requests, right? We would miss that because our handcrafted implementation was, maybe it's missing something, but EBPF program knows about all calls. So immediately we know more than just handcrafted metric, right? We know that there are some, and we've made more of those, you know, maybe someone spamming this request will know about that from the EBPF implementation, right? So this is the whole demo. I can also show you the code of EBPF program itself, but, you know, it takes some time to explain all of this and we can do it at some point, and maybe I will provide like a blog post about that, but essentially, yeah, lots of work we're in there and kind of the most tricky part is to how to ensure we measure really HTTP requests only and only the rights that gives, because, you know, when you write response, you do lots of right CIS calls potentially, not only one, because you maybe write like some response and not only HTTP headers, right? So we want to also measure only that. So there's lots of complexity, but once you do that properly, it is rewarding, right? So just to finish, we have kind of, you know you can do it, but let's understand this feasibility, right? The technology is major, but we see that it requires some kind of low-level programming skills. You need to know about kernel. You need to be aware that there is literally zero portability here. You need to be very aware of your kernel version. So you need to deploy, for example, when you deploy that into communities, you need to probably be in, not probably, you have to be in control of your kernel version or all of the nodes and operating system, which usually you have, so that's possible. You need to fix your containerization access. So it has to be privileged container and so on. Still doable things and mapping challenges. So all the BPA programs you usually have is like C group ID, which is like number of E node and then thread process ID. And there are ways to map it. I have to create a patch on EBPF exporter to decode that, but you know, I think with some work it is totally doable. So thank you. I think I hope you learned something. I hope it will get you started if you want to go and try to make something production ready with EBPF for observability needs and check our demo and sources in this repo and follow us on Twitter or ask any questions if you want. Thank you.