 Hello, everyone. Welcome to my talk. My talk is about how to use BPFdris with performance co-pilot and performance. So most of you probably know two or maybe three of the components. But for those who don't, I'll give a quick introduction. Performance co-pilot is a system performance analysis toolkit. It consists of multiple components. One component are the agents. We call them performance matrix domain agents. And they are responsible to get a performance matrix from one source. For example, we have one agent for PostgreSQL, one agent for GlassDFS, one agent for Microsoft SQL server, and we have around 92 agents. And the latest new agents we wrote are one for PCC. That's the EPPF compiler collection and another agent to get metrics from BPFdris scripts. Each metric in performance co-pilot has multiple metadata associated with it. For example, semantics. It can be either an instant value like the temperature of the hard disk or a counter. The Linux kernel exports a lot of metrics as counters that are monotonically increasing unsigned integers. So it just, for example, if you write something into a hard disk, the Linux kernel aggregates this data. And if you get two values and you want the rate of, for example, how many kilobytes per second was written to one disk, you can just get two values, calculate the difference, and divide it by the difference in time when you've got these metrics. Then you have the rate. We also store the unit. It can be kilobyte, bits per second, whatever. So performance co-pilot always stores the data, how it gets them, for example, as a counter. And when the user wants to read the value, it automatically rate converts it. We also support multiple types, float, 32-bit, 64-bit unsigned integers and strings. And also we have the concept of instances. For example, if you want to know how many bytes are written to a disk, then usually you have multiple disks. So each disk is one instance. And you can also, there are tools for visualizing the metrics in Grafana and as a Qt application. And there are also nice tools. For example, if you know that your application was working properly at 8 in the morning, but there was some issue at 10 in the morning, you can compare these two metrics automatically, and then it records your differences. And Grafana, it's a web application for the visualization of metrics. It supports visualization of time series data, of feed maps, of table data, world maps, and a lot more. It's highly customizable, so you can create your own dashboard. You'll click which type of visualization you want, for example, line chart. Then you select the query, which data you want, which metrics you want to visualize. You can select the legend, which lines you want, which color. You can hide something if you don't want some metric to be shown. And it also supports multiple data sources, for example, PCP, but also Elasticsearch, Prometheus, PostgreSQL, and I think around 10 more. And it's very extensible, so you can create your own plugins, you can create your own data sources, and also panels. Penals are the visualization part. And the nice thing, in one dashboard, you can have multiple data sources. For example, you have one panel with PPF trace data, and another panel, you show data from PCP, for example, CPU user, you show data from Prometheus or anything else. And PPF trace, yeah, we just had a great introduction to PPF trace. So I'll keep it a bit short. It's a tracing language for EPPF. And in this example, we attach two trace points for the read source code. And then we calculate when the read source code starts, we save the current timestamp in the map. And then when the read source code ends, we calculate the difference. So we know how much time was spent in the read source code. And usually, if you run PPF transcript, you start the script, then you wait for a bit so that the data gets sampled, then you close the script, and then you stop the script, and then you get a, for example, in this case, it's a histogram of all the data which was scheduled while the runtime of the script. This is a heat map histogram. So you can see the bucket, the data gets grouped in PPF trace in the kernel, so it's quite fast. And here, yeah, the lower bound is 256 in the first bucket, the upper bound is 512, and in the middle, you see how many values fall in this bucket. And if you combine this with Krufana, basically, you get another dimension to your data. You also get the time. So if you compare it, here, this is the summary of all the runtime value of the PPF transcript running. And here, one column is basically one heat histogram. So here in Krufana, if you have the dark theme, it's the darker shades are where less values are in a bucket. The buckets are always, for example, here, 33 milliseconds to 66 milliseconds. And the lighter values are if there are more values in this bucket. So here, we can see the beginning. In the first, around 10 seconds, there were a lot of values in this bucket, between the 31 microseconds and 63. And the architecture of the full project is as still as follows. The left side, we have the PCB part. There's this collector demon, which collects the data from the agent. We call the agents in PCB performance matrix domain agent. And this PPF trace performance matrix domain agent starts PPF trace on the fly if you want data. And if the data is not requested anymore, for example, if you're close for Krufana, then it automatically stops PPF trace. And PM proxy, the name is made a bit misleading. It started as a proxy, but it's now also used for exporting the PCB matrix with a REST API, so you can access it through HTTP. And on the right side, we have the Krufana part. Krufana is actually not only a front-end application. It also has a demon running called Krufana server. And you can configure it however you want. So either you want that the browser directly connects to PM proxy to get the matrix, or it goes through Krufana server. For example, if you are behind the firewall and you only have access to Krufana server, you can also let Krufana server proxy the request. And so if you open Krufana, it makes a request to PM proxy. Give me the matrix of the PPF trace script. Then it goes forward to the PMD. And if the script is not running, it starts it. And it gets the data. So how do you install this on Fedora 31? Unfortunately, the latest kernel 5.4 broke a bit of stuff in the EPPF land. The reason is that the Linux kernel gets compiled with GCC. The PPF trace and BCC compiles the kernel headers with C-Lang. And GCC introduced ASM underscore inline, which C-Lang doesn't support. So there was the patch landed in BCC, I think, multiple weeks back. And for PPF trace, I submitted a progress four days back. So now it's in the upstream master, but there's no released version yet. But you can enable my copper, and then you have the latest PPF trace version if you use the latest kernel. It's also in Red Hat Enterprise Linux 8.2 in the beta. So in the beta, you don't need the copper. You just need the other parts. It's BCC, the PMD, Grafana, Grafana, BCC. That's the plugin for Grafana. So Grafana can speak to BCC and PPF trace. And because PPF trace needs to run its route, it needs a lot of privileges. And it can also execute other scripts, and it can also read files. So you have to configure authentication. You have two options here. Either you go with the BCC authentication, which is a bit difficult to set up, and it supports only SASL. Or you just set up, for example, an engine to reverse proxy, because everything goes through HTTP, so you can just put a reverse proxy in front of it and use HTTP basically. Then you need to enable the PMD by going to this directory and installing it. For Grafana, you need to enable the plugin. So after you install it, in Grafana, you need to go configuration and enable the plugin. And you need to configure a data source. So you go to data sources, enter the address of PMBoxy. Usually the part is that listens to all the devices. Part is 44322. And last, I want to show you a few PPF trace tools. So there are a lot of finished PPF trace tools in the upstream PPF trace repository. Example tools. And I'll show you a few of them. One of them is the CPU scalar run queue latency. This measures the time, how much time a process is waiting for a CPU time. Here you can see if you write the PPF trace script directly in Grafana. You have a small code editor. You get auto-completion with trace points. You get auto-completion with PPF trace functions and built-in variables. And here the PPF trace script is like this. We attach it to two trace points. And in the schedule switch, there's a context switch happening. So the kernel is switching from one process to another. We're checking if the previous state was still running. So that means the process got interrupted, but it yet got interrupted, and it wanted to continue to run. So we store when it got interrupted, and then a few lines below it in line 12. We check if the next PID, so the next process, if it got interrupted before, and if it was, we print a histogram. And in Grafana, we see it like this as a heat map again. So we can see in 1956 and 45 seconds, there were a few processes waiting on CPU time around between, this is the lower bound, this is the upper bound between the one and three microseconds. Other PPF trace tools write the output not as a histogram, but just in a console. So in the terminal, for example, DSP Connect, it tracks all the DSP connection requests in the kernel. Doesn't matter if it's successful or not. And the original tool just prints them in a console like this. It also prints a source address, but I redacted it for privacy, for the IPv6 address. And if you want this in Grafana, you can also do that. You need to do a bit of changes. So instead of padding the output with spaces, you just do it with commas so that I can pass it properly and make a table. And with this script, usually it just keeps on running until you close it. So it gets a list of connection requests, gets bigger and bigger. That's why I created a small metadata annotation for the PMD that it just should only keep 10 entries in the table, and the rest of the data, the older values should be just cleared out. This is the rest of the PPF script I redacted a lot just so you can see the CSV output. And in Grafana, it's a table. In Grafana, you can customize everything. Here, it's sort of a time. You can remove columns. And it gets refreshed automatically. You will see it in the demo in a few minutes. And with PPF, you can also sample kernel stacks. This example, I'm 99 times per second. I'm saving the current kernel stack in a PPF map. I'm counting it. So if I assemble the same stack multiple times, yeah, I can see how many times I've got sampled. And with this information, I can print flame graphs. So flame graphs, it's kind of a reversed stack trace. So on the top is the actual kernel function. And the next one is like the parent. And the x-axis is not the time. It's just the more a stack frame got sampled, the bigger the box gets in the x-axis. So this, there are spin-unlocked get sampled a lot. And yeah, you can try it and finish this switch. And if you want to try it out immediately, I configured a greater container with all the dependencies included. You just run it with the latest kernel. You don't need the kernel sources anymore because the kernel headers are already included in the kernel. It's just for all the kernels. And you can start it at $3,000. You should take care that you only allow it to listen on localhosts because a disabled authentication, and yeah, otherwise, every unauthenticated user can execute PPF transcripts. Let's see how it looks. When you install the Grafana PSP plugin, you'll get a few dashboards, which I prepared, and you start a dashboard. Here we can see for PPF transcripts, so you start it, the PPF transcript got started in the background, and it's plotting the data. If you remember the CPU run queue latency program? We can see here are the settings of the panel. Here is the code editor. It's the original PPF test tool from the example tool from the PPF test repository. There's all the code, and there's the visualization. You can also, for example, here's the block latency. And if we generate some load, we can generate some load. You should see a bit of change. You see the white boxes, because I'm just writing a few bytes on my hard disk. And if I stop the script, it should get dark again. So yeah, I think now we have time for questions. Can we use that tool to export the metrics, or to send the metrics to Kafka and import it in a time series database? We have the PCP supports an open metrics endpoint. That's the format also from the TUS users. And if you can read this file, it's through an HTTP endpoint. If you can read open metrics from out, then you can have it. And also PCP supports writing the data to a radius database. It's a recent feature. So you can also write it to read this directly. But probably in your case, probably if you read the open metrics format, it's probably the best. And because it's integrated in PCP, you can use all the PCP tools. It's PCP metrics, you can all the tools which are written for PCP, which are a lot like comparing different archives, logging it, and predicting it, and responding to events. Can this also run on archives, or is it just live? If you configure a PM logger to store it in an archive, it gets in an archive. It's a normal PCP metric. You can do everything you want to do with PCP metrics. Any more questions? Thanks. That would be a very elementary question. But what actually is PCP? I tried to Google it, but I only find some drugs. PCP is the abbreviation for performance copy lot. And the project got started in 1993. It's a pretty old project. Previously it was closed to us, only for SGI, and then you open sourced it. It also works for 3BSD and stuff. But it's basically, it reads a lot of metrics, for example, from slash proc file system, or other agents. It stores the data. You can store it in archives. And it's very helpful, for example, support engineers at Red Hat. If a customer has a performance problem, install it. The tool keeps collecting the data on the customers' server. And then the support people can gather the data, see it locally, and then can hopefully see what caused the problem. Because usually, if there's a problem, the customer calls the support employees. And then usually the customer says, yeah, the problem was at 5 in the night or something. So it stores also historical data, and he can analyze the data. And there are also scripts with the customers. Yeah, it worked perfectly fine at 8 in the morning. There was a 10. We had some issues. We don't know what was it. And then maybe the support engineers can deduct. There was, I don't know, something. There was some log contention going on, or some other issue. So yeah, if you Google it, you should Google it for performance copy, or the full name. Still have five minutes? Anyone else has a question? I can also show the tables. Here we have the TCP connect. So if I make a TCP connect to the Cloud for DNSR, we should see, yes. Here in the TCP connect, we see that I just entered a call process, and it connected to the destination address 1.1.1.1.1 at port 80. And I have something running, which is a great tool. Yeah. These are the, yeah, a lot of different PPF test tools. Except it's empty because nobody's connecting to my server. And the other tools are the TCP sessions. This is, for example, which course of my processor got utilized. I have eight virtual course, and it keeps on refreshing. And if I close the browser, after one minute, all the PPF test drops get stopped automatically. And you can configure the timeout where you want. OK. I see there are normal questions. Thank you.