 Awesome. Thank you. I am unmuting myself and moving the doc around. So, hey everybody, thank you for joining. My name is Everett Pompey, and yeah, thank you for joining me for Run Fast, Catch Performance Gressions, and EVPF with Rust. This is going to be a little bit more of an extended session from my Linux con talk that I gave back at the Linux conference. So, all righty. As I said, I am Ever Pompey, and that's my email if you guys have any questions or anything after the talk, and that is the repository for Venture, which is a tool we'll be talking about here in a bit. All righty. So, what is EVPF? Kind of to start off with all the buzzwords in the title there. EVPF, if you kind of think about it conceptually, you have user space and kernel space, and EVPF, you take your source code and then you compile it down to EVPF bytecode. That then gets handed over to the kernel side and put into the EVPF verifier, which basically makes sure that it can kind of solve the halting problem essentially on it. And then as long as that checks and it does that, then it's good to go. And then it moves it over to the EVPF VM again inside the kernel, and it gets rough. And so there's a lot of restrictions on that VM and things like that and how that verifier works, and we'll talk about that a bit more as we go through. All right. So, kind of different kind of EVPF things that you can instrument and do with it. There is this thing called trace points, which means you can trace point any syscall within the Linux kernel. There's also a U probes, which are user space probes, which means if you kind of know function definition, you can watch that, which is pretty powerful. There's also K probes, which are allowing you to watch an instrument kernel functions. And then there are is XTP packet filtering, which is kind of one of the original uses for EVPF and even BPF before that, which was before it even gets into the kernel, being able to filter things really fast. And then there's LSM modules, so you can instrument any Linux security module hook and see what's going on there. And then what is Rust? That is a programming language. As you write that source code that we're talking about. And it has a focus on performance, reliability and productivity. I used to write a lot of C++ and now have kind of converted over to Rust and very much enjoy it. So, very excited that, you know, in Linux 6.1, it is now part of the build chain for the kernel. And so, but that is not what this talk is about. This talk is about writing Rust for the EVPF part of things and not the actual kernel itself. So, that is where we're kind of writing our Rust code to. So then what is a performance regression? Right on our last part here of what we're trying to prevent. It's the idea here is that performance bugs are bugs. A lot of times they get overlooked, right? Like folks might have lots of unit tests, but absolutely no benchmarks. And it just kind of depends on the application you're building on whether that's important to you or not. EVPF can add up to 100 milliseconds of latency if you really try. And so it is important there that like, you know, on every single syscall if you're adding that, that's quite a huge overhead. So, I would argue that it's pretty important here. And then production is also the most costly place to find bugs. So preferably we would shift that left as far as possible and find our performance regressions as early as possible. And generally performance regressions, like when do they get detected, right? They can happen in development and you can catch them there if you happen to be looking for them and you're like doing performance tuning and things like that. But a lot of the time it doesn't happen. And so it's possible to catch them in CI and automate it just like we do unit tests and the whole original idea of continuous immigration is that those tests get run automatically so you don't even have to think about it, but most folks don't have anything set up there. And so that means it ends up getting all the way production when things are on fire and users are upset. That's when you tend to find those issues. So kind of an overview of where we're going to go from here. But we're going to look at a basic eBPF program written in Rust. We're then going to evolve that eBPF program. And then we're going to benchmark that eBPF program. And finally look at continuous benchmarking for that eBPF program. So just an overview here of the possible eBPF tooling that could use. There's essentially tools to write it just in C. And then there's BCC, which allows you to use both C, Python and Lua. And that just interacts with that lib eBPF that's written in C over there. Or you could use Go, which is kind of a full stack Go solution. And that is possible, but we are going to be going with Rust here. Maybe by the talk you can maybe have imagined. So looking at the Rust tooling, just kind of the ecosystem here. There's a wrapper around the lib eBPF or lib bpf, which is written in Rust, which means you can write your user space code in Rust. But you still have to write your eBPF code that gets compiled down into that byte code in C, and the syscalls there are also in C. And then there was another library that came along called redbpf that allows you to actually write your eBPF code in Rust, which is super cool, but it was still using libbpf as that interface there with the syscalls. And then AIA, a really great library, came out and that allows you to kind of do full stack Rust here. So we're swallowing the pill here and going full stack Rust. So then now on to our first part that we are going to talk about, which is a basic eBPF program written in Rust. So this is going to be an XDP program just to keep it nice and simple. All that this is going to do is log the IPv4 source address of a packet that it receives. So in order to create this, again, we've got our user space and kernel space side of things. This logo, this little robot here is going to indicate user space, and this b here is going to indicate kernel space. And you'll see it up in the top right hand corner there to help us kind of know where we are at any time. So we're going to start off in the kernel space here in eBPF land. And so we've got this Rust function called fun XDP, which takes in a context that is a instrumenting an XDP call, and it returns a unsized 32 bit integer as its result. And so using AIA, we say this is an XDP, this is a macro that says we're going to be using XDP instrumentation here. And then based off of the result from a helper function we're going to look at in a second, we either return the value we get, or if there's an error, then we just sell XDP to abort that packet. And taking a look at that helper function, we get the Ethernet header. And then see if it's an IPv4 source address, otherwise we just move along because we only care IPv4. And then we then read that header. And then get the source address from the header. And then we log it. And then we pass. And that's it. Pretty simple. We're going to take a quick quick look here at the pointer at helper function, we're getting the source address, the header. And so this pointer at function, it's kind of an important thing to understand with EVPF here is that we're trying to really make that that verifier happy here and make sure that we're not doing reading any memory that we're not allowed to. So we're essentially passing in this offset here, and the context we take the start and the end. And then we take the length of the thing that we're trying to read. And then we make sure that our start plus the offset plus the length is, you know, not greater than the end and if it is we're going to error out because that tells the verifier that we are, you know, being good stewards here of where we're going in memory. And then we return the start policy offset for getting the pointer address, if it's good, and then balance. And then we also, again, to please the verifier rust has the possibility of panicking. So we have to tell it like we are not going to panic here that this is unreachable. So with that complete, we are going to move over to the user space side of things. And on the user space, we have a main function here, and we are going to parse some command line arguments and a net logging. We also have a path to our ebpf binary that we have created. And then we're going to load that, or bytecode and load that into the, the kernel. And then for that program that we knit, we're going to get a handle to it. And yeah, load that load that on in. Sorry, we've loaded the logging there. This is getting the program there and then initializing it. And then we're going to attach to it for the interface that gets passed in as one of the arguments to our function here. And then we'll wait until someone hits control C. And once they do, we'll exit, but in the meantime, we just kind of run our function. And so that is the very simple user space side of things. And so with that, we're done with our very, very simple ebpf xdp application. So let's kind of, you know, actually have it do something other than, you know, essentially hello world here, and of all of our ebpf program. So we're going to introduce the concept of maps. And so maps allow you to communicate between user space and ebpf land and said data in between them. So we're going to add a fizz feature here, which is we're going to push fizz into a queue, if the IPv source stress is divisible by three, and otherwise return xdp paths. And so let's go ahead and do that. And again, we're going to start out, or just take a look here, I guess, on our mental model of where the kind of map, map fits into things. So from ebpf, we're going to pass over using that map into user space. So that's essentially what we're doing. Cool. So we'll start out taking a, taking a look at the map here. The code for that kind of map logic that's shared between both user space and the kernel is pretty simple. It's just the message that will be passing. It needs to be representable as C. We need to be able to clone and copy it. And this is a AS sort of thing, which is on the user space side, right, we need to be able to like know that this is something that we can kind of deserialize from the kernel. That's what that there is with the pod. And then also on the user space side, just for being able to see things nice and pretty, we can implement debug, but in the kernel and ebpf land, we can't do that because there's no real concept of standard out and printing there. So that is the map side of things. And now we're going to go and look at the kernel. So with those kernel changes, the first thing we do is declare a map, which is saying that we have this queue, and we're going to, you know, have up to 1024 messages on that queue, and those will be source address messages. And we're going to update that try find xdp helper function that we looked at just a minute ago. And instead of just logging, what we're going to do is check to see if the source address is divisible by three. And then, if so, we go ahead and push it onto that queue. And then we pass. That's, that's basically it. And then on the user space side, we also need to make some changes in that main function. We're going to create a spawn agent helper function. And then kind of just wait the same as before. And so if we look at spawn agent helper function, we have to create the user space side of this map, which is again that queue. And then once we have that, we can just loop over reading off of it, as long as there's items in the queue, and then we print those messages from the user space site. So, with that in place, that's kind of it for our v one of really our fizz feature here. And then let's look at adding a simple update to that right. So this is our fizz buzz feature. I'm going to please you guys kind of know where this is going here, you push vision to the queue if the IPv source address is divisible by three, buzz of divisible five, or fizz buzz of both. Otherwise just return XTP pass. So, then we're going to go and take a look now at the map, and see what needs to get updated there. It's not that much. It's just these additional message types so we had fizz before. And then on the kernel side of things. Again, we have to update our logic, but it's just implementing fizz buzz here, based off of that is what message we send. And if so, it gets sent across just like before. And then other, you know, at the end we just return XTP pass. There's really no changes to make on the user space side of things. So we're done. So that is our next iteration of the code. And now, you know, let's, you know, kind of make it even better, right. Let's make another seemingly simple update, which is fizz buzz fibidachi. Okay. So, push fizz, you know, divisible by three, buzz of divisible by five, fizz buzz of both, except if the remainder, if divided by 256 of the IPV source address is part of this fibidachi sequence, then return fibidachi. Otherwise return XTP pass. So, we're going to go in there into our map, right, we have to add an additional message type which is fibidachi. And then from there we got to go over to the kernel and update our logic again. So, change the logic there. And now we're going to be calling this is fibidachi helper function before, you know, checking to see what message we need to send. And that is fibidachi function, just checks to see if the number is part of the fibidachi sequence. Nice and easy. And then the user space side also stays the same. So we're good to go. And everything's totally fine until three weeks later when our customers are very upset and things are on fire. And we spend the whole day tracing our tails trying to figure out what happened. And that's when we finally get back to this, you know, chunk of code here and we're taking a look at it. And then we decide to, you know, investigate that is fibidachi function. And we think, oh, man, we are recalculating the is the fibidachi sequence every single time. You know, this could be drastically improved. So we're going to, you know, make a fourth version of our app here memoized version, in which we, you know, simply check whether it's against the because there's not really that many numbers in the fibidachi sequence below 256. Right. So we can just check this. Nice and simple. Now, I don't know if you guys notice anything here. But I asked chat GBT to give me all the Fibidachi numbers below 256. And this is what it gave me. But I don't know if you notice. Yeah, are indeed missing. So robot overlords aren't quite taken over just yet. So with that in place, we are now able to go play firefighter, put out the fire and production and solve everything. But like, why, why did I have to get this far right why don't have to get all the way production before we could fix things. Why can't we shift this left. Right, so we're going to take a look at trying to catch these performance regressions in development and what we can do there. And now let's look at benchmarking and even if you have program and rust, because you can't prove, but you don't measure. So, before we dig in too much, we kind of have to understand micro versus macro benchmarking micro I think it was like unit tests. So this would be checking our is Fibidachi function directly. So micro would be doing something kind of more the integration level equivalent to integration tests. And this would be that spawn agent function that we had over on the user space side of things. So first we're going to take a look at the micro benchmarks. There are some options for doing rust micro benchmarking. There's lib test bench, which is built into the Rust standard library, but it's nightly only, and it's currently in the process of some pretty drastic revamping. You have to use an unrelated crate, also called venture, if you want to use it on stable rust, and it's not actively, you know, being developed for like, keeping with what it's doing right now. And as I said, it's going to be, there's a lot of churn, there's a lot of things that are about to happen there. So, keep a lookout, but probably not the best choice. But there's criterion, which is a available available on both stable and nightly. And it has kind of become a de facto standard in the community here. And it is much more feature rich and allows for a lot better analysis comparison of your benchmarks. And then the final option we're going to look at here is AI, which also available on both stable and nightly but it's considered experimental. And it's set from the same crater as criterion. And it does things a little bit differently to single shot benchmarking using cash grind so it counts instruction counts, I want accesses L2 access Ram and estimated cycles. So, this can be really great for kind of single shot benchmarking, as opposed to using wall clock time. Just to kind of keep things simple here, we're going to use criterion. And so yeah, so let's take a look at using criteria and then for micro benchmarks. We're going to go over to this map section. And what we're going to be doing is actually refactoring code from the kernel side into this map section since we're able to kind of access it a little bit easier. And so in our dev dependencies, we're going to add criterion. And then we're going to add our benchmark called source address, and we're going to say harness is false since we're using criterion and not the default built in benchmark runner. And so we're going to implement for that source address kind of message type that we created. We're going to implement a function for it where we just pass in the source address. And here is where we calculate what message should be sent back based off of the source address. And so this is the same exact code as our last version of the application. So if we take a look at kind of the, this benchmark here, we taking a criterion object, or us, or source address benchmark, and we're going to run it for every number from zero to 256 trying to create anyone. And then this is kind of set up here for criterion is creating a benchmark group, and then our main function to run this benchmark. All we then have to do is run cargo bench, and our benchmark to one. And so we will be able to benchmark the source address there. So that's micro benchmarking. Now all done. Nice and easy. So micro benchmarking down here, this is going to get a little bit more complicated. Because we're having to test end to end with the ebpf side of things which means we, you know, can't just move things over from the kernel side we really need to instrument that. And so there's a couple of different ways to instrument kind of looking at ebpf. There's kernel bpf stats enabled. That's both the runtime nanoseconds and the total run count all ebpf programs so it gives you the total amount of time it took and the number of times it ran. It was added in kernel version 5.1. It's off by default. And it's off by default. Okay. Then the next one is BPF tool program profile. And that is sort of like III but on the the kernel side of things so it collects perf counts so there's instructions, loads and misses and cycles and things like that. And so if we were to be using III this might actually be a good compliment on the integration side of things. It was added in 5.7, but it does require BPF tool to be built with Clang greater than version 10. And another one in BPF tool is program run, which is pretty neat. It lets you run a specific ebpf program and then like provide its input and context there, and it will return its output in the data context as well. It was added in 4.12, but only for specific ebpf program types, which I've listed here. And even though XDP is on that list, like most of the ones folks are going to use aren't here and so instead we're going to go with the BPF stats enabled for our example. Alrighty. So in order to make this work we're going to have to make quite a few changes here on the user space side. So in this function we're still going to parse arguments, but now we're going to add this shutdown Boolean, and we will pass that into this helper ebpf run function. And then we just wait this before. And then if we get an exit, then we tell the shutdown that we are indeed shutting down, and then we exit. And so let's take a look at that ebpf run function there. And this is going to return this process structure, which is the process ID, the program FD and a handle to a running process. And so this will become important for being able to both run this as an application but also cleanly started up and shut it down when we are benchmarking. So we get the process ID of what is running here. And then we get the program and load it just as we did before. And then we're going to spawn it as an async sort of process so it's kind of running, you can think on another thread or green thread sort of thing, and using the spawn agent helper function. And then we return the process information. So, taking a look at that spawn agent helper function, it takes in the bpf that we're running in that shutdown pool. And so we create the map, like we did with our previous helper function, and then loop over and get the source dresses and print them out, like we did before. But the only real difference here is that we're checking every so often that we should indeed exit. And that's it. And then on to the custom harness. So, this is basically kind of recreating what criteria did for us and the micro benchmarking. And so we just have a benchmark in the name and a function. And we use this great called inventory to help kind of collect those up. We don't have a macro like on criterion to make our main function. So we're going to do that ourselves. We have results are back of results. And then so for each of those benchmarks that we register with inventory, we're going to go through parse the benchmark name, and then we're going to run that benchmark function. And then we're going to take the output. And we're going to assume that it's in the JSON format that we would like. And then we'll kind of push it onto our results here. And then we're going to parse those results as JSON and then put it into a string, save it to a file. And we're going to go. So, then that's us adding a specific fun XTP benchmark to those that we're going to use. And so taking a look at that there, you're essentially just going to return a 64 bit floating point as the results of however long, you know, it takes for this benchmark. And so we are going to need to spawn a new runtime, create a new shutdown. And then with that, create a copy of it and spawn our process very similar to what we did on the like user space, kind of main program side of things, right, using that run helper function. And then we're going to do some work, like go into a cool website like venture.dev. And then we're going to get the stats. And then we'll finally shut down and return those stats. And so taking a look at the get PPF stats. We are going to go into the PROC FD info file for our specific process and the file descriptor that were given for the ebpf program. And we're going to read that file and then just basically loop through until we find those runtime nanoseconds and the run count that we wanted. And then if they're both there, we basically just calculate the average runtime. And otherwise, we return zero. So either we get the time and return it turn the average or or we don't. So there we go. So now we're actually can run this and, like I said, BPF stats enabled is disabled by default. So we have to set that to one. If we want to be able to do this. And then once that is enabled, we can build our ebpf binary, or it's not quite binary, but code that we need to be able to hand the kernel. And we CD into our user space side of the code. And because this has to run ebpf, we need to run it with elevated privileges, which is why we're sudoing here before we basically run what's equivalent to cargo bench. And then the output of that is going to be this. And that results here from our fun xdp function in JSON format. And we basically have that value there for the latency. All right, so that is the user space side of things. And that's it. And those are all the changes that we need for the micro benchmarking so a lot more work here on the user space side for creating that that custom harness here. And maybe kind of a lot that is a how to build a REST custom benchmark harness hidden here inside the talk as well. So that is the macro benchmarks, all taken care of. And so now that we've benchmarked our ebpf program, we're going to kind of look at continuous benchmarking, which is sort of comes to this question like, does your software performance matter right. Well, which I think in most cases with ebpf. And where are use cases is the answer that's yes. And like, but looking at more generally like software where performance matters there's kind of a scale. Right. I like to think about it. You've got kind of performance criticality here. Ascending going down. And there's enterprise applications where, you know, the buyer is not generally the user. So the performance of it doesn't really have to be all that great. Then you got kind of more business level applications where, you know, it's sort of nice to have and consumer applications where, you know, folks make churn. It kind of doesn't work but it can be, you know, like a React webpage. It's fine. It's kind of farther down here and kind of a bit of a gap away. You can have database software and kind of library software and system software, which is definitely where we are with the kernel and ebpf kind of sits somewhere in between here between the library and system software. And so that's definitely us. And then how to track the software performance. Right, because not all software has access to production. And again, you got another axis here. And if you think about this is like access to production at the very far end down here, you have Dynatrace, Sentry, New Relic, all these like Datadog, these APM tools that do give if you have access to production and let you instrument everything and see all that you want observability and things like that. But for the folks who either have very limited or no access to production, there's not really all that many good tools for doing this. And that's kind of where better fits in. So the idea here is that instead of waiting for things to get all the way to production, which is too late. We've shifted as far left, but then running things in development. The problem is that that's local only in manual, and for the same reasons that we started running integration tests and CI. We, in like unit tests. We also want to run our benchmarks there. And so that's the idea behind continuous. So venture is tool open source tool that I built in order to be able to do this. And it's both self hostable and there's a SAS version at Bencher.dev. It has multi-tenancy, multi-language support through adapters. So you can use things other than REST with it. And then it has statistical thresholds and alerts. So if that threshold gets exceeded, you can have it fail your build and even comment on your pull request. And then it also integrates with GitHub actions to make that really easy. Kind of the idea is that to track your benchmarks as we went through those versions of our application, right, the first couple were kind of fine, maybe add a little bit more latency. And that final version is really what, you know, blew the top off and should have generated nowhere for us. And that's about detecting those performance revisions. And so, like, this is actual dog fooding here with Bencher being able to kind of go in and detect performance anomalies here. And so, yeah, this is an example of this statistical threshold set there between the main branch and a merge request branch. And the way this works with detecting those anomalies is again using kind of statistical thresholds. So as you're kind of like get those first couple data points, they're going to be here near the mean, essentially. And then that last point is going to be out here farther down the distribution and essentially where that green section ends is going to kind of be the cutoff of what is acceptable in the distribution function. And then you generate alert when you exceed it. So with continuous benchmarking with criterion, trying to do this with Bencher. It's pretty simple. You would just do Bencher run and you get the use the adapter for us criterion. And then you run cargo bench. And then you can pass in the air flag if you want, which is the thing that will make it fail if a threshold is exceeded and alert is generated. And then on the macro benchmarking. You can do Bencher run again. And then this adapter is going to be JSON because we created that JSON output. And you're going to read in the file, and then run that command. And again, you can have it fail in CI, if there's a problem. So that is what kind of venture allows you to do to catch performance regressions in CI. And in order to add it to GitHub actions nice and easily, there's actually a built in GitHub action to download it here. And, you know, if that branch exists, it'll it'll run there's a bunch of documentation on the venture web site about how to set this up and how to get things integrated with GitHub and be able to comment on your pull requests. And you can go into that more, if you're interested. So, in summary, continuous benchmarking like detection leads to the prevention here. Production is too late. Development is local only and continued benchmarking can save us a lot of pain. And also don't reinvent the wheel again. This is a prior prior section of folks who have had to reinvent the wheel here because there hasn't been a like easy to use readily available tool to do this before. And so everyone just kind of has rolled around. And so yeah, so I'm hoping to save you a lot of work and make that easy going forward. This is super high dimensionality, which is kind of one of the things that allows it. I feel to like offer to you be able to use it for just about any project that cares about this, because it compares things both based off branch. So such as your main branch, the test bed so if you're running on like GitHub actions that'd be like Ubuntu latest. Specific benchmark in our case like fun XDP. And then also the specific metric kind so like latency. So here is an example of running four different benchmarks and comparing them all. They're running on the same branch test bed, but they're different benchmarks and the same metric. And then if we wanted to kind of compare instead to something like the develop branch versus the main branch. Then we can also slice and dice that way again so this is the same benchmark but now we're comparing the differences in in the branch that we're running this on. So this is helpful both for kind of looking at things like this with comparisons but also when you're creating those thresholds, a lot of the other tools that people tend to make are a little simpler, and that they can't handle this difference and so you say can't like separate out and say we're running on a Windows box or Linux box on or like a Mac here for the test bed, or even if it's both on Linux, you know this one's arm and this one's x86 and things like that. And so for a lot of folks they do want to be able to do those comparisons and so this high dimensionality is super important within venture. This is a public perf page. So this is a for public projects, you're able to kind of be able to direct people and when it comes up comments on the PR this is easy way pulls up for you to be able to see exactly what's going on and track things over time, and kind of do those slice and dice comparisons based off the dimensions that I was talking about. This is publicly available you don't have to sign in kind of thing. These also the ability to embed your perf blot so if you want to kind of stick it in a bigger site or somewhere in your documentation or things like that. You can also, if you just hit the share button. It's, it's there to be able to embed it. And there's also static images versions of these so if you want to be able to embed them to your read me, which I think is pretty nifty. You can't have dynamic content and GitHub read me, but this all the dynamic stuff is done on the back end. So it's a static image with a very low refresh rate, basically zero from GitHub point of view so every time they pull it, we regenerate this image for you of exactly what your, your plots look like. So, so that's kind of continuous benchmarking there with venture. We kind of our overview here we went over creating a basic ebpf program and rust. And then we evolve that ebpf program, right, add fizz fizzboss and then fizzbuzz Fibonacci. And then we finally fixed our performance regression there with fizzbuzz Fibonacci. So how to benchmark our programs, both on the micro and macro benchmarking to try and prevent, figure out how we can prevent those performance regressions in the future. And then finally continues benchmarking and when it's applicable when we care about performance. And if so, what do we do about it, as opposed to just kind of rolling around solution there. That's been run fast catch performance in ebpf with rust. Again that's the repo link and if you don't feel like typing that much you can just go to venture.dev slash repo, and that will just forward you there. And yeah, if you guys like the project. Please do give it a star it actually helps. People use it to, you know, kind of think about projects so yeah. Everybody said you can use this to benchmark any project. What is a project largest part project you are others used to benchmark. Yeah, I think the largest public one would probably be Hydra, which is a database is using it. There are a few others who have picked it up and they're using it there. And so, I guess the key question is like, are you, is the concern in size there like number of benchmarks that are being tracked or. I was just looking to see the kind of projects or software that could benefit from venture and how it's wanting to learn more about how it's being currently used. Yeah, who is using it. Yeah, yeah. I'm, I am using it. So yeah, there's quite a few teachers who have signed up and used it public projects are probably the best place to kind of go and look for that. But yeah, I know for sure like the Hydra folks might be kind of a good example to talk about and using that to track their database performance. Yeah. So it tends to be kind of like library code and those sorts of things where there's no production environment for them to kind of instrument into. So if you were to, if somebody were to look into how they could use it to benchmark kernel per se. So any of that would be the libraries. Is that correct or. Yeah, so the key, the key part is like venture is benchmark harness diagnostic. So there's these adapters that are there to make it easy for folks who kind of use pretty standard benchmarking harnesses to be able to they don't even have to actually in the docs here I said that we're specifying the harness. You don't even have to specify it venture can just know it. It kind of rocks which harness you're using without you telling it. And. And so, there's that, but then if you say we're doing kernel development and you have a special benchmark runner or harness that, as long as you can output that Jason, that venture understands. You're able to use it. So it's kind of completely agnostic to exactly how the benchmarks will run. So, yeah, so like the way that the project works and things like that. The way I designed it specifically is that this could be used across basically any project. Pick a benchmark that kernel benchmark and then hook it up with venture and make sure that it venture can understand the output Jason. Yeah, it's a, it's like, it's a pretty simple. Format like, let's go to the docs. So this is basically the format here, like as long as it's the benchmark name. This is the metric kind so latency. And then these values, the lower and upper value that's, if it output something in that format venture can can take it. Do you all of its magic. That sounds good. So let's, to tie this kind of back to the talk here that custom. So, like venture obviously wouldn't have a adapter for a custom benchmark harness that we did for macro benchmarks. And so that's why when we had this here, we output it. This is the benchmark format here, which is the BMF the venture metric format. Does that answer your question. Yeah, yeah. So we, we have some benchmarks we run in a kernel performance benchmarks per tool runs part of the kernel. Yeah. And that outputs the, so if we wanted to use venture to do we automate some of that benchmarking, and then, then we have to hook it up with the output to be Jason. So, you know, bench is what I'm thinking about when I'm thinking about this is a bench that in the corner that we use. Okay. It should be, you know, like fairly simple to get it to play nice together. So, and then yeah, it generates, if it generates that then that could be made the benchmarks could be run on the kernel we do run them. And people do run the benchmarks on as kernel new releases are coming out. Yeah, it tends to be folks like right now, it's either it's manual, or it's kind of like a homegrown solution. You know, the kind of like, like Russ perf is an example of one where they've, they've kind of. I've taken a lot of inspiration from them, but like, you know, it's just for the Russ project. You can look using it for anything else kind of thing. So, do you guys have, like, Derek, is it time to just be someone like running it on a box they have at their house to compare it or do you guys have like a hardware setup where Well, there is the kernel CI that we have, they run primarily tells, and I don't know that I don't know if they run performance benchmarks, I tend to run them manually. At the moment, you know, when I am looking at a performance problem, or something looks off or I want to get a generally see what's happening and per developers kernel developers they run them. Of course, they're continuously using them and running them, but and then various users run them as well. There is a user base that's on some but most of all of it is manual. Okay, it would be manual. Okay, who would be the like rate rate folks, albeit plumbers. Next week. Would there be any folks that would be good to chat with about that. So there is, if you're going to be at the plumbers, I'm going to be there as well. So that there is the performance stocks that are happening. Yeah, for either I don't think there is a micro conference, but there is a per. Talk, either at the Colonel summit or referee track, check those out. I don't remember I do remember seeing a perf talk in the list, I am on the program come to you and I'm trying to remember if I have seen any I think I have seen a one or two perf talks. Okay, that would be on the performance like I mentioned the perfect bench tool that we have. Yeah, perfect. The perfect tool does a lot more than benchmarking. It does a lot of other things but benchmarking is one of its things. You can just pick up the. If you have a kernel repo, and you're running the latest kernel you can you can just go just say perfect bench CPU for example it'll run benchmarks on the CPU or perfect bench mem or perfect bench all it'll by default run CPU memory and I think it'll run it on the schedule. So, that sounds really good. Okay. So, yeah, look, look those up on LPC. I think it's scheduled out you can take a look and see if. Yeah, I think the schedule is there. I started taking notes on on what I might go to so. That sounds good and any questions from anybody has been a very silent group today. Any questions on the importance of getting led not letting these escape to the product and any questions on how expensive could it be if it were to escape to the to release after release or finding benefits of finding it now. Yeah, it's like 10x the cost or something. Yeah, the cost. Yeah, depending on depending on where the problem is really in some cases the problem could be depending on what you're testing right if you if there is a testing off you have a product that has the vertical of hardware software hardware and Cardinal and all of those and the problem could be somewhere in the summer in in the place you can't really easily fix once it's deployed. Yeah. 10x. Sweet. Well, the folks kind of want to want to take a look at venture here. Yeah, go ahead. We have time. Okay. So, yeah, if we go back here. This is me dog fooding venture with venture. So, this is the. Okay. And so this is the latest report, which is from when I pushed things up on Halloween to release here and these are just kind of those adapters that I was talking about. We track those. And so these alerts here are me playing with how tight can I get the threshold on GitHub actions just using the default runners. And so there's kind of a couple of thoughts here on, you know, how to run your benchmarks because you've, you can, you know, get a nice pristine isolated bare metal box that you run your benchmarks on. And, you know, only use that single runner for all of your benchmarks and like control, you know, like climate control the room and everything and like, you know, to get it as much noise reduction as possible. And, you know, and somewhere from there to kind of what I'm doing here which is just running with the default runners on GitHub actions. So if you kind of look at this boundary, you can see me sort of playing with it around over time and still trying to figure out where, you know, even though this code hasn't changed it, you see kind of the variance here in the runs. What, what's kind of the best way to handle that statistically. And so like what, what's the value and how much, what's the noise threshold there. So yeah, that that's kind of a thing. I'm also looking at with venture long term to make it easier to do those dedicated runners that might be a little less noisy than CI. So if anyone is interested in that please reach out. I've been working with some of the kind of higher profile rust projects to that are also looking at this to kind of see what they want to do there. And so there's that. And so yeah, if we kind of look at just take a look at one of these alerts, even though this is a false positive right now because of the, you know, playing here with the constraints but if it wasn't if we wanted to kind of look at it, we can come in here, and it will tell us kind of where it occurred, and the kind of upper boundary that was calculated and what our actual value was. And so that we exceeded it. And we can also dismiss that alert. So then if we go back to the per plot. The alert, I guess we'll, we'll refresh here, get easier that alert goes away. So you can dismiss alerts based off of how you want to handle them. And as you can see, I have a bunch of alerts because I've been playing around with it here, as we kind of set things up. The metric kinds are essentially the units. So these are reports which is the running of all those benchmarks together, which is what we were just looking at in the plot. So metric kinds, by default, each project starts with latency and throughput. But you can add additional ones. So if you're using something like that I I adapter, there's actually a built in adapter for that and venture, and it'll create the four metric kinds, the kind of, you know, instruction counts and things like that. And then before you, or like with the kernel, if you're using perf bench, then this will be, you can add whatever your kind of standard units are that you you want to care to calculate there, if it's something different than these. And then with the branches as you can see, these are for merge requests, I did not create these these are the way I configured things in CI these are created from my. branch that they're branched off of so this happens to be develop. And so the, the branch here basically does a shallow copy of that branch and then any new metrics that are added on top of it are just for that PR branch. So that allows you to try and catch those performance regressions in CI by, and like on PRs before they actually merge into your major branch, main branch. When those branches get cloned, it also clones the threshold, which will take a look in a second. So there's test buds, which by default you just start with local host. This is my bun two latest test bed, which is used for running things on GitHub actions. And then the benchmarks. These are the benchmarks that I have set up so far with venture, which is just for those kind of, there's more adapters but this is just kind of showing. The Azure adapter, which allows you to not, it's the default. And so if you don't specify an adapter. The parsing is smart enough to figure out which adapter should be used and use that. But it means it's a little bit slower. And then these are the thresholds that I was talking about. So, essentially, if something is a merge request running against a file or main, it will essentially clone the threshold that is there. And then use that in its processing. So this is the example of that cloning students T test with a point nine percent point nine five percent cut off there, which means like below in that cumulative distribution function with a maximum sample size of 30. So that's the thresholds. We took a look at the alerts. Here. And so these are the alerts for the different benchmarks that we hit that we kind of looked at here a minute ago. So yeah, that is the And like being able to use statistical thresholds, you can also visualize, visualize the lower and upper boundaries here. It's probably easier to see if we're only looking at the single plot. So if you do have those lower and upper boundaries and much variance, you can also visualize those there's ability to right now this is based off of date. So this is days in October here, but you can also change the x axis to be based off of the version number. And so there's just kind of increments with each version you push up. So if you just kind of want a more normalized view of things as they kind of go through here. That is available there. So, yeah. Also for folks who are non native English speakers. I have translated the docs using GPT for into eight different languages. So, yeah, hopefully one of those might might be more useful for me. So, I'm just looking at the venture dev. You have a self hosted and venture cloud. So what does venture cloud entail if a project wants to use venture venture cloud. Yeah, great question. So on the venture cloud is essentially the same thing you can, it's the venture self hosted, like I'm just running it for you on venture dev. And so the, the just thing is you don't have to worry about running it. And, yeah, that's, that's kind of the key pieces. I am trying to, you know, pay, pay and maintain it by charging for private projects. But if you're a public projects, then it's completely free to use. So, so that is the cloud one that you're talking about. Um, that's both it's that's both. So like, but I mean if you put it behind your VPN, and no one can access it, you know, like, as long as you're okay with folks who have access within your VPN, being able to come to this projects page and take a look at things, then, you know, that's, you're fine. Okay. That's all good. Thank you. Any questions. Whatever it today. What would you would do if there are developers that want to get involved in your project, either by improving venture and such. What would they do. Yeah, great question. Um, so I have a couple of good first issues tagged here for different adapters. If any of those are kind of pertinent to folks, probably within this community, the like hyper fine adapter might be a good one which is like CLI tool for like benchmarking binaries and things like that. Which might be a good one to start with. And it should hopefully be very easy to get started contributing. I have a dev. You can use both a local dev container or a GitHub code space to start developing. So if you don't want to have to kind of download and install everything, you basically just click those. And I have script setup that will auto set up everything you need inside of there. So that dev container will just create a that works on any operating system. Yeah, so the, the, the dev container is, if you want to do the epf epf stuff, you need to be on legs. But but if you just general venture things than yes, this this dev container should spin up and it also when it spins up, it also spins up the development back end server and UI. And so you're kind of like ready to go out of the box with with an example of the kind of PR comment that you'll get with venture over here. Yeah, so like if you have an alert, this will pop up and you can use a plot and view the alert, things like that. Yeah. Thank you very much. Awesome. Well, thank you guys so much. Yeah, I can wrap us up. Thank you ever and show up for your time today and thank you everyone for joining us as a reminder this recording will be on the Linux foundation YouTube page later today and a copy of the presentation slides will be added to the Linux foundation website. We hope you're able to join us for future mentorship sessions. Have a wonderful day.