 what are we using, and maybe spark some interest from the community to contribute. So everything starts with a patchwork. Patchwork is a patch tracking system. It's open source. It makes the patch management process easier for maintainers. After you configure it for your subsystem, and like in our case, BPF, all patches that submitted to BPF at VGER will appear on the patchwork netdev BPF list. There it will show status of which patch that we tested. It knows how to collapse patches from the series into one entity. And it can be enabled for any Linux kernel subsystem, and in fact, many already did enable it. So for example, for BPF, you can see the list of patches, and you can see that like some of them have, oh, sorry. Some of them have warnings, some of them failed. Warnings are usually coming from some netdev tests, but BPF tests are usually passing. If we'll open the individual patch, you can see that it knows if it is part of the series, and it also knows which tests succeeded, which tests failed, which is pretty awesome, and hopefully useful for maintainers. This is the list of small subset of kernel subsystems that already onboarded to patchwork. Here you can see netdev BPF, there is XFSDouble, and many, many more. So, but what is a black box that actually takes a patch from patchwork and does all the magic, right? And this is the kernel patches demon. So kernel patches demon is a service that's implemented in Python, it's currently hosted on the meta infrastructure. It tracks and identifies new BPF patches in the patchwork BPF section. It creates pull requests to GitHub, to kernel patches BPF wrapper. It merges GitHub actions on top of it. And once that's complete, it will fetch the results and post it to the patchwork. So this is how pull requests actually looks. So for this patch series, it has four patches, it adds CI files on top of it, so it will create it into five, and it submits it for merge into the kernel patches BPF GitHub wrapper. And if we look into the kernel patches VM test, this is basically the commit that goes on top of the patches series. It has this dot GitHub directory that is doing all the magic. Inside it has like step definitions, how to execute, like what to do, how to execute tests, everything. So we added kernel patches VM tests and kernel patches BPF wrappers. So what is kernel patches BPF? It just GitHub wrappers that mirrors BPF next and BPF Git repositories from kernel work periodically. And once PR is created in this GitHub wrapper, GitHub will start executing tests defined in VM test repository. And yeah, this is just a screenshot of kernel patches BPF is just regular kernel tree, nothing special, no magic. But where we actually execute all these tests and these tests are actually executed in GitHub action runners. We are using self-hosted GitHub runners. We currently support two platforms, X86 and S390 to test end-endness, right? But we can obviously add more architectures. And like if we had cross-compilation supported what we discussed with Dave yesterday, right? We can build things on X86 runners and execute them in architecture-specific VMs. So that way we can support pretty much any platform. GitHub runners, they pull work from GitHub, build kernel, build self-tests and start the QM to execute everything. This is a screenshot from the runners page. You can see that for X86 we use AWS instances for IBM, we use IBM cloud instances. They are all registered, we can control them. So transparent and convenient. So what we actually run, we run a regular BPF test. In particular, we run test prox, test maps and test verifier and we run two flavors of test prox. One is regular and one is no ALU32. So this is an example of successful run. You can see that we have four tests executed, all is passing, all awesome. This is example of failed run. Test maps failed, we posted error here and so maintainer have a pretty strong signal about if Papech has any issues and can direct submitter if any issue happened. So what is next? We have lots of ideas how to improve the system. We want to add user space and kernel space sanitizers. We want to support Clang build. Currently on Clang fixes all Clang issues. If they occur, thanks a lot on Clang. We want to add ARM64 platform. It should be pretty easy with AWS Graviton instances. They're available. We can just add them to the infrastructure. We also want to extend BPF CI support for other Linux subsystems. Some of these candidates is better fast with XFS tests or RCU with RCU torture tests. This just to add more weight to what Joseph mentioned yesterday, right, during the lightning talks that all maintainers will benefit from it. We should also continue adding new BPF soft tests and iterate on BPF test procs infrastructure to make it more usable and to be more convenient. And one thing that will benefit almost everyone is to make CI executable from local machine. So even before you send the patch, you can actually initiate the run, get the results, and you'll be pretty confident about your patch going to the maintainer for the review, right? Have you, so we have some CI tests that run like nightly on BPF next that do like longer running tests. So we run like net perv and then we attach a bunch of programs and stuff. Would that, we reconsider adding that to something like this? Or do you want to just keep it like to these unit tests, basically level stuff? So this system, it tests patches that weren't applied yet to the tree, right? What you are saying, if I understand correctly, is testing the tree nightly like after patches already landed, right? So we currently don't support it, but we can definitely add it. I'm not sure if it will be through CI or not. Sorry, through GitHub or maybe we'll need to. So we have like on GitHub, we basically have like a nightly job that runs out of the GitHub jobs. And then it runs kind of these longer running tests to make sure that we didn't break, you know, at some point during the day, some patch wasn't applied that broke something, but wasn't detected that the unit test only breaks when it's running for whatever, long running networking stress test or something. I mean, it's something we could contribute if it's interesting. Not so long. I mean, like we have some that run for minutes and then we have like one longer run that we run on like Sunday or something, right? Like runs for like overnight or something. Yeah, so one problem is that like all this is run for every single patch set on every revision. And also whenever we update the BPF next or BPF, right? So like all that is re-triggered like once we land some patch, right? So if it like runs one hour, then like we just don't, well, we can have a lot of workers, right? That will run it. No, no, like we trigger it like on nightly, like a nightly test, I just... We could, yes. So like I think it all will be or is already configurable, right? Like which tests are running where we, you didn't mention, right? Like we have the BPF and BPF next baseline tests as well. So like we have like two fake PRs with like no patches. It's just like they are testing BPF and BPF next. So we could add those long grinding ones there. Whether it comes from like the kernel sources or from outside kind of secondary, though it would be probably easier if it's contributed to the self-test. So I think like at least every time I saw like nightlies, the way they used is by people who are playing a catch up. Like in this case, you have a selium stuff. It's not part of this CI. That's why you have to do it nightly. That's why you test after it's got merged, hoping that while you catch all this regression quickly. But instead, if you could contribute selium tests in whatever capacity, either upstream to self-test or to this BPF CI, your need to run nightly would be greatly reduced. Because we will run it for you. So we use something already very similar to this on the Windows side too. And there what we run nightly is things that are very low probability of catching something, but take a long time. And so if it takes a long time and there's like less than a 1% chance it would ever catch anything, then sometimes you wanna run those nightly just to increase your agility. You can merge patches faster because your probability of there being a problem is not worth running and delaying everything by consuming all the CI cycles on a per patch basis. And so most of the tests, unit tests and things that are short or whatever, then before and on every patch, right? But if they're very time consuming and gosh, it's found one problem in the last two years and it runs for minutes, then it's probably not worth it. And so we run those afterwards. But otherwise, almost equivalent to what you said on the top, top bullet and the bottom bullet is what we do and what John said, so. Let me just add like, the problem with every patch is like some of the things are like, very like, like KTLS, for example, like it really depends on like the packets that are coming into KTLS. Maybe there's one like combination that we see, but we never see it very regularly, but we might catch it in nightly. Like if we run for like an hour, maybe every third day or something, we catch it. And you would never run your CI probably for an hour for every patch, right? Like just the cost would be. The question is, what's chasing those cycles? Right now they're on my list of to-do things and... So could we add synthetic traffic to the test and then put some of these weirder use cases we find into a synthetic test essentially? I mean, I think the answer, like one thing we're looking at right now is like, can we encode these in packet drill? How we've been talking with like some of the Google folks about like, can we get a better packet drill that like builds all of these tests and then runs from CI and sort of strange packet patterns, right? To catch some of this. But I, you know, am I to do this too? As well. Just like one comment regarding the regression, right? I mean, the K build bot does some of that, but it's maybe not always reliable or I haven't seen many reports in the past, but it's also like a question of how much do you fluctuate or like how stable is your environment where you actually can run this reliably with those results that you want. Yeah, performance test, yeah. So the Jason's point, right? Synthetic workloads are great. The question is, you know, are they fixed? It's always exactly the same synthetic workload or is there randomization at like a fuzz testing perspective, right? And so there's actually use cases for both of those. And so there's synthetic ones and you can treat them as a, you know whether there's gonna be a regression because it's predictable and there's no flakiness or probabilistic stuff to it. If it's randomly generated, then it's more like a fuzz test, right? Where you could have random failures because you suddenly hit a path that's so unlikely but you finally hit it in the fuzz test, right? And so for that, for any fuzz testing run, the fuzz testing run will output the seed so you can always reproduce it by putting the same seed back in even if you were to run the test locally, right? And so you can do the same thing with synthetic network workloads if they're randomly generated or have some randomness in there. As long as you can output the seed in the wrong then you can reproduce it, so. Yeah, I agree with that. The thing with that is if we, like I think most fuzz testing tools that are public have like a corpus where you build it, so you have the known set and then it just starts permuting these packets in horribly broken ways, so. The seed for the permutation is important to captures to be able to reproduce when there's failures. But I mean, I would regard this infrastructure mostly for the correctness and to avoid catching regressions. So it's like more about the self-test but maybe less about the performance stuff, right? So, yeah. So I think adding more architecture is definitely super useful, especially our M64. And one thing maybe also like the testbpf.kernel module. So I think at some point this wasn't my to do but I never got to it. But I think it would also be useful to run because there are lots of tests that may not be in the self-test themself. Ah, okay. So it also has a battery of, like given it's inside the kernel as a kernel module it's mostly testing corner cases around the JITs, right? And there were some guys in the past who added a lot of new tests because he was developing the MIP64 JIT. And right now they are not run as part of the CI. And I would love to see that happening. We can add it. If it's just loading the kernel module it should be pretty easy to orchestrate. So my understanding was I think it's not supported right now or like the kernel module itself I think you mentioned. Yeah, I think we have limitations of like having external modules. We have like this BPF test mode but like we can copy, paste it and stuff. I think like the, it's all doable but I think we should kind of have like a protocol between the kernel module and the testprox. Cause like I would integrate everything into testprox cause testprox provides like the most advanced infrastructure where you can blacklist tests have only like the relevant error output and all the stuff. We don't want to reinvent that. Like that's a lot of code, lots of logic, a lot of iteration. For test, for this module like we should probably have a test that loads the module and then like, I don't know maybe calls it like for each test and guess like the results, something like that. So like some simple protocol to do like test at the time and simulate it as a sub test in the testprox. Then it will like be very naturally integrated. Yeah, you could embed the module in like some like in the testprox itself to an unit module call and then have some parameters that would run one test at a time. Yeah, yeah, yeah. I think testprox already load in something, right? That's different, that's different. Yeah, yeah, yeah, it's different module but I mean we have some infrastructure already uses doing it. And then BPF preload also we should load it as a module because I saw some like the files are missing here. So we, I think we've compiled it with like yes as the kernel, we should also compile it with like as module. I know the second line, maintenance will be, how do we do it? Do we do it faster? Yes, yes. You're just looking for like low hanging fruit, right? I can't do it. I think I'm up this week and behind already. But there's like a, there's a lot of tests that are not run because they're not for testprox, right? Like even if like low hanging fruit would be to take all of those and some of them were written before testprox like I'm like probably the guiltiest here, right? Like all of the test stock map stuff is done on the side and we've run it in RCI but like really that's like those kinds of things should all be moved into I think into testprox and then they would just naturally be run on both ALU32 and ALU and all the platform stuff. It's like so. Remember that testprox also has like the parallel model. Yeah, yeah. It's reusable in practice. Yeah, right. You probably want to do like this. So we'll review. So I remember you did the overhaul of the self test framework, right? Like with skeleton and stuff. So... No, in self test testprox we sort of at some point started agreeing on the format in which we write tests and stuff. Is that... Yeah, like all those changes to testprox we rely on them like here. Okay. So like generally like over the last two or three years like we've been consistently moving well moving or adding anything to testprox. Sometimes we convert to the existing tests into testprox. There are still lots of them especially a network related that are outside which we must be honest. We don't run them regularly, I assume, right? So there are lots of ways to contribute here, let's say. So local CI stuff, right? Like I gave a half a vast shot to it like bring like with the VM test or SH stuff. Is it going to be like... So I think like this point of like making the CI runnable locally like it probably will be like taking VM test.sh trying to reuse like what's possible and then like somehow wiring it with CI or what, I don't know. Like they use the same image by the way, right? So like VM test downloads the same test image. So just more work to like put everything together and like make it easy to use. Yeah, idea by like running CI locally is that developer who sends a patch can reproduce exactly same tests that maintainer will see. So if you, if something will be failing you guys will be on the same page for like to discuss. We talked about like teaching VM test.sh to like either compile or download like Clang and Pahol and all the stuff. So you don't have to preset up your environment. So that direction, right? Like make it completely reproducible sandbox. Like a hermetic sort of thing. Yeah. So Clang has nightly builds, right? That's what we use in CI, yes. Okay, so this would be like... They are semi nightly because sometimes their pipeline is broken. So you have to wait for like a week. It's still better than like realizing that for 10 hours that what has effectively gone wrong is your Clang version is out of date and then you recompile Clang. Yeah, we couldn't afford like building Clang from sources because like we would be hitting GitHub action time out. I'm happy to sit with Micola and like try to add this to VM test.sh while we are here. Awesome. Perfect. Thank you. Cool. So yeah guys, if you have any ideas or desire to contribute, please, please do. Everyone will benefit from it. And yeah, one cool thing is that like if you will do changes to the CI, we also run tests on those changes. So you will be able to see if your change actually executed properly and what's the effect. So like in this case, I added the kernel config print out just before running the test and I was able to verify it. So it's easy. It's not a problem. Any more questions? Pause. Should we, so like aside from converting existing self-test that are not running yet in the CI, I think one other thing that is also always a mess are the BPI samples. I would just either love to move some of them into the CI like or just remove them from the tree. Delete samples. It's such a mess. Yeah. And there's enough somewhere else like on the internet already, right? There's not even. That I think we should discuss like as a separate topic in itself. We were, we were, we always bring this up, right? This is the program type. These are the, this is what this program can do. This is how you can do with that program. Some, some sort of like a drop list, like documentation on eBPF.io would be nice. And this will be a nice place for samples. But Sam, like there are two places in the kernel. I generally tell people who start with BPF just look at self-test, right? It explains like how stuff works. So yeah, for me as well, I recently started the program with the eBPF. I feel sample is pretty good for the test, but they have tried a whole test for infrastructure. You have to know how they work. Actually, they are more complicated than the samples. For some, it's more standalone. We can move to somewhere else, but I think it's good to keep it somewhere. Maybe linking the documentation somewhere. I think we can only guarantee that self-test will be working, right? Cause I'm pretty sure that samples, that some of them are not running anymore. Cause like some things change, right? But ProcTest, we're giving them some love constantly, right? So they are runnable. So like with samples, like they are usually written as like a demo application that does something not like really testing very fine. So like converting them to self-test is really like just kind of writing the tests from the scratch. So like if we are doing something about them, like let's just remove. But I think like the problem with samples that especially nubes don't realize, right? That you have this nice condition where like you're compiling samples with your current kernel, right? Which is like completely unrealistic for anything but networking, let's say, right? Usually. And it's just like not a really good representation of how you would do like the BPF application in real life. So it's good maybe for starting, but like we can also extract it into separate repository as like a samples. Like we have like LibbPF bootstrap for some simple tracing applications. So like we can have something like that or extend bootstrap, whatever, I don't know. But like we can keep it separate. I totally agree. And what I like for, I mean, it's probably takes a bit to build, but what I like is when you go, for example, to the Golang page, you have executable samples in your documentation and I would love to have something like this maybe for, I don't know, ebbf.io. And that's maybe a starting point where users could play around, but like the samples in the kernel, they might be misleading in terms of like where to start and how to start with things. Like I said, we have LibbPF bootstrap for like Uprobe example, Kprobe example, stuff like that, like minimal core stuff. Takes a long time and a lot of effort to maintain all that, so like it doesn't grow. So, you know, like you'll need some active involvement, like if you want to maintain this as a collection, like a big collection of simple examples. So we'll need to figure out how we want to go about this, but this should be a community effort. So part of this discussion overlaps with my last slide, which talked about CICD for LibbPF. And where you'd want simple LibbPF programs, and I mean programs both from the ebbpf program sense as well as from the user application that consumes it, right? And so I see a lot of analogies here. It's just not just about testing, you know, the kernel per se, it's about how do you develop samples and make it easier for developers to write programs with LibbPF. And for that, you'd say, well, for at least for those, you'd want some of those to be cross-plat, right? At least that's part of my goal, right? And so where do we put those? And at least like ebbpf.io is a nice place that could be a cross-plat set of samples and things. And you can have some samples that might be Linux specific, some that are Windows specific, and some that are cross-plat, right? And we'd have to have a way of categorizing them so that people would know what to look for, labeling. But otherwise, yeah, let's do it. Awesome, thanks a lot.