 Now it's the boring thing that no one likes to talk about, which is testing, but we all have to do somehow, right? So last LSFMM, we had talks about things that would be wishy-washy, that we should do and work on. Folks discussed the possibility of maybe sharing a repository to post results for testing, FS test, block test, that sort of stuff. That's happened. We have a shared get-tree, we have shared responsibilities in terms of being adults. We agreed that we would be civil, and I think so far it seems to be working. Every now and then we do have commits from folks who break shit, and basically we have to revert stuff, but it just takes like a day to notice and then someone would complain. There's a Discord server, woo yay, I know some people hate that, but whatever. There's also oftc.irc.net, so Derek, I know Discord didn't work for you, so I'll be on IRC, if folks want to just jump on IRC, we can use that too. So just go to kidapps on IRC.oftc, that whatever you know. But let's see what else. There is also the concept of not only storing failures, but also storing actual resorts, given that it seems like I'm in the ship. So the other thing was storing full results. The reason is that you can actually use complex advances and tools and that sort of stuff to scrape results and provide fancy results. Forget the name of that stuff that was being evaluated by our team, but the request that I got was that they not not only want the failures, but actually they want the successes as well. I didn't do the math for that, I only did the math ballpark back and napkin type of calculations if we started archiving only the failures. So right now kidapps actually produces the full tarball that you can get commit and basically only has the failures. I thought that that would be sufficient, but apparently people want the successes as well. We can try to recalculate to see whether or not that makes sense to put it into a good archive. If not, maybe we'd just stuff it into another repository. So right now it does have failures, DMS logs, that sort of stuff. Right now I think it's only been me and Chandan who has posted results from our testing. I encourage other folks who are using it to basically just create your own name space for that. There's examples there, just your username, your whatever you want to call it is that your environment is. So I have like for instance, McGruff slash like Libbird or whatever. It's useful because then you can always get grep for things, right? These are compressed tarballs, you know? So you just gotta implement the tooling to obviously the decompressed memory and then just grep for whatever it is that you want. We also now have support for compiling the kernel, FSTES, oh sorry, the kernel on the host using 9p. We actually found some bugs with 9p, but that's because we had enabled initially caching. Right now we don't use any caching at all. Another option is us to just switch to NFS, so I'm not sure if that's, there's any benefits or whatever, but 9p seems to be working with caching disabled at this point in time. So just to be clear, this is basically allowing you to compile the kernel on a big system and then just installing it on the guest. Derek for instance had a question to me. He assumed that it seems like we were running things requiring only the kernel to be built without modules. That's not the case, modules are supported. In fact, module signing is supported too, to ensure integrity, given that we're doing all these crazy testing so that we can ensure that the modules are fine. How is this used today? Well, we had a talk earlier today about how Amir and Chandon are helping with stable backports. So it's an example where KDevOps is being used in two fronts, local virtualization solutions and also on the cloud. In fact Chandon actually ended up implementing support for OCI, so it's a new cloud provider, right? So we didn't have support for that. And how this is being supported is basically just using Terraform. So if you are a cloud provider and you don't have, if you want to essentially allow the flexibility to integrate into KDevOps, all you just need is a Terraform module. So for example, I spoke a little earlier about cloud support for Alibaba. It was not clear whether or not there was a Terraform module I Googled in, it seems to exist. So we can support Alibaba pretty easily. It's pretty simple to add support to new cloud providers, just basically variables. I was asked earlier today if I was gonna do a demo, but the answer is no, and the reason is that I wanted this to be more about the things that we should be talking about here. And I had offered to give a demo, record that, put it on YouTube, whatever, and then just go see that. If folks still need demos about specific things, I'm happy to provide that. There's YouTube recordings of that. There's YouTube recordings also about day-to-day implementation. Just to be clear, I use KDevOps not only for testing stuff, you know, false systems or block layer. I use it for my day-to-day. I use it for all my implementations basically is using KDevOps either on the cloud or on my bare-metal system somewhere. I just, as I say it somewhere, and I basically just instantiate and just hack there. But obviously you can use it for your own testing. We have Vert.io now support, given that we had some issues with NVMe on Kimi. So just to be clear, this is basically Kimi instantiating the NVMe drives and this is why I do know about the IO thread stuff that was going on. So we've been kind of trying to see what the issues were and IO threads certainly helped on Vert.io. So for those that are not familiar, essentially Kimi works with this huge global lock and using IO threads allows each storage instance for instance to not share that global lock. So if you're using Vert.io, it's really, really fast. So I know Amir, it helped him a lot to avoid all these odd ball NVMe timeouts that are happening. If you layer that on top of butterfaster, something like that and use loop packs, you're always gonna end up with timeouts. So I started running into this as well, given that I'm also testing XFS for large block sizes. So I'm using Vert.io too, unfortunately. So yeah, whatever. I do know that eventually we're gonna have support for IO threads for NVMe too. So hopefully we can switch back to that. There's initial RM64 support for the cloud. I did this to start evaluating large blocks I support and what folks claim to be large block devices on the cloud. They're not really large block devices. They're basically supporting just, you know, sorry. They're basically using atomic rights. Everyone does it differently, you know. Some vendors are using NVMe. Some of them are using other storage fronts. But if you wanna evaluate, for instance, what Amazon did, you know, you can go and spin up instances right now with K-Dobbs and use the atomic right support to a right support. But my understanding was that there are no local virtualization images available other in Tumbleweed and it's unclear if other folks are gonna work on that. So what I wanted to talk about really and dedicate some time here was to help with infrastructure for testing and resources. So to make it simple, I think it helps to reduce or limit the scope of what we wanna test and talk about how to test. The non-controversial thing I think that we can talk about is to use the maintainers file to describe components and respective maintainers. And then get respective maintainers to decide what it is that they should be getting help with testing. Then once we have that, we basically need resources, right? And then parties interested to help. So I just wanna remind folks, I keep showing the slide. I think every time I go anywhere and talk about the stuff because we are really good at implementing code. We're not so great at testing. So this is that boring part about testing, right? And this is about trying to address the other components that we suck at. So let's talk about the system resources. So these are the resources needed. We need actual systems for automation and then we need folks to do the work. So for system resources, Samsung has been really nice and has allowed me to share the system that they provided to me to do development to basically share with the community and allow community developers who are interested in trying to test these sorts of things that you just log in and basically we use the system. It got to a point that I started running out of resources myself and I can't do shit myself. So I basically sometimes poke them here and say, hey, I'm sorry, man, can I like turn these instances off and so forth. I go back to Samsung and say, hey, do we have more resources? Can we get another server? And like, ah, how about you talk to other vendors to see what they can do? So I have started to talk to vendors. In this particular case, I've been starting to talk to cloud providers to see what they can do. So, Derek, would you like to say anything about OCI? Are you on the line? Yeah, hello, can you hear me? Yes, perfectly. Cool. Yeah, so as you all know, I work for Oracle. Suddenly now we're a cloud vendor, which is hopefully convenient because OCI has a, has both a free tier and an always free tier. So now that Chandan, who also works for us, has done a whole bunch of work to enable KDevOps to talk to the OCI control plane, this means that you can use KDevOps with reOCI accounts and in order to get testing done and we provide the hardware and Louise provides the software. And anybody can sign up for an account. Please feel free to use your, either your real accounts or fake throw away ones, Gmail, whatever. For me, as an OCI user? As my own personal opinion, not as an Oracle employee, but so yeah, we have resources and we are perfectly fine with letting community members sign up for free accounts and use them to the maximum extent possible. Anybody want to volunteer to write some documentation on how to use Oracle? It's undocumented in the KDevOps reads right now. So I can certainly help with that. Well, all right, I haven't even used OCI so I'll sign up for the account and then I can write the documentation. But I will say that the most difficult thing that I run into in trying to use automation for with cloud providers is the fact that it is actually very, very easy for even their own customers, right? To basically say that, yes, I want automation and then they basically instantiate 100 guests and then they basically have to sell the mortgage, you know, their house or something like that because they can't afford to pay the cloud provider because they left all these instances running. This happens in the industry, right? So cloud providers tend to make it difficult for you to enable this by default and it's actually a real royal pain in the ass to try to get the simple little nuggets that you need. I can say that at least for Amazon, it's a YAML file that you can have and you just need to get the information somehow. I do have the documentation on that so we can work on the documentation for OCI if such doesn't exist yet, I'm not sure if it does. So yes, I can certainly help with that. I did also talk to Microsoft about Azure to see what likely could be done. So first of all, thank you so much Derek for asking, it's incredibly useful and now I can at least go back and say, yes, we have other vendors helping. So Microsoft said that they would evaluate this. I have no other news other than that at this point in time unless there's anyone here who does have new information. Okay. I don't have anything about Azure but I will say that they've been trying to get us to use OCI internally for a long, long time and I resisted because I grew up in the 90s and have a whole bunch of pet lap machines. And then eventually the light kind of came on of, hey, you know, I can spin up like 170 VMs to go run testing on several dozen profiles and I can run this thing every night. And it works, like it just, things just fire up. I don't have to go deal with capital expenditure, blah, blah, blah accounting, it just is there and I can use it. And it worked great until a few days ago when I apparently have now managed to consume all the department's resources and they're telling me I need to back off a little bit. So we may run into that, we may be, you know. So I did figure, I'd ask other cloud providers too. I did ask Alibaba. So at least I got a reply and they're also in the evaluation phase of considering this request. So I have no news other than that, but what can I say? I'm trying and also I wanna thank Jeff for applying to Microsoft Azure program, right? Because, you know, then at least we're on the radar. It's not like, okay, no. If you don't ask, you don't get it, right? It's a good opportunity to also ask if there are any vendors here or who can likely consider and evaluate talking internally to folks, it'd be useful to get feedback. For instance, I know S390s exist on the cloud today. Do you guys know that? How cool would it be to try to get instances for S390? Bad ass, I think. Anyway, if you guys have input, please ask internally for resources. Anything that we can get. I like to think about this, I will say this. It is very boring to think about Linux when you think about it as, you know, common property and, you know, utility. But that's kind of like what we're doing in a way. So I guess we should be able to run KDevOps and do tests with, well, kernels. Let's put it like that on that. You can do RHEL, you can do SUSE, you can do open SUSE, yeah. This came from SUSE, by the way. So that, I guess we can. What we cannot allow is to have anyone, not an employee, logging into these machines. So these machines are intended purpose will be offline to this effect. So we had a kind of beer buff on the balcony, unofficial beer buff and one of the crazy possibilities is would it be possible to do something like, I don't know, test at home or some crap like that? It'd be really hard to try to implicate, you know, the describe what your job is and then send it off somehow and hope that you get the results. How do we trust that? I don't know. But for now I think we need the resources and we need to log in. Well, I guess, I mean, if you do testing, I mean, what we do internally already is we're doing testing, building a kernel from a Git tree and then do testing on that. And then we have the commit ID, the top most commit ID reference in the build log saying right, commit ID XYZ, that happened. And that gives you quite a good level of confidence because well, either you have that Git ID and you can track that Git commit ID or you can't. Then well, these results are nice but possibly pointless. But if these results would tie to an upstream Git commit ID, well, guess what happened? You tried, you tested without upstream Git commit ID. I think at this point, any type of help would be great. So if we have documentation about like how to move forward with this, it'd be awesome, right? So one of the key things I think they have come through dialogue with other folks who have been considering it, you know, or testing right now or adding workflows is, it's important to be vendor neutral. We don't want to depend on one entity, right? Because there's layoffs that can happen. We can just jump between companies and so forth. So we need the resources to do this regardless of what company we go to, right? So it'd be useful. The more resources we have, the more company agnostic this is the better. So yeah, sure. If there are ways for us to test with like Git commits and point to Git tree, absolutely fantastic. Yeah, I was gonna suggest that one of the things that might be a little bit easier for some companies to say yes to would be if it was a, you know, a particular Git tree and a branch. And they just simply watch that particular branch on that particular Git tree when it changes, they run tests and then the test results are sent somewhere. That becomes I think a little bit easier to manage because we don't have random non-employees logging into a particular resource and we could add throttling so we will test no more than end different pushes to the branch a day or something like that. I imagine that that kind of approach, we might be able to get some companies to say yes without necessarily saying we need to give shell access to random community developers, which I think is hard. So I guess one of the difficulties there would just be that we do want the maintainers to be the ones to define what should be tested. So once we get the description of what needs to be tested, yeah, well, it could be, you know, or in the KDovs world, it could be Kconfig for KDovs. Configure it this way, this is what I want. Yeah, I mean. That's fine. So speaking as a maintainer, the thing that I would love to be able to ask people to do and I can even ship them test appliances so it's really easy is on their local machine, run a KVM smoke test using XFS tests. Takes 15 minutes on a home KVM machine. You don't necessarily even need major cloud resources for that, right? It's just run the smoke test and that will find like 95% of the obvious problems. Cause that's what I very often end up doing is, you know, someone sends me a patch, I fire off the smoke test and then I send back a, I'm guessing you didn't test this because it blew up, right? And that actually is one of those things that we might be able to do that doesn't even require allowed of cloud resources is self-testing using developers own resources for smoke testing, right? So back to what Luis said at the beginning, it's not clear to many people what the steps or how to do exactly what you just mentioned. And so the maintainer file could have a pointer to a simple script or something for people to run against their own patch. Yeah, what I would suggest is a URL, right? And that URL can point at these are the steps that you should do before you submit a patch that will allow you to do quick and easy testing, right? And that might lead to download the shell script. It might be a pointer to KDevOps with, you know, certain preconfigured, you know, configurations that the maintainer wants you to use with KDevOps or, you know, myGCE or KVMXFS tests. But we can actually make that be very general, just a URL and for that will allow people who are interested in that subsystem to know how to do a quick test. Because I think it really is important to do the, you know, big, expansive tests that take, you know, six days of continuous running on a dozen different VMs. That's the sort of thing that Derek and I might do. But then there's also the quick test that we want developers to run before they send me a patch. And that just simply scales so much better because there are a lot of developers and like one or two maintainers. I agree that KVMXFS tests is as easy as it gets. But if it's so easy to download and run it and it takes so little resources and there are not so many drive-by developers, it's probably easy for Google to have this CI push to the branch and we will test it for you. I mean, which is easier. It would be kind of nice if we had a way, at least for some trees, not for all of them, a way to run this automatically. It is run on the Linux next. Oh, yeah, yeah. For example. But Linux next implies that the patches were applied already, so. Because I think I said this last year when I run NextFS tests, I still have this manual mode. I haven't yet. Dude, I gave a specific live demo for you. It's recorded on YouTube. Yeah, I'm not sure what else to do. You need to shame me into doing it. I need to just go home or something with you and get a beer. How about we get a beer, you know? Shit, man, whiskey. I don't know. My God. I promise you, next time we see each other, I will have switched to KDA. Okay, no worries, man. It's fine, man. I don't know what else to do. All right. Jeff, how did you find out about the breakage of C-time? Didn't you get an email from a bot? It happens. The bots do the tests already. So this goes into the... But they're not testing everything, right? They're testing EXT-4, they're testing XFS. They're not testing NFS. They're not testing SIFs. You know, we need a broader coverage of file system testing. That was a fantastic point because I was looking at, we fixed the NextFS tests and we found a C-time bug in one of the servers. So like in, I'm sure in EXT-4, you have four or five configurations. I have five or six servers I'll try against. Well, we found a C-time bug. Would this have been found in Linux Next? Probably would have been found years ago. But, you know, it's like, it's a server problem, but still, we have to be running these on as many file systems as easy. You know, don't create new work, but any file system maintainer like Ted or whatever that has the list of, I have a list of tests, you have a list of tests. This is so easy to set up. So I will say, and add to that, one of the things that I hear that's a bit difficult is I'm not sure what to do to get some certain workflows going. I am available for consulting for these things. I'm not saying I get paid for this. I'm saying I'm willing to just give you time and dedicate time to talk to you and help you and review. Well, it would be like if you wanted to automate your workflow using KDevOps. I had implemented demo workflows to add some more support for KDevOps so that way people can see what it looks like. CXL supported is there. Now you can do initial testing with CXL. That's a complex environment. It used to take people quite a bit of time to ramp up with CXL. It should be like that now. What's next? You know, PCI Pass Through. PCI Pass Through is automated. Thanks for, Joseph, for adding initial support there and now you have KConfig for it. If you don't know how to do something, I'm happy to provide some time to talk about this. But other than this, we need people to actually do the testing. We need volunteers as it became very clear earlier from the LTS talk. We need volunteers to help with kernel maintainers in this room who do want help because they need the help. It's extremely useful for learning as it was described and you get insights to new features in technology. So if you would like to help, talk to any of the file system maintainers, they all need help, right? Just poke. If you have any interest, poke at them. The alternative, of course, is that we can pull money resources together. I say this just because at one of the LSF MMS, it was hinted, money's not the problem. We can pull money. It's just like getting to decide how to do things. Well, I think we have enough work already cut out. We did some of the work. Can we, do we really have resources to putting out money? Layoffs happen. I'm not sure if that's still the case. So it's not a good economic time to consider that again even though it was said before. So the only thing that we do have is volunteers. However, if really pulling money is something that folks really are serious about, because it was hinted before that money shouldn't be an issue, well, should we just pay folks to do and help with this stuff because this is common utility type of work? I don't know. But in the meantime, we just need volunteers. So the more volunteers we have to help with some of this work, the better. Obviously, you know, kernel developers need to become familiar with how it is that, you know, this workflow would work. And then they can give the blessings and help get people to help. In the meantime, question is, do we want to pull money or do we want to just focus on volunteers? What would kernel maintainers prefer? I mean, I think trying to find, you know, is there a lot of benefit in hiring people to just push buttons, right? I don't really see it. So, I mean, we're better off trying to just automate as much as this as we can. And that way, you know, if we can have the computers do the work for us, right, as much as we can. Yeah, I mean, I think if we had the money, the place that I would put it would be funding work to enhance something like kernel CI so that we would have a common dashboard for test results, right? XFS tests uses XML, you know, XUnit, XML files as test results. It would be not that hard to have some submission of sending test results into some centralized location so that you can actually, you know, get an easy dashboard of tests that have failed and, you know, the test artifacts that might help someone to track down a flaky test because I think one of the benefits that you would get out of that, and we're actually seeing that with the Sysbot dashboard, is we've got people who will look at the Sysbot dashboard, find a Sysbot report and they work the problem, right? And then, you know, the next thing I know if someone has sent me a patch and they say, it fixes this problem, you know, if you are, you know, uploading XFS tests and you'll have some weird file system config that, you know, has a failure and somebody else can actually work that test failure, then it's not just the person who's running the test who has to, you know, track down and root cause all the test failures, there can be other people in the community that could do that work. But that requires some sort of common dashboard. I think many of us have our own custom dashboards. Some of them are, you know, fairly, you know, they're personal dashboards. They're not something that's really very scalable for the rest of the world. I have access to a really, really excellent dashboard. Unfortunately, it only works inside the corporate firewall so I can't share that. But that's the sort of thing where if we had money instead of actually, you know, paying someone to push buttons, as you put it, that's sort of, that's a missing piece of the infrastructure that I don't think I've seen a good open source flexible solution for just yet. So I did look into that workflow and realize on a series of complexities. I mean, sure, your support for that can be added into Kdevops, you know, it's just a Kconfig, you know. So sure, if someone is interested, that support can be added for sure. But it requires just working with the whole lava stuff, which I was just like, no way, I'm not touching that. But, you know, that's what I mean. And I think that's the sort of thing that really you want it on a public website, right? And so I think kernel CI is actually a better fit than, you know, saying, oh, I'll add a visualization dashboard to Kdevops because then each person running Kdevops has to find a way of poking a hole in the firewall to export the web server, right? Whereas if the model is people have their own test runners, you know, because Kent has his own test runner, you know, I have my own test runner, Kdevops has their own test runners. And the point of commonality is the Xunit XML file and maybe some standardized format for the test artifacts. We need to be able to send that to some common public website that then displays it for everyone, right? I think that's just, I think that's the right model. And, you know, kernel CI exists. It's just that at the moment, all of the funding went into testing ARM, boot up, and device tree, right? And, you know, someone needs to throw more money at kernel CI for other kernel subsystems other than device tree. Well, I guess maybe one of the things could be to see if we can use these cloud resources testing for us and focus on non-ARM stuff, you know? But again, you know, this is lava stuff. I'm not too interested in the lava stuff, but if someone is, that would be useful here, I guess is what I'm hearing. I'm interested if you have any thoughts about what I ran into. When we did test automation, one of the things that happened was XFS test changed. And I think you guys remember it was copy, file, range, error, change. There's tests that break. So you go from, you do a Git pull on XFS test this week and behavior of test 91 just changed. Yeah, so we certainly run into this even amongst all the folks who run in, you know, with GitOps too. So what we did is we basically just Git pull GitOps, you know, organization has its own, you know, FSTest. And then, you know, we hope that it doesn't break when we do the Git fetch and reset, but we run into issues. Chanda, you know, reported an issue recently, the latest FSTest, there were some issues with some scripting there for the XML parser. So he reverted tons of stuff now and pushed that. And we hope that that eventually gets fixed. So my best recommendation is that we have to stabilize FSTest as well. The only thing we can do there is just rely on the tags and hope for the best. So like, for instance, this is why I was asking Shinjiro about the tags for block test because that would be very useful too, because then we know, or at least this seems to work, we can stabilize in that for a bit and then move forward. Yeah, so I think the reality is that if you are a file system maintainer that wants to be test driven and really rely on XFS tests, you have to be on the FSTest mailing list and you have to be an active participant because we're constantly adding new tests. Some of the tests are, gee, let's see if this security bug has been fixed. You want that test, right? You don't want to stabilize on a version of XFS tests from six months ago. And sometimes the new tests fail. And sometimes the new tests, like for example, might rely on a specific version of Bash. And if your underlying system has just upgraded to a new version of Bash or Core Utils, the test flake out and I patch the test and I send a patch up to the FSTest list saying, oh, by the way, Debian just went to a new version of Core Utils and it broke this particular test because the test relied on something non-portable, right? It happens. But the reality is the value that I get out of XFS test is such that I'm one of the FSTest community members so that I'm constantly watching the latest FSTests so I see the test regressions independent of the kernel. So that's actually one of the things that I will do right as the merge window is closing is I'll run FSTests on the old version, pull the latest version of FSTests, run it, see if there are any test failures. Okay, there's a test failure. Is it a new test or an old test? If it's a new test, then okay, I might have to root cause that. Is that a test bug or a file system bug? And sometimes it's a test bug, but I get enough value that it's worth it to me to actually spend that time. But I think one of the goals for KDevOps for the simplicity, and this is not, I understand it. So what you did with KDevOps, you took the approach of Xpunch files that are very, very specific to 5.10.107. It doesn't match anything else. You wanna test anything else? You need to create a copy or sim link to 108. That needs to take into account the FSTests tag otherwise everything blows. Every time that I update FSTests, I screw everyone else. So the thing with the very, very specific and Xpunch list is far from being perfect, but at least if you want to meet the goal of simplicity, it needs to be added. Yeah, what I actually started doing is I started running the exclude files through a CP processor, and I just have if death Linux version code greater than whatever or more to the point less than whatever, because sometimes what will happen is an older kernel will always fail a particular test. And I just simply put that in the exclude file as a pound if because I didn't want to maintain a dozen different exclude files for a dozen different kernel versions. And it may be that KDevOps should consider doing something similar. Your prep services can also look at fixed by annotations in the test and put them automatically inside Xpunch list. I would like to share something we have down in the past year with BPF, my list patches. We hook something to patchworks. I think a lot of maintainers use patchworks and whenever there's something show up in patchworks, there are things in the background that trigger some tests that would use the GitHub tests. And then there's the link from the patchworks. You can see there, oh, there's the test failures. And you can even see the whole thing like all the pending patches. If they're ready, ready, ready, you know something wrong with the test framework. If only your patch is ready, I mean, okay, I broke something. So I think that and we haven't even have a ways if you go to GitHub, you can test your patch before submitting your patch. So, so far, we think that works pretty well in terms of the framework side. And of course that's neat maintenance, like especially with BPF, we are sensitive to LLVM version, there's certain version gonna break things. And I think the other key is like, we need the maintainer, the developer to see there. We need that to be green like in the regular time. If there's all red, like, because the regular contributor doesn't know like, oh, did I break that one or it's actually broken things like two months ago. But so far, I think that framework worked, I will see it worked great for us. It's spent, we need the effort, but I think that patchwork is probably the easiest interface and GitHub so far worked great for us. We do have patchwork integration, surprisingly. So we show up on patchwork, it's just unused, but I have a to-do item on my list and looking into this because I think it might just be useful just for tracking. Serious that I've been reviewed, serious that I'm still outstanding. And it might kind of like that feature that you have this sort of, even it's just symbolic delegate mechanism where you can say, okay, this needs to be reviewed by beyond or by Amia, for example. Yeah, and this might be another thing. I don't know if the best place to put it is in the maintainers file or the subsystem profile in the documentation that Jonathan mentioned, but many of us have patchwork instances. They're not all patchwork at kernel.org. I happen to use one that's at Oz Labs. And so just simply telling people this is the patchwork that you should be looking at would actually probably be a good idea. And again, I'm agnostic as to whether the maintainers file or the subsystem profile is the right place to put that. I guess more people look at maintainers, but I don't know that that's the right place. So something to think about. Yeah, the biggest advantage that I see is that you have a way of keeping track of what is actually still around because currently it's like the inbox and thanks to Constantine, I hope I'm pronouncing this correctly, lay. Or that's the only way to currently keep track. Patchwork would be way better for this, I think. You know, from my perspective, when I compare with other open source projects, the number one problem is we don't have enough tests. You may be at 800 or 799, but we should be at 2,000. We've got so many bugs that we fix and not enough tests. So what I was looking at in this, like what would cause people to be so impressed with your infrastructure that when they reported the bug to Ted or to Jeff or whatever, they also sent a test case that we can then add because that seems like the number is, does this make it easy? Does it help in any way getting the visibility of how important tests are at us? Because we're not gonna know about all these bugs, right, the customers will. And so will that help with that? The answer is yes, but because, not because, you know, KDEVS doesn't have tests itself. It has workflow definitions to allow you to use existing repositories that are used by the community for testing. So for instance, a fast test is a workflow. Block test is another one. Each workflow has its own, you know, dedicated series of tests. So it's important to, you know, separate that. This is just the automation aspects about using existing infrastructure for testing. We have kernel self-tests, no shoes here. So self-test is also supported within KDEVS too. So if you're a maintainer for the kernel and are using self-test and you wanna do automation with self-test, you can do that with KDEVS too. So it's important to increase the number of testing that you do have for your different workflow. This will help you with the automation and getting people new contributors to also easily adopt those required workflows for your testing infrastructure. A complex environment, for instance, that we had, you know, recently been heavily involved in is CXL. One of the biggest complaints that I got was it's really complex to deploy and instantiate a CXL test environment. Well, no, not anymore. It's just a few commands. Yeah. So I think in answer to your question, Steve, the in general writing a test case is difficult because you want the test case to be something which is small, self-contained, and easy to reproduce. And most often the type of bug reports I get are of the form, run this proprietary workload or run this open source workload that takes several hours to set up and then run it for six hours on this class of hardware and then maybe the bug shows up. And so generally what tends to happen is we work with the bug reporter, we root cause it, and then I come up with a reliable repro which is like, you know, ideally 10 lines of shell script and that's what actually goes into XFS tests. So if you are lucky enough that you get a bug reporter who can send you a easy repro that is just half a dozen lines of C or whatnot, they've done 90% of the hard work of fixing that particular bug. And I wish more users would send me those types of bug reports, right? But the reality is that's not what we get. So I think the challenge is the, there is a certain amount of discipline involved. And sometimes what maintainers can do is send back a thank you that was a great patch. Would you like to write a test case for me? Because very often the test, the commit might actually even include the six line reproducer in the commit description. And then it's just simply a question of sending back to the patch submitter. Would you like to take that reproducer turn it into an XFS test because that actually helps everyone? And that's, I think, again, part of the, we need to push that back on the patch submitters because while, I think today, the vast majority of the tests are written by the maintainers and the senior developers of the file system and that just doesn't scale, right? We need, and there may be things we can do to make it easier to write XFS tests, right? And maybe we need to have some tutorials about, so you want to write your first XFS tests, this is what you should do, right? And it's actually not that hard. We just have to write the documentation. Well, that's all I had. So if you had resources in terms of systems that you'd like to contribute, now you know at least you have kernel developers who seem to be willing to help with some of these test infrastructure and add the automation stuff. You guys need help with seeing how it is that what it would look like. I'm not gonna write the code. I'm gonna help you evaluate what you could do. There's enough demos out there. I've recorded demos of how to use it too. I will create the dialogue to add a new cloud provider support. Yes, I'll easily add support. That's a few lines of code. Adding new profiles for testing file systems. That's just the key config change to kid ops. So that's all. So one very quick thing. You had to talk about stable earlier and I mentioned this to you earlier. I think this actually has even more benefit. Maintainers are running XFS tests, a lot of them are running them already. But I mean, I don't hardly ever run against 5.15 or 5.19. So this is gonna be a grand slam if we add a test 7.99, we find it breaks 5.15. We didn't backport a fixed to 5.15 that we need. The stable is gonna be huge if we can find a way. Is that within scope as well, testing any scabels? Any kernel can easily be added to kid ops. It's just basically a key config change and a kernel configuration to ensure that it compiles. After that, then all you have to do is run a test test to get your baseline. The complexity lies in establishing baseline first and that takes a bit of time. So you first need to establish that. And that basically unfortunately just means, you know, running the test and see what fails and documenting that and having a list of expunges and be used as the baseline. Now, this basically just means, you know, you are expecting failures. You know, it's just how FS test works. We have tons of tests and tons of tests fail for one reason or another. Sometimes good, sometimes bad. Maybe it's test bugs, sometimes it's real crashes. These are known issues that will create your baseline. Establishing the baseline is a pain in the ass. And yes, we can strive to automate that, but there's enough tooling to at least get us one step further right now, which is to get us a quick test environment. The baseline process takes a while. Getting to automate that, we can get there. There's just some work needed there too. Yeah, I mean, we've talked about this last year. There are ways that you can validate tests without necessarily creating the baseline first. You know, the approach we've used is when we get a test failure, when we're validating a new patch that, you know, for the stable back ports, say, and we find out that, you know, we've got like a 5% flake on a particular test. At that point, we go back and test the baseline and see whether or not that test has a similar test failure percentage and if it did, then we know the patch wasn't responsible for making things worse. And if not, then we know we have a problem, but that way we're not actually spending a huge amount of time establishing a baseline for the common case, which is most tests actually pass. So there are actually other alternatives. I know that's not the workflow that KDevOps is currently using, but I think there are some, you know, smarter ways of doing that. And I think one of the reasons why I mentioned that is if we're going to be using cloud resources, finding ways of using cloud resources efficiently is actually really, really important, right? And so, you know, if you have systems that will automatically shut down the VM when you're done with the test and fire up the VM just when you're starting the test, that's a lot more efficient than keeping the VM up all the time. And, you know, by the way, if you're using Oracle's Always Free Tier, I was just checking out the documentation. If the node is idle for more than a certain amount of time, it automatically shuts down the VM, which, you know, makes good sense, right? So actually one of the big things that I've actually done was to make sure that if the kernel hangs because it's spinning in a deadlock, that I have timeout systems that will automatically kill the VM because, you know, having the VM run for 24 hours, spinning in a dead loop, not making any progress, it burns money, right? And so that may be the other thing that's worth thinking a little bit about is how can we do more efficient testing in a cloud environment because, you know, resources are not free. So certainly feature-welcomed patches are greatly welcomed as well. Any other questions? Back in the, was there a question? No, no questions. One question, Chichiro. We should have those things where you can throw the mic. Those are fun, right? Right. Thank you very much for the good discussions. I think many of the discussions mainly passed for XFS tests, but most of the discussions are applicable for block tests, I think, so. So block tests are rather small test sets, so I'm not sure if the really tag helps you or not. So if I do a tag, so you need to follow it, but if you just use a git hash, so anytime the KDEbox user can choose which version to use. So is it okay for you just to use that tip of the block test master? Yeah, right now I just get close to the latest and greatest, but we can certainly just pick a git, you know, commit IDs and the same thing as the brand. So yeah, we can certainly do that as well. Is there any difference to do the tagging? I'm sorry? Do the tagging and then picking up just the tip. So what could be the difference between those two? Well, tagging just dates it and, you know, we should at least, for instance, one of the nice things would be having some sort of blessings on your part on a specific git commit, like you would, I don't know, maybe, for instance, run a series of tests on a kernel or something and make sure that there's no like bash grip breakage or something like that, you know? That's where some sanity checks or something, so, you know, I mean, if you already do that for every single commit, then fantastic. As I see. Yeah, so not only for KDebops, so if any of the block test users wish to have the tags, I'm willing to do that, so if any comments you have, please let me know. Yeah, that's all. Great, well, thank you.