 Vsezme ovali, da vem, da si počonno izblijem v tudi svega in svega Listso. Stajz v svega in, da sem in glasba, da so vsezteo glasba. Moj personal perspective on this is that I'm using 5.10 LTS and using XFS on 5.10 and I realize that it has not been maintained at all and I can go into the history but some of you already know it. So there are like for two years, more than two years now, there are only three backboarded patches for XFS in 5.10 and it's not because there were no bug fixes over the past two years. And I've done some work. Well, first of all, I'd like to say in advance, this is my entrance to run XFS on 5.10. It's my responsibility to backboard patches and I indeed made some progress with that. I can show it later if there's an interest, but I currently have some backboard branch for XFS and I'm testing it along with Lewis, which has the next talk after me. But I know that the FS test talk is going to be a lot wider, so I wanted to have this session to discuss process because why do we even have the LTS kernels, right? We have them so we can collaborate on things and not duplicate work. And that only means something if people actually use LTS kernels, so I cannot force, of course, and the stable kernel maintainers cannot force anybody to use the stable kernels, but if the big players, which traditionally didn't use the stable kernels, don't use them, then they are worthless for the subsystems in question. So the big players, the distros didn't use to use the LTS kernels and they still kind of don't. But we do have some big players. Google Cloud, KOS, and Microsoft, I hear, are following stable. So they're not the only ones, obviously. I mean, Android also follows stable, but there's no interest in XFS in Android, so there has to be some overlap between a large player using LTS and using a subsystem. So, of course, in order to be able to have stable file systems, the most important thing is files, FS tests, and we already know that and we need to collaborate on testing, collaborate on the test suite, and that's the next session. But baselines, baselines are kind of parallel kernel, and they're sharing there, they can be sharing. I want to open the floor to thoughts. I mean, I can show the work that I did, but I don't know XFS, but I think it's less than, whatever. Yeah, so sorry, Ted, no, ma, it's still was for me. So when I worked at Red Hat, like this was kind of a pain point for us, is like, we're on an old kernel, QA would pull in the new FS tests, things would blow up, right? And this became really annoying to us. But I think that this is actually a really good indication of where we can pull in patches. There are clearly things that shouldn't, like my ref link changes, you probably don't want to pull that back into 510. But I think that constantly updating FS tests gives you a good indication of, OK, this test is new and it started breaking. Is this a target for back porting, right? You know, depending on how much pain you're willing to absorb. Maybe I should mention, there's also the factor that FS tests is mainly used to test upstream by developers. And when you go to test LTS, things happen. I mean, you need to know, you need to use certain XFS blogs. It's not built and not being maintained, not being friendly to people that test LTS kernel at all. So, one of the things that I've done recently is just try to learn from LTP, which is another project that I'm working with for FS Notify, which is a project that is very much friendly to stable and being test constantly on stable kernels. And they have some practices that I thought we could maybe adopt. So, just the beginning of the beginning is a standard way to mark tests as regression tests of a fix. So, I just added to a notation. It's supposed to be merged soon as soon as Zoro gets his hands around. So, just added to a notation that fixes by commit and fixes in kernel version. That doesn't mean that tests will not pass in kernel version in some enterprise distro or a backboard. It just means that if the test fails, you get a hint. The hint says, maybe you don't need to backboard this commit, or maybe that's not gonna work in the kernel that you're testing. And one more thing that you can do with those annotations is, and I haven't done that, create a simple script that works on your kernel branch, whatever private kernel branch, and checks whether the annotated fixes commits are in your kernel branch, or if they have been backboarded and annotated as upstream commit. And then you can automatically create an expansion list that is custom to your kernel branch. So, I think it perhaps might be useful to think about this from a more specific perspective. So, I think most file systems, with XFS being the notable example exception, are very happy to let the stable maintainers auto-pick bug fixes into the stable kernel trees. My personal experience with EXT4 is less than once a year there's a screw up. Maybe it's once a year. There's like some screw up where they have to revert a patch, because it wasn't suitable for backporting, and they didn't know that, their automated scripts didn't know that. That actually escaped to an LTS version. Much more often someone notices when they send it out to the stable list, and someone says, yeah, that one you really shouldn't backport, and then they catch it right away. That's why I believe most of the file systems have been very happy letting Greg and Sasha, the LTS maintainers, do their thing. XFS has been the notable exception to that, and there are good reasons for that. The XFS folks aren't in the room, so suffice it to say there's history there. One of the things that I had announced on the list, and unfortunately it has taken a lot longer for us to make forward progress on it, primarily because of the baseline issue, which was figuring out which patches were currently failing, that are failing on 5.15. Lea, who's a person on my team who's working on this, has been focusing on 5.15 because that's the cause kernel we're actually most interested in, it's based on 5.15 LTS. And so what we've been doing is using an automated spinner that is testing patches that were identified using Greg and Sasha's automated scripts that were XFS specific backporting them to 5.15 and then making sure that there were no regressions, and that's where the baseline is very important. We are doing that with, I think it's something like order of magnitude, 10 different file system configs. We consulted with Derek, the XFS maintainer, what he was using, and we basically replicated that in our testing infrastructure so that we could test all the things. That's when we discovered that there were a whole bunch of tests that only pass if we cherry pick some of the 100 odd out-of-tree commits in XFS tests that are in Derek's personal tree that he has never actually gotten upstream. He works on it, it's a process. And so I have my personal XFS test fork that has some of Derek's patches cherry picked into it so that we could actually run the XFS test baseline. And that's what we've been doing. Unfortunately, Derek decided to take mental health break this week. Lea is also on vacation this week, and I knew that there would be very few XFS folk here this week. But our intent is to send a list of commits, a description of what we've done to the XFS list, and then there's going to be a negotiation about what is considered appropriate testing. Do we just simply send it to Greg and Sasha, CC, Linux XFS, and then let them knack patches? Do they want to explicitly act patches before they go to stable? I have had members of the XFS community express that that was their desire. Other people have been willing to use the send it to Sasha and then we'll knack it. It's a negotiation, right? We have to work with the XFS community to see what they're actually comfortable with. But that is a discussion that I hope we will be opening with the XFS development community next week. So, one of the things that folks were comfortable with on the XFS front, at least at Malasa FM in Utah, was essentially that there would be a group of volunteers. In this case, it was a mirror in myself to essentially do the reviews for what would be candidates for fixes for XFS. That went out only for two releases, but one of the things with this sort of work is, as you guys who are familiar with running XFS tests will know, it takes a long time to establish a baseline. It takes a long time and it's work, right? And then also identifying the issues also takes time. So, essentially, it's thankless work, too. It does take time, and you need resources. And when I talk about resources, I'm talking about pretty big systems, right? So, I ended up having to buy my own systems at home, but that doesn't really scale to the level that you want to provide automation. In the next talk, I'll elaborate a bit more how to scale some of these considerations, especially for stable. As it stands right now, essentially, it's each developer doing their best effort, right? But I don't think that scales. I'm not sure if Derek is going to be giving his talk on maintainers scaling and all that stuff. Is he? No, he's not. I got an email from him of what he wants to talk about. I'm going to pretend to be Derek. So, I think that what we need as a community is essentially a bit more of a collaborative effort on that front. Resources is a consideration, but again, in the next talk, I'll talk a bit about that, too. But on the XFS side, if you have candidate fixes, I think you can pretty much send them to Amir and me, and then, basically, we'll put them into a queue of tests that we're basically running, and that should essentially give us confidence whether or not these are proper candidates or not. So, by all means, if you do have candidate fixes, please send them. But I'm just trying to give the lay of the land of the last conversation that we had about trying to address the problem. But I'm also trying to provide an apology, I guess, as to why you haven't seen any other fixes yet merged for a long time for XFS, and that's because if you change employers and you lose a huge system, then basically you basically lose your rig, to test a lot of stuff. So, that's kind of like one of the problems, right? If you change employers and you had a big system to run tests, then where do you run them? So, there are solutions to these problems, but it requires a bit more collaborative efforts. OK, we're going to let Yong say something that he's been waiting for a long while. Hi, hello. Yeah, you're good, man. Can you hear me? Yep. OK, so, how actually distrust deal with this file system backporting fixes is, for example, we very much do care about XFS fixes. So, in our kernels, although stable doesn't pick up fixes, we do pick up for the distro kernels, basically, and I think Red Hat does the same. And it's that we leverage the Git fixes infrastructure. So, basically, when you have the fixes stack, then even though the patch actually doesn't go to stable, it actually goes to us, and then we basically backport the patch and add it to our distro kernel. And we also do all the testing, like Luis did a lot of the automation as well, but regardless of that, basically the point is that the distro then does the testing. And so, to get the confidence that actually the patch is good to go to the customers. And so, this is the resources that need to be put both in the, like, developer time, people actually looking at the patches, people who have at least a bit of a clue whether the patch is safe, they look at the patch and decide whether this is good to backport or not, and then, of course, quite a lot of testing. So, I'm not sure if, as I said, it mostly works out fine, even without targeted testing for EXT4, but on the other hand, it didn't work out a few times for XFS, which I believe is why things kind of, you know, why XFS guy decided that it's better for them to not backport anything, because there were a few screw ups and, let's say, the personalities didn't work out together. So, yeah, so, I think that Gitfixes is also, like, useful thing for this, because there you have already the annotation, like... Te, let me just... I'm sorry, Johnny, are you finished? Yes. Okay, I just want to give a glimpse about something. I'll let you ask what you want, but maybe some of you have been staring about what I'm trying to display for a while, trying to figure out what it is. So, this was mostly auto-generated, right? This is something that I was looking for when I was looking at backporting patches over the period of two years. I was missing the higher-level context. Like, there are around 600 patches between 5.10 and 5.17, I think, for XFS, and I wanted to look at it at a higher-level. So, I created a tool that takes all the commits and finds them on the public inbox in the mailing list and then finds the cover letter and creates the link to the cover letters, and then I have all the patch series and the patch level that went into upstream. And then I could easily pick patch sets instead of picking commits. That's one thing. It's easy to get out things that you don't need, and also it's easier to understand by reading the cover letter if you have any dependencies and such. So, it's still human work. It's a lot of human work, but this is a great assistant for that human work. And I've created a branch with the backports already, which is being tested. One small feature to mention that I added is looking up references to XFS tests in the cover letter, mailing list correspondence, and that's auto-generated here. So, you can know whom. I see failure in test 301. Maybe I need to backport this patch series. It's not always a failure. Sometimes it's just related in some way, but it gives you context. So, you have to see what was discussed. By the way, this has been done by looking at the pull requests. But just so you know, as a maintainer, when you're sending a pull request, you can use the same tool to generate those release notes if you want to attach them to the pull request. This is generated in a restructured test, but the tool also generates in text format. So, I don't know, maybe you like that, you can use that. It's not been merged before. I need to talk to Konstantin if you guys find it interesting. One other thought that's probably worth maybe some discussion is as the EXT-4 upstream maintainer, about every three or four months, I'll take the latest LTS releases, so 5.10.4.19, whatever, and run a full set of XFS tests that, recent as of that time, to see what test failures there were. And it's not running the test that's hard, right? I mean, for me, it's literally less than $2 per retail GCE cost to run a full set of tests on 12 different XFS configs. It's not resources, really. It's the developer resources to then interpret the failures, figure out if these are failures that will never actually get backported, because there's no way we're gonna get those changes back into the 4.19 kernel, or if it's a patch that should have been backported but somehow didn't get auto-picked. And I think on average, I do this about every four months, that's when I have time. On average, there's maybe between one and three patches that I will then manually backport and then send to the stable kernel maintainers. So that is something that I think, I don't know how many upstream files to maintainers do that. It's just simply a matter of time. Maybe there are improvements we can make to auto-generated, that would make that less of a time burden, or maybe this is an opportunity for us to recruit other developers into our file system development community to do that work, because there's no reason why the maintainer is auditing LTS test runs. But that is something that is good and useful to do. There is value in running XFS tests against the LTS kernels even if you're using the auto-backport mechanisms, the auto-backport don't always get everything. I would go further and say that, I'm not a file system maintainer, but I'm a file system user, but if you're in charge of a project, a file system project, upstream is not a product. No, you're not using upstream, you're using a stable kernel. Using a stable kernel, which is not LTS, that's fine. I want to ask you why. If you're choosing one kernel per year, why not choose LTS? I think that's obvious. Even in a company like Facebook, you probably have more than one kernel. In Microsoft, there are going to be running kernels that are basically LTS, like your example you mentioned, but there's also lots of workloads that are running on something that's more recent, like upstream. No, I agree LTS is upstream. XFS currently fixes for upstream, it's .o, not even the first LTS is usable. So there's some data here that may be really helpful. So I was curious about your point. I looked at how many things are in 5.10, stable. For more, from which file system? I looked at eight of them. So I just looked at it. T.E.D. is doing a good job. He has over 25% of his commits are in stable. 382 commits since 5.10, 96 of those are well over 25%, because I'm including merges there. So VTRFS had 140. I looked at my percentage as a maintainer. I'm obviously not doing a good job, because I'm at about 13% or something. I should have a higher percentage of commits than that. It looks like the number you're aiming for is about 25%, based on looking at some of these other file systems. NFS is close, it's about 20%. So file system should be backporting about 20% of their commits. EXT4 backports more. I don't think percent matters though. It's a fix. I know these are fixes, but what I'm getting at is that as a general rule, XFS was a disaster, but 9P had 2. Is that a good thing? I backported 50 something batches of XFS out of 600. I think a right number is probably higher than that, because there's probably performance fixes and other things, but I am curious that the numbers in LTS seems to be larger than 10% for most file systems that are not just EXT4. Obviously, maintainers, we shouldn't be backporting all these things, they don't have to be, but clearly there are file systems that do NFS, backports a lot higher percentage than NFSD. I don't know if their bugs are any different. I don't know, but anyway, it's an interesting story to look at, but one of the things that's as a maintainer, it's hard for me to run, is I don't, there's no, I can't fork XFS test and today XFS test is going to fail in 5.10, because we changed reflink behavior, so, can't we have a stable branch of XFS test? Well, yeah, it's an issue. FS test is not friendly to people testing stable, not so friendly, but specifically for what you just said, the annotations that I added can help, because if the test says since kernel and something you can run your own script to auto-expunge this test. Yeah. It's going to be merged to FS test soon, what I did is I created the helpers to annotate and I annotated a lot of the overall FS tests. So we have an example, you can continue. And to be fair, I think that's great to have the actual infrastructure, but at least for butterFS I assume most everybody else. We've already had this in when we do the comments for the thing, if it's a regression test, it's like, okay, it was fixed by this patch title. So we can go through. That's what I did. I didn't do the research now, I just used my own comments. As soon as your stuff is merged, I can go pull all that out and put it in there. Okay, it's two o'clock. So let's go through this call and join the storage track call. That's where we'll be. Thanks.