 So, past years I have been saying, when is the cash of us going to be ready for us for being almost not not yet almost. Now I'm saying, let's finally do it. I've got a couple slides on status and where we're at and why it's ready now but basically I want the bulk of the time to be talking about process because it's a massive 90,000 lines of code beast and we need to figure out how to get it reviewed and what the process should be. Next slide. I wanted to talk a bit more about the actual goals, I think as be cash versus becoming a bit more developed. I'm able to talk with more of a straight face about what I want to do with it. This is the performance reliability scalability robustness of XFS with modern features. That's a tall bar we're not there yet, but I think we're pretty far along. Scalability wise, people are using it on 100 terabyte file systems without any issues or complaints I'm waiting for the first one petabyte file system user. It's scale beautifully. Well that's been an issue with butterfuss that people have complained about. Lots of other cool things to talk about that I'm not going to go into here. Next slide. The data status in the last last year a lot of scalability work has been done that involved deep rewrites the last of the code that dates back to be cash design wise allocator code rewritten replaced with on this persistent data structures. So now there's no scanning required an allocator or copy GC. What got us to 100 terabyte file system scalability. We've got no Cal mode. We were starting to do some benchmarks of the no Cal right path versus XFS and results like encouraging not still a little bit of work to do so I wasn't showing those those off yet but That's a big check mark off snapshots people are now using be cash confess with my SQL and using snapshots for taking backups. So taking like nightly snapshots and with database workloads that stresses things pretty well and The last of the bugs that those people were writing have been worked out and I have been saying that snapshots and are stable and waiting for the next person to break it and things are looking good. Erasure coding is the last really big feature that I would really like to get done. I would love to have that done part of streaming but I I think it's time to draw a line in the sand and that can wait a little bit. Erasure coding is going to be really cool though. It's no right hole. And it doesn't fragment rights like ZFS does. And there's obviously a lot more work to do but The big feature work is lessening. Let's let's say it's never ending in a file system but I will have more of my time available for being a good maintainer unless I have to go off and block out the world for a month because I have to think about this feature that I'm working. And I can't think about everything else. That's a lot of what snapshots was. Next slide. Team is growing. Brian Foster at Red Hat has been doing a lot of really great work and bug fixes. Shout out to him if he's here. Thank you so much for joining us. His help has been much appreciated. Eric Sandin has been a big help in attracting interest at Red Hat. I've been starting to get a bi-weekly call going, the Pcash of Scabble meeting. If anyone is interested in joining that and helping out, send me a shout and I'll get you an invite. Test infrastructure. We've got automated test infrastructure that's awesome and it's been making my life much easier. We've had a lot of fun when everything is coming along nicely in like half an hour. That includes multiple FS test runs and our own huge test suite. And rust is something that I'm not going to take too much time here but I'm really excited about evangelizing to anyone who will listen. I think continuing to write code and see when we finally have a better option available as madness. I love writing code. I hate going back to debug code that should have been done with. And so I can move. I like to be able to move on to the next thing. And writing rust just means a lot less time debugging. And I intend to slowly gradually rewrite Pcash of Fest and Rust. That will be a 10 plus year project. But it's already getting started and happening. Part of user space tools has been rewritten in Rust. So I'm looking at taking that work and bringing it into the girl repository. So I think that's it for the status update. Next slide. Upstreaming. I just posted this morning. About 30 of all the non be cash FS patches in my tree. Stuff that we depend on that's out the list and it's already getting reviewed. And the rest is about 90,000 lines and 2500 patches that I did not post to the list. I just included a link to the get repository. And I want to talk about how we're going to like review what the review will look like for that and what the process will be. And let's open it up. So I think it's feedback. I think it's similar to what we said last year. Right. Can't like, we're all really excited for this to get merged. I think that you've basically done what we talked about last year, right is like, have all your prep stuff. And that's the stuff that's going to need to be reviewed. But I, I'm not going to read your be cash FS. And even if I did, like, I'm not going to be useful, right. So like, I don't feel like it's going to be useful. I mean, maybe somebody else's like, you know, Brian or whatever, like you have guys that are working on the cash FS and like just trust yourselves for that part. I think the generic stuff is what we need to review and once that's in like, I don't. I don't see a reason why not to merge. I mean, the rest is like, when it's right. Yeah, yeah, my, my question is what do we take to Linus. And my feeling is that it should really be for a project this size and establish it should be more about the process. What do we say about the process. And for me, the past years, a lot of that has been about process bringing in more people getting red hat interested and test infrastructure that was that was my biggest milestone weight off. Things go a lot smoothly and more quicker, more quickly when you have automated test infrastructure with a nice dashboard. So, can this is Mike Snitzer. Hey, Mike. Hey, so it's interesting. There was a failed post of the video target to Linux block yesterday in that the amount of code that was captured in two patches it ended up being like 512 K for the patch number two and 1.2 megabytes for patch number three. And it was an interesting contrast to see the granularity of the really coarse grain patches getting dumped on a list versus clicking on your get repository and seeing 2,500 patches split out and very fine grain detail. Honestly, it seems insane to have that many patches factored out and have somebody sit there and review each one and, and make a judgment, but honestly just doing that work alone, having put that work in helps you seem way more substantial and easier to trust your code and your process, then somebody just dumping a mega patch that encapsulates the body of work over so many years. So, I'm just saying, I, as DM maintainer and getting set up to have to like suffer with this video situation, whereas you having to convey to others and get others to trust the work that you've done you've done the heavy lifting by doing all that work to split out patches quite honestly. Thank you. I'm re basing the entire history or close to the entire history has been a lot of work, but I have also very glad that I did that. I ended up needing that about six months ago, right at started doing some performance testing and went, What the hell turns out, I had dropped the ball and had not been doing my own performance testing adequately enough there were some big but because I had the whole history I was able to write some tooling to do automated performance testing and bisect or automated by section of performance reactions over that whole history and I got everything back. We're pretty close to it. Yeah, having the whole history is just necessary in my book. Yeah. And this is, again, this is not really a question that anybody in this room can answer because at a file system level like your maintainer is Linus and that's who you have to convince to pull your whole history. I don't know what he's going to say, I would be my preference that you pull the whole history because like that's super useful, right. But it ultimately I think that's not questioned. The people in this room can answer. I think as far as review goes like you've done what we would all expect, which is, you know, stuff that's not be cash Fs that needs to be reviewed. And then I don't want to speak for everybody, but I know for myself, I'm not going to review anything that's be cash Fs specific. And so then it becomes more of a question of like, okay, go to Linus, like, hey, this is my, my get pull with my full history. Is this acceptable? Like not just be like, do this, but like, is this acceptable? If not, what are, what would you prefer to see? Right. Yeah, that's a good way to put it. Yeah. If I were Linus. Yeah, the interesting question is after we, after we merge this, it's going to be difficult for anybody besides you to do anything about, right, about bug reports, or I don't know how large the is the team that can do anything about bug reports. I think that's the main question that's need to be answered or to needs to be explained in the pull request. Right. I want to marry. Yeah, I have a team that can support this. I think this is very important. So that was actually one of my big criteria for before upstreaming was that it could not be a one man show anymore. I was shouting at Eric Sandy for quite a while to you guys are seem interested in this you need to actually get me some help if this is ever going to go upstream because I will go insane and run away to South America if I upstream it and then get deluged with bug reports and have to deal with it myself. So, Brian, helping out has been a huge help and that's one of the things that makes me much, much more comfortable about upstreaming it now. Yeah, so Brian's already been doing great work. Yeah, I think I'm going to reiterate a theme about the fs be cash fs patches which is ultimately Linus is the person needs to be made happy. And I would suggest that since we are now talking about how do how can we remove file systems paradoxically that's probably what's going to make it easier to add file systems right because I can remember when Hans Reiser stepped forward, you know, decades ago being as enthusiastic as you were and asserting that he had a team he had a company it was going to be great. And it may have been great back then, but then it fell into disrepair and so we have a process for saying, Okay, there are still some enterprise users. But, you know, in four years, we know how to remove it right and we've had shaggy actually say yeah you know I have no objections to making fs jfs go away because no one's using it and I don't want to support it forever. So, ironically, just simply knowing that, you know, accepting a file system isn't forever makes it a whole lot easier, right. And I think having a team and saying, Yeah, this is this is who's doing it. It's not like, you know, this project has a bus factor of one is going to be really, really important. The other piece that I wanted to suggest is that and this is sort of generic file system and you know patch review sort of semantics is just simply looking at that huge patch set. It's a lot easier to get people to review patches. If you separate them out into smallish, you know, 10 patches like these are all the closure patches. These are all the patches that add various, you know, interfaces that I need in the block layer. These are the locked up related patches. That may make it easier for people to just simply look through it. Like for example, I looked at the patch, the patch zero. And it said there were two lines added and two lines removed in fse xd4. And I haven't found which patch set actually changed something in the xd4 because I'm I have to actually go through all the patches to find it right. And that's the piece where I think people are a lot more likely to say, yep, I've reviewed all the luck that patches that they're like separated out. The one thing that we will need to relax as a policy question when we do that. And I believe we should but I want to call it out explicitly is this whole, you know, general rule of thumb, which is we don't add infrastructure until we add the first caller. And that becomes a problem when we're adding something big like be cash FS and my suggested amendment to that general rule of thumb is we add the infrastructure separately and we point at the get repository that says, and here's the first caller. It's not yet in the kernel yet, but it's going to be in the kernel, but we're we're sort of reviewing all the prerequisites first, getting that into the kernel and then we do a pull request that has everything in it. Whether or not that's the right process, you know, we'll need to like gain consensus and ultimately, Linus makes the call on that. But that would certainly be my suggestion for how to break that chicken and egg. Oh wait, you know, we need to add these changes to locked up, add these function calls to the block layer yada yada. And we don't have the first caller yet because the first caller in is in these 90,000 lines of changes that, you know, is going to be coming up as a as a separable pull request. Maybe that's not the best way of doing it, but I throw it out there as a possible path forward. My hope is that really only one or two of these patches are going to be of any controversy and we can hopefully that stuff will just sail through the beam a lock exec is the one that's turned out to be controversial Christoph doesn't like it, but I kind of need it. Exactly that. I kind of need it. Oh, I've already spelled out exactly why and what it's foreign and what the disadvantages alternate approaches are. The others are pretty small and well factored. And I think I've already been discussed, like in the case of locked up. But yeah, if we do end up meeting that suggestion, I'll definitely keep that in mind. But to answer your question about the patch that touched the X before there's a two or three patch series that reworks bio for each page all and bio for each folio all and adds a bio for each folio. That's the one. Yeah, okay, thanks. And that was going to be the other observation that I made. Which is right now, anything involving struct page is been under radical change as I assume you've noticed. Oh yeah, because of us is already, yeah, already large for you. Yeah, yeah, but the problem is, even after you've converted to folio, people have been changing function signatures of folio related patches of functions, right. And each before got caught out by that. You know, something that return a bull now returns an error pointer and Linux next it builds ship it didn't catch it. So, I'm still maintaining a 4.19 backport so I feel the pain. Yeah, so, you know, I'm sure you probably have already seen that but that's probably going to be the other piece. Oh, yes. Tricky. Oh, yes. I'm not complaining because I'm very supportive of folios and it happened from the start, but I feel it. The point out was, you know, whether or not Brian will be a resource in perpetuity I do think that his experience working with you in the last month or so. Has shown that an experienced and capable developer can get into this code and can contribute, you know, that it's written in a way that you can get up to speed and you know what I mean. I think that there's always this chicken and egg problem of, you know, who's going to work on it when there's no users and how user right so. You get to the point where it's out there and people start using it and people start getting interested, you know, I, I think that we have a little bit of data to show that, you know, people who want to get into it and contribute. We'll be able to do so that it's a code base that lends itself to that. I think the only thing that would be great is right from the start, right from the start but fairly early on, it would be clear who could jump in, for example, picking up patches and routing them to Linus in case you should be able to be around for a while, for example. That's usually that's a problem we had seen with the, I don't know, NTFS or NTFS3 like the maintainer disappeared and then, even though there were people who technically contributed there was no clear path like who could route patches upstream I think that's the most important part so that, you know. Yeah, I would trust Brian in that role. As it stands, right now, if he's willing to step up that. And the other part is, yeah, I think VcashFS is probably in excellent shape to be to be upstreamed, but in general my observation is that we, there's a lot of file systems that we have upstream and I appreciate that we are slowly removing some of them. The NTFS could probably in the future be also a candidate and we should have a conversation of how to do this generally like have a deprecation path for file systems. I was just looking at this from a perspective if you have to touch something that affects multiple file system things get painful very very fast. And you also faced with the problem that for a lot of file systems you did there's just no one there anymore to review it. And I have no idea how riser FS works I have no idea how I don't know some other tiny file systems kind of work so I would like to be more conservative before before we accept new file systems I think NTFS and NTFS3 was a huge mistake. The only thing I can do for that is just try to write the cleanest code I can. Yeah, I mean you, I mean, it's really not directed at you I think this is just it's great work. And, but just in general, like we regularly get submissions like SSDFS or whatever it is like really scary because the only thing that really needs to be done is for a pull request to be sent and for Linus to be, I hear no one yelling I'm pulling it. I used to do a fair amount of those big like block their effect rings across the block layer. I did a lot of the early beaver iterator work. And I know that kind of pain of having to jump into every random code base and learn it. And like digging into the MD code was always like, my God, Neil Brown is such a great guy and always easy to talk to, but that. I don't know how much we wrote or inherited, but that code was just so painful to try to try to follow. But then there's other code bases that just weren't that bad, like AOE. And yeah, it really just comes down to how well and how factored the house will start with code is for what it's worth. I already started doing, doing what Ted suggested for XFS online repair, putting all the infrastructure changes in its own separate set and saying, well, the real users for this are somewhat further down in the patch set, which in my case, unfortunately is like you 100 patches later, but I have finally persuaded Dave Chenner that there's some value in looking at patches at both patches and also the overall diff that you get from getting the beginning of the DJ Wong dev branch to where whatever the end of whatever I'm posting is so that you can actually see all of the pieces, both individually broken out bit by bit and also be able to see how does this entire system actually work in while saving me the annoying task of rebasing things repeatedly and having to play code golf like moving small helper functions up and down in the patch set. Because one, one thing that one thing that I've noticed while writing online repair is that I'll get an idea for some, something and think well this should be common infrastructure and I'll go write it into some part of the patch set that later I'll think I could also have used it over here to improve this other thing and not to go and re take that little snippet of code and then move it 100 patches back and upwards in the patch settings like this is a waste of time. Why don't I just put the infrastructure at the beginning and tell people that they can run get diff. So, you know, so so to address what Ted said I already started doing that and yeah it did create a bunch of complaints from the reviewers but I did it. So there's at least some precedent for trying to do something that I hope, generally preserves overall code quality while also not making patch submitters completely insane from having to move stuff around their patch that's over and over and over again, especially when the outcome, especially when the outcome is exactly the same code. I'm not usually worried about the 30 patches that are all relatively uncomplicated is not that big of a patch sets. I'm just going to wait for the action reviewed by to come in. And if they don't come in then then I'll take your suggestion as a plan B but I'm less worried about that part. Yeah, I mean, keep in mind that my current patch backlog is about 900 patches. Okay. Well, thank you for your time. Look forward to thank you can seeing everyone on the list.