 Mae'r gwahoddiad yn ymddiwch. Rwy'n fyddiad, dwi'n gwahoddiad yw Llywodraeth Llywodraeth. Mae'n gwahoddiad yn ymddiwch i gyd, mae Andrew wedi'u gweithio'r gwahoddiad a'r gwahoddiad yn ymddiwch i gyd, yn yma'r fawr mewn cyfnod, ond mae'n gweithio'r fung ymddiwch i'r rhaid. Mae'r fawr yn ymddiwch i'r fawr yn fwy o'r M-Map. Mae'r fawr yn ymddiwch i'r fawr, yn rhaid o unrhywr reinforcing greimg, wedi'u molyg bawb ach то ar offered-, ac mae'n llei da enthoraeth. Mae'n hindovernydd� conscious, yn ymddangos rhaid ymddangos cynnig unid a ddewch yn ddod i gyflym maesol. Mae'r Rhaid i gyflym maesol ar bobl, mae'r iawn i'r ringh o'r Gtaid Dayron Rydw i'n bywyd Friendau fel gân dda deall, y gallwn i-sgård mewn gwelio'r ddefnyddio, mae'r ddweud i ddweud gyda'n gweithio. Y gallwn i dded Money byddwn i'n ddweud y tu pawr, mae'n ddweud i ddweud gweld ei ddweud i ddweud ei ddweud i ddweud i ddweud i ddweud i ddweud i ddweud. I took x86 as the example because, why not? So in the x86 do user address fault function, we take the mmap readLoc and then we call findVMA and so the whole way after this we're expecting the VMA to stay stable because we're holding the mmap readLoc and if you're going to change a VMA you have to take the mmap writeLoc and so we're guaranteed it will stay stable. And we pass the VMA down and the problematic bits is when we get into double underscore handle MMFault and we call P4D alloc, PUD alloc and PMD alloc and I'm really glad that David went first because he did a fantastic job with those slides explaining what all these acronyms mean but the problem from an RCU point of view is that those use GFP kernel allocation. So we might end up doing page reclaim, we might end up sleeping, waiting on page write back to happen and memories become available and this is all a giant mess because obviously you can't sleep while you're holding the RCU readLoc. It's actually okay after that. That's the only bit which is really causing me hardburn right now is that we've got these three GFP kernel allocations and to add insult to injury most of the time you don't even do them because they already exist but of course the possibility is that you've never touched anything in this P4D before and you've got to allocate all three levels. So Paul I'm so glad you're in the audience. I think there is another mic floating around. I proposed a couple of ideas to Paul and he then asked me some trenchant questions and I didn't quite get round to answering them. One question is if you had something on SRCU that did not have the read side full memory barriers would that make things easier? Oh I wish Laurent was here because somebody can find him. Sorry. It's only 10, 11 o'clock there. Right. Well it's nine hours difference. It's 11 o'clock. Well anyway so SRCU has been tried for this before or for SPF before. There was an SRCU variant of it and there were performance problems. Now this was a few years ago so maybe those performance problems are now fixed. Performance problems there would be there already. The difference is that I might have a way of letting people choose between having the read side be slow which is the current choice you get period and having the right side be you know the grace periods be a little more contorted which but I would want to use it before I did that so my question is if you have something like SRCU where you have to have the SRCU struct but there was a way of doing it so that I mean you're going to have probably a cramped enable in there okay in the read side so you're going to have some tests maybe in both of them maybe not I don't know I'd have to go through it but there would not be a full memory barrier. If you had that would that get you to the point where you could just throw the critical section around the whole mess and not worry about it and that's actually to both of you both Matthew and Michelle for that matter. I don't know. I think more thought probably required. I mean my instinct to say that actually that should be fine but I think recent bugs for me have indicated that I do not understand memory barriers in any functional manner. I mean I think there are cases right now where you would hold the MAP read log for a long time so I mean in this case it might be fine but the question is like how long could the grace be done? Well that's up to you. It's your SRCU struct. No seriously that's the reason that the SRCU has the multiple things is with normal SRCU if anybody anywhere in the kernel decides they want to hang out in the read side for 100 milliseconds that affects everybody. The thing about SRCU I mean you could even have one per process if you wanted to. Yeah we could embed it into the MAP struct. Yeah MAP struct needs and some extra size right. I mean it's too lightweight. It's like two kilobytes or something. But you know you or you could just have a global one for all of them depending on you know what the readers are doing right. I mean you know it's I mean if you're having trouble allocating one you're probably having trouble allocating the other it's not clear having separate is useful but you know I don't know the code who knows. Most faults will be very fast but you know there's always going to be once in a while that you know hits the allocation cases or hits a slow disk and so most faults you know take you know micro seconds and then once in a while there's one that takes the 100 millisecond and you don't know at the start of the faults which one is going to be. Right but is it the case that processes have different backing stores in other words if one process is having problems allocating or all the other processes or is there something with C groups and namespace and something like that that means that some might be fast some might be slow. Michelle's nodding his head but I'm not sure which way he's answering. Yeah so essentially you can end up in both MMCG reclaim and the global reclaim so no luck there. So we would want to have reprocess most of the time. Well very MAP struct. Well and in FS reclaim wouldn't we have a variable time based on which FS reclaim we're going into. Say it again I was just too busy not screaming. So the FS reclaim is really a wild card in itself right like we don't know how long that would take so it if we end up in FS reclaim all bets are off that we're going to make the deadline. So actually I mean I've got a room for them and people here who can shout at me that I'm stupid and wrong. My inclination is actually to make GFP kernel explicit rather than implicit to all these P4D PUD and PMD alloc cases. And in this path at least the first time through we make them GFP no weight and we actually handle the failure to immediately allocate memory. And the so the intent is the quick path just does it undo us to you and you know the quick path is we already have these things. We really have these levels of the page table in place or we can allocate them immediately and the pages are in cash. What I haven't shown here because you know sides full is that format map pages can fail. It can say well the page you've asked for is not in the page cash or the page you've asked for is in the page cash but we're going to have to run read ahead in order to fetch the next batch of pages. So yeah you can have that page but you're still going to have to drop the lock and do IO. And so it will return a full flag retry or whatever it is all the way back up. And in that case we would take the end map read lock again. We can fall into the slow path. That's fine as long as you know the 99% cases we do the whole thing under our CU read lock and never touch it. You know we're already winning right. So who's with me on actually adding a GFP flag to these three levels of page table allocation. That would be a lot of code to change. Less than you'd think. We really don't allocate page tables in that many places. And you're like this that makes Vmalloc able to operate with not GFP kernel flags. Yeah I mean this hasn't been a technical problem I believe that was mostly that you've got those Balloc functions for all architectures. So you would have to touch a lot of architecture code. Have you seen how big my folio patch sets have been. Yeah that's what I'm not looking at. You're not scaring me here Michael. I'm not trying to not scare you because you are hard to get scared but all the people who would have to look at a code and essentially be aware of all those subtleties that might be there like I don't know those continuous page tables in our architecture and what not so. I don't think that's a huge technical problem rather than do a lot of work because for some reason this would be really really helpful to have from the very beginning but that's not the case. GFP flag or can you just go to the slow path whenever there's not a PMD already there. Because I worry about the case where you have like a big virtual mapping and you start faulting everything like sequentially and with GFP no way do you never go into reclaim right. So you could pretty quickly deplete all available memory without being forced into reclaim. But because the PMD is mostly going to be there right it's just the first hit that has to allocate it and then all the subsequent ones wouldn't have to fall back. Yes so if you're doing the GFP no way and you would have to reclaim you just fail and the response to that is to go back to do user IDRA fold, reacquire the M-map read block and then try it again with GFP kernel instead of GFP no way. Right. So that's going to force you into the reclaim path unless somebody else did the work for you first. Then you wouldn't have to update. You wouldn't have to add a GFP flag right if you just go to the slow path if there's no PMD yet. Like don't even try to allocate it in the fast path not even with no way just don't. If it's there you do the fast path if not then you fall back and mostly it's going to be there. That's the last code to touch. That's really interesting. We should try that. Okay. All right thanks. Yeah so where we're going from here. If I can have a question to Paul actually. So if we have that whole thing in a RCU what could actually happen if the reclaim or any part of that path just depends on RCU in some really awkward way because we simply don't know because those are reclaimers that are out of MN hands and so you actually cannot make any assumptions. Is that possible even remotely? Well so first off I'll end up giving you like three answers because we have three different things we're talking about. Okay. So the first one was a modified SRCU to be fast enough. In that case it's a different thing off on the side and so there's no interaction unless you make it be. It's your own RCU you've used. So if you put an interaction there well okay you shot yourself in the foot right which happens I do that a lot myself so you know it'd be you know welcome to the club. The next one is if we used a GFP no weight or whatever it was okay. In that case you go off on the reclaim it probably has RCU readers. If it does a call RCU that's fine it just goes off and that's great. If we do a synchronized RCU or a synchronized RCU expedited that'd be a problem but if that were to happen locked up would yell at you really quickly. So if you were doing that approach my advice would be to do something to just force reclaim on the path manually. I'm pretty sure you can do that. You're looking at it if you can't which maybe you can't but if nothing else just have something else allocate a whole pile of memory and that will force it all right. This can be done and if you had to then a kernel with locked up enabled and you tried to do a synchronized RCU or RCU critical section it would yell at you. Okay. Assuming it and your next thing as well maybe it happens only sometimes and yeah I can only help you so much there right. In the other case where you aren't doing the allocations and you're doing what Johannes was suggesting then clearly you aren't hopefully that doesn't force a reclaim if you don't allocate but what do I know? Did that answer things or? Yeah actually yes because the answer is that this could be really dangerous if any of the codex that are living outside of the page fold so they're not under the direct control of that code but actually do something that it's synchronizing RCU which can be really hard. So for example just to give one example in RCU torture if it were to be doing a callback flood which has made the reclaim happen and it gets the callback from the OOM handler and then says I'm going to do an RCU barrier you think that might cause a problem? Yeah. Okay well yeah there we have an example for you. You can worry about other things I don't want to stop you from worrying go ahead and worry about other possibilities Is that fair? Yeah. Okay so we do have an example where it could be a difficult. In the sleepable it's a full barrier right? In the sleepable RCU? Where's the catch box when you need one you know? Okay so in SRCU as it exists now yes okay there's a full barrier that has not always been the case there was a time a long time ago where there were just three grace periods three RCU grace periods and SRCU grace period which caused us trouble all right. It's possible now to make something kind of in between where we don't have memory barriers where basically it would be possible to have a thing where you say initialize SRCU struct and make it be a fast reader one and that will be some penalty I don't know exactly what right now if I have it on the update side if you do a grace period there's something that might take longer. Intuitively that's exactly what RCU should be like most users will assume that readers are cheap and writers are expensive so the way you're proposing optimizing sleepable RCU that makes sense for any users because like hey readers are going to be cheap but at the cost of writers. I agree but any is a strong word. Right so I guess the other part of this is the VMA handling and in that case we're thinking that for this the work once you RCU read a VMA it can't somehow change to the point that the address you're interested in no longer is in that VMA and that basically means instead of resizing VMAs VMA adjust and split would be essentially use new VMAs so the VMAs would be RCU safe by being RCU freed. So yeah so that's kind of the other part of this. I don't know if that's a problem for anyone or how. If the VMA has changed like that and the old VMA you looked up at the start of the fold is not current anymore you still need to detect that at the end of the fold probably before you commit any new mappings to the address space. Yeah so we were thinking about a flag for that in the VMA flags an inactive. So if you hit a VMA that has an inactive flag you know there's something happening to that VMA and you keep looping until it's gone. Yeah basically you about the page and you try again either the same way or with the lock. And that flag would be that flag on the VMA I think would be synchronised by the page table lock. So once you've got the page table lock you know that you can check that flag and it's going to be valid for the duration of. Okay there's potentially a lot of. Oh no once you have the page table lock you can check the flag because the flag changes on you while you're holding the lock. The person who has set that will then take the page table lock and tear down all the mappings. That's a bit similar to SPF I mean I don't know if I want to talk about it right now but there's a lot of similarities. That's good because that seems like the sensible way to do it to me so I'm glad that it's also the sensible way to do it to you and to Lauren so that's good. Yeah I mean should we compare and contrast our approaches here or do you want to? Yeah we could do that. So the way that I see Michelle's code is that it's sort of yours is perhaps separated in time and ours is separated in space. So the SPF version of this is instead of taking the M map read lock at the top here you take a sequence count on the struct MM struct and so any modification to any VMA while you're doing the rest of this will be checked right before you do the insertion and if the sequence count has changed then you know that somebody has changed something somewhere in the M map struct and so that might be the VMA that we have a handle on and so we abort, we go back to the top and we take the read lock and we do the whole thing again actually protected by a lock whereas what Liam and I are doing is separating in process address space that a VMA is your inactive flag per VMA but there is no sequence lock on the MM struct it's simply done by checking the VMA that you were looking at to see whether or not it's being killed by an M map operation or an M protect operation or something. So I mean you're still going to see false retries with our approach because if somebody scored M protect on a giant VMA it would have caused that VMA to be split and well now you had to replace two maybe three parts of it with new VMAs and so you know there's going to be unnecessary retries still with both approaches but hopefully fewer with our approach. Yeah I would think that it would be, being per VMA it just means that it has to hit that one area right? Yeah it's going to be less. Yeah I think there's a few places you have to be careful not just at the committee at the end so if you go in handle MM fold you know when you go through the existing page levers you do have to be careful for the page levers not to be young from under you which can be done with ASU but always like clearing interrupts so that you won't have TLV should done depending on architecture. So you have to be careful there. There's the place where we take the page table lock on the page table that we found same you have to make sure that it's still the page table at the instant where you try to get that lock that it hasn't been young from under you and then all the way at the end when you're going to commit your pages when you already have the page table lock you have to make sure your VMA is still the right VMA for what you wanted to do. I think that will be similar whether we do it to SPF. So in SPF we kind of have the same approach except all of these three places that I mentioned have their own small RCU protected section and we don't care about the whole thing being one big RCU block but I think that's kind of an implementation detail whether it's we have one big RCU block or like three or four along the page road. That's kind of a similar idea. Yeah, thanks to that Michelle. You're absolutely right. I forgot to mention that one thing that we're definitely going to need with this approach is that page tables get freed under RCU protection. That's already the case for some architectures. It's even the case for some X86 configurations and I forget the details because I looked at it once and I ran away screaming but I'm going to have to get less scared of that. Okay Michael, it turns out there are some things I'm scared of and RCU page table freeing is one of them so I'm kind of hoping somebody else does that but I'll do it if I have to. And David wants to give David a mic but thanks. So I mean like I had a look at that whole mess and like I mean page tables are just horrible. The thing here is whenever and I think I mentioned that to the SPF series is we don't only need like freeing of the page table under RCU. We have to make sure that also any, let's call it auxiliary data that is clued to the page table gets freed using RCU. For page tables that is for example the page table lock on some architectures it's embedded in struct page and others not. And I think we'll get more into that problem domain once we for example use some dynamic allocation of like a struct page parts as you imagine. Yeah, that's easier. I mean maybe it's not so buried deep down in some call chain but yeah. We actually have that unresolved issue in the current SPF budget and that's really only that's configuration dependent. That's if you have a split PTA logs and that you're going to allocate the spin logs instead of just having them in the struct page. And I think the only configuration, legitimate configuration that triggers that is if you have config RT preempt that will cause your spin logs to be bigger and you're going to hit that issue. I think also I think on X if you have 32 bit architectures I think it might also but I'm not sure. I'm not entirely sure. I'm not entirely sure how we want to handle that. We could definitely buy the code to also defer the freeing of the split PTA logs. That brings me actually to the point that I was trying to come to is that the way we currently free page tables is a mess. And I think like we should defer that whole deconstructed like there is some something called deconstructed for page tables. We should find a way to defer that to the actual freeing. I have no idea how we would do that but maybe that goes into the same direction of what you proposed. I've worked on that code before. I have thought about doing that and I just didn't see a particular need to do it that way. You have a use case. Let's do it. I can win that patch up for you in 15 minutes. Let's do it. Okay, thanks. Easy. I thought we were going to fight more. Sorry. What would be a nice conclusion? Well, I think the best plan would well first of all maple tree the stuff I have out now doesn't conflict with either path forward. So that's great. If FS reclaimed one way that would be a great conclusion but I don't think that's going to happen. So we're going to have to figure out allocations outside the lock, the IMAP lock for certain things. This is still really, this is step two, right? And then there's other things that can be done to better, to go further in our grand scheme of the beautiful, sunny, rosy future. Matthew, you want to? There's some interesting problems we've been having around slab pre-allocation and it's like, ah, yeah. So the basic problem, and we have one on the slab maintenance in the room, this is fantastic. The basic problem is the usual, I'm holding a spin lock and I need to allocate memory, right? And so, you know, you don't want to go into GFP reclaim, et cetera, et cetera. So what we've been trying to do is pre-allocate at the top and then take the spin lock and go through. The code paths, yeah, this is updating the maple tree, yeah. Perhaps the worst is an M protection in the middle of the VMA, so we need to allocate three new VMAs and we need to allocate three times the height of the maple tree plus one nodes. So we need to get quite a lot of memory pre-allocated to be sure that we won't need to allocate memory when we get all the way down to the bottom of the tree and find out that we're in our worst case scenario. But of course that is the worst case scenario, right, when generally we're not going to need it. So what we really want is a very efficient way to have the slab allocator say, from this slab, I want 28 objects and then a short while later, here's 26 of them back. Memphal, really? Oh, that's the classic hack, yeah, I hate Memphal. It's perhaps irrational of me, and perhaps we should just be using Memphal, but I mean, I've gone outside my boundaries and I've looked into the slab allocation. It's like, you know, why not just give us a detached free list and then we just pop a couple of things off the top of it and then hand you back and say, here's your detached free list back and maybe? He's not saying no. We'll see. OK. It's not a no. We could look into this. OK. Well to do it. He's probably watching on the screen. Oh, there you go. He's yelling at us on IRC right now. Yeah, so I really expected a lot more on this. I don't really have anything else. Did you have anything else? Did you want to talk SPF now? Well, I think we're going to start with SPF soon. Yeah? I mean, I want to say in general, I think we agree on the big directions that we want to do locally strings, but it's the details. First, we keep fighting on the details, but also, I mean, whenever we try it, there's always a few things we didn't see coming. I mean, I think it's time we get started actually getting stuff in with that because we've been talking about it for a long time. I might have a question regarding that. So what scared me a bit, scared the wrong word, but with the SPF series was that it introduced quite some supple, lockless versus locked semantics to a lot of page fault handling code. That scared me a bit. It made a code significantly in my opinion harder to read and understand with the approach that you're proposing. Would that also be true that we would have similarly complicated page fault handlers, or would it just feel much more natural? Let's call it that the delta for people that are not a way around the page fault handlers would be smaller? I think the delta would be smaller. I'm not supposed to move away from the podium. So what I have up on the slide is this is the state of today. What I would change from here is the M-map read lock would be not taken the first time round. Once you return with a fault flag retry, we would in fact take the M-map read lock. It's going to be if first time round take the RCU read lock, else take the M-map read lock. So it's not going to be a huge semantic change there. It's a few extra lines of code, but depending exactly how we solve the P4D alloc, PELD alloc, PMD alloc thing, that's a tiny little bit of extra code there. While you work the page table tree, check that it doesn't go away from under you. No, because it's all under the same RCU section. You have to care about that because you have different RCU sections, but I've got one big RCU section. So I can do all of this stuff speculatively and then check that the VMA has a change at the end. RCU won't have to clear interrupts for that. I mean it's not true today, but sure. Once the page tables are properly being free by RCU, we won't need to do that stupid interrupt disabling dance. That might also be true for when we acquire the page table log, that sort of thing. I mean we have the same issues in like two or three places and right now I kind of do the little dance every time to make it safe, but sure. So there's going to be a bit of extra code that's not on this slide where we do the actual insert into the page table and so we'll check the VMA there just to make sure it's not dead. But I see very little change in the file backed path, but I think about file backed stuff because fundamentally I'm a file system guy. I think about anonymous memory because I'm not really an MM person. Don't lie. I'm a file system person at heart. I don't understand these unnamed pages. They make me uncomfortable. Anon is a lot of the same, but it's a lot more likely that you will have to allocate a page. And then at the end when you have your page you kind of have to check you still have the right VMA. But that's one of the things where it might not be convenient to have the same RCU section because most of the time you may have to allocate a page or at least a lot more often than in the value. I guess something that's going to change a lot of this for both file back and anonymous is using larger pages. Once we start deciding to allocate even like order four pages for both files and anonymous, we're going to see PMD alloc be needed many times more often. And I think that's going to change the whole cost benefit analysis, or if it doesn't it wasn't worth doing. I think of having one single RCU section or several. I don't think it's such a big deal. We could always terminate the RCU section, allocate pages, whatever. Start the next RCU section, check if the VMA is still not expired, whatever the expired bit is called. You can because you've got the C clock and you know the M-map system hasn't gone away. We can't because the VMA may have gone away. So if we drop the RCU lock, we have to re-lew, we have to re-call, find VMA. Now it's in a maple tree rather than an RB tree so it's going to be quicker to find and that may not be a huge performance penalty to do that. But it does mean that I do want to see us at least try to allocate a PMD page before we give up and say, oh just drop the lock and try again. So the way I do that in SPF is that I actually make a copy of the VMA originally when I get the VMA and then I can do my check using the sequence counter. I have a sequence counter that's updated by any byte. But that won't work with you always looking at this expired VMA bit. That means you're pre-allocating though too, right? You're making a VMA copy. But he needs to get the stack. Oh, okay. Make sure it doesn't get too big. Also VMAs have the proto, you guess you don't check the proto, the piece that VMAs have allocations in the VMA itself, right? So if there's anything you need to check in there, don't check it. There's a piece of the VMA. In the VMA, what is the name? I don't remember the part that gets cloned. When you clone a VMA, there's certain things that get allocated besides just to start and end. Yeah, the reference structure. Okay. So don't use them. One thing that I would like to ask us and we have discussed that two years ago and probably more in the past is that with Maple Tree, do you think that it's still worth to consider the range-looking path or just moving straight way to RCU is the essentially only reasonable choice? Because as I read the Maple Tree kind of guarantees that you get, maybe just getting the look-up to be RCU aware and do the rest by the range-looking, that would be tied to the VMA that we have less data structure to look at. And probably that might help a lot without too much subtlety. And wanting to do the range-locking with the Maple Tree is like a half-step to RCU look-ups. Right, because that can turn out to be, that might show up a good performance improvement already, because you rarely do page faults from different threads on the same VMA. And I mean, RB3 was terrible for that kind of thing because you have to do all the rotations when you manipulate stuff, but this should just make it so much easier. So have you considered that or it's just that end? Yeah, so I was looking at that and I was actually looking at because we do, it's a range tree. You could potentially have a lock per node but it just takes up too much space in the node to do that. And then we started looking at just locking on ranges and it's a lot of complexity only to turn around and throw out. So I'm not sure if you buy much by just doing the half-step. That's my opinion anyways. So one of the approaches that we've explored but not written code for is that we could put essentially a read-write SAM into the VMA. And then each, I've forgotten all the details of this because I thought about this a year ago and then I went off to work on folios. So when we look up the VMA, we're using the entire VMA as the range lock essentially. Yeah, that's what I have in mind. And so you would still have contention on the VMA as you acquire the read-write SAM for read. But then you can then drop the RCU read lock at various different points because you've got the VMA for read. Right. And you know it's not dead. So that actually solves a bunch of problems but it does then create contention on that one VMA. What I've been describing is the rightless path. Or at least we're not writing to the VMA struct. We're writing to the page tables, sure. But I mean that's kind of the point of a page fault is the right to the page tables. What I've been describing is a rightless path. And yeah, there is definitely a version of this which is lockless until you get to the VMA. And we could absolutely do that. And I'm perfectly happy for us to iterate towards an end goal if the community at large is willing to go through all these locking changes over and over again. And maybe we would never get to the rightless stage that I've been describing because it would just be good enough to be RCU. But I think there's applications that have these giant VMAs, like terabyte-sized VMAs that are going to say, well, thanks, but you haven't solved my problem. Yeah, that might be a good push for later work for you. I was just going for the 10 out of 10 gold solve problem. I mean, I'm happy to go for the 8 out of 10 solution first if people want that. Just recognizing that it will be more disruptive eventually. Over the long term. There's anything else? Of the VMA semaphore, it would have to mean that a parallel operation has taken place. That VMA could be going away in which case the fault is racing with the thing just disappearing. So I don't think the tension on a VMA semaphore would be as severe as it is on the MAPSA. Hey, Mel, great to hear from you. Thanks for driving in. Yeah, I mean, you're right. It's not going to be nearly as bad. I just think that for some workloads, there's going to be some applications that say you haven't helped me. The gamble would be that someone that's creating a very large VMA is likely managing it themselves and have done it for the express purpose of avoiding MAPSEM. While there would be applications that would have terabyte sized VMAs, chances are they're managing their own memory quite explicitly for the express purpose of avoiding any parallel operations, meaning it's also less likely to see any contention. There would be some cash line bouncing, acquiring it for REIT, but the level of contention that you'd have for a threaded application that is allocating and faulting its own address basis is completely different to what it is on just pinning the VMA itself. Okay. I mean, if people would rather that we take that step forward and then only later go to this, I'm perfectly happy to work on that. How about you, William? I mean, when I was looking at the range locking, I was looking at locking each individual portion layer of the tree as we walk down, but if we're just going to RCU, REIT, lock and then lock the particular VMA, then, yeah, totally. We started already talking about SPF and there was introduction, so I'll have to cut it short a little bit. So I just want to present the current state of the things.