 Okay, let me say why this IT to a new John, not existing John or Node or in a hardware way. So firstly, we probably partially mentioned, but for existing John, John normal to summarize, it is not pluggable. It is not pluggable. So it only works for basically dim DLM device. And for John movable, it's not a lot pinning, but and also it is for the dim DLM device. And John, I don't think they specify that there for Jim Jerem devices. I mean, why can't they be for cell devices? Okay, yeah, please understand this, not CXL DLM. I mean, I want to say that CXL DLM has a different hardware characteristics. Okay, I will mention it first. So I think there are some CXL hardware natures that MM should concern and that it is a performance can be the dynamically changed due to the link negotiation and for QSSL Tling and for error handling purpose. So there's a different last mechanism than dim DLM and the switch and fabric connection error can more happens. And for the sharing purpose, a security information issue can be addressed and the CXL DLM allows the async operations like a background. For example, forming of data sanitizing some commands around the background operation. So I think those, how do you natures are the MM and should concern? But I mean, like we have background scrubbers for DDR, like we have a post-package repair for DDR, like these aren't unique to CXL. I'm sorry, pardon me. We have error handling, RAS, scrubbing, all these background operations that happened on DDR. Yeah. That also happened on CXL. So like CXL is not unique in that way. Well, probably, but I'm not sure whether you'd be 100% the same with the dim DLM. It's not 100% the same, but it's close enough to not have to throw away the model we have here. It's the case that some people have had DRAM that comes up at the wrong, it trains at the wrong number of channels or the wrong speed, like these are problems that we have on DDR too. Okay, so the features that I mentioned here is not 100% but I summarized some CXL features that could be different from, that is different from DDR DLM, a dim DLM, so even MM need to consider. So yeah. The other point I'll make is that like that we can run with nodes until we can't run with it anymore. Like there's nothing that stopped us from getting more complicated later. But I'm trying to find the summary of why we don't add zones because it's just another dimension of manageability that the CoreMMS be responsible for. Okay, let me say further for this slide. So yeah, to explain why we selected a new job. So yeah, so for normal and moveable, it's geared not for CXL DLM and for job devices as we all know, it not allows a page allocation. So we thought, John, and then why not node is? So there are three reasons. Firstly, we want to inherit the MM background. So, because we know the MM is the, node is the TAMOS at MM hierarchy and the node is usually abstract multiple memory channels. So when CXL memory becomes single node, we currently need to newly devise a larger level of management. So let's call it the Supernode. When you put it as a John unit, so it'd be better to use existing node and John code because node is the largest unit now. And second reason is we also wanted to expand the MM hierarchy. So as I mentioned, so in current links implementation, John actually implement some specific MM algorithms like a convection and reclaim watermark, migration, anti-proliferation. So I think those pictures can be limited for CXL DLM. And third reason is so probably people here is it'd be easy, but it was less dependency and maintenance effort. So node is widely coupled with other color subsystems and user space than John. So John required much less coding modification. So yeah, so probably less potential side defect and management support. So those are all, yeah, we need, we propose a new John. And one plus for hardware way, so we provide some functionality point of view that hardware software coexist. But if the architecture then explained was work after OS boot with a driver by driver, I think it means. Yeah, I think, I really think that if I'm channeling the room, like that this line scares people, like that we don't wanna have different algorithms, like this, so I know you're saying this is the maintenance effort, this looks like the maintenance effort of having memory type aware compaction and reclaim algorithms. Like we didn't change the kernel because we went from DDR one to two, three to four to five, like, so we're not gonna change the algorithm for going from CXL one to two to three to four to five. That makes sense. What I want to say is, you claim that zone requires much less code modification, but that's just not true. I mean, all of a sudden you're specifying a new M map flag to be able to specify that you want to be able to alloc from a particular zone. So now you're exposing zones far more widely and I think that's what's really scaring people is that all of a sudden zones become much more visible and all we see is you can do everything you want to do with nodes, you can't do them with zones. So yeah, you can enhance the concept of zones, you can enhance everything so that zones can now do everything that nodes can, but I mean, you can take a plane, cut the wings off and drive it on the road, but you haven't made a good car. And also, how many zones do we need, right? Because like 10, 20, 100 or how many? Because a single zone simply cannot describe all the variety that goes with the new technology that you don't know these days. And this might be completely different thing in upcoming years. So I agree with Matthew that we shouldn't expose zones outside of the core MM, not to mention to the user space. Because once you start to like M up flag for this special zone then, do we want to have the same thing for the zone normal, zone DMA, zone? There is a lot of complexity that down the road. Well, for the dance command for applying this different algorithm on your neighbor. So what is sort of, so probably we provide the ABI to apply the level of these algorithms for a new zone. For example, if we use the zone or zone normal and then the same ratio will be applied on the same zone. So we thought, so with the different zone, we can apply the different ratio for the algorithms. So let me just like try to understand what you try to do. Like you say, like we're gonna add a new zone and so we can have smarter algorithms working on it. Like what would stop you from using what we propose? Like just using a node and configuring the node to have different compaction whatsoever mechanism. I mean, as Michael said, most of your memory will be in zone normal, zone movable, most probably in one of these. So just configure the node and like to have some, something else if you really need it, if you really, really need it. But I don't think that you need a new zone just to make something differently configurable or hard coded, like even worse hard coded in the kernel. Like what stops you from going with the nodes and then configuring for a node that you want that node because it's so special, because it's so slow to behave a different, like slightly different in certain scenarios that makes much more sense to me. Maybe, I see shaking heads, so. And even with that, why would you need different algorithms for slower memory, for example? So it's slower, but you still need to compact to reclaim. Because the system is the hybrid memories, so near far, so probably some reclaimable attention due to some kind of migration between the nodes, so. But that's already encoded in the node distancing. And you want to migrate to the best node you can anyway. And I'll also note that we, the other memory type that we're also dealing with on systems today is high bandwidth memory. And we haven't done anything different. And that's like radically, like CXL is supposed to be like DDR-ish, HBM is a whole other class that needs this whole other management scheme. And we haven't even thought about a new zone for HBM. Okay, and regarding the allocation pass that Matthew has mentioned, let me explain some more. But, so, as I mentioned, yeah, we expanded the system called Pregs and gf3.exe mem. And how it works is these two Pregs traverse john.exe mems of pre-pages. So yeah, here, right here. And the gx, so john.exe mem he locates the balanced system called internal locator expansions that he made it. So we think it is not that complicated on a, the rather added, as the configuration can be made on OS boot and after OS boot. So CXL memories are integrated in the same john. So allocation latency is not that long I think. Cause it is already made it. So when an application layer or a current space request memory allocation, page allocation, then what happens is it just find out this john and put the pre-pages. So I think it's better than finding out kind of some performance table on allocation time. Sure, yeah. You've looked at live mem kind. Yes. And it doesn't do what you need. Well, actually, when I see the mem kind, it doesn't support it, the CXL. So we extended our own library. Okay, so rather than add 50 lines of code to live mem kind, you decided to do all of this. Okay, but one another issue of a mem kind is a mem kind is a third party library. So not only application can use that. So we thought that the kernel previously provided information. Yeah. Yeah, I think we're, I think we're over time, but I kind of want to land this plane. The, this car, I don't land this car. I think, so like, I'm not hearing people being convinced or scared about the ability to use nodes. That said, like this community is willing to be proved wrong, but it needs to be proved wrong with like hard, I have this use case that I cannot do with nodes. And we, and so it's like, it might be better or this might be more efficient, is not sufficient to kind of overcome this like, hey, like, we just want us to look like every other node. So this isn't a, well, this is like a node today, but like come back with like the kind of the hard, can't do this without this change kind of argument. And that's how I move this forward. Like that was how MapSync came into being like, like we went, I think it was a two year discussion. We're like, we have no other way we can think of to tell the kernel that like, hey, we don't, we're not going to be calling M-Sync on you. Like you need to sink your metadata. And that was two years. Until we finally convinced people like, yeah, okay. We, I think we need this. So that's the kind of, that's the kind of hill you're climbing at this point is kind of like a two year effort to convince the community to get a new M-Map like. That's the bar. But yes, please continue to engage and we'll definitely change our minds with evidence. One last question. So in terms of job scheduling in a fleet, so how would the two multiple tiers of jobs take advantage of this proposal? This is possible. Let's say I have tier one job and tier two job. How do I impact them with this proposal? Because with nodes, we might be able to bend, you know, the two jobs to different nodes. With zones, is there a way to do this or? Okay, we want a tier one job to use like. Well, I think the way to kind of, kind of way to aggregate the bandwidth or a capacity way, it would be similar whether it is a job or load. But in our case, how it happens is, yeah, here. So as I mentioned, the multiple CXL channels can be aggregated into one job. And how we do is we have some CLI, the commands, you can think it kind of as a laid. So we think of a kind of a software laid striping general, striping, so when you make it as a group, for example, target node one and device CXL12, then the device is aggregated into here, and then it performs the round-robin algorithm. It interacts in a software way. But I thought this part can be replaced as the working memory device and how to breach it works. I think this part can be replaced or coexist with the dense, the proposal. So basically, it is not the scheduling of tasks, but I think we are telling about how to aggregate the CXL memory capacity or bandwidth. Thank you. Yeah, yeah, I think we're out of time, but thank you and continue investigating these cases that we can't do. Rick, thank you very much.