 Tom here from one systems and I did a video on XCP and G and over allocation of memory and how Zen handles shrinking and growing Memory in a running VM how that passes between different servers and the ins and outs of it I'd SNI video if you'd like to see a video on over provisioning CPUs You know a lot of people answered yes And I just ran into a consulting job where the CPUs were over provisioned in a bad way And I thought hey I should really do this video because there's some Clarification that I think would be helpful and it's kind of fun to play with this and it's something I encourage you to play with in your lab to gain a better understanding Now yes, you can if you didn't watch the whole video just the answer is yes No problem you can have all these different VMs assigned the full number of CPUs available in a system It will work but there's some exceptions when you don't want to do that and we're going to the more detailed ones of like Hey, if you have too many things requesting too much load at once You're going to get inefficient But I also wanted to talk about the NUMA boundaries because I won't be covering that in detail But I want to make sure you're aware it exists and what that means is when a VM has More VCPU cores allocated to it than exist in a single CPU on the underlying host It will have to cross the NUMA boundaries when performing CPU instructions That means if you don't have a board with a single processor But a multi-processor board and each one of those processors and the board has X number of cores Yes, you can actually assign the cores that are in two different CPUs all into a single Virtual machine, but once you start crossing NUMA boundaries You end up with some memory problems going back and forth and it can become very inefficient So it's probably not the best idea to do that I can't really think of and maybe someone will leave a comment down below There's an exception when you think that's a good idea, but I generally find it not to be So if you have a dual CPU system and there's 24 cores in each CPU You don't want to assign any more than 24 to a single VM. That way it doesn't have some of those Running on the other CPU Zen is aware of the number of cores and where those NUMA boundaries are and it has been for a long time So Xcpng will manage that perfectly fine and said if you assign a certain number of cores It won't grab a few cores from each one of the processors. It's smarter than that Now as far as over-provisioning CPUs It's actually not a bad thing to do and I've got mine way over provision Because I know that all these VMs I have don't work in concert with each other So one of the real reasons you may want to do this is you have workloads that are Ephemeral there when you want something for example, I run gray log and when I do a query I want gray log to have all the power when I compile Zen Orchestra from source. I want it to have all the power But I'm usually not compiling Zen Orchestra at the same time. I'm querying gray log When if I am they won't cause the system to crash. They'll just well fight with each other a little bit of Who's got the most resources and yes I'll leave a link to an article because you can wait all the CPUs inside of zen to each VM So you can have a winner versus a balance So that's kind of a you decision and there are ways to do that But generally because I'm not doing those things at the same time. It is very Efficient for me to do this So I have all the power when I need it and everything goes into idle and all that Extra idle time can be passed along to any other VMs So let's dive into some of the intricacies and show you how it works It's actually pretty simple and I will show you that yes, you can do this without restarting a linux VM You can just dynamically size the processors up and down Now let's start here in the zen dashboard and you can see I've got 48 vcpus out of 24 So I've already doubled my allocation in this now Let's go over here to the host itself so we can see that this is a 24 core system specifically And there's a link down below to the build of this system. It's an amd rise in 9 5900 x 12 core processor So it has the 24 available, but it's physically 12 core processor and We'll go back over to the two VMs that we have and right here for on x1 for on xlab 2 We see what there's 24 cpus assigned to each now the first thing that may come up is how do you reassign cpus? Do you have to shut the VM down that depends on the operating system? It is very dependent on whether or not the operating system supports dynamically moving the cpus So right now I've got 24 cores assigned. Let me walk you through downsizing it So if ssh din we're just going to run htop you can see that there's 24 cores So let's go ahead and do this we're going to go shrink it down to 12 cores Switch back over and you'll see all these becoming absent. So yes, you can absolutely dynamically resize these This is another option of if you want to temporarily give something more power You can give it some more power and then revert that back down later I want it to have 24 cores again. So we're actually going to go back over here put 24 and it'll put us back up all the cores and they'll all become back available to this system Pretty slick that that supports that. Uh, I didn't have to restart it of note That may still crash services that run on the system because if those systems or those services Have a dependency where they count the number of cores and maybe spawn Their processes based on that information and you downsize it You could cause some crashing or overloading problems or certainly some Inefficient problems because it would have spawned a number of processes expecting a certain number of cores So it is still very application dependent to be able to do this Just some food for thought one more thing that's worth noting You may run into this problem going over here to advanced You may notice that you were not able to expand yours without shutting it down The first time but once you've done it you'll see this right here And that's the cpu limits the cpu limits are set to 24 and currently we have 24 of 24 But if we downsize this back down to like 12 go over here go over to advanced and It will then show us only using 12 Out of the 24 The other issue you may run into is this secondary number being the max You have to have it on a host that supports that max This can be a challenge you run into when you migrate this to another Host and that host maybe doesn't have as many cores You'll get a no host available to start and you can when this machine is off You can stop this vm and set the cpu limit the high limit to how many cpu's you want available to this Now you may have noticed that the virtual machines were named pharaonix pharaonix one lab pharaonix two lab And we're going to do a live demo here in a moment But I wanted to show you what the results are from running the pharaonix linux curl compilation Now the linux curl does saturate all the cpu cores And when we assign 24 cores to a single vm and only that vm running on the system So it's just by itself It took 68 seconds to compile the linux kernel if we assign 24 cores to two vms in both machines Simultaneously at the same time compile the linux kernel. The total time was 131 seconds to compile for each vm I don't have a way of displaying both of them doing at the same time because pharaonix has one upload part, but That took just about double the time, which is as expected It's actually kind of surprising there's a little bit of efficiency I'm guessing was gained because it took slightly less than double the time But when you're saturating the cpu fully it's going to take And balance it between these two cpu's But if you have processes that are kind of up and down and you have two different systems They may not align and you may actually get an efficiency by having them over provision So one's running a completely separate task that's kind of needing cpu at some time But not all the time and same goes for the other and that's where you can get the system to still be more efficient Even though you've over provisioned cpu's. All right, so i'm logged into my pharaonix Lab one and lab two machines. We're just going to run the pharaonix test suite benchmark build linux kernel So we can suck up some cpu time Option one I don't care about saving your results because I already did it in the previous test And now we can watch the system get loaded up here and there we go You can see it ramping up right here now. We can switch over to The stats inside of here and same thing. It's Pinning all the cores and if we go to the rise and leverage system itself look at the stats it's Rising up quite a bit here now. I've got the other vm running But it's not doing anything. This one's idle. So it's barely taking away anything from the other I could stop it, but it's not doing much. So once again, I've got this vm Just pulling all the cpu. So it's peeking out right here And we can go back over to the That data and see yeah, we're pretty solid on cpu usage, but let's go ahead and stop that because I want to Show you what happens when you run both services at the same time So what we're going to do here and this is tmux if you're not familiar with it We're going to set the panes to be synchronized So I can take the same command at the same time into windows And the command to do that is set w synchronized panes. So we're going to ahead and do that Now the panes are synchronized and if I type You can see to run both windows at the same time We want it to go ahead and clear and we're going to go ahead and kick off the pharaonex build So we're running the pharaonex ss suite simultaneously on both machines option one for both Don't need to save it and I want to show you what the cpu's do differently now So it's going to take a second to ramp up So we'll get this out of the way and while this ramping up We'll look at what the cpu time looks on a normal one here and Specifically, let's look at the cpu steal time, which is nothing This is that small 1% you seen that the other system was using so that's cpu steal time means that it was waiting Even though it had control the processor it was waiting for the commands to run. It's pretty low right here It's a small percentage. So nothing you really have to worry about Let's go over here to current time And zoom in and here's the ramp up But you notice the yellow is not quite as big. Let's go down to the steal time The steal time is running roughly about 50% here What that means is we have two vms biting over this the hypervisor is going back and forth It can only give because we didn't wait them differently. They're weighted the same that cpu time to each of these devices So the steal time is that time in between going you said I have a processor But you don't actually let me use it. So that's measured in the linux kernel as steal time so you actually can see when you're having a problem and By the way, net data is a really cool tool to be able to do this But you can also see it using things like htop where you can go in there and look And see what the steal time is But hey, I think it's cool to use the net data here and get the idea for what's going on And this is what gives you those tools to troubleshoot the cpu and understand what's happening If we go over here to this system You can see this is where it ran before and this is what it looks like running now It's only getting about 50% of the cpu if we go to the other one here It's only getting 50% of the cpu Now, hopefully this video left you with a better understanding of how cpu Overvisioning works why it may be a good idea may not be a good idea depending on the workload that you have Also, check out my video on net data. It's a great tool to help you troubleshoot some of these problems And uh, that's linked down below along with the cpu weight documentation over an xcpng and Do a little googling on the numicores that can be a fun deep dive into having a deeper understanding of how all that works As always head over to my forums for more in-depth discussion or leave your thoughts and comments down below I love hearing from all of you. Thanks