 Okay. Welcome everyone. I'm Chan-Sin Chinpasa and my colleague, Khom Kitt when we say. So we are from Nipah Khao, Thailand, Bangkok. So today we are going to talk about the performance tuning and resource management for Valiant, Dedicated, and Share Call favors. So, okay. Agenda for today. The first one is how we decide a favor for public car customers. Share call tuning guys, Dedicated call tuning guys, and Conclusion Future Works. In our company, we have a marketing team to talk our customer and get their requirement. My marketing team concludes our customer needs. The main factor that customer want to use a public car in in Thailand is high already or H.A. And the second is a cost and discount convenient sale service and pay as you go. So for the high already in the last year we have a session talk at the Berlin with Dr. Apisak is talk about how we decide the open stack in H.A. That in the picture we decide the concept like push the controller in the T side with the all active and we decide a zone for do the like a NOVA sale controller and we are using the tungsten fabbing instead of open with which you can you can go to watch is again in YouTube. You can sort of how we decide an open stack public car as a care with from big networking. So today we are talking about the second factor is a cost and discount. The first, first of all, we need to understand the user requirement and decide a favor to fit their requirement. After that, we need to decide the compute note hardware spec for each type. We talk to customer. We can grouping the requirement in the two parts. The first part is a non-critical application like low traffic web servers, CMS, blocks, depth test servers or just more database or someone used to keep a coding repository. This section I'm called a non-critical application. It can down, it can not work at high performance. And the second one is a mission critical or performance VM. For example, it's a high performance SQL in memory caching, indexing, memcache, ladies or JVM video streaming and video coding. So here is an example on our customer. They move from on premise to our cloud. They have Kubernetes cluster on the for the depth server and production server. So the customer requirement is how to optimize cost for the non-mission critical service and how to maximize the performance and stability for Kubernetes production. Because this customer will provide the POS and vending machine for it in the Thailand. So after we discuss to customer, we know how to save cost for them. We decide the two favor group. The first one is a check call favor and second one is a dedicated call favor. The first one check call favor, it can be the Kubernetes development cluster, locking server, Git server open search and monitoring. And the dedicated call favor is will be the Kubernetes production, database node, gateway or VPN side to side. So we are public car, right? We need to calculate the cost. So on the left-hand side is cost effective. It's focused to the low medium workload application, possible CPU, lower cost as possible. And the CPU and LAM will over commit. And so on the right-hand side is performance. We are focused to the medium high workload application, sustained CPU performance and minimized CPU and LAM latency. And it will be dedicated CPU and LAM. So we decide favor and dedicate node together for the check call. The spec of compute node that we use is 32-core 64-trade and 768 gigabytes of LAM. The CPU and LAM will show will be high. It's around 1 to 10. And the dedicated call favor is the CPU pinning. So there are no over commit, right? CPU and LAM will show will be lower. It's around 1, 2, 3, 4. The spec of a dedicated compute node that we are using is a 68-core 96-trade with two sockets. It's a total 192-trade. And memory is 500 gigabytes of LAM. So when you have a check call and dedicate call compute node in the same cluster, you need to create a host aggregate for grouping that compute node. So here is an example we are creating. The first line is a grouping of a dedicated node. And the second line is a grouping of a shared node. After that, we are using the feature of Noah's scheduler named aggregate in-stand extaspec. So we create a host aggregate, right? And we create a favor for a shared call and dedicate call. And how to map the favor to host aggregate, yeah. You need to add the favor metadata named aggregate in-stand extaspec to favor metadata and add into the host aggregate. When user launches the in-stand with match the value of the favor metadata, it will launch into the compute node correctly. For the resource type we offer, we provide to type of favor, one share call favor and dedicate call favor. We cause a lot our concept form hyperscaler cloud. If we look at example of hyperscaler, we can see that event with the same specification. The same specification. Performance testing score by gig band are not same. So we calculate the price per performance ratio. You will see which one is valuable. Yeah, because customer ask customer with provider, we want four call, eight gig, right? But they don't care about the performance that we get. So here is an example on the address. Yeah, we just test on the gig band of all the favor and calculate cost and performance ratio. Yeah. Our public cloud user can adjust the resource type to optimize cost or service to focus on performance with use residing. Yeah, here is a benefit for do the share and dedicate in the same cluster because a user can reside by their sales form share to dedicate and dedicate to share. Four share call over commit ratio is one to eight by default. That means if we have 16 call in our inventory, we can utilize up to 128 vCPU. When creating a share call favor, we specify the option. HWCPU policy equal share. When create favor or set favor. Actually default open stack favor that you create is a share call. Yes. And important aspect for share call is that we have to impose what is known as a CPU quota limit. If we don't do this, the problem that appear is unequal CPU uses high congestion, performance not stability. Giving the change for some VM to use too much CPU, which impact other VM since each VM share CPU uses. This example that we use in setting CPU quota limit when creating share call favor. This is an example. Next, let's see how creating favor that dedicate different form share. The difference is that we add the option CPU policy equal dedicate. It's different form share when creating favor or set parameter favor. For dedicate call when dumping the revert HTML, we will see is the mapping between each compute CPU and each VM CPU is one by one. For the share call, it do like a pool of CPU range. And the VM CPU can run on any CPU in that pool. But in the dedicate call, the VM CPU will map one to one to the hypervisor CPU. So it's quite guarantee and latency is good. Let's look at how configuration know the difference between share call and dedicate call. The important parameter relate to share call are the configuration of over commit and CPU share set. For dedicate call, the important parameter is CPU dedicate set. When we conflict as mentioned, the inventory collection of resource in paceman service we use with CPU for share call, PCPU for dedicate call. The number of CPU share set is CPU call ID. If you have 128 call, you can add up to 128 call. You can mix the share call and dedicate call in the same compute. But please do not overlap the CPU range. This slide talk about the huge page. Last slide, we talk about the CPU, right? And now we need to improve the memory also on the dedicate call favor. In the Linux system, there are page table. The page table is for store mapping, virtual memory updates to the physical memory updates. The virtual memory updates are used by the application VM instance. The default of Linux page is 4KB. So you mean more page, you have more time that you can find the page mapping into the physical memory, right? The concept of pure page is increase the page size from 4KB to 2 megabyte or 2 gig. Here is an example. If you use the default page and VM required to use the 2 gigabyte of RAM, you will have about 500,000 pages. It is a lot. But if you use the huge page, the page will be increased to 1,000 pages. So let me conclusion about the benefit of pure page. So page size will be increased, right? And number of page will be decreased. And the little page table walking is faster. So it's no swapping and no case swap operation. And overhead operation will be less. So two things you need to do. The first one, you need to enable the huge page on the cloud parameter like this. Just add parameter here. Default huge page, 1 gig, 1 gig, it means page size will be 1 gig. And huge page, 400, it means you will have page, 400 pages. And you need to update and reboot the server. And the second, you need to add the favorite metadata. HW main page size, large. It tells NOVA to use huge page on this instance. I will provide example to build a favor for dedicate call with huge page. We will need specify the option HW, CVU policy equal, dedicate, and HW main page size equal, size of huge page. This example, we use 1 gigabyte for huge page. Here is an example for resort provider for both dedicate call and share call on the same compute, on the same compute. Yeah. Observe that PCPU is eventually PCPU, eventually for dedicate CPU which does not have an allocation ratio or over commit, PCPU does not allocation ratio or over commit. And PCPU is eventually for share CPU. You will notice there is an allocation value which is over commit value. It's a PCPU and have allocation. A placement service will not multiply for us if we working on capacity planning. Okay. Let me conclusion. We decide the share call favor and dedicate call favor for customer. We separate the compute node for share call and dedicate call. So we use the host aggregate and favor to mapping by the aggregate in standard aspect. Yeah. So future work is, we plan to upgrade to the latest version of NOVA because we are interested, new feature is about the strategy to reduce the CPU power consumption when unused. Yeah. So we haven't tried yet but it can reduce the power consumption of dedicated node. Okay. That is it. Thank you. Thank you. So do we have any questions? So you mentioned in order to have dedicated physical CPUs for virtual CPUs, you ping virtual CPUs on certain physical CPUs. So my question is, do you need to live migrate or cold migrate these virtual machines? And if they do, how does this CPU pinning work for VM migrations? Because if your hypervisors have different CPU models, different numbers of cores, it may work on, you know, the pinning may work on one hypervisor but not another. Right. So I wonder if you'll encounter that and how that works in your case. Actually the CPU here, right? Actually NOVA compute will map this one. When we live migrate, NOVA compute will recalculate the CPU. So nobody can live migrate from another compute. So for your hypervisors, do they have the same number of CPUs? No. Same number of cores? No. So the pinning is just a range and NOVA will figure out the details, you mean? NOVA compute will find the available CPU of that compute and really mapping again. Okay, so NOVA will find another hypervisor with the same physical CPU setup and that's how VM migration works. Okay. Thank you. Any more questions? Thank you. One more here. Sorry, only one question. Why did you combine the shared and the dedicated compute together? Why not do one compute with all the dedicated CPUs and one compute with all the shared CPUs? I didn't share in my production. I just told about it can be possible. But in our production, we split it. Thank you. Okay, thank you. Thank you.