 So, hi everyone, sorry for the delay, my fault, and you didn't press it, so I'm Sharon Gratch, I'm a senior software engineer, too late. Okay, so I'm Sharon Gratch, I'm a senior software engineer at Overteam Red Hat and I'm going to talk today about high performance virtual machine, which is a feature recently added to Overteam, it's actually was added to Overteam 4.2 and enhanced in Overteam 4.3 which is the coming next version of Overteam. Now, let's start by understanding what is the high performance VM, so we are talking about a VM that we wanted to run it with the highest possible performance as close to bare metal as possible. But what we are not supporting with this high performance VM is that we are not supporting real time. There are a few people that confuse between high performance and real time, but it's two different things. When we are talking about real time, we are talking about predicting a set of operations that should be done in a given amount of time, and when we are talking about high performance, we are talking about enlarging as much as we can the number of operations that can be proceeded in a given amount of time. And it's two different things, so real time doesn't necessarily mean that you are running with high performance mode. So just to make it clear, we are supporting high performance and we still don't support real time VM. Now, why we added this feature for supporting high performance VM? So the main reason was that there were applications that needed a higher performance than others, and for supporting that, for letting them run on a VM and not on the bare metal machine, we wanted to create an easy way to offer high performance VM, because even before this feature, users could have set high performance VM. They just needed to go over all the VM properties, why by one, understand what they should do, which properties to change now, and then they will have a high performance VM, but it's not a straightforward mission to do. So we now have a very easy use case for the user to do that, much more easier. In addition to that, when they tried to do that before this feature, live migration was not supported. And live migration, as we know, is a very crucial ability and feature for running VMs. So now it is supported, and it's something that we worked on. And the last thing that we worked on for this feature is there were few functionalities that were missing for really supporting running a VM with the highest performance that is possible. And now we implemented those features. The list is on the slide. There's a lot of things, for example, huge page support for huge pages, support for endless VMs, disable of USB and other devices, enable multi-qs per virtual interfaces, et cetera. Now, when you create an overt VM, you must set a field called the VM type, or the optimized for field. And that means that you are setting the type of the VM that you want to run your applications on. There are two options. There were two options, desktop and server. If you are planning to run the VM as a desktop machine, then you will set it as desktop. If you are planning to run it as a server, then server. So we used that field and added a third value, which is called high-performance. And now when you want to run an application that requires a higher performance, you set this field to be high-performance, and that's all. The VM is now high-performance VM. As you can see, this is the overt UI dialog for creating or editing a VM. And there is one now a third value called high-performance. You set it, and that's all. The VM is high-performance, very easy. It is easy, but there are still things that I need to talk about. It's not that easy. There are still a few things to consider. Because by choosing this new high-performance VM, two things happen. One, the automatic settings, meaning that the VM is preconfigured with a set of configuration settings that improve the performance. But there are still few, not a lot, but few settings that we cannot automatically set in advance, because it depends on a lot of things. For example, the host that the VM is going to run on and things that is a bit problematic to calculate in advance. And for that reason, there is another layer, which is manual settings. And for that, we added a smart dialog that helped the user set those manual settings. You check which settings are missing, you advise the user how to do that, but the users still need to manually set those settings. We will detail them later on. That is the two phases for creating the new high-performance VM type. And now, two things that are important to mention is that all those settings, the manual and the automatic, are non-mandatory. Users can choose if you want to apply all, some, or none of them. And of course, whatever you can do for a new VM, you can do for an existing VM. You can take a server VM and switch it to high-performance. It's also supported. Now, you can ask me a question that is very logical, and it is that if it's so easy to create high-performance VM and I can gain a VM with the highest possible performance, so why not declaring all VMs as high-performance? So the answer is that as all other good things in life, there is a price, and the price in this case is the flexibility that you need to pay. So there is a balance between flexibility and performance. The more performance you gain for the VM, the less flexibility you will have, and I will explain that later. But for example, do you have a clue now? For example, if you create a high-performance VM, then you cannot have the default mode, that you cannot have USB, you cannot have graphic console, migration is a bit more limited, the OS that you can run on are limited, et cetera, et cetera. So there are few limited things, and the flexibility is hurt because of that. Now let's drill down to the actual settings that I talked about, and we'll start by the automatic settings. So first of all, on the console area, automatically for you, we enable the endless mode, meaning that there is no graphic console, but on the other end, we also enable the serial console, so there will be some kind of a console to manage the VM. In the devices area, we disable this list of devices. We disable automatically the QXL, the SPI, the USB, the sound card, the smart card, memory balloon, watchdog, the tablet. All of those are disabled. Again, you can of course enable it if you want. So when you disable it, of course the performance will be increased because there is no need to manage those devices, but again, on the other end, the functionality of those devices is missing. You won't have USB. Next thing is networking. We enable multi-Qs per virtual machine. That means that per each, sorry, per virtual interface, that means that per each virtual interface, you will have more than one Q. And of course, the networking request will be handled in parallel, so the performance will be increased. On the other end, you need to remember that per each one of these Qs, you will need a thread to handle that Q, and that means that there will be less threads used for processing, for CPU. So you need to keep in mind that if you have a VM that has a lot of networking operations, then it's good to leave it as is. But if you have a VM that you will know that the application is running on it, will need to use more processing, the load will be processing sign and less networking activity, then maybe you should disable it. And again, we'll see that along the way, the balance between the flexibility to what you really want to use for your performance increasing. We also have, I will try to make it a bit quicker. We have the entropy, we'll enable the random number generator, the same as done for the host, we'll do it for the VM, so of course the performance will be increased, and the CPU will enable the cache layer 3. We have layer 1, layer 2, we added another cache layer, so of course performance will be increased. Continuing the automatic settings regarding the storage IO, we'll do, first of all, the different interfaces, because we tested and see that in most cases, the performance is increased, storage allocation will be preallocated instead of a thin allocation, because of course in that way, write and read operations are quicker, and the VM mode will be set to non-stateless, because that way, it doesn't need to keep the state, and the performance is decreased because the delta should not be handled. And another thing, we enabled IO thread, meaning that there will be a dedicated thread on the VM, for serving IO operations. The default value is 1, but because we tested and see that that's the usual value, that is okay for most cases, but again we can change it. And regarding the host spinning area, we do two things automatically. We enable the pass-through of CPU, that means that the CPU model and CPU flags that the VM is using is exactly the same as the host that the VM is running on, of course the performance will be increased, and we enable in a PIM, the IO and emulator thread, automatically. And the last thing that is done automatically is that we enable live migration. This has nothing to do with performance increasing, but it's a feature that we really wanted to have, and now we have it. And of course for loaders, as I mentioned before, you can apply all of them, none of them, all depends on what you really want to achieve. Now I want to talk a bit about migration because we had a lot of problems with that. So you need to keep in mind that if you try to migrate the high-performance VM, the performance may be decreased because you try to take a VM running on one host with all the pinning configuration and all the things that are setting to this host and try to run it on a second host, another host. Maybe the performance will be decreased because the pinning is changed and the host settings is different, so that should be kept in mind. And because of that, we decided that the default mode for migration for those high-performance VMs will not be automatic migration, but rather a manual migration. So the only, the default behavior is that once you have a high-performance VM, you need, you can migrate it only manually by going to the VM and try to migrate it now or by changing the host that the VM wants on to maintenance mode. Nevertheless, we still support automatic migration. I just say automatic migration means that the VM is automatically migrated when the engine decided that it's the right time to do that because of, for example, load balancer decisions or availability mechanism because the host is not responding and things like that. So we do support it, but because you should remember that the performance may be decreased and the user doesn't have control of that, so it should be considered with caution. I mean, maybe it's not worth setting that. It depends on the user. Another thing that we, and that is selecting the destination host for the migration, we decided that source and destination host should not be identical. If they're identical, the performance won't be damaged, but it's not a requirement. What should be done is that source and destination host should be compatible, compatible in everything, in the number of CPUs, CPU pinning capacity, memory, huge pages, of course, because otherwise the VM won't run. And because of that, the limitation that we currently have is that the VM can migrate it not to all the host in the cluster, the Ovid cluster, but only to a subset, the ones that are compatible to that VM. So this is the limitation. And another thing is that among all those compatible hosts, the performance results can be different because it depends on the status of the host, what is the load of the host, how many pins other VMs are running on the host, currently running on the host, et cetera. So the recommendation is that the user will use manual migration, but he will let the engine automatically decide about the destination host. We have that feature. It's actually the default mode. The user can choose to which host he wants to migrate the VM, but it's less recommended because maybe you will choose a host that will result in a performance result that are less good for us. Now let's talk about manual settings. So there are four manual settings that the user is recommended to set. I will start with the first one, set the CPU pinning. That means that each one of the virtual CPUs in the VM will be pinned to physical CPU on the host. Of course, that the performance will be increased. Another thing is setting virtual nomonauts for the VM and pin them to the physical nomonauts on the host. Nomonaut means that you take the CPUs and locate them close to the chunk of memory that the CPUs are using. Of course, that in that setting the performance increased, but we say more than that. Not that declare a nomonaut, virtual nomonauts on the VM, physical nomonauts on the host, but pin the virtual nomonauts on the physical nomonauts on the host. In that way, we even increase the performance. Another thing that we recommend the user to do is set the huge pages, memory backing with huge pages. As we all know, the memory is divided to pages and for managing those pages there is the TLB table and of course it requires a lot of decrease the performance to manage it. If you declare a huge pages meaning that the pages size is larger that means that there are less pages to manage and the performance is increased so that's what we recommend the user to do and the last bullet that we ask the user to manually set is disable KSM, kernel same page merging. The kernel has the ability to unify or say one copy of identical pages in memory of all the VMs in the system. This is of course have advantages but from performance aspect it's less preferred and we disable it and of course the user can apply all some or none. Now about huge pages few things that are important to remember is that first of all we set the huge pages size they are recommended sizes as written in the slide and we also recommend to set the virtual machine huge pages to be the same size as the host that the VM is going to run on and if there are few sizes to the largest size and of course make sure that there are enough free huge pages for the VM to use otherwise it won't work another thing that we automatically set is that the huge pages are preallocated when the VM starts to run so there are no dynamic allocations because the performance again will be decreased other limitations regarding huge pages is that when you set the high performance virtual machine memory size you need to take to calculate how many free huge pages the host has in order so that it will be enough for the memory and for the virtual nominode sizes because virtual nominode size should be a multiple of huge pages selected size and two other limitations regarding huge pages is that there are no memory odd plug the unplug and unplug and memory resource is limited for a host that means that if I have a host running a high performance virtual machine it uses all the huge pages that the host has there are no free huge pages so all other high performance machine running on the host will not have huge pages because the resource is shared among all the VMs now I want to talk a bit about our pinning issues so we talked before about automatic iron and mulatto thread pinning I want to explain the algorithm behind that the algorithm says that the first two physical CPUs used by one of the physical nominode will be pinned to the iron and mulatto threads in the VM so if all virtual CPUs fit into one of the host nominode then it is easy the first ones will be pinned to usually it is zero and one to the iron and mulatto thread but if the VM spans more than one nominode then we choose the nominode that is most pinned we take the first physical CPUs there and pin them to the iron and mulatto thread that's the algorithm that we do now let's take use case as an example because it's much more clear that way so we have a host with two physical nominodes each one is set with the memory size and the physical CPUs and then we declare high performance virtual machine and we set two virtual nominodes node zero and node one for each one of them we set the memory and the CPUs and in addition to that we also enable the iron and mulatto thread as we said before now we want to start the pinning so as we can see now it's most preferable to do the virtual nominode pinning such that it will be compatible so as we can see now the node zero, the virtual nominode zero is most recommended to be pinned to the physical nominode zero of the host and the same for node one after we finish that we now need to set the CPU pinning so there are few options how to set it but most recommended is to do it that way and after we do it as you can see the zero to zero to two, one to three, two to four, etc this is most recommended because if you do a cross CPU pinning between the virtual nominodes the performance will be decreased and another thing to remember that physical CPU zero and one should be left empty and not pinned to the virtual CPU so that iron and mulatto threads will be pinned to them so that's the way we recommend to do sorry and a quick future improvement we said that there are automatic that there are no automatic settings so the future improvements will be to set virtual CPU a normal pinning a virtual huge pages automatically to enable affinity rules management to manage those high performance VMs and hosts and of course to continue tuning the high performance VM solution according to future benchmark that's all