 Yeah. Hello. Hello. Welcome. My name is Anri. I work for ARM in Cambridge in UK, within the ARM kernel team, and we are mostly doing architectural development. And my focus is on KVM, also extend a bit lately, so in the virtualization side. Yeah. This talk today is about KVM tool. Can it be a QM alternative? Sure, it'd be a QM alternative. So that's what I'm going to talk about. First, give an overview of what KVM tool is for those of you who don't know. A very brief history. Then some comparison to QM. Yeah. Usage. How to use it. And I have some interesting numbers to show. And an outlook, which kind of where the journey will go eventually. Yes. All right. So what is KVM tool? KVM tool, that's from the read me, I think, is a lightweight tool for hosting KVM guests. So what it does for you is it can run Linux guests under KVM with the focus of Linux and KVM. So it only cares about those two, basically. So it requires KVM. It does not emulate anything. And in the moment, it's, yeah, kind of focused on Linux guests. That doesn't mean that one cannot run other guests, but as far as I know, nobody has really tried. And even if it does not really support it, it provides network and block devices to the guests, but only using vertio. So it does not emulate any kind of hardware. It just uses vertio because it's a standard virtual interface. It emulates some platform devices that is mostly important for x86, where you can't really get away without having some platform devices emulated. So in this particular case, it's 80 to 50 UART, PS2, mouse and keyboard controller, and real-time clock, and some graphics. It's not really VGA. It's some Visa frame buffer thing, but yeah. So you get some default screen output on x86, not on ARM, but on x86. And KVM tool is portable. So it runs on 32-bit x86 and 64-bit x86, both 32 and 64-bit ARM. MIPS, as far as I know, the port was for 64-bit MIPS. I'm not sure if that runs on 32-bit MIPS and PowerPC supported. So what KVM tool does not do explicitly is instruction emulation. There's not a single line about it in the code in contrast to QMU, which goes a really long way to efficiently emulate different instruction sets. KVM tool does not. So it runs only the same instruction on the same architecture it run on because it uses KVM, which has this limitation basically. So what it means without instruction emulation, you cannot run ARM on x86 or x86 on ARM because it's only about virtualization and not emulation. It also doesn't emulate existing platforms. So you cannot say, please emulate a Raspberry Pi for me or something, and it gets you all the peripherals that it has. It just does the bare minimum to get Linux loaded and make it happy. So for ARM, for instance, it just uses the word platform and that's it, but that's probably sufficient. It cannot run Windows. I personally haven't tried it, but I think since KVM tool does not support ACPI, it just will not work. And it cannot run on non-KVM capable machines. So it means you have to have Linux KVM support in your kernel and it has to work. Otherwise you can't use it. That's because it's KVM tool, right? And what it also does not do anymore is that it lives in some kernel tree anymore. It used to be in the kernel tree in the tools directory. It's not any longer. So if you came for any of those in the bottom half, then is this your chance to leave the room? So sorry, time's up. Those will be locked now. Yeah, so that's what it does, what it does not do. A brief history. So if you Google for KVM tool and some presentations, there are some presentations from the 2011 time frame and they give you a more detailed history about the first time of life of KVM tool. So I think the first post about it came in 2010 and it were meant to be in the tools KVM directory of the Linux tree. So it was not a separate repository or something. So it was part of a Linux tree. And the idea was that kind of similar to what Perf does in terms of that you have a user space counterpart for all the kernel features that get implemented. So in this regard, it was like, okay, we have new KVM kernel features and you want to use them in user land. So you have to use some user land tool, right? And normally there's a screen move, but it's out of three. And the idea was that if we can co-develop the features both in user land and kernel land at the same time and commit them, we make sure that the ABI is right so that it actually works and the interfaces are right. And we also give people some idea how to implement this. Personally, I don't like the idea so much. I will come back to this. Then end of 2012, at this point it was x86 only. At the end of 2012, it gained ARM support. So ARM32 and just a few months later, ARM64 support. So there was some refactoring in place to make it because I think that was the first foreign architecture or maybe not, but still the code was, you see still some codes that was originally written for x86 and then later on ported to other architectures. In 2014, it gained MIPS support. As I mentioned, it was probably MIPS64 only that was meant to drive the KVM octane processes. And in 2015, it was separated from the kernel tree by your very presenter, in fact. So as I said in the beginning, I didn't like the idea of having it in the kernel tree. A, because it wasn't really upstreamed. So there was efforts to make it to get it upstream, but this was eventually shot down by Linus himself. So there were only very slim chances that it will ever be upstreamed in the future. So it lived in some kernel tree from Ingo Moreno, I think, for some time. And yeah, checking it out was probably easy for kernel developers because they had a kernel tree anyway. So it adjusted, get remote add repository and then fetch it, and then it's fine. But still it was a bit weird to have this, this user land thing in the kernel. Yeah, so I didn't like it and I thought it shouldn't be too hard to remove it. It was a bit harder than anticipated, but eventually it worked. So now it's a very small thing. Yeah, so it's a separate repository now. It lives on kernel.org under worldeekings repo something because he's the maintainer now. Yeah, and it can be easily checked out and compiled. And you don't need to get the whole kernel source tree. So let's start with the disadvantages. So KVM tool is missing a lot of features. That's kind of the idea of it, right? So it's, it was meant kind of as an yeah, counterpart to QMO who's become very bloated and supports a lot of features, which probably not everyone really uses. And also there's a lot of refactoring going on, and you have all these different instruction emulations and architectures. And yeah, so it's a lot of code, a lot of features. KVM tool just goes why it is more keep it simple, stupid approach and focusing on single things. Platform emulation is missing. So ACPI and UFI are the prominent examples here, which has the interesting effect that we can't power off from KVM tool on x86, at least, because I spend an hour digging into kernel how we power off on x86, and it is basically only via ACPI. You can reboot via other methods, but power off is only via ACPI. So if you type power off in a guest, it kind of shuts down, but you don't get to prompt back easily, you have to break it. Yeah, also non-Linux guest devices aren't really supported. So the original idea was more a hacker tool for Linux developers to test their new kernel developments in a quick way. Basically you just compile the kernel and point KVM to do it, boots it up and you are going yet. So nobody has really tried anything. As I said in the beginning, Windows is probably not supported on x86 because it's missing ACPI and probably a lot of other things that Windows needs. Yeah, but I'm not sure if some BSDs would work. Maybe I should give it a try, I guess. Also, it's much less tested. So it's really a hacker tool in a moment, and some people use it, but they have very specific use cases, and they're properly tested only in this space. So QMU gets a lot of testing. And so if you depend on it, I would either do my very own testing on this or just use QMU because that is somehow supported in testing. I mean, we eventually, so if we do kernel features, we have to support both. So we at ARM, we use KVM tool. So we implemented there first. I will just elaborate why. But eventually the users use QMU, so we have to push it into QMU. And yes, we do exchange ideas, and it's not really, we're mostly the most developers do both kernel and userland development at the same time. So there's huge overlap between the stuff. Yeah, so I think there are two concrete examples where fixes have been ported over to the other tool. Yeah, and some convenience features are missing. I just found out when I wanted to do some tests that something should be that are optimized. And for instance, you can't simply put it in background with ampersand because it reads something from standard in and it stops and then you have to later on you can get it. So what I ended up was doing a lot of screen sessions and putting it everything in there. Yeah, that can be easily be fixed. But yeah. Okay, now on the plus side. So KVM tool is really small. It's about, so the whole, if you check out the archive, it's 1.6 megabytes. And it's 35,000 lines of code, at least according to my counting. And it builds very quickly, builds in seconds. And it's easily cross-compilable. Easily is a bit, it was a twist, of course. So for ARM, for instance, you need libftt, but you need libftt for ARM, of course. So you need to put the libftt library somewhere in your cross-compiler sysroot, which is a bit challenging if you haven't done this before. If you can use the Debian cross-compilers, then it's pretty cool, it works. But if not, so that's the most complaint that we get. Some people, oh, I can't cross-compile it because libftt. I've tried to do some step-by-step instructions in the readme, how to do this. But apparently, there's still issues. Yeah, but if you ever try to cross-compile QMU, you will find that it's very easy to cross-compile. Because apart from this simple library, there's no dependencies on anything fancy. Yeah, in contrast to QMU, which requires GLIP and everything else. Yeah, as I said, very small dependency chain. Also, you can easily have a static binary. And that is, for some people, it's really a killer argument. So eventually, you can compile it into something which has no dependencies on external libraries at all. And it's roughly one megabyte in size. And that's just all you need is this one single binary. So if you work in a hardware company and need to pass something on to people who are not so savvy in Linux and software, that's the way to go, right? So you compile something, give it to them, you can send it by email even, and they run it and it works. That's a big advantage compared to QMU, where you need to use a shared QMU directory and you need all the dependencies installed. And if you have a small image, for instance, running in FPGA, then it's really an issue sometimes. Yeah, also, it's easily hackable. So that's what many people love. If you look at QMU to find the right place, and it's all kind of robbed in some device model and callbacks, and it's hard to really get an idea of how things should be done. And KVM tool is more straightforward. You can easily find the place where everything is happening. And one important thing is the license. So KVM tool is, as it was part in kernel, it's purely GPLv2. QMU is not. And that is a really real issue for ARM in particular, because if there's a chance that there's some patent stuff involved, so if you have GPLv3 code, for instance, and you contribute to it, you give the project basically a patent license to use the patent. That's some concern. And that's basically a no-go for ARM, because ARM's business is based on patents. And we sell the licenses and we protect them by patents. So if there's any slim chance that someone kind of threatens the patents by you contributing and then they say, okay, so ARM gave us a patent license for this, and this special feature, then this is very real danger. So that leads to the fact that we at ARM cannot contribute to QMU full stop. So that's what we've been told. And there are ways around it. So usually we send people into Linauro. And Linauro is a different entity. And Linauro people can contribute. That leads to the fact that the main QMU maintainer, it's Peter Maydell, works as ARM, but he's at Linauro, so he can contribute. But we can't. So if you do architecture development and do kernel features, we can't just do the QMU part as you would expect. So instead we use KVM tool for this. And yeah, so that's the main reason we jumped on it. And that's also probably the reason, because as I said, 2011, there was a lot of stuff going on, and it was a bit quiet over the years then. And now we needed to push it forward. And that's why, so Will Deacon, who's the ARM64, one of the ARM64 maintainers, and also contributing code to KVM parts in the kernel, he asked the original maintainer, which is Pekka Inberg, and he transferred the maintainership basically to him. So now Will Deacon is the maintainer of KVM tool. And that's why it lives in this repository on kernel.org. Yeah, but it's by no means an ARM issue now. So it's, everyone is very welcome to contribute, and we are really making sure that it doesn't break for anything else than ARM, for instance, or it still works for x86. Yeah, usage. As I said, it's an easy tool. So if you have come across Qmo comment lines, you see that it's really easy. You just say run. It uses this, because it is written by kernel hackers, and originally in the kernel tree, it uses this new style verb interface where you say the tool name, and then you say what to do, right? So like git or like perf, basically. So in this thing is like run minus KBZ image. So you point it to the kernel. That's probably the kernel you just compile, because you hacked on it, and it just runs it, right? So it does a lot under the hood, of course. So it loads the kernel, and it sets up some 9PFS things. So there's some 9PFS emulation in KVM tool using what I would transport to get a root file system. And it connects the guest serial console to your terminal directly. So you see the guest output on your terminal, and what you type on your terminal ends up in a guest. And it even provides user-level networking. So the usual thing where all the packets that the guest sends out are transmitted on behalf of KVM tool. So everything you do, basically, if you packet send on behalf of this user-land application, which leads to the fact that although you're a root in the guest, and you can type ping there, ping will not work, because the ping eventually comes out from the user-land application, and this is not allowed to do ICMP packets by default. So unless you run this root or with a security root, you cannot use user-level networking to do pinging. The rest works. So HTTP and everything and normal TCP and UDP work, but you can't ping. You use, as I said, either you run it as root or you use TAN networking where pin works, but then you need root privileges as well. Yeah, and that starts a shell prompt. So that's basically what people love, right? So if a small command line type it, and boom, and second later, you get a shell prompt. Or not, or you see the kernel crash, and you can just scroll back and see, and copy and paste and put it into email, and that's the nice feature. So that was the main motivation originally for creating KVM tool, to have some easy way of making this possible. But all the rest, of course, is also supported. So you see down there is a more elaborate command line, which also starts with 1 minus K in the kernel, and you have minus i as the inner dot e. Minus d is the disk image, that's a raw image that gets presented as a virtual block device to the guest. Minus p is the kernel command line, so you can also pass a lot of command line parameters. Minus c is the number, of course, that supports that the guest should see. Minus m is the memory. And minus minus d by 0, for instance, just one example that you can redirect the output, which is normally to your terminal, you can redirect to a virtual terminal, the def pts something, which you then can connect with screen, for instance. Yep. So memory usage. So when I was doing KVM tool, as I said, we do this mostly for the license reasons and because it's hackable. But I think it was a year ago, a bit more. Some people approached me and said, well, if KVM tool is much lighter than Qimo, can one load more guests on a host than was it? Because it has smaller footprint and Qimo is quite bloated and takes much more memory. And I thought it was an interesting question and I should spend some time on looking at it. That's why I made this talk, basically, to get some bookable time to learn this. So the first thing to think about is the memory usage in a normal big guest virtualization use case is that the memory usage is, of course, dominated by the guest's RAM demand. So it would say the sweet spot of a guest would be one gig or two gigabytes of memory. Everything that is talking into the tens of megabytes doesn't really matter. So if you have four guests of four gigabyte, it doesn't really matter whether the user length tool takes 40, 50, 60, or 70 megabytes. That's not an issue. Also, you have to think that the guest's RAM consumption, so what the guest actually does with the memory, is, of course, driven by the guest. So the guest allocates and uses the memory and the fulfillment is done by the kernel. So the kernel actually takes care of mapping that data on how to get to the memory. So user length is not really in the chain here. User length just allocates the memory, which means it just reserves some virtual address space more or less in the user length memory and then hands it over to the kernel and say, look, here's the memory that the guest should see. So normally, and alone for performance reasons, you would hope so that user length is not involved in any kind of memory operation. And also, because there was the argument that cum is quite big, so the binary is quite big, it uses a lot of libraries and it uses a lot of memories. But if you start multiple guests, that shouldn't matter because all those memories actually shared, right? Because it's, so for instance, the binary is only loaded once and then referenced multiple times. So it doesn't use up more memory and the same for the library, for, yeah, binary and libraries as well. So nevertheless, if you're still interested in this, how many, you should still ask yourself the following questions, namely, what are the guests supposed to do? So what I did in my experiments was to start as many guests as I could basically, but they were all idle basically. So they were starting to a show and then doing nothing, which is a nice micro benchmark to see how many you can squeeze on host. But once the guests actually do something, it's a whole different picture, of course, right? So the chances are that you run into another limit and memory is not really your concern, because if the guests really do something, you may end up with exceeding your CPU capacity basically. So if you do overloading especially, so that over committing of CPU, then also the guests will slow down and eventually reach a point where the guests are not really useful anymore, maybe, or they kind of swap each other out and then you'll get some non-deterministic behavior, which is bad and stuff. Also storage and network, so if the guests are basically hammering their network something, then eventually they all go over the single or maybe multiple, but still limited network capacity of the host, right? And also storage, so if they all read and write like crazy, that eventually has to channel into something. So chances are if you're a really useful guest that you hit one of the other limits as well. Yeah, and then if you're still on there, maybe as a question, should I use virtualization in the first place? This may be a container, something I should look into or some other light-bait stuff. Yeah, nevertheless, so this is the number of this PROC PID status, which gives you some memory information about the process. So the left column is KVM tool, right? This is QMU. So an interesting part, the first thing that springs to mind as well, the dip was running a 512 Mac guest with two cores. You see, wow, KVM tool is a real virtual address hook, right? It allocates almost like 50% more virtual address space than QMU. So it's the first two lines of virtual address space. So just how many is allocated? It doesn't mean how much it's actually used, but it's how much in terms of pointer sizes in the user land. So that's actually much bigger than QMU. Also, the data, which is kind of dependent on the first thing. So what you see, this is what you probably would expect, the executable size, which is just 180K on KVM tool and 7.5 megabyte on QMU. And the same with libraries, just four and a half megabytes of libraries and almost 15 on QMU. So, but as I said, just the only other slide, that really doesn't matter if you start multiple guests because those green numbers are basically shared. So you only pay them once. And if you don't have the 22 megabytes for this, then you have probably other problems. But what is interesting and what actually bites us is the red number there, which is the resident memory size. So that's the memory that this actually paged in. So where the host really provides physical memory for those addresses in the user land. And you easily see, so those numbers, I did several experiments and those numbers differed a bit, but it was always a huge difference between the two. You see that in this case, it's almost factor four, more memory resident in QMU. And that is something that I wouldn't expect. And I'm not sure. So I have the feeling that should be fixed in QMU. I'm not sure if that's easily possible. I haven't investigated what is actually causing it. I just looked down that it's anonymous memory. So malloc or mapt. Yeah. So what it should find out. And the point is that you will see that it will bite us. And the following numbers. So my tests heavily depended on KSM. That's why I thought it would be a good idea to tell you about KSM. KSM is kernel same page merging. So the idea is that it is a kernel thread which scans all the physical memory, basically, or not all, scans certain amounts of physical memory and looks for identical pages. And if it finds some, then it actually deallocates one copy of it. And let's let the other pointer, basically points to the original allocation so they can share it. So the idea is that was original conceived as a means to put more guests on one host. Because the observation was that it is pretty common to run multiple copies of the same guest basically at the same time. So you run, for instance, reddit Enterprise Linux, and you run multiple of it. And all of them actually do the same code. So they have the kernels the same in there, the libraries are the same, the binaries, and maybe even the data if it initializes them the same way. So you end up with a lot of memory that's basically identical but held on multiple copies for each guest. And the idea is if you use that you can save some memory because the gilip sees, as I said, it's the very same in all the guests and they just can share. And in this case of gilip see, for instance, it just binary so it doesn't change. So it's read-only and you can easily be shared between the two. So case M is a kernel feature which does that. And the idea is that if it finds them and links them together it marks them as read-only. So and also enables copy and write on this. So as soon as a guest writes to it, which case M as a generic tool cannot know what the guest actually does. So it cannot see that it's a part of gilip see which does not change. It could be some other memory as well. So in case that someone writes to it, these copies will be split again so you will allocate. And so there's no chance that one guest can interfere with another. To enable it, you have to enable config case M in the kernel config thing, it should be on on most distribution kernels. And because it's a feature that uses CPU, you have to enable it explicitly via this command. So we echo one into this kernel MM case and run and then it starts running. So two things to keep in mind if you do this. Yes, it's a nice feature, but it takes CPU cycles to scan the memory and there are actually Sisyphus knobs in this directory where continuous. So you can say how often it should wake up and look for new pages and how big the runs are. So how many pages it should look for it one time before going back to sleep and everything. Yes, and also if you do this memory over commitment and it works, yes, it works, you may run into the issue that you have no successfully squeezed many guests onto one host and they share a lot of pages, but suddenly they may share pages which are not really read only, but case M doesn't notice, so it chairs them in the first place. And now eventually the guest start actually using them writing to some memory. So case M says, oh wait, this was read only and the guest wrote to it, so let's split it up and then it needs to allocate memory. So but it may end up at the point where there is no memory left. So you actually, if you start them, everything is fine and once they use them, you may end up really into out of memory situations. And then what happens is the OM killer goes around and looks for the biggest process and the biggest process is probably the guest which just did something, right? Because all the other guests are idle, they don't have much memory resident, but the one that just did something appears at biggest guests and that is what OM looks at and it gets killed. So it's not all the idle guests which you may probably, may live with, but then go away, but it's the one you were actually doing something. So be aware, if you do have a memory over commitment, then it's, yeah, be very careful in this. Yeah, just to show you the effect of it. So as with the other following charts, you have the memory in kilobyte, that's not kilobytes. Oh, it's swapped. Okay, so you have to memory. So that's 1.8 sum off. So this is the number of guests running and this is the memory. So this is swapped. Sorry for that. And this is the number of guests started. So I wrote a script which actually starts a guest, waits five seconds, starts another guest and then goes on and then always locks all the values that you see. So the red line, which is hidden, the blue one here, is the, the number of guests that are actually running. And you see, you see that it goes up here milling, I would expect. So with every guest you start, there's one guest running, that's natural. But it ends up, of course, at some point in time when basically, as you can see, the green line is the free memory. And yeah, that is kilobytes over there. So 1.8. So let's add it together. So that's 18 gigabytes, right? That was eight gigabytes, yeah, eight gigabytes of physical memory and eight gigabytes of swap, which end up into 1.6 times 7 kilobytes. Yeah. So basically, you see the memory goes down linearly and then ends at some point because it was a user-land process. It doesn't allow it to waste all the memory. And at this point, you see no more guests are basically created. So basically, the process was killed because it couldn't allocate memory simply. So you try to start, it was tried to allocate guest memory and the kernel said, sorry, no memory. And then Kumi said, oh, too bad. I go home and quit again. So that's expected behavior. Now the same thing with KSM enabled is that you see that memory goes down much slower now. And also you can put much more guests on it. So here it maxed out about 200 guests on this particular machine. And eventually you get like 450 something with KSM. So that's, yeah, but it's not multiple. This is the number of guests running. I unfortunately swapped the labels basically. I hope the other slide should be right. Yeah, so that's how KSM works. And it works. You see you can put twice as many guests on it. And that is only a guest. So that was just the nine PFS sandbox shell. So actually, there wasn't so much memory to share. So it wasn't allocating much memory, but also there wasn't much to share. I mean, the kernel was the same. So the kernel text was shared and everything else. But there was no big user-land program started. So I'm not sure not even maybe some library, but if you run real things on it, you would expect even more stuff to be shared because they're all the same. All the libraries are the same and binaries and everything you start. Yeah, so the test that it was to try to run as many guests as possible, this limiting the guests to be really small. So just to cause and the lowest number I could really get down to 40 megabytes that work for x86 and arm. And nine PFS shared file system because it was KVM tool office. And I just rebuild it on a QML comment line. You can do this. Yeah, I didn't want to use a block device because that would use, I would have to use several disk images which would then consume buffer cache and would kind of spoil the whole idea of saving memory. And also in it ID, it needs to be loaded and it takes up memory. So I was running KVM tool, the latest head and Qmo is 270 on an Intel 64-bit Ivabridge box. And then on a Kexida midway, which is a 32-bit arm machine with quad core. And it has eight gigabytes of RAM. It's pretty interesting for an arm 32 machine. And it has decent storage for instance. So I had a SATA SSD on it. And so you could do some swapping. I originally planned to use the Juno as well as an arm 64 bit. So I did some experiments and they basically didn't show which different behavior from the 32-bit machine. And the Juno is a bit complicated because it's big little, which Qmo doesn't support. So you have to do some tricks which kind of made it less comparable. So that's why I refrained from it. And also to swap. So the disk is connected by USB 2. So swapping goes with, I don't know, like 30 megabytes per second, which is not really what you would expect. So I refrain to get separate numbers. As I said, they are not different anymore. And the idea was to launch guests until the number of running guests actually no longer increases. So that is where basically either a new guest could not be created or for every new guest that was created, an old guest was killed by the OM killer. There was some kind of cat mouse game there. Yeah. And don't forget to enable KSM for this kind of test. So that's what the outcome is. Yeah. And we have right scales, right labels. So there's the free memory. There is also swap and physical memory added together. And you could see it goes down. And you have here some interruption that has been the physical memory was normal. Physical memory left and swap takeovers. We have some jitter there. And this line, so the red line is the free memory with Kumo and the green line, which is kind of hidden again is the number of guests. So you could see that about like a bit more than 400, which is kind of matching the other chart, it maxed out and it couldn't create more guests. And then I stopped it at this point in time because so basically the point in the memory is off. And you could see that KVM tool, which is the purple line and the number of guests and the blue line in terms of memory consumption. It's consuming much less memory and you can end up with more than 1600 guests basically created on a very same machine. And I didn't even reboot. So it was just repeating the tests after each other. So yes, in this regard, it means yes, KVM tool can run many more guests. The other question you should ask is, is it useful? So this is one particular micro benchmark just basically to demonstrate one thing. But if you want to run it in production as I said, so at this point, there's basically no memory left. And every action you would do in a guest would probably lead to some other guests, maybe you're even the very same guest to be killed by the OM killer because it was so tight. And also if you run 1600 guests at the same time and all want to do something, then you will not be happy on a quad core machine. So here just another chart showing that is what you would expect, the memory usage. So that's just QMO. And the green one is the veil of physical memory. And you see that it gets exhausted pretty quickly. And then swap takes over and gets exhausted as well eventually. And when both of them, when basically there's no memory left in, you can stop creating new guests. That just more is a proof that the whole assumption of memory consumption and stuff works as you would expect, right? So there was numbers, some known issues of KVM tool. So we actually, it's not for quite some time, we're chasing a virtio bug. So KVM tool is pretty sophisticated when it comes to the virtio implementation that uses multiple threats actually much more than QMO on this. And somewhere in this maze of super shiny parallel programming, there's some deadlock. It's not deadlock, it's some kind of race issue probably, which leads to the guest waiting for something for an interrupt to arrive, I think. So it waits for a transaction to end and cannot continue. And that does not manifest if you use an UP guest. So one processor and also pin it to a single host core, which points that it is really a race condition. And it can be observed, for instance, if you try to boot via NFS route, that's how you see it. And it depends a bit on your setup. So I have another machine where I can easily reproduce it with virtio block, for instance. Yeah, so any debugging, so we spent some time on debugging, found some issues, fixed them, but the original issue is still not fixed. So maybe it's pushed away a bit. But it is something nasty. So if anyone is willing to earn some benefits here, you're welcome to contribute fixes. There are some security issues someone found around file path passing and string functions. There was just yesterday a set of patches sent to me, which should fix them. That's basically what I said in the beginning, right? So QMO is always in the front line and everybody sees it. And if something happening, patches get sent and merged and everything, and KVM2, it just nobody cares at the moment. So this can be a bit slower, but yeah, yeah, addressing this. Some messages have double line endings, some nasty thing with some overlap between some traditional user land error messaging functions like P error, which don't take a line ending. And then from kernel, you have PR info, which takes a line ending. And there was this was kind of mixed because it was living in a kernel tree. So some people wrote user land style kernel messages, error messages and some kernel style and some editor new line and some not. And so we have some weird things. And it's astonishingly nasty to fix. Yeah, so it's like a patch set ready, but it's like 15 patches and I'm not sure it's really worth to just fix the double line endings. What I observed during my test is that the CPU load of idle guests can be quite high. So in QMO, it's nothing basically. So if the guest idles on a shell, it doesn't consume any CPU. But with KVM tool, I could see that it uses actually, in my case, 7%, which would you say was 7%, but if you run like 400 guests, then it really matters. And it actually led to the problem that if you run that many guests and each of them is used as 7%, then your box comes really, really sluggish at your run and the other issues, basically. Yeah. So I chase it. I try to debug it a bit. So I have some x6 backboard as well. So and it looks like that maybe some timer issue. So what KVM tool does to avoid some invalid MSR rights from the guest is to not insert a well known CPU id vendor string. So it's not Intel or AMD. It's basically LKVM, LKVM, LKVM, which means that the kernel kind of denies a lot of fancy features that you have. And among those features are some timestamp counter things and timers and stuff. So it selects different clock sources, depending on whether you run on an Intel machine or not, on whether it detects certain feature. And some features are not, although there are, the CPU id bits are there. They are not kind of honored because the vendor string does not match. So this is something one should look at. And whether it's beneficial to just pass on the host CPU vendor string and taking those MSR rights, just, yeah, just living with them. They're just warned in the kernel, so it's not really something. And QMR is the same problem as well. If you use minus CPU host, you will see the same warnings. Yeah, so there are no issues. So the outlook, KVM tool works for us right now. And some bugs get fixed. So sometimes you see, oh, the people seem to use it actually, because they find bugs and they actually send patches. And what we do regularly is we add new features. So for instance, I'm just about to send my patches to support the ARM, ITS, Interact Controller, which greatly gained emulation support in the kernel. And now it's the userland part getting submitted. That was, as I said, that's the usual development model we do. We co-develop this and it just happened to send it out later. The sandboxing feature, so if you see how many guests you can run, and if they just run some simple command, and they run in a sandbox, right, so this is this kind of feature that has probably conceived about it. And apparently that uses much less resources than QMR. And also some people say it starts faster. So it may be interesting to see if one could invest something in there and to make it, you know, more sophisticated because it could be worse wire and it has advantages over there. Adding new features. Do we want new emulated devices? No. So the idea is KVM tool does not emulate any kind of devices of someone, oh, but I need this Chinese Sutter controller because I want to start this thing. No way. We won't do this. ACPI emulation. Interesting. So, the fact is we're looking at this at ARM as well because there was an original argument against KVM tool that by not emulating ACPI, you kind of cut off half of the kernel code, right, and you don't test this. So if you run the kernel with KVM tool and it doesn't use ACPI, then all the ACPI kernel code doesn't get exercised. And you don't know if there's a bug in there. And it severely limits it. And we have the same issue now running up as ARM as well. So we are looking into this. So maybe new KVM kernel features. Yes. So if you have something implemented in the kernel, you know, where we welcome to send patches to KVM tool. Yeah. And in general, please send bug fixes and reports and also tell us about what you do because it is not really obvious. And it would be interesting to see in which direction KVM tool moves. So conclusion. KVM tool is a different implementation of the KVM ABI. So we have now two user land programs running the KVM using the KVM ABI, which is a very interesting and worthwhile feature, I think, because it prevents us from cutting corners or removing support from the KVM ABI because Kuma doesn't use it for some reason. But now maybe KVM tool uses it. And also it does things a bit differently. So the order of initialization is different. It's all within the KVM ABI. So it's within the specification of KVM basically. So it's all valid. But we have to live this both and we have to make sure that the kernel code runs at it. I think it makes the kernel code better. KVM tool is an easier to handle application in terms of cross compilation. You can have a static binary and you can easily compile it. Question is, does that matter if you have some Debian packaging or whatever? You just set up get Kuma. And yes, it pulls a lot of stuff in. But eventually it works. And if you ship it to the customer, it's maybe not an issue. Also, I have to say that KVM tool is no real competition to Kuma and probably will never be. Because if it would be a real competition, it would be like Kuma eventually, right? So it would run into the same problems that Kuma run because it needs to support a lot of features and code size grows and the memory consumption grows. And then you basically end up with the same problems. So the idea is that it stays as it is with a limited feature set. And yes, it can show advantages in some cases. So if you have hosts which have very low memory or otherwise resource limited, it may be interesting. If you use it as sandboxing or to kind of rock containers or something that may be interesting as well. And also if you do prototyping, as I said, it's quite easy to add a feature or to change something. You just recompile, run it and you see what it does. It just took me under a minute to fix the vendor string to put in Intel in there or to put on the hardware thing and then recompile and done. And this Kuma would take much more. That being said, please contribute to Kuma instead because that's what people use there and that's what actually use and help fixing it. And I'm very happy actually if KVM tool helps people to point Kuma to problems that it has, for instance with the memory consumption. And I think there was two cases already where something that KVM tool was doing better was actually then fixed in Kuma. So that Kuma just catches up there. Yeah, that's my talk. Any questions? Yeah. So what are the use cases for KVM tool? You're using it as testing? Yes, so for implementing new stuff. So it may be interesting to use as a sandbox tool to run applications with a different kernel, for instance, or just in a separate kernel where it's hard to break out because it uses the hardware virtualization which is meant to be more secure than containers. So you can use it there, but it still lacks some features there, but it would be, I think it's a good base for developing on top of that. Yeah, and if you find, yeah, so you can, for instance, if you want to compare how different kernels behave with this application, that works. I think some people even tried to put like Firefox into it to kind of have a sandbox and to make sure that the browser just is contained. Because KVM tool easily allows that it can be done with Kuma as well, but with some more complicated command line, which KVM tool hides in the code, which is kind of a mood argument. But because it just lets the host file system shine through, read only. And then it just so the bin user and everything else is kind of looks like the read only copy of the host. And only the stuff that you would expect to change like ETC and something that is separate. That's very easy to set up with KVM tool, because someone invested some time in putting it into code. But as I said, you can, I think there's a, beside the run command, there's a sandbox command where you give just this child script. And it starts this. And I think there's some example somewhere on the web where someone put Firefox into and uses X, X11 forwarding over the network to display it on your, on your box. Yeah. But in general, I'm very happy if someone gives me some ideas what to do with it. Yes? Yeah, sure. If you have, yeah, if you have questions which are not fitted for the big audience, then I'm happy to take them in the hallway as well. Or if you want to, I can I meant more like from a time standpoint. Okay, yeah. Is it? Oh, yeah, we are running. Yeah. Yeah. Okay. Yeah. As I said, find me. Thanks.