 Ok, se dobro. Če vidim, tako je Dario, in še pravimo, da bo se zatvorečite, o preformancji, testovlj need, vse na vrčne. So se dobro, neče se, ko sem tukaj, je povrčen, po vrčnih testovlj, vse na vrčne, vse vrčnih, vsi tukaj, ali bljega bila i da zelo podejnico. Vse zelo, da sem ne專ima, that I could have given the idea that there are quite a few of them already, as we probably all know. So, I mean, I hope you don't think that with this talk I am going to talk about and propose to use yet another one. Will I? Of course I am, and well, that's actually not entirely through and we will see about that a little bit later, but let's introduce the subject of performance benchmarking with thinking about how you would assess the performance impact of, let's say, change in the code, for example, let's say that the change is in the Linux kernel, but it, well, here it could be anywhere, and what you do for figuring out whether this change brings performance, any difference in performances, so better words, performance, you run a benchmark, without the change applied, then you run the same benchmark, actually, most of the time more than one benchmark, you run them with the change applied and then you check the difference, the reason. Now, this is what you do for, let's say, bare metal benchmarking. What about if, no, sorry, what about if we put virtualization into the picture, into the question, then turns out that things becomes a little bit more complicated because there are quite a few more cases that you have to think about and actually go checking. This is just an example, so the example where you have to consider all the cases when the change that you are interested is either nowhere, so neither in the host kernel nor in the virtual machine kernels, or it's only in one of these places, either on the host, only on the host or only on the guest, or it's in both, and you have to run the same benchmarks in all these cases and figure it out. Of course, this depends on your use case, on the work that you are interested, but in general, it could be like this. Also, well, yes, and of course, one of the other differences is that in this case, the benchmarks, yes, they still run, but they run inside the virtual machines, quite obvious, but let me just point it out. Let's now focus on, for example, these two situations where we want to see the difference, whether not change whatsoever, or change applied both host and guest. Is it really always the case that this is enough, so we want to check the performance when running only one virtual machine? Maybe, but maybe not, maybe we also care about use cases where on the host there are multiple virtual machine running. And so now what is happening is that you have to rare onion benchmark, not only in all these different use cases, and not only in just one virtual machine, but in multiple virtual machines, and most of the time at the same time. So basically what you would need is something, some tool that would help you, as for example the developer of such a change, of the change that we are talking about. Tool that would help you to run benchmark inside virtual machines, that's the first thing, but you probably want to be able to run benchmarks inside multiple virtual machines, and you may want to be able to run the benchmarks inside multiple virtual machines at the same time, and if they run at the same time and you are interested in the results of all these benchmarks in all these virtual machines, then you also had to think about a way of making sure that the benchmarks are running in a kind of synchronized way, because you want all the CPU benchmarks to start at the same time, and if the benchmark, if each one of these benchmarks does, for example, multiple iteration, then also the single iteration inside the benchmark needs to be synchronized, otherwise the results that you will get are, well, usually not the one, not so interesting and may not want, maybe not the one that you want. So, let's speak briefly about this tool called MMTest. It's not really a new tool, because it's basically a benchmarking suite, which is out there since quite a while. It was born as a tool for testing and benchmarking memory management code in the Linux kernel, the original and still, the original author and still the maintainer and main developer is my colleague, Susam El-Gormand, and, Matija, it has evolved a lot since these days. It's not only MM only. It leaves there that GitHub address, you can go, try it and, yeah, and to us if there is something that you don't, that doesn't work for you or of questions, you can open GitHub issues, but we are not yet monitoring those very actively. So, the best thing is to email either Mat or me. And, yeah, what is it? It's a collection of Bash and Pearl, and what it does is it fetches, it configures, it builds, and then it runs one or more benchmarks. It also collects and stores on the file system the results, and they let you compare the results of different runs. And, yeah, and there are statistics that you can look at, and you also can, and it also can plot, and it supports what we call monitors. So, basically, this is a very nice feature according to me, at least, while you can configure the things in such a way that while the benchmark are running, then the tool will sample for you stuff like top VM stat, MP stat, and then you will have these numbers as well. And it's also integrated, at least up to a certain extent, with perfect trace, both monitor and tracing, if you want. There are quite a few benchmarks that are already available, already pre-configured there, if you just clone the sources and want to try one of these. Each one of them is there with multiple, let's say, multiple and different configurations. There is a directory in a code where all the configuration files are stored, and there are multiple configuration files for most of these benchmarks. And there are also scripts that say that we generate more variations, so even more configuration files, so even more diverse configurations, again, for these, for the various benchmarks. This is an example of a configuration file, just a collection of bash exported variables. Of course, there are some, this one here, which are more or less common to all the benchmarks. This is the monitor that I was seeing before, you can choose which one you want to be active while running the benchmark. And there are some other which are specific for the benchmark. Now, fully understanding them, you have to know at least a little bit what the benchmark does and how it works. But once you do that, it's pretty easy to, for example, change the number of iterations that you want stream to run for each run, for each benchmark, or they place in policy for OpenMP or the build flux or whatever. So, yeah, that's how you use it. So you use the script, which is called runMM tests. Not to say, you specify the config file, you specify a name for that particular run. Then you do another run after you have applied your change, change the configuration, whatever it is that you want to measure the difference between. And then you can compare the results. You can use these compare kernel wrapper scripts, which has some parameters, or you can use directly these other facilities, which RMM tests provides, and which I'm showing you in a little bit more detail. So this one, for example, is for seeing the actual results and the comparison, because here you see, so in this case, I run a net perf with different package sizes. And these are the absolute results, the absolute numbers from the first run. These are the absolute numbers for the second run, and these in parentheses are the percent differences for each package size. You can have overall comparison of these two runs, because as we saw in this case, the benchmark was run with different package sizes. Is there a way to aggregate all these results in one and other overall view of how the benchmark went? Well, there is. It's this one. Yeah, basically we do the geometric mean of all the ratios, and we use the geometric mean, because people who understand statistically better than me think that it's a good type of mean to use, so I agree with them. And, yeah, and these are the monitors. So basically, again, you can have printed, you can ask RMM tests to print the comparison between the values that it has sampled for these monitors during the two runs. This is another example of monitors. This one was just monitoring the duration, which is very trivial. And this other one was actually using PRF for collecting these metrics, so migrations, content switches. And, yeah, as I said, you can plot. Okay. So, couple of beware of, which are, the first one is that it pretty much, I mean, it requires root. There are, you can even run without root privileges, but, well, it depends on the benchmarks, and it depends on what you want to do. For example, you can ask RMM test to change the CPU flag governor to performance while running the benchmark. And if you want to do that, that needs to be done with root privileges or change other parameter. I don't know, is it enabled or disabled, transproducija, especially all these things that may be of interest to you, which do require root privileges. It tries very hard to undo all it has done, so to leave your system in the previous status before, but really this is thought to be used on test machines, which ideally you would be able to, after having run the benchmarks, like reinstalled or redeflow, easily, rather than on your super precious workstation, which then can be messed up a little bit, as hard as we try not to. And it also fetches the stuff from the internet and then runs them as root, which may not be the most secure thing that you want to be, especially, again, in not something which is a cattle machine that you will redeflow as soon as possible. Internally, the way we use it internally is that we don't fetch from the internet, we fetch the benchmark, then there is what we got, and then put them in a mirror. And it's pretty easy to configure RMM test in such a way that it will fetch for a mirror, so you can set up your mirrors and mirror and use it like this, which is a little bit less scary, let's see. So, I said, I had virtualization in the thought, I spoke about benchmarking virtualization in the first site. Can the RMM test be used as that tool, which allow you to run benchmarking side virtual machine, multiple virtual machine synchronized, blah, blah, blah. It can, because I am working on extending it so that this is becoming possible, and what you do is to use these different scripts. So, instead of run RMM test, you use run KVM, after it doesn't even have to be KVM, because it uses VIRSH and libVIRT, so it can be even anything else, but yeah, that's how it called for now. And it's pretty much the same as before, with the differences that, well, these two parameters just ignored them for now, but you specify, in addition to the config file and the name of the run, the name of AVM. So, again, for now, this is pretty much a work in progress, so I will have a few words about it later, but I would say for now a few times during this presentation, because it's in active development. So, for now, the virtual machine has to exist already, and has to be known to libVIRT already. So, basically, what RMM test does is VIRSH start, the name of the virtual machine you have provided, if it doesn't exist, bad things happen. And also, they have to be set up in such a way that the host can reach the virtual machine over the network, and can SSH inside the virtual machines passwordless. So, what it does is start the virtual machines, it copies and tests inside the virtual machine, run the benchmark inside the virtual machine, then stores some logs, both about the host and the guest, from inside the guest, and then it fetches back the result from inside the guest, back on the host, and let you run the comparison tools. And besides adding the virtual machine name as a parameter, all you need is to add this additional variable in the config file, which is basically the IP address of the host. And yeah, I said also that I wanted a tool which can run benchmarks inside multiple VM, and yes, it can also do that, so instead of just specifying one, you specify two, three, four, however you need. And it does exactly the same thing with multiple stars, multiple VM, copy and test inside all of them, run the benchmarks inside all of them, blah, blah, blah. Synchronization, the synchronization that I also mentioned. So, basically, this is implemented, and as I said before, Austin Guest needs to be able to talk over network, and it's over network that synchronization protocol has been implemented between basically the run KVM script running on the host, and the run MM test script running inside the guests. And this synchronization protocol, as I said, right now happens over the network in future, it can happen using other means, like virda, jovisok, virda, jo-serial, but let's see what's more convenient. And it's based on exchanging messages through NC tool. It could be GRPC, it could be a work to be GRPC, anything more, I don't know, solid, this is what it is right now. And it's a token-based protocol, so basically the benchmark inside the VMs knows that they can only go so much farther, let's say until the point where they want to start an iteration of one of the benchmarks, and when they get to that point, they send a token to the host. The run KVM running in the host collects all the token, it knows how many virtual machines there are, and as soon as all the virtual machines have sent the proper token, it realizes that they are all at the same stage, at the same place, and they can continue, and so it sends a token itself to all the virtual machines to let them continue. And so it's a barrier protocol working like that. Right, and basically what you see here in this screenshot, of course I could have shown the logs with the timestamps, the various phases aligning and happening at the same time inside the virtual machines, but it would have been, I guess, hard to read. So this is a simplified view, but I mean, you can try and check the timestamps and see that it actually works, but you also can see from here, I hope that at least have an idea that, yes, in these two virtual machines, the benchmarks are actually running synchronized, because this is the CPU load caused by the benchmarks running there, and you see that it matches pretty much. This is probably the time where an iteration ended and here another one started. This is a very simple example with only two virtual machines. I have an example with 30, but it probably has hard to read as the timestamps, if I would have shown it. And yeah, so this is a tool which right now is used for developing, let's say, so basically it's you as kernel, but it can be, you can be doing development somewhere else. You are interested in benchmarking the effect of the changes that you are making to the code, and so you use it and you get the results. Can it also be used inside a CI loop, for example, to automatically monitor and perhaps automatically report regressions of new kernels, new packages and stuff like that. As a matter of fact, yes, as a matter of fact, it's what's happening within Suze, because the performance team is doing exactly that. They are using another tool on top of MM test, which is the one implementing the actual CI loop, and it's called Marvin. I'm not going into details now. There is this talk, this is a link to the recordings of a talk, which you can watch and have more details. What we plan to do is pretty much the same thing with some variations in the virtualization team, which is where I am working, because we already have an instance of Jenkins, which is building our packages for all our distros, then it's installing them in different machines, each one running a different distro, and then running already some functional test. So what we would like to do is to plug MM test to this infrastructure and have also performance benchmarking on top of these and be able to check for performance regressions and stuff. So there is, as I said already, it's an ongoing development effort. There is a lot which I am doing or plan to do on specifically, on the, let's call it, let's call them virtualization extensions to MM test, but there are also areas of improvement of MM test itself. This is something which is public on GitHub since a while, but so until now it has only, and up to certain scientists still is, it's only been considered and regarded as an internal tool, while my idea would be to start making it a little bit more usable and available for people, even outside of Suzy and our teams, which is why I am giving this talk after all. And, yeah, so these are a list of things that I plan specifically for the virtualization extensions, so running stuff not only in VMs, but also in containers, maybe other containers. Fancy thing, I want to be more, I want the VM starting and stopping process to be more parallel. Right now is a little bit too much of a sequential thing, which is not fancy if you have to run the benchmark inside 30 or 60 virtual machines. I want better, I want to have some VM management, I mean stuff. Yeah, then there is documentation. Yeah, that's pretty much the documentation that you find about it, but it will get better. And for now, if you really go there and try it and miss the documentation, which you will, and you maybe also find issues, I do encourage you to, as I said before, either interact with us on GitHub or email, either mail or me, and we will see what we can do to improve the things. Yeah, which is pretty much what I said. Give it a try, tell us what you think, whether it work, whether you think it could work for you, whether you think you would need something different and we will be happy to see if we can do what we can do about that. And yeah, here you can find a little bit more information about myself, and with that I am happy to take any questions. So, two hands. We have one here. So, it depends. Yeah, yeah, right. I have to repeat the question, and the question was why the synchronization is necessary. So, it depends on what you want to do, on what is your use case and your needs. Of course, if you are interested. So, here I am thinking about a scenario where you want, for example, let me just go here. In this case, this is the actual real case where I use this tool. In this case, I was running stream inside four virtual machines. And what I wanted, they were pinned to different physical CPUs. So, what I, in theory, what I would expect would be that on each virtual machine, I wanted to see the same or similar stream performance, because the goal was to check whether the isolation between them was working by pinning them, by partitioning the resources. For doing that, I need to run the benchmark inside all the virtual machines, and I care about all the results inside all the virtual machines, and I cannot afford to have results from a run where it happened in that stream, for example, it did five runs. If I don't do any synchronization, what guarantees me that, I don't know, inside the first virtual machine, the first run starts, then it finishes, then the second run of streams starts. And in the meanwhile, in the other virtual machine, they are running in different stages of the same benchmark. So, I need the gating. If this is not what you want, if you just want to, I don't know, I have some noise around and run a benchmark inside the virtual machine, and you care about the performance only that virtual machine, then you don't need this. And you don't have to use this, I mean, you can just use it without this synchronization. But there are use cases where it's important, at least according to me. And perhaps I haven't looked hard enough, but I haven't really found this particular feature of gating and synchronization in many other tools which are available out there. So, the question is why we are using full-fledged VMs instead of using other solutions, like creating VMs dynamically out of the OSPY system and stuff like that. We are just not there yet. It was easy enough to have focus on the part which implement this synchronization thing and for having the benchmarks utterly running in VMs, just we'll start something and then implement this. As I said, I ideally would want to be able to continue developing on this and to be able to use it for run benchmarks, not necessarily in VM, even in containers. And I want to add some VM management capabilities, which includes being able to redefine VMs with XML file, auto provision then. And absolutely, I mean, even something like that would be more than useful. It was just the easiest way forward. From where to start, basically. So, this is actually currently in use, as I said, inside the performance team and what they use it for is for testing and identifying regressions that may be happening in our product. So, basically, we test different, the commercial version of our product is Susaninux Enterprise Server, and so we test a different version of the operating system and see whether between version n and version n plus one, we are introducing regressions. And we also, they actually, because it's not, I'm doing development on this, but it's not strictly my team. They also track whether there are regressions between mainline and our kernel, which has all the patches, which we include. And it's also used for doing kernel development and figuring out whether, as I said, changes. But the kind of guests which are automatically tested right now are these ones, so our products, basically. But nothing prevents user minutes. It's again a matter of adding some virtual machine management provisioning, whatever, at which point you will, if you are able to install Windows and start a Windows VM, then you can use it. Can you use it? Yeah, sure. That's the... Sorry. The question was whether we, how do we boot, how do we boot a specific kernel that we want to test? So whether we use direct kernel boots or something else. So this is what this dash k parameter does. My, again, that's where I started. And my use cases was not much about testing different kernels, but about testing surrounding a benchmark inside multiple virtual machines with different configurations. So for this use case I don't care about. I'm fine the virtual machine booting with the kernel, which is installed when I provision it, when I install the operating system on it. And so, if you use this option, dash k, which is keep kernel, it basically just boots a virtual machine, which whatever kernel is inside this image of the virtual machine. But you can, if you don't use that option and use other options for providing the kernel, then it uses direct kernel boot for booting your kernel because you have your changes there and you want to test them. Zdominants. Thanks.