 First up, I'm a senior engineer at Cofing, I'm working on a vital source project from Linux, now sort of known as QMFamble as well, because I absolutely love it, so I'm part of the reason that we're going to do this talk today. I'm releasing some of the CCBite, so hopefully I will have a copy of this version of the slide up by the end of the conference, because I've uploaded a slightly earlier version, I've been editing during the conference, and the best start is I've managed to bring my clicky slide management device. Great to see you all here, great to be back in person. We'll start with a quick introduction on to what the subject is. So, the aim is going to be a basic overview of QMU and what it can do, and then we'll go through some examples of QMU in real world projects from stuff that we've done in the house and elsewhere, and QMU is a very big project so I'm not going to say that it will go into detail for all sorts of things, so quick introduction for those of you who don't know. QMU is a flexible open source from virtualisation, like say container type stuff to a full system emulation of enemy supported architectures, it supports an awful lot of them from x86, both sort of arm, risk 5, pretty much anything you can think of. Next important thing is user space based, so it can be run without any privileges, there is also a bit of things doing for such as acceleration, but they're not necessary, they'll just speed your system up. The codebase is 200gpl22, I think some of the sub-projects may not be, but you don't need them to build a QMU if you want to yourself. We run on a really wide range of systems from units, things like Linux and VSD all the way through Apple or Windows. Most of the time you're on average distributional chip through any of my defaults. Do you agree with history, the 0.10 release was, I believe, 23rd of March in 2003, so it's been in there all the time. I think at the time you were writing it up to the V7.10, and my basic code matrix, which is basically a git log and a couple of bits of processing, said there were around 2,000 contributors. Some of them, from like one or two patches up to, I believe they were, there's at least 1,9,000, so the codebase is quite diverse and well supported. People may have used QMU in what we call API operation, this is, we'll just touch on this a quick single slide. So we're not creating a virtual machine, we're running forward code on our host system. So QMU here is doing CPU code translation as well as a system call signal. So your host operator system is providing all the resources, and it looks basically like you're translating just the code. So this example, I don't know if you can see it, is integrating with a charoute. So I'm just basically here saying I've got an AMD64 laptop. I can charoute into a Debian ARM64 charoute and run the same call again, and I'm suddenly looking like I'm on an ARM64 system. You can also do very similar with bin format, so that you can just run foreign binaries on your native system as if they were native code. Anyway, very useful things like OS bootstrap, but we're not going to talk too much about it today. More of the interesting point is complete system emulation. So in this case, we're not only emulating a complete different CPU, we're also dealing with complete, we can create fake memory bus and devices. So we can go from something with a normal memory model like we might have on our laptop to emulate even a new math system, which is definitely not what we generally have on our laptop. We can do different bootflows, which we'll talk about later, from direct software to a BIOS style. You can suspend and migrate them, but we're not going to talk about that because that's a big topic on its own. It even has a little tiny code generator so that it has its own methods of accelerating supported systems. Not all systems are supported, but again, we're not going to touch on that because it's, again, quite a large system on its own. So there's a bit more on the introduction. QMU emulates pretty much everything your normal machine will do from input, display, character devices, audio block, networking. It even provides methods of passing secret data in if you need to do that. Some of these subsystems have multiple connection options. So, for instance, your display can be shown as a window or it can be exported over something like VNC. There's also a variety of bus emulation. So you have simple bus systems like memory mapped IO to your more complex systems like PCIe or USB. So, in the case of those sort of complex buses like USB, QMU provides core system management for those and core configuration so that you can do configuration and creation of devices over the command line. And there's a complete other list of buses like SCSI and ITC that you might find in embedded machines, but it's got to be an exhaustive list. This is illustrating that pretty much every modern bus is available. There is a notable exception here at the moment that QMU does not have an I3C emulation, which is slightly annoying, but there's not a lot of systems out there with I3C at the moment. So the more interesting bit of the emulation is device emulation. So QMU has what it calls device models because drivers didn't quite sound right. These models are either inbuilt or can be extended externally. Now, QMU does not strive to be 100% accurate and we will run into this a little bit later. They also may not be bug free. A lot of the models are well used, but some of them, one or two machines use them. It gets difficult in some cases. There are also too many to list here because there are a lot of devices that are supported. We'll talk a little bit about how the machines model attaches the standard devices next, but you could also attach devices via either the config or command line. So if you're booting a x86 system and you want an extra USB device, you can just throw it on the command line and pass it in. Also, a lot of devices and things like CPU cores are hot plug capable. So you can, as long as you have them defined, take them in and out of service. So I hope you can all read that. It's a quick example of launching a QMU system emulation on I would, I just do not love the IAR 64 term. It's ARM64, get over it. So in this case, we're going to launch an ARM64 virtual machine. There's a standard virtual machine with no graphics, two gig of memory, two CPU cores. The next few lines, well, we'll give it a kernel, we'll give it initial RAM disk and kernel arguments. And then the next last three lines we launch, we tell QMU that we're going to launch USB. We first need a device to say what's our USB core controller. So that's the QMU XHCI. We create a drive called USB stick because that's what we want. We all created that earlier. It can have whatever you like on it. And then we pass that into QMU's USB storage by just having the last dash device that says USB storage and the drive. I'm not going to show you that working because I didn't actually get it sorted. And it's an exercise for you later, right? So often you'll be working with a standard machine model in the previous example that was the virtual standard virtual machine. The machine model just basically touches your CPUs and standard devices and make sure they have the right IO resources, maybe interrupts, memory maps. In some cases it might just create black holes because some of your devices, well, actually emulated software might not need everything your real machine has. The model also may specify what boot methods you're allowed to do. I'll talk a little bit about boot methods and boot flow in a bit. We'll go very quickly over networking. There's networking. It has a virtual network system which is supplied to a host backend that can either use a real network device, say using a real device interface, or it can emulate you having a network via the socket interface. It has usual network protocols like IPv4, IPv6. You can do raw Ethernet if you're doing a real device. And there is also some can networking if you need to do vehicle work. Boot flows are very useful if you're wanting to change the way your machine boots. These often, these boot flows change depending on your architecture and system type because PCs do PC BIOS. A lot of embedded systems have some weird system that the manufacturer thought was a good idea at the time. Simplest boot flow is, as we saw previously, is you just give QMU a kernel or other, it doesn't have to be a kernel, it could be some other generic code to start. And QMU will just start that maybe with entry parameters for Linux or whatever. So you can then do a very simple boot, or you can then invoke a higher level boot code. So on some systems you might want to run like an ARM trust zone or open SPI on risk 5. So you can load those instead of the BIOS and then tell them that the kernel is available. And if you really happen to love slow booting, you can completely emulate the boot flow. So on the PC that would be a BIOS. On an embedded device that might be a boot ROM. So you can just pass a emulated device in and the machine model will emulate loading the first stage from whatever device and get it going. So you can then say test booting the U-boot main code or even the U-boot SPL on some boards. It's slow, but it's doable. And if that's what you need to do, it's actually really useful to be able to test that flow without having to use a real machine. Now, obviously it wouldn't be useful in real world cases if there was no way to control QMU. There is a QMU machine protocol, which is a JSON based system provided by a socket interface, survivor network or standard local socket. This allows you to query pretty much everything that QMU is currently doing. There's an internal object model that you can look at and then say, OK, no, I'll change that or I need to know that. It's very big as well documented. There's also a D-Bus interface because why not have several control methods. This pretty much can do everything that QMP can and has a method for migration. If your sort of person just likes to run up a console, then the QMU has a console. You can interrupt the console by control A, go in, do commands exactly like you would do with the QMP interface. Quick note about debugging because everybody will need to do this at some point. Not only you debug QMU with the standard tools, but you can debug into your emulated system by asking QMU to create a GDB server. So you can just use GDB to connect to your QMU instance and you can stop that virtual core or cause and ask what they've done. There is also an internal tracing system if you are doing development of QMU itself or if you're trying to find out what and if you did wrong with your machine model. There's a command line trace system I think it also integrates to other things if you need it to, but the inbuilt events can be enabled. So there's an example here. We can trace all the M2580 events for the flash driver. There's so much output I don't really think it was worth showing here. If you really want lots of output you can ask it to trace the memory system operations and you will get a lot of output because it will show you every read and write that that virtual machine is up to. And you'll get megabytes before you even got to the boot ROM. So that was a brief overview. We haven't gone into the complete depths of QMU but hopefully that shows you what QMU can do. We'll look at some projects. Let's go through to the actual projects now. So first project I've been involved with was testing the Linux kernel x86 entry code which we wanted to modify for certain secret projects that the customer wanted to do. So firstly we can use QMU to emulate a higher privilege model than we had on our command line. So the kernel code is obviously we need to execute a higher privilege and we really can do just running it under its own completely isolated machine so we can test the custom kernel. Obviously we don't want to leak a lot of our hosted information into this test environment so we must be able to control the CPU features that this virtual system can see which is very useful because when you're debugging into this sort of low level kernel code it's really not very easy to do that on real hardware. I don't know if anybody has tried to debug a standard x86 PC board but they don't really give you a lot of good debug features. Also if you're having to reboot your kernel often because you messed something up the x86 real world boards can take minutes and that's not exactly much fun when you're sitting around waiting for your say Intel atom to reboot because you need to know and do another test run. So QMU here gives us a very useful scriptable easily controllable easily monitorable system that we can use to test. And even better we can script these tests and we can parallelise them if we need to. This is probably something that people are going to be more interested in. So we'll look at the next few slides integrating QMU into your CI system. So for the first real world example which you can again as a slide thing you can go and look at yourself SUSE use QMU in their test system using OpenQA. Of course there are other test environments you can use if you want to but OpenQA does seem to be the popular one at the moment and this allows them to test releases straight out of their CI pipeline to run an installation, a boot flow test that the actual system comes up and does what it's meant to. So for instance they'll boot a media player and check that media player actually works. And they can do this without even installing it on a real machine which makes it really quite fast to turn around and very scalable which is something that we have found very useful and we'll talk about in that slide or two. Again we've taken something very similar in-house and applied it to customer projects as well as doing lava and more kernel CI work. It's very flexible and you can do your own scripts around it. So several of our customers didn't think about any sort of virtualised testing they were very much of the way of you get your software, you put it on a USB stick and you stick it into a real piece of hardware and you have employees sitting in a lab whose job it is to stick this into a rig and get it to a programme and test. Well if you like employing testing people but doesn't scale it also means that you can only really test software releases so you might test a release a day you might have several branches into this release that you don't know what was in it. Well you hope you know what was in it you haven't tested it though. So by taking the testing virtual we can often now not only test branches but we can actually get down to a commit level test it's not going to be a 100% test but you'll get a lot quicker feedback and this feedback allows you to identify problems much quicker so it's a lot quicker it's a lot easier to scale a software system than a hardware one so your customer may have only made 20 test units you might have been given five you hope that UPS didn't lose three of them in the post for a couple of weeks not that that's ever happened it costs a lot of money to make a hardware rig software well assuming you're not paying AWS millions of pounds you can scale your software testing system a lot quicker also you've now freed up your testing people to do the more important tests you can only do on your hardware so we'll run into this a little bit in the next slide however just going to say so some example of tests that do work very well in a virtual environment a say insertion of USB stick say what we saw earlier a lot of use cases in our customers projects I use stick a USB stick in that the customer wants probably has some media on it that they want to play we can test by inserting it does that media player turn up is the menu system correct does it draw it in the right order you can use open QA in your scripts to do all this sort of testing other instances well a lot of these systems have input stimulus which come from elsewhere this might be like USB serial network can so we can take those inputs from our scripts input them into the virtual machine and check that the system actually does what it was supposed to do again very useful if these systems are being repeated in case somebody comes along we have had at least one instance where a designer has come in decided that those fonts are the wrong fonts and we don't like that colour scheme we'll change it it broke the media player because now the fonts don't render properly or the sort order has changed although why not if you care about a sort order on our media player is slightly beyond me but then I seem to wonder why people keep putting all this software in cars when they quite clearly want to pay us to help them with that so I can't complain about that more software in cars means more revenue so some of the experience from these testing changes is that not only are there improvements there are issues firstly we can't make a perfect test and there's probably not a good customer case for making these test is perfect because really either you can't emulate the latest iPhone completely or it's going to cost you far too much money to say to Apple can we have a virtual iPhone please there are some inaccuracies in the emulation and we may not be able to do all the test cases or test hardware but as said we will get close so example one of our developers managed to break system D and part of this virtual test environment makes a much easier diagnostic flow when you can see and debug into the system without having to go to a real rig make sure you've connected it all up properly just look into the system on your virtual setup and see what happened so and we can often as I say because we're now doing closer to commit level checks this can be found much faster than say messing up five or six testers day by giving them software that was broken from the base performance well actually we're not going to get the same performance as the real hardware I mean you can put some very powerful x86 boxes in there but there's probably still never going to get to the quality of the system that you want but that actually in our experience CPU power is generally not the problem especially when we have things like the graphics pass through vertio GPU and the GL they're efficient, they're well supported and such as we found that Wayland and Mazer all support this out of the box as first class citizen so that mitigates a lot of the CPU issues we can just do the graphics by substitution actually just to note you will have probably to substitute the odd device when you're doing these sort of tests you may not be able to get exactly your customers hardware into this virtual environment generally you may you can probably get away with substituting a approximation that is close enough for your cases so yeah I mean we've one of my colleagues while working on this system for a customer found an interesting edge case so all of these boards have multi monitor systems unlike your PC that's generally single monitor we have an issue that multi monitor systems don't work at boot the graphics card is only told, unless it gets only told about the graphics card when it starts up that is only done for the first display so you only get the first head when the system boots which is great except if your Linux kernel wants to go in a way and initialise two other heads that you're going to use very quickly we ran into a few issues here with the documentation wasn't overly helpful about this it's obviously a case that people haven't quite thought about and the source code can be confusing in places we found that often monitor head display and those sort of terms were fairly interchangeably used which was a bit annoying we haven't quite fixed this yet but our fix was just to update the source to allow direct connection of heads we'll start the VM paused so we can actually just go in and script the initialisation so we'll connect we'll start the QMU instance in pause mode our scripts will go in and we'll connect the two more heads that we need and we'll just let it go and then the system will have all the heads connected as we expect I believe that this is possibly we'll look at trying to see if we can fix this better at some point but it works for us so last project I'm going to look at which is another one that I've been involved in is that as a software house we might be asked to emulate some new hardware so our first problem is that because we are getting software ready for something that's going to be say released next year well there's no silicon because nobody's made the silicon yet because they haven't finished their shiny, very expensive FPGA emulation as a side note if you are employing people to do this for you it might be a good idea to say when the employment contracts are done and started that your access contract was also done at the same time and not say oh no we didn't think about getting your access to those we need to speak to our lawyers now we didn't buy you that expensive $80,000 FPGA kit we're going to have to wait for six weeks so that was great so what do we do? we'll start using QMU and we'll go through a little bit about what that gets us and how we dealt with it so our first problem not all the IP that goes around this new system it might be modelled we have about free choices and a caveat that's still on this slide that should have been edited so we can add spend the time we can add device models it's well documented, the interfaces are all there I think on average it took us about 7 or 8 days per IP core to get something that was good enough I say good enough not complete we'll talk about that later you can replace it with something else if it turns out that you don't need to model everything in your system say the flash chip replace it with something else there's probably a close enough device model or maybe you just don't need it if you want to boot a system you may not need your trusted platform module at start up you may not need a PCIe bridge because you may not be booting over that and these notes also come along to your CPU cores it might be that your vendor wants to try some shiny new CPU vendor core well there might be something close enough in QMU there and you don't care if it doesn't quite work or doesn't have every feature you want because you can charge them for adding some new features later hopefully so for our process we're going to start with adding a machine model so this gives us the binding for all our devices our CPU cores, our memory I think in the case of the system that I'm working on we're talking about a thousand lines of code about 500 of that I've copied it from a previous system because it vendors tend to try and avoid being too original when they are moving along so we have things like our bootstrapping options we can emulate pretty much all of those if we need to we might not want to do that at the start but you can so not only are we going to have to add more IP models we may also run into bugs in the existing IP I would say some machine models are not well tested they may only be some device models are not well tested they may only be used by one or two machines and people don't tend to apparently run them very often otherwise they would have found bugs like they haven't worked in Linux for about nearly two years I'm going to try and fix some of these at some point we will talk about some of the problems you run into when you start trying to do that in a bit so the advantages we've reduced access contention to the very expensive single FPGA system we'll have a machine model ready for the vendor when they've got their boards together whenever that happens I don't know when this is going to be released so I can't talk about what I'm doing but I can't talk about who I'm doing it for I can talk in general about what we've been doing so we'll have a QMU machine model ready for the hardware release which is great apparently the vendor did not think about that and went ooh that was a bonus yeah thank you but we also still not had actual real silicon yet so we've been sitting on stuff we've had ready for a month or so now and the vendor is going well mmm yeah we should have thought about that we've also talked about some of the disadvantages accuracy again we can't get 100% accurate we'll never be cycle accurate some of the internal states may not have been thought about also upstream ability so fixing existing IP models is all very well and good but if you're writing new ones and we've written three or four new ones at this point upstream don't want to look at taking these until they've got a machine model that uses them and we can't release the machine model until we've got a release date for whatever we're doing and of course given all the delays we don't have a release date which is a bit of a bum we might see if we can work around this at some point but that's going to be talked about in the future this is also effort that the client may not want to do a lot of these companies will go well here's a github repo with our QMU init that's very much well and good but it will often just rot away which is sad the experience that we've had so far well working at both ends of the driver stack from Linux drivers all the way down to the QMU device models not always great if anybody here has had to deal with manufacturer data sheets they're not always accurate you will also possibly make mistakes due to duplication of understanding so even if they're two people doing the code they may well have made the same mistake again the core may not cover all cases in the case of when we're looking at the unmatched board the RIS5 atomic memory operations do not work on all memory types in the real world that was fun because it works they work perfectly everywhere on the emulation so you need to click the closest core to your use case which is 90% of the time it's going to work we've also talked about timing accuracies so even things like SPI bus timings internal hardware states they're not going to be emulated properly so for an example we've been working on a new device model for a CAN controller it turns out that some of the tests do not work in fact they do not work on the real hardware just in the same way but it's like where is this problem, is it QMU? did we write a device model badly? no Linux sucks because it decides to check the receive before it checks the transmit case so you get a warning that you received stuff before you sent it software testing is not fun sometimes so I would have loved to have dug deeper into some of the issues we've had conclusions it's not totally accurate but generally close enough you're going to get an improvement by doing this you're not going to solve all your problems but you'll probably get a good coverage it's much easier to scale software people are falling over themselves to sell you some sort of cloud computing you can shove this into and have it work which is generally going to be cheaper than actual real hardware and the scriptability is great it means that you get a reliably repeatable set of tests you don't have to worry about whichever human flashback was doing the tests and whether they made a mistake or not so thank you very much this has actually run slightly longer than I was expecting possibly because I've slowed down because I'm actually losing my voice from having done too many talk practices I can recommend preparation in advance and not have your colleagues drop stuff on you last minute that you have to spend all weekend on so I think that there is some short time for some questions if anybody has I don't know if there's any online questions I think the online sessions are lagging by about half to a minute but it's been great being back here in person I will upload new versions of the slides later I don't know if I should probably upload a version of the speakers as well yes Sam so the question is what are good resources for getting started I mean for me I started by looking at the unmatched board I had one of those in real life I just looked at the QMU documentation of how that starts built a build route image for that and a kernel and used it to do some testing on my own so that I could prove that I can do it for a fairly simple known case and I went down and the examples on the unmatched were fairly good it showed all the bootstrapping options you could do so not only could you just go through the kernel boot and boot into an initial RAM disk it went down into then attaching a real device and then emulating the boot flow of actually having the system load new boot from a real device that you may have done which is pretty much most of what you want to do so if you want to do any changes to U-boot you can use that to build a new U-boot image and test it from that level other than that I think there are generally a lot of good documentation around QMU itself so getting fairly close to the end I think so so thank you very much for coming I will be around the conference until Friday please don't ask me any really technical questions but I'm sure that we're in thank you