 So hello everybody. My name is John Snow. I'm a software engineer with Red Hat. I am the ATA Devices Maintainer for QEMU upstream and this is how to write a legacy storage device emulator and maybe why you would want to. Anyway, so as an overview for this presentation I'm going to cover what QEMU is a little bit. Some, I mean probably most of you know, but some of you may not. I would like to cover why legacy device emulation is important. I would like to talk a little bit about what exactly a legacy device emulator is. I would then like to cover a little bit about my favorite legacy devices, the ATA and friends, and then concluding the talk I would like to do a whirlwind tour of bits and pieces of the code necessary to implement an emulator upstream. So what is QEMU? QEMU is an open source general purpose machine emulator and virtualizer. It supports a ton of different architectures. It's similar conceptually to if anybody has played around with like MAME or MESS, it's multiple different kinds of emulators all packaged together. It's just unlike MAME which is targeted towards arcade machines. QEMU is targeted towards literally everything. So I guess QEMU is more like VMware and virtual box, but MAME is a lot more fun to talk about so I'd rather do that. But what can QEMU do? It supports a variety of CPU architectures like I said. It can run under a ton of them. It can run under a ton of operating systems. It can emulate dozens of processors, boards, chipsets. It'll support hardware virtualization via KVM which is available for a number of architectures. It's capable of emulating hundreds of devices, anything you want from Nix, hard drives, serial ports, USB devices, etc. How is it used? The most common usage is virtualization. You'll be hearing a lot about that over the next couple of days. You can run, of course, X8664 guests on X8664 hosts to maximize your resource usage. For gamers, a lot of them have been using Verdio GPU lately. If you just check in on Reddit and you can see everybody's very excited to run their favorite Windows games with near-native speeds on Linux hosts, but emulation is also a really common usage for QEMU. You can run legacy applications and such on a brand new hardware. Or if you want, you can run your favorite Mac OS9 app on your modern machine if you really, really want to do that. QEMU is also often used a lot for debugging and development of various kernels, drivers, hardware, and so on. But it has some fun uses, too. This was actually a port of QEMU that somebody has turned into a working Xbox emulator and I was so excited by this that I had to put that screenshot up there. Anyhow, why would you want to emulate legacy hardware? The enterprise and maybe home entertainment has different goals or objectives with emulation. The enterprise, we're very interested in performance and utilizing our resources and having this flexibility. And we're very interested in devices that maybe have never technically physically existed because they give us the best possible performance. But maybe on the entertainment side, maybe it would be really nice to play TIE Fighter again, which only works on Windows 95 unless you have a ton of hacks for it. So sometimes an emulator is really useful for that. And sometimes there's a focus on legacy environments and things like that. Not always the enterprise objective sometimes a little different from kind of the open source users want to see something a little different. So what would an enterprise want with any kind of legacy hardware? Sometimes there are infrastructure applications that were written a long time ago and then that guy retired and nobody knows what that program does, but it definitely only runs on a machine that stopped working 10 years ago. Sometimes emulation can help us out there. For flexibility, we don't know what people are going to do with this. We write this code and people do surprising and upsetting things with it. So we just want to make sure that everything is working as best as possible for as many different scenarios because we can't anticipate what people are going to use things for. And sometimes you just simply want a CD-ROM even though it may not be useful for massive deployments. Sometimes you just run it, ah, if I just had a floppy disk, but well, maybe that's not as common anymore. So I would like to talk a little bit about preservation of media and hardware. So flash ROMs will degrade over time. So will CD-ROMs, your peripherals, the hardware they connect to, everything will eventually stop working. So there are a lot of organizations out there dedicated towards archiving media. So we have archive.org, the U.S. Library of Congress are all working on preserving books, movies, media, people are preserving software, primarily video games from TOSEC and MAME and no-intro organizations like that. But who's archiving the hardware? Us, I guess, kind of. So to keep these archives useful, we would like to provide emulators that can run all of these old hardware that's sold software that we've been hoarding for decades. So like video game home computers are pretty well covered by a lot of emulators that are very narrowly focused and receive a lot of attention. But maybe home computers are perhaps a bit underrepresented, but it's just as important as the video games, I think. Less fun to talk about, I suppose. So what is a device emulator? So when we talk about an entire emulator for an entire computer, it's a collection of various emulators that are focused on different aspects. So the device emulator itself is either a program, a module, or a plug-in that's dedicated to a single core purpose. It's kind of useless on its own. It only works in conjunction with other emulators. And it's the inverse of a device driver. The device driver is telling the guest operating system how to interact with the hardware, but the device emulator is kind of doing the opposite thing. You're allowing the computer to talk to a device driver. So if you wanted to look at, for instance, the Linux source for LibATA and then compare with our emulator, they provide the complete opposite ends of the interaction for that. So we want to turn devices back into ideas. They started as ideas, and then people made the manifest, and now we want to turn them back into code. So where do we start with that? Commonplace, standard specifications. So for ATA, we have ATA 8 as the latest one. Sometimes there's not a standard. You might have vendor specs instead, like for floppy drives. The OS Dev Wiki, if you were interested in looking up, like kind of antique device emulation, the OS Dev Wiki is a great source of information for all kinds of standards and specs and how to code for older systems. There are emulation development forums and lists. Of course, there's ours. Can you develop at nongnu.org. And maybe the best way to figure out how to turn a device into an emulator is to examine the real device. So this is my Q-bit work. This is a physical Q35 machine. This is the latest, well, I guess it's the latest PC chipset we've added to QMU. And I use this to investigate the real behavior of the CD-ROM, the floppy, and the South Prince chipset to make sure that our emulation is on point. It's not fun to use this, but there it is. So for observing a real machine, there's a few key techniques. I mean, a custom Linux kernel is really invaluable because you can observe the device behavior. And then there's the old standby of printfs, which are dumb, but they always work. You can do device pass-through using a machine that supports the Q35 machine there actually doesn't. But if you can pass the device through, you can use QMU. You can use a debug build of QMU to see what the device is doing and there's ways that you can investigate the behavior there. You can also look at existing driver code to kind of interpret how a device should be working. But you have to be careful because if you use open-source drivers, your only source of information for how to write the device emulator, any bugs that existed in that driver are now going to be bugs in your device emulator. So you should have multiple sources of information. You have to be careful about that type of thing. So for specifications, there can often be multiple layers of things. Everything uses a spec from something else and this list could probably extend indefinitely out to infinity in either direction. But if you were interested in implementing a particular device, you would want to kind of look a little bit above and a little bit below your spec to understand kind of how it's expected to be used and how it expects to use adjacent devices. So not only are there multiple kinds of specs involved with any one device, but they often have different flavors. So AHCI for the SATA controller, it's just a spec, but real devices have specific quirks. So the Intel implementation on the ICH9 Southbridge has all kinds of magical registers they've added to add specific flavor. And Kimu mimics the real device is not the abstract of it, so we have to make sure that we're following real behavior and not imaginary behavior. So how does the cell fit together? I'm going to talk about writing just an IDE emulator without all of the other components that would be necessary to make it work, but I will show the APIs that we plug into to get that running, which hopefully keeps it simple. So I'm going to talk super, super quick about exactly what IDE is. It's originally integrated drive electronics. It was a drive that plugged straight into the isobus, which kept it very simple, but time has moved on since then. Later, ATA2 added EIDE and DMA, which started to complicate the design a little bit and started to build up of specifications on top of specifications. A TAPI was an extension to ATA, which allows you to send SCSI packets to an ATA device, which is how you communicate with CD-ROMs, so that adds yet another layer to the emulation. SATA is a superset of ATA. It defines a couple of new commands, and it supports most of the old ones. It has a new message protocol. It was actually designed to use an external host bus adapter kind of unlike the original IDE, but unfortunately it's backwards compatible, so it kept all of the baggage from the original spec, which kind of complicates matters, but it's okay. A HCI is the advanced host controller interface. The host bus adapter was designed no longer internal to the drive. It got rid of the direct CPU IO access, and everything is done through PCI memory mapped registers, more or less, so the design has evolved a bit. So just as a super quick recap, we have all of these specifications that we have to be aware of when we're writing this emulator, but with good object-oriented programming, hopefully you wouldn't need to worry about necessarily all of them, but real life and expectations, sometimes they diverge. So why care about ATA specifically? It is kind of slow. It's inefficient. There are a ton of specs and acronyms, but cases four, it's extremely widely supported. It's often used in bootstrapping better devices later if the drivers aren't included with your installation media, and it is modeled after real physical hardware, so it does have very good support and guest operating systems. So let's talk about how exactly you would do this. So it is a bit messy, so I'm going to skip over quite a bit of detail, so I'm going to try to focus on some of the key portions here of literally exactly how we would implement this in QEMU, and if you are curious enough, it's always open source. You can read the code yourself. You can ask me questions later, and I could help somebody get started if they were interested. So we're going to use QEMU's QDev to define the device. It's our own implementation of kind of our object-oriented device tree. We're going to define or describe what the parent bus is like so that QEMU knows what type of things to plug this device into, and we're going to define a couple of user-configurable properties for the device. Good, it's readable. Okay, so we don't use C++. Maybe that surprises some people. Maybe other people are very excited by it, but we have our own object-oriented system for describing devices, so you can see here. I've named this device the DevConf device, but I intended it to be an IDE device. The parent is our generic device, and it inherits all of the generic device properties. We're going to describe the size of both the class and the instance. This is stuff you wouldn't normally have to worry about in C++, but welcome to see. So we're going to describe here as well under DevConf register types. We're going to register our type info structure, and then using a bit of compiler magic, we're going to hook into the rest of QEMU so that QEMU knows about this new type we've created. Then over here, you can see that we're actually defining the structures. I've kept them simple here. I haven't added the extra information, but you're going to describe a device, and the first member is going to be the parent device, and the same for the class. The first member is going to be the parent class. This is how we're going to do our manual object-oriented bits. Then in the class initialization, we're going to describe the category of the device, the name of it, give it a human-readable description. Also, during the class initialization now, we're going to explain what the parent bus type is, which actually isn't that important for IDE because they're not hot-pluggable, except in Zen, I guess. But for newer devices, there's hot-plugging, and the bus descriptions become very important for knowing where you can plug in these devices. So device properties, similarly, during class initialization, we can point to our property struct, and we can define a couple of interesting things, like the version, the serial model, things like that, things that the real ATA device would have to report to the guest. So it's just a simple structure of the names and where in the object these things are stored, and then we can link that into the class, and then CUMI will handle a lot of the rest of that for us. Of course, in the device, we need somewhere to store these properties, so you would put them in this structure, and then there's an ellipsis up there. I've kind of omitted a lot of the actual real register state for where you would begin putting the real IDE values, but I'm trying not to scare people. So device realization, just like in C++, we have essentially what our constructors. You set them in the class initialization function again. It runs at boot, and you can choose your realization function for a device. Here's where things get really hand-wavy. So here you would normally check the properties that the user provided. Did they ask for a serial number that was just complete garbage? Did they type a bus that didn't exist? You can check those things here, and then you would generally pass on the initialization of the device struct itself onto a helper into the thing, but this is your chance to initialize all kinds of register values, initial state, things like that. So we have a basic device. We want to make a talk to stuff. The basic IDE devices use a CPU IO, Port IO, not M-Map, which is kind of configured by the machine instead of the device, and by convention these are the ports that we use for old IDE drives. So to set up CPU IO, you can see that this is the isobus. This is the definition for the isobus, and as a user property we've described the IO base and IO base 2 as our conventional ports, and then during the IDE bus initialization, we call IDE init IO port with these properties as configured by the user, but it defaults to those. And then in the initialization here, we're going to pass this off to the isoregister port IO list function, which is going to use these two structures, IDE port IO list and IDE port IO 2 list, which are defined over here, and this is how we're going to hook into, when the guest operating system requests to talk to our device. So we have all of these functions that we've defined, you know, read and write for, this is the base register, these are read and write to the data register, and these are read and write to the status register. So we can just describe the structure of all these read write functions and then we can hook them into the port IO register list, and then QMI will handle the rest for us. If the guest asks to talk to these ports, we will get calls to these functions, and we can write the command handlers accordingly. So the next thing we want to do, the guest can talk to us, we would like to talk back to the guest, so we need to register an IRQ. Modern host bus adapters handle it for us, but IDE is old, so it doesn't. So similarly in the isobus we have another property to define the IRQ is defaulting to 14, but you can figure it, and then similarly another function here is an IRQ, and we're going to register our IRQ to be used. And if we want to talk out, this is the actual real unabridged IDE set IRQ function. It's very dead simple. If interrupts are not disabled, we just raise the IRQ, so now we can get messages in, we can send messages out. And this is the last part, so transferring data, the exciting and good part of the IDE device. So for this the API we're going to use is the QMI's block layer, which you can find an infinite number of talks on if you are interested about the nuanced way that that works. And it's sometimes a little difficult to follow. This is literally the chart I made for myself when I started working on the ATA device to literally help myself understand the flow of everything, but we don't need to worry about that. We do have simpler versions that we can use. So for instance, here is the simplified version of kind of what's happening with an IDE device. The guest is going to ask for data via the IO port one of seven using the read command. The IDN relator will intercept that via that port IO list that we registered. We're going to buffer it using block AIOP readV currently, which is going to read from our block layer like the actual host storage, and it will buffer that data in the IO state for us. And then the guest is going to read that data back using CPO input. And in between, of course, there will be an IRQ to let the guest know that the data's ready. But it's reasonably simple once you kind of cut out all of the error checking and so on and so forth. But we're going to use in our emulator Kine's block AIOP readV to actually get the data from the disk. And then once the data is done being read, the callback is going to set our device state and issue an IRQ. Okay, I got to hurry. So it's going to read the data and eventually the guest will read it using the CPIO input. And then writing data looks a bit like this, but I'm out of time. So if you need any more information, please feel free to reach out to me. I will post the slides online soon. I encourage you to reach out if you are interested at all in writing device emulators. Thank you. Oh, are we good for like a question? Okay, they're giving me the out of time things. I'm like, ah, crud. So does anybody have any questions that I could help you with? I know that was like super, super, super quick. Yes, I can write after this. It's not fully functional and there are some cheats that I did. So the real IDE device as implemented in QMU has all of these nasty, hairy kind of backwards compatibility bits that we had to put into it. So for the sake of posting neat little snippets of code that fit on a presentation slide, I had to cut a lot of that stuff out. So I do actually have a working example device, but there's some kind of hacks in it, like direct casts to other types so that I didn't have to reimplement 40 other classes. But I can post the code and explain where the shortcuts were taken and the slides will be up soon too. Yes, we do have some... I stuck with one that I liked, but yeah, I can post that as an amendment with this if you'd like. Okay. Anybody else? We good? Okay, I think we're good. Oh, hi. Sure, okay, so the question is the legacy IDE I was talking about was mostly CPA-IO and not M-MAP access. There are similar functions. What I showed was you basically create a structure of the offsets you're interested in and what functions you would like to register with those offsets. There are analogous structures for M-MAP access and a lot of that is handled by the PCI device above you. So the AHCI device kind of extends the PCI device and there are basically structures you can register where... Oh, if somebody writes to this M-MAP bar of this PCI device, please call this function and you can define your function and from there you can define whatever structures you would like for dealing with that request. Yeah, it's not too much different. Just 25 minutes is shorter than I thought it was going to be. Yes? In the floppy device. ATA does have some... I should repeat the question, sorry. So the question was did we run into any problems with the timing of the devices for these? And yes, we did for the floppy device in particular because the timing is a lot more stringent on those. So there are sometimes some delay functions that you have to write and one of them caused us a CVE, which we fixed it, it's okay. ATA I think is a little more lenient and forgiving because you can always just signal the interrupt when the data is ready, the data flows a little better. I think the floppy spec from what I saw of it was a lot more like yes, you issued this request and then you have 400 nanoseconds and then you are allowed to submit a thing again and the timing is there a little fussy, but for ATA it wasn't so bad. Okay, we're good. I'd just like to remind you just a little bit of attention if you could raise up a presentation of speakers in the mobile app or through the web interface we would appreciate any kind of feedback we could get. Thank you.