 All right. Well, it is time to start. Our next talk is improving guest management via the QMU guest agent and Michael Roth from the IBM Linux Technology Center Will be giving us talk. So thank you Michael. Thank you So the QMU guest agents in a nutshell is a system-level demon It runs inside the guest and it executes commands inside the guest on behalf of the host now all of the The communication between the host and the guest is done over a para virtual communication channel so there's no networking required and That's actually an important point Because you might ask why not just use SSH? Right you run an SSH server in the guest you connect to it from the host and you can run any commands that you want inside the guest but the problem with You know a network-based guest agent is that That causes network isolation issues. You generally don't want your guests to have access to your host management network Because when you do that all of the services that you have running on the host or You know in that same network are now potentially exploitable by guests So it's an additional attack vector And that's that's the main issue from a host perspective from the guest perspective just the notion of asking users to Run a network-based service that provides root access To clients from outside the network is is kind of a hard sell because You might have confidence in the security of your host network, but from the perspective of a Somebody that's just using your host to run their guests providing root access over the network is a you know It's a big deal, you know, maybe you know your network isn't so secure and any other guests running You know in that network or on that node could potentially You know exploit your guests. So it's it's a securities concern and for that for because of that a Non-network-based solutions the way to go so that's So here's a just an overview of how all the pieces fit together We have QMU. It's executing the guests and within the guests. We have the QMU guest agent running QMU also provides the QMU management interface. So when node-level management Wants to do management tasks. It will generally communicate with QMP QMP provides interfaces for doing things like attaching disks and rebooting guests, etc. With QMU GA There's now in an additional interface that Libvert or other node-level management tools can connect to To extend the types of management capabilities that are available We also have future QMU here and the idea there is to take the The QMU GA client and build that directly into QMU Such that we can expose all of the guest agent commands over the existing QMP management interface and You know by doing that we can greatly reduce the the complexity behind managing multiple guests at the node level and That also lets us use the guest agent from within QMU Which also opens up some potential use cases Now the communication protocol is a JSON based RPCs so Commands are sent to the guests in the form of a JSON encoded request and Responses are received in the form of a JSON encoded response. This is actually the exact same Protocol that QMU's QMP management interface uses and we can have a side-by-side here for comparison, but it's the same protocol and it was done that way on purpose because You know existing tools like live vert they already know the the QMU management protocol So it's easy to plug in to the guest agent if we stick to using that Here's a list of the supported commands Most of the commands are also supported on Windows Notable exceptions would be the file access interfaces. I do have some experimental patches that that add that but You know, we just need to get those patches upstream, but that's that support should be forthcoming and some of these other network some of these other interfaces may not be so important in terms of Windows support, but things like fs trim possibly so basic usage on the host The only thing you really need to configure is the communication channel that the guest agent is going to use The dash char dev argument is basically how you configure How that communication channel is exposed on the host So in this case, we're saying that we want to create a Unix socket at slash temp slash qga.soc and In doing so you can connect to the guest agent just by connecting to that Unix socket on the host all the other bits are how we expose the communication channel inside the guest and The the most important parameter here is the name equals org.qmu.guestAsian.zero, which is It's an identifier that guests will use that that some guests will use to determine where exactly to Create the character device that represents the communication channel on the guest side Where exactly to what path that that character device should be Said to in the case of fedora With this with this argument you'll end up with a slash dev slash furtio dash ports Slash org.qmu.guestAgent and inside the guest you basically just point the guest agent to that That communication channel and from there. That's that's all you need to do to To begin using it now the the guest agent Binary it's available on most Linux distributions Mainly because it it ships as part of the QMU source code So any distribution that has a QMU package will also have the the guest agent there in some form for a rail-based distro there's actually a a separate package for the guest agent which is kind of nice because generally It doesn't make much sense to install the QMU package inside of a guest and And I mean you can you can do that and you could use it, but it'll be pretty slow So generally inside the guest is where you want to use the guest agent On the host is where you want to use QMU. So it's nice to have the the separate packages for rel But the important thing is that the packages are readily available. You don't need to put together a custom package to Get most of your your guest images You know over to using the guest agent So why another guest agents? There are other guest agents out there and there's even other guest agents that communicate over a para virtual communication channel Over a guest agent does that Open stack. I'm not too sure if they use a para virtual channel or not, but I Would assume that's that's a potential thing that that would be added in the future VMWare May seem kind of odd to mention here, but they do actually have an open-source guest agents it does also communicate over a para virtual communication channel via the VMCI sockets and those were actually recently added to the Linux kernel. So we could Theoretically add support for for VMCI at a QMU and use the the VMWare guest agent But there's two reasons two main reasons why We're doing a QMU guest agent or why we did a QMU guest agent The first is that we're we're going to end up needing one either way for some of the more advanced usability features that are offered by By solutions like like VMWare and virtual box things like clipboard synchronization where if you copy a Block of text inside of a guest and then you want to paste that into an editor you have open in the host That type of functionality requires a guest agent things like a smart desktop resize where if you resize your your guest window Instead of having you know all the icons and text being really small you can change the resolution so that Everything's still legible. These are all things that are supported by by a lot of competitors to QMU And you know ideally we'd want to close that gap eventually and to do that we'd need a guest agent and For that type of functionality where you need that You know tight coupling between QMU and a guest agent. It doesn't really make sense to rely on Something like the overt guest agent to have to install this huge you know management stack on top of QMU so that you could support small common use cases Which brings us to the Next part which is some Some use cases that that QMU GA provides that guest agents in general can can provide that aren't currently available and to our All right, so the the first use case is live image snapshots now a guest disk image is Basically just a file on the host snapshots Are just point-in-time backups of those files and all of the common use cases For backups also apply in the context of virtualization If you have a guest that was running the software or That was running software and they had an issue where it ended up wiping out a bunch of data and you need to recover from that You could use a snapshot to recover to an earlier state On the hardware side of things to address things like hardware failures You can create snapshots and store those to multiple storage nodes that you have some level of redundancy So even if the drive goes bad, you could still recover the guest and that type of functionality is is easy to Provide if we shut the guest down beforehand because if we shut the guest down beforehand the snapshots are basically just To make a snapshot you just need to make a copy of the image on the host that represents the guest file system image The guest disk image But we generally want to do this while the guest is running while it's live and that's actually a very common use case and Because of that QMU has a pretty rich set of interfaces to handle live snapshots These are all commands that are Available through the QMP management interface that I mentioned earlier just a brief overview There's a block dev snapshot sync which is a point-in-time snapshot of the guest image One thing to note there is that that command is synchronous So technically when you run this command the guest is still live. It's still running But if the guest ends up Get into a state where it needs QMU to do some work on its behalf It could end up blocking while this command completes So even though the guest is still technically running if you have a in IO intensive workload You could still get downtime running that command So to address that we Recently the drive backup interface was added which it basically does the the exact same thing But it spawns off a background process Well, it does all the work in the background so that We don't block guest IO while we're creating that snapshot Drive mirror it mirrors all the rights that a guest image makes to one or more additional images and that basically Maps to the notion of a continuous snapshot where instead of having a point in time that you can revert to to recover Every time the image changes you take another snapshot So if you ever have a failure on one snapshot you can fall over to the other one and not have any data loss I mentioned redundancy earlier and you can achieve that through these interfaces if you use a network based file system so you can expose multiple storage nodes to a host and you could Save those snapshots to multiple storage nodes to get that type of redundancy there's also things like network block device protocol that you can use to Point QMU to a remote image. That's not running on the local node. So you can get redundancy there But all of these interfaces they all suffer from a very common issue When it comes to backup solutions and that's data inconsistency so on Linux and Windows You know, there's an in-memory write back cache for disk reads and writes so every time a process or generally when a process writes to disk that write doesn't actually go straight to disk it gets written to memory and That's nice because if you have a process that's doing a bunch of writes It doesn't have to sit there and wait for those writes to complete before it goes off and Does other work? So it just writes straight to memory and then at some opportune point in time in the future We that's that's when we actually sync it to disk and that could be during cache eviction so if we run out of Out of room in the cache and we need to remove Some data from there to make room for for new data. We'll sync that to disk you can configure a time intervals where if data has been sitting sitting in cache for More than a certain amount of time the operating system will automatically flush that to disk and you could Also control that on a per process basis where if a process wants their data written to disk It can tell the operating system to do that explicitly with the sync call But regardless of that Just the fact that There are points in time where data is sitting in guest memory, but hasn't been committed to disk Because of that you can get data loss and corruption when you do your live snapshots because If you take a snapshot of the disk And you have data sitting in memory when you restore to disk You know if you have saved 10 megabytes of data sitting in the cache Dirty pages and you do a live snapshot of the guest image if you end up having to restore that's 10 megabytes that you lose and where that data Was originally going to be written You know who knows it could cause lots of problems or you may not notice it, but you could get data loss but furthermore You could also get data corruption where if Linux is in the process of flushing data from the cache To the disk and it's say it's it's flushing out a File to disk and it's only halfway done flushing out that full file and at that point in time the the snapshot completes Then when you try to restore from that snapshot what you could have is a file where half of the data is new Half of it is is stale data and depending on how that data is encoded if that was an encrypted file Then you could lose that data permanently So that's that's that's a big issue and You know as I said, this is not an issue that's specific to Virtualization this will happen in any case where you try to back up a file system. That's in use So to deal with that One mechanism that Linux provides is the the FS freeze system call And what FS freeze will do is when you make the system call it'll flush anywhere any outstanding rights to disk and then it will freeze Any new rights that any process attempts to do And when those have something similar, which is the the flush and hold Interface that's provided by the volume shadow copy service. So By issuing this command prior to doing a snapshot You you gain to two assurances one since you do the flush beforehand any outstanding rights are going to be committed to disk to since No new data will be written to disk after the FS freeze was called You didn't you'll never have an issue where fully or partially overwritten rights occur after the snapshot and with those two things we can provide data consistency in our snapshot files so QMU GA supports issuing the FS freeze Inside of a guest via the guest FS freeze and guest FS freeze DAW interfaces it supports it on both Linux and Windows via VSS and Livert will actually use that interface automatically if you specify the QS option to the the Livert snapshot command it'll automatically Freeze the the guest file system prior to doing the snapshot so the next use case is shutting down guests which It seems like a simple task, but there's actually a Lot of considerations there It seems like a simple task because we're used to shutting down a guest from within the guest And when you're within the guests or when you're on a bare metal machine, you know, you can power down halt hibernate suspend You know all that good stuff and the machine will basically You know do what you tell it to do the behavior is mostly predictable You know you might have to wait for Windows updates to install or something But if you tell your guests to shut down you tell your machine to shut down will shut down eventually We also have well-defined programmatic interfaces, so you can shut down machines from the command line You could shut them down from a process from widgets, etc but This is the interface that QMU sees When you tell QMU to shut down the guests All it can do is the same thing that you do when you press the power button on your machine, which is it will Send an ACPI shutdown request to the guest and from there what the guest ends up doing is is Just completely unpredictable the guest it might shut down the machine It could suspend it hibernate it it could halt it where you know if you're on windows You'll get the you may now safely power off your machine Message in which case you'll never know if the machine actually shut down Or Worst of all I might actually just completely ignore it if the guest doesn't support ACPI then it won't respond to that request In Windows you can tell your guests your you can tell Windows to ignore ACPI shutdown requests So it will just be configured to not do anything So the obvious solution is to instead of using the power button used as these rich sets of this rich set of interfaces that are available within the guest and that's provided by the guest agent via the The guest shutdown commands and that's that's also supported on Windows hibernate and resume So hibernate as you all know writes the guest memory state to disk So that when you power down your machine you could still restore your your working state This is useful for guests as well if you have if you're using your your guest VMs for doing development work or any type of interactive work where you tend to accrue a lot of working state that you want to keep around but Working state that isn't necessarily Gonna be committed to disk anytime soon, you know, if you got a bunch of windows a bunch of terminals open You know just development environments things like that It may be a useful feature to hibernate the guest so that you can resume it later But in the context of virtualization, there's there's some additional use cases for something like hibernate one of them is servicing Servicing hosts servicing hardware nodes sometimes you need to do Sometimes you need to do tasks that require you to Evacuate all the guests from a node you need to do a kernel update or you need to update say the QMU binary Or you need to replace the hardware, etc You know so you need to get those those guests off of the host and the simple solution is just to ask your users to Shut the guests down by a certain date And then when that date rolls around all the guests will be shut down and you can do whatever you want But you know in practice you'll tend to actually you have to force shut down guests because it's You know it's quite often the case if you have a lot of customers that they're not going to respond to every email message You send them and when you do that You risk data loss so that there is actually one way to address this problem. That's You know almost a magic bullet, so it's worth mentioning and that's live migration If you support live migration In your environment, then all you need to do is live migrate your guests off to another node And at that point you can do whatever you want to the node. There's minimal downtime for the guests Everybody's happy, but that's that's not always a feasible solution If you if you haven't carefully architected your solution with live migration in mind You you could potentially run into issues that are hard to work around for instance if you're You know if your environment supports local persistent storage then that all that local storage will need to be migrated along with the guest memory and For one guest that may not be too bad, maybe they have to 300 gigabytes of data If you have a fairly fast network, that's not too bad But if you're evacuating the host then you know multiply that by 16 or 20 or whatever And if you actually have a lot of nodes then that could be just a tremendous amount of data and it could become intractable to You know deal with transferring all that data over your network to to service nodes there's also a couple other situations when We add new features to QMU. We don't always add those features with migration support included um Which Which makes sense because it's hard to maintain migration compatibility So if you don't give new features time to stabilize Then you may end up, you know breaking migration down the road So a lot of times we'll have new features inside QMU that people want to use but that may not necessarily support migration For instance at you know currently there's V host scuzzy IV shmem These are all features where if you're using them for your guests, you won't be able to do live migration So if you if so if live migration is not You know a solution that's available to you You basically have two other options. You could shut down the guests or you could hibernate the guests both will lead to a loss of service but obviously the The pro for hibernating the guests is that you don't lose Uh guest data when they restore all the data that was there in memory is still going to be present Um, additionally if you need to hibernate the guests so you could service the node once you're done servicing the node you can resume the guest So that could potentially reduce downtime for for your users But but there are certain cons to that approach. Um, for instance, if that guest was uh Doing something like processing credit card transactions and you hibernated it There's a good chance that while it was hibernated that credit card transaction was done through some other means And then when you resume the guest, you know, you could potentially end up double charging a customer or something So depending on on the application that the guest is running Uh, you know a user may opt to just happen to get shut down as opposed to hibernated, but uh, There's a lot of use cases where Users would prefer would prefer hibernate. So it's a nice option to be able to expose and you could do that via the The guest suspend disk command Um, how we're doing on time? Okay, so some other use cases, uh watchdog health monitor. So a watchdog is a way to Kind of track to see that a certain process or a certain machine is still executing properly Uh, this may be a nice feature to have for a virtualized environment where You know, generally when you start up a guest you'll Be able to indicate that that guest is running or stopped But there's no indication there of whether or not that guest is running, but it failed to boot Or it's running, but uh, you know, there was a a colonel oops, and it's completely frozen So being able to provide some type of Feedback on the actual health of the guest is a useful feature to have and the guest agent provides a guest ping command so that Your management stack can periodically ping a guest to see if it's still responding and if it's not responding You can notify users or provide some way to make that information available to users They can better keep track of what's going on now, you know, of course you know for You know intensive applications Determining whether or not the guest is functioning properly is going to require You know the the customer to implement a more, you know targeted solution, but Just as a general feature that that might be a nice to have Um It's also synchronizing and correcting guest clocks So generally we rely on a network time protocol to correct timing issues and That works most of the time, but there are certain situations where Where it may not for instance on windows if If the jump in time is beyond a certain threshold windows will just Basically give up it'll stop Trying to set the clock according to ntp and your guest will be completely, you know out of sync time wise You know forever So there's a guest set time and guest get time commands actually allow the host to explicitly configure the time in the guest and if if you're an environment where the guest images are Where there's really a tight integration between your your management solutions and and your guest images to the extent where You provide say a management interface that can Do things like set the guest clock or set the guest time zone etc Then the guest agent could also be used to to provide those types of interfaces ip discovery Yeah, I mean so it's it's a fairly common issue where you start up your guest And you don't know what the ip is you can't really connect to it if you if you're doing that locally, you know you can You can vnc into it or You know you have a local ui and and you could also vnc into it in in some cases in the cloud as well, but generally To me communicate with your guests. You're going to have a network based Interface management interface to to interact with your guest and if the guest For whatever reason is it doesn't come up with the ip that you configured it to come up with then Figuring out how to You know deal with that issue Could be a problem. So The guest agent provides interfaces to discover What ip the guest actually leased so if the guest leased a different ip for whatever reason you can determine that And provide that information via the management interface so that a user can can you know recover from that and connect to the new ip address trim and the hole punching So trim support It's an operating system for mostly for for ssd drives where Normally on a hard drive when you delete a file It you don't need to tell the hard drive that you deleted it It's just when you need to use that space. You'll just overwrite whatever's in there on an ssd drive If you have if you overwrite data that operation is actually slower because the delete operation is slow so To work around that there's a feature called trim where an operating system can tell the drive that explicitly that it's deleted a file and in that way instead of having That delete operation occur in a critical path when you're trying to You know write data to disk at a fast rate. It can do it At more opportune times via some type of garbage collection mechanism qmu recently added support for exposing the trim command to the host so in the case of virtualization when a host operating system issues a trim command qmu could actually use that to resize the disk image that represents a guest's disk And use that to reduce the the amount of memory that the image takes up So if the guest creates an 8 gigabyte file and then it deletes it Since you know, we might grow the the file the the disk image by by 8 gigabytes But because of the the trim notification will know that we can shrink that back down and remove those That actually pretty much happens automatically You don't really need a guest agent for that but if you um, if you have a guest where There's a lot of data that has been deleted in the past then The the the trim command to tell the host that that data has been deleted will never get issued It's only going to apply to any rights that that happened in the future any deletes that happen in the future So the guest agent actually supports a guest fs trim command Which will say go through the entire file system and send the delete Notify the the underlying storage device that you know for every file. That's that's not in use for for every block of data It's not in use So you could use that to to reduce the size of a file system image I mentioned there was the file access guest file access so There's a A pretty wide range of things you can do when you can access files in the guests from the host you can tweak the guest configuration you can do automatic performance tuning Things of that nature so that that's a pretty broad range of use cases there So just Okay, so future work I mentioned earlier clipboard synchronization We are actually working on that and at the moment we actually have two google summer of code students that are Completing their their project to implement clipboard synchronization via the the guest agent guest exec is a potentially Pretty useful command because if you can exec Commands in the guest and you can basically do anything so There's actually experimental patches that that do add guests guest exec support for both linux and windows Um So that's a pretty extensive interface there so any In general when you when you add new features you want to have a nice well-defined interface, but Has a stop gap. It's nice to have something like guest exec so you can handle Anything that's not covered there. So that's something we're also working on And that's pretty much it if there's any questions So you said that It has trim features that allow it to How rights that are large sizes and deletions that happen at At some frequency does it does it compensate for that doesn't know that the disk image has to be cleaned up at certain intervals or uh I'm trying to figure out if it would Cause it performance to grow that image Yeah, uh potentially I mean it it certainly would take It would cause latency if you're doing hole punching on on every trim command so I think that's actually the reason that it's disabled by default on qmu. It could cause performance issues In that type of situation. So it's yeah, it's kind of a trade-off there So are there any plans to add? Kind of when you merge it into qmp. Are there any plans to kind of have some kind of knowledge of that? There's a guest agent listening in the guest Uh, yeah, that would be handled by qmu. It's just that the issue is It's it's hard to do that right right because if if at any point in time you say there's a guest agent You know alive and well The user could then just go and then shut down the guest agent and that's no longer the case so basically the best we can do is Try to execute those commands And give them a timeout So i'm speaking from the like a live bird sense So we we always have to send a guest ping because there's a bunch of commands that Could take an indefinite amount of time. That's a liver. It's going to block but You know, are we blocking on a command that? We might never get a response. So Like the hack that was put into the bird Is to say well ping it first And then that command will that second command will block until it responds But the question there is like if there could be like a state That says that guest agent went away in the middle It was live birds basically I'm not too sure that's a difficult one To work there. I have a good question. Good question. Yep. We'll talk after Hi, my question's about uh backup and uh You mentioned three types of backups. You had your synchronous your asynchronous and then you had a drive mirroring backup I'm assuming that's basically that's writing to virtual image files on the host So my question kind of is around the drive mirroring So one of the things that people are concerned about is corrupted data whether it's from the user or otherwise Is there any mechanism to prevent Corrupted data from transferring from one drive image to the other in that continuous mirroring process uh I mean basically the only way to do that is to have a point in time you could recover to so while you're doing the mirroring you could also Take snapshots at a certain interval so that if something does go bad, you can still restore to a previous state Many times we get the requirements where we need to configure the guest before actually starting in life If if guest contains the multiple interfaces, so just assign all the IP addresses Assign the default gateways So like just before actually starting it you can have all the network configuration like the ssh access or the lnet access So I mean wants to configure those kind of things So what is the best way to achieve this? Is it through the guest exec or Through the json interface or like So to actually configure a guest before it starts is it um Yeah, I mean if it's before it starts then I mean the guest agents is basically out of play um It's it kind of depends on your solution because you can you can have activation scripts that run on the image on an offline image to configure it and Since you have direct access to all the files on the image And you can execute any commands that you want inside the host Against those files You could in some cases have a more powerful interface than what you get with guest agent Guest agent is more catered toward Guests that are already running You know if there's any configuration you could do prior to that then Yeah, it doesn't make too much Add some interface to send the command from guest to the host because For example to implement a pre-port synchronization Such an interface is not existing Host need to query periodically the guest pre-port is updated or not so Yeah, so actually it's part of the the clipboard synchronization stuff There is a uh a guest to host channel that's being added And we because we need that because if you want to support something like Copy paste if there's a copy and the guest the guest needs a way to notify the host about that So yeah, there is going to be a guest host interface in the future Any more questions? So Currently are any of the guest agent commands exposed through liver? And if not, what's the plan to do that? Does it have to wait for the qmp integration or? Yeah, my plan was to just uh pick it back off the pass through Interface that's already available for qmp qmp um, but it does use it most of the uses are For specific commands, you know internally rather than being exposed to so you can do like command line birch guest agent command and you have to pass in A json string and so oh, okay, liver does that and you can you can pass that by hand and it's going to spit you back the json by hand There are no pretty interfaces to most of those commands Uh, that's the most Recent one that's being added that might appear for the next liver would be the ip address gallery okay Yeah, it's nice to know I saw some mention of the the guest agent interface And I was I was poking through the source code to see if it was actually exposed to version I just didn't get a chance to confirm or not so it's good to know So does liver support configuring the guest agent or do you have to do everything by hand? um so No, I mean you you'll have to have it The xml you have to have it defined to spit out the perfect man line Do you have to define like a serial port? You don't just say yeah, yeah, so there's a there's an xml syntax line and uh, if you Have the example xml syntax Michael put up there it spits out those commands Basically, if you look at the domain xml configuration Oh, thank you, Michael. Thank you