 Hello, I'm Eric Blake. I am a senior software engineer at Red Hat Where I have been working in the last few years on the QMU project as well as Libert and also with the NBD protocol in working on incremental backups Today's talk bitmaps and NBD building blocks of change block tracking It's a mouthful of words, but ultimately it boils down to the premise that when we're doing backups We want to optimize an incremental backup says rather than backing up everything if I could figure out which blocks changed since my last backup and Only backup those blocks Then I have a more efficient backup for each following day And then I can reconstruct the overall image by layering those incremental backups in this talk I will go over how QCOW 2 has Exposed dirty bitmaps as its change block mechanism and how the NBD protocol is used alongside Libert to implement incremental backups Change block tracking is a term that's been around for multiple years. It has been introduced by the backup industry as a way of Figuring out which portions of the disk have changed since a certain point in time There are two general common approaches to doing change block tracking One is with a generation tag every time you write to a cluster. You also write a generation ID made of data somewhere and the idea is Then you can say at this point in time. I was at ID XYZ All blocks that have an ID greater than XYZ have changed since that point in time Therefore, those are the blocks I care about It does require a fair amount of metadata per cluster But the nice part is you can track multiple points in time all with the same amount of metadata The other common approach is with a dirty bitmap instead of having a 64 bit ID number You have a single bit that says has this block changed since a given point in time You do need one bitmap per point in time that you're tracking So it's a little less granular on when you can compare to But there are also some optimizations because there's less data being written per bitmap You can store things in memory and flesh only a strategic times An important point to remember with backups a Full backup is always correct as long as your guest data has not been corrupted a full backup Maybe slow, but it will be correct Change block tracking is merely an optimization So if something ever goes wrong with your change block trackings if you lose a bitmap or if you corrupt your generation tag ID Your fallback is to do a full backup. You don't lose the guest data You just have lost efficiency on handling that guest data and because of that fact the fact that backups Change block tracking is an optimization There are some interesting interesting Uses or interesting behaviors that we can do Where we're a little more cavalier on how we handle the data than we would be if it were guest visible data Qt Kautu is Qmoo's preferred image format We've had Qt Kautu around for years. It first used in-memory bitmaps for the blockchain command introduced in 2012 Over the years we decided hey bitmaps are so cool that we will use dirty bitmaps as our change block tracking mechanism And we first exposed a persistent dirty bitmap as part of the Qt Kautu image in 2016 It's actually done with an image extension header where we say Here is the location of the bitmaps within the Qt Kautu file. They're not guest visible just host visible and Within those file within each of those bitmaps we track a granularity you can the default is one bit per cluster of the image But you can also do finer grained or Courser grain depending on your needs Right now Qt Kautu As a file format will support any size bitmap But Qmoo as the program using those bitmaps insists that the bitmaps be the same length as the disk itself Also, we took care when we added persistent bitmaps to add a auto clear feature bit If you ever operate on a Qt Kautu file with an older program that doesn't understand bitmaps The newer program will then see that hey the bitmaps are probably incomplete Treat them as corrupt and once again as I said earlier that just means we have to do a full backup instead of an incremental Qmoo has also been involved with the network block device protocol over the years We first introduced NBD as a client support for Qmoo back in 2008 as well as the program Qmoo NBD Which acted as a rudimentary server as well as a hook to call into the kernels NBD.co module for accessing a block device served by an NBD server Over the years we then have added things to Qmoo in 2012 We made it possible to export a Qmoo image as an NBD drive While Qmoo is running. This helps with live migration of your storage. We on the source we set up a All the all the pieces we'll need to migrate on the destination We then start a Qmoo server. The source then says do a mirror job everything that I write locally I will also mirror over to my NBD server, which is on my destination When the mirror job is complete then the destination has a copy of my storage now I can do the live migration of the memory and Stop the NBD server at that point and we have now copied the server the disk data over by using an NBD connection In 2016 we added TLS support in 2018 we added block status block status lets you query an NBD drive and say which portions of My image have which properties An interesting part of the NBD protocol was that block status has multiple contexts where you can define Your own context so Qmoo has done just that we defined the Qmoo dirty bitmap context to expose a persistent dirty bitmap In addition to the NBD standard of base allocation that says which port portions of your image read as zeros So with all of that introduction, let's go ahead and play with the files we'll need to do a guest I'm going to Use vert builder To grab a fedora 32 image I'm going to inject a root password so that we can SSH into it Actually with fedora 32 you can't SSH with a root password. You have to use a Key file, but that's what the SSH inject portion of the command line does It takes a little less than a minute, so I will continue talking We're going to use this image for the demos and the rest of this talk. We're going to both play with Directly in Qmoo. Let's see what happens with the bitmaps as well as in live vert and see what happens when I use live vert commands to drive an incremental backup The image is nearly ready Just a few more seconds and there we go. It took 45 seconds created a six gigabyte image containing a fedora 32 image I'm also going to create a 100 megabyte secondary disc that we can play with directly and With those discs in hand my next task will be To create a live vert domain using those discs So I'll use the vert install command pointing to those two discs As you can see the image is ready to go. We're booting it now I'm going to finish the install by logging in as root and then we'll do the rest of our work through SSH There's no need to have multiple windows here But it's Bird install is a fun little tool for building everything live vert will need and Using the password. I just injected in the previous step. We can now tell the image to shut down Give us some clean things to start with And now I'm going to run it in the background While the image is running you remember there was a few seconds at the beginning where The grub is doing a countdown before it actually starts the boot So we'll wait for those few seconds. I'm going to grab the IP address of my guest I need to grab an IP address first There's my guest running at 122.10 We're going to SSH into the guest. I'm going to make a file system on my B drive the secondary 100 megabytes that I had Create a image on it and touch a single file and then exit Yes, I do want to connect to my guest. It's brand new. That's why I have to accept it And For now I will shut the image back down And now that we have an image in place and all primed to use QMU image tells me that my secondary image is 100 megabytes wide and has no bitmaps associated. Let's give it one I'm going to use the new QMU image bitmap commander I've added earlier this year to add to the base to qcal to file a bitmap named bmap zero And we repeat our info command and you can now see that there is a bitmap I told you earlier about the autoflag That says the bitmap is enabled any writes that I do to the image will update the contents of the bitmap So let's do some writes. I'm going to create a file with a contents of hello by using guest fish I'm going to mount that image List what's currently there and then upload the contents. Hello into a new file named slash b and Yes, indeed the previous contents was just the file a now. There's a file b And believe it or not adding a single file. We're going to run QMU mbd with dash capital bb map zero to serve that bitmap An nbd info a new command this year from the lib nbd project to map that dirty bitmap From the nbd server that we have been running with QMU mbd And there you see it when I touched a single file it touched um four different clusters of 64k each um, which makes sense the Super block the directory that containing the file and the file itself I'll we'll add up to around each each of them require a cluster to be touched And with that We'll move on to another idea that we can do with QMU mbd I'm going to create an overlay file over my image um One of the benefits of QCOW 2 is that it is designed as Backing chains in mind to each file is sparse and whatever you don't get locally comes from a backing file Uh checking out the backing chain, huh? There is the bitmap that we created in the base file, but the overlay does not have a bitmap So one of the things we had to implement over the last year was the libvert commands to manage Bitmaps under the hood in a saner way than what QMU could do automatically but even next thing i'm going to do is Touch some data in the overlay. I don't have a bitmap tracking it But now I've touched file c we can say What with the dash capital a option? We can say what is allocated in my overlay versus what is allocated in my backing file Again with the nvd info command, but this time we're going to map QMU allocation depth instead of QMU dirty bitmap When we do that we can see the same four clusters have been touched locally to add my file c everything else comes from the backing file and As usual ext2 has a bunch of super blocks scattered through so you're going to have a repeating pattern of allocated and unallocated blocks And then i'm done with bitmap zero. I'm going to remove it And we'll move on Um when we first added the drive backup command back in 2015 Our original thoughts was QMU would drive everything We'd create a single bitmap at the time you do a full full backup And then future incremental backups will use that prior bitmap state to create an external file all under QMU's control And reset the bitmap for the next thing so Graphically we start with our image. It's partially dirty We do a full backup and create our brand new bitmap all empty to say What's dirty at this point nothing Then over time some bits do get dirty. We write some new content the bitmap tracks those areas We say time to do another bitmap We take the dirty area copy it into a new incremental backup file and clear the bitmap all under QMU's control Time progresses we write some more data The pattern repeats Problem with having a single bitmap is that you can only track incremental backups You can't do differential so in 2018 we've modified things I had a presentation at KVM forum two years ago where we demonstrated What pull mode would entail instead of QMU writing out in the file Let's expose the file over nbd for third-party access And also at the time our thoughts where we're going to track multiple bitmaps for multiple points in time to let us Do a differential backup so start with the same dirty data Start with our initial bitmap for our first checkpoint in in libvert And we create our full backup then as time elapses data gets written into that bitmap Then we do another incremental backup and we mark b0 is disabled And now b1 has been created to track all changes since incremental backup one It starts out as empty More changes happen and notice that b0 remains unchanged We only touch b1 And it's time for another differential backup. We can say I want to do backup from Point zero where we took the full backup rather than point one where we did the incremental backup To do that I have to create a temporary bitmap that merges in all bits from b0 and b1 And then expose my temporary bitmap over the wire to get the portion of the data that I need for differential Or I could do just a diff an incremental backup where I look at just b1 directly But it turns out that Turning bitmaps on and off can turn into a lot of Management overhead to track which bitmaps have to be enabled where especially gets messy when you have External snapshots or a backing chain So ultimately when libvert finally did implement incremental backups We changed our mind yet again and now all checkpoints have a live bitmap at all times Um, so at our first bitmap we create our first full backup. We create a bitmap Over time we add bits. We created an incremental backup Actually, we created an external snapshot You notice when we create an external snapshot Bitmap zero is no longer changed because we're no longer writing to the base file But now overlay dot qcow2 will be written. We write some data There's the bitmap zero In qcow2 is different than the bitmap zero in base But between those two bitmaps we still have a track of everything that's changed since our full backup So now we do an incremental backup Once again, we need that temporary bitmap to merge b0 from our overlay and b0 from our base Into what we expose over the incremental and we also create b1 to start tracking changes since our incremental Then we get rid of our temporary we write more data You'll notice that writing the data modifies both bitmaps at the same time Whether we do this by actually writing data in qmoo Into the qcow2 files at that time or optimize it and save it in one place in memory and then merge it out at the last minute Is an optimization that can be done under the hood And then when we do a block commit we say I want overlays data to be merged back into base and liver takes care of Okay, well base needs to have a b1 bitmap to track everything that was in b1 before the commit As well as all the data and b0 becomes the merge of the of the bitmaps If we do another incremental backup at this point We'll uh for ease liver it always creates the temporary bitmap even though at this point It's just a single copy of b1 We create our incremental backup and we're good to go So let's see this in action. I'm going to Clear my screen. There we go Demonstration with libvert. I'm going to start my domain that we created earlier It'll take a few seconds to come up, but we can already attempt to backup Oh wait incremental backup is not supported yet I am testing with qmoo 5.1 There are a few features in there that we did not quite have ready and libvert refuses to use them Until they are polished One of those is that when we are doing a block commit we have to modify the backing file To merge the bitmaps and right now that takes the x block dev reopen command qmoo 5.2 will rename that to Block dev reopen without the x prefix And as as such that will be when it's supported in the meantime. We have a little hack I can take my xml describing the domain And take the domain type add in an xml namespace for qmoo In that namespace i'm going to add the capability incremental backup and it says Use qmoo even though it has to use the x prefix That tells libvert Everything he needs to do To use incremental backups. So let's try this again Start the domain Let it boot up here I'm going to Create a backup. I'm going to use push mode where qmoo writes the file I want to create a backup named full dot qcow2 I have a qcow2 file format for ease of use I also want to create a checkpoint at the same time. I'll name it check one. I'm pretty boring today um I will also pre-create my destination image for the sake of file permissions And with that pre-created image verse backup begin reuse the external image I just created and use those two xml's to describe what i'm doing There's my backup operation underway. We can see it going qmoo Finishes the job automatically and as soon as the job finishes. I no longer have a backup backup xml to dump And with that we have a backup I'm looking at the file size. I did a backup of full Using the contents of base dot one The two are about the same size right now because it was a full backup Now in libvert, I'm going to create an external snapshot I I want to track differential changes on top of the base. So I'm going to create a new overlay on top of base one Uh, there's my snapshot create See what images are used. I'm still using my temporary base two. We're going to ignore that I'm using the overlay image for my vda And if I look at the backing chain of overlay, oh dear permission denied. That's a good thing qmoo is running On f32. We don't want to be messing with the disc at the same time that qmoo is messing with the disc What we can do is request You can use sudo and the dash u option to say I want to open the image in spite of it already being in use And hopefully my information is not too bad We'll do that And there's my backing chain overlay has Um, no backups No bitmaps and base one has the bitmap that we started earlier I just said libvert does all the information to create bitmaps on the fly, but we don't see one Why is that because I also said qmoo stores as much in memory for as long as possible So i'm going to shut down the domain And with the domain finally shut down We're going to try that again this time. I don't need sudo because now there's no qmoo process using the image And there we go my overlay now has a bitmap written out at the last minute as part of qmoo shutting down Uh, let's get stings started back up for my next part of the demo And we're going to do an incremental poll Uh, my second backup xml. I'm going to use an incremental Use based off of check one is my starting point. I want to pull which means I need an nbd server So I want qmoo to open an nbd server on local host port 10809 is the nbd default I also want to create a new checkpoint two to in addition to the checkpoint one that I already have I'm going to grab the ip address so we can ssh in I'm going to touch a file before we start the backup That should be included. I'm going to begin the backup. I'm also going to touch a file after That should be excluded my point of the backup Even though I haven't copied anything yet is when I did the backup command Looking at the backup job. He is running as an nbd server on over tcp I can also see that he created some temporary files Named overlay dot q cow two dot check two For storing the changes that happen That need That are needed to Get me that point in time that I started the backup no matter how slow I am at copying it I'm also going to list what that nbd server is showing He is showing two exports vda vdb Each one of them has an allocation map and a dirty bitmap containing which parts of the disk are interesting And with that now I can proceed to create my backup I'm going to map that dirty bitmap See which portions of the disk are dirty The output goes on I'm also going to A little helper script here. I'm going to create a temporary file backed by My original image I'm going to then run my same map as I had here Pipe that through a wow loop And for every dirty line on that wow loop run kimu i o dash c capital c for copy on read and little c to run the command read this portion So that will populate just the dirty portions of my bitmap into my destination file And at the end my destination file is then rebased on top of the final file Run that script You can see there's quite quite a few chunks. Most of that is the ext for super blocks Now I can finish my incremental backup job I'm going to shut the domain down at this point. We could leave it running But there's no point and I'm going to use guest fish to inspect my incremental backup Make sure that he is what I expect him to so I'm going to connect him mount him List the root drive and there we see the file before but not after so we did indeed grab a snapshot of the drive while the guest was running with just those contents And looking at the file sizes um The incremental backup Is slightly smaller than the overlay file Which is good because the overlay file changed after we touched the incremental backup And the full backup is slightly smaller than the base Because the base modified before we did our external snapshot But the really important part is the incremental backup is much smaller than a full backup So we did save ourselves quite a bit of effort by doing an incremental backup instead of a full backup And with that you have now seen how nbd shows the dirty bitmap of qcow 2 And how libvert pulls that all together and manages multiple bitmap commands into the hood to do incremental backups If you have any questions now would be a good time. Thank you for listening