 Hello, I'm Vladimir Semensov-Gievsky, I'm software engineer at Virtuoso, and I mostly work with blocklayer of QM. This my talk is about my recent suggestion of new backup architecture. Let's start with what we have now. The aim of backup is creating a consistent point-in-time copy of the VM hard drive. At this first simple slide, note my designations. On the left there is a block graph, solid arrow shows parent child relations in the graph, dashed arrows show backup job. On the right side there are QMP commands needed to make the picture. So here to make a backup, you should add target block node with help of block.dev.add command and then start backup job by block.dev.backup command. The non-trivial thing in backup is that active disk is active. Guests may write to the disk during backup job and we should do something with it. To handle these writes, backup job inserts a backup top filter node above source node. In turn, backup top works simple. On any write, it first copy original data from active disk to target and only after successful copy, new data is written to the active disk. This also called copy before write operations. On top of this, we have working image freezing scheme. It is used for so-called pull backups or external backups. What is it? We set up a backup to some temporary local QCultor image. In the same time, active disk is set as backing for this temporary image. This way, the reader of temporary node will see a kind of snapshot. The data that was changed since start of backup will be read from temporary image. Not changed data is read from its backing. Not since non-parameter of block.dev.backup command. It means that backup job shouldn't do any background copy except for copy before write operations. Only in this scheme we lack bitmap support. So if we are doing incremental external backup, copy before write operations will cover non-duty areas as well. So we can do a lot of extra write operations and waste disk space. We could support bitmap argument when sync parameter is none, but there is better way. Note that backup job is a kind of node here. It only adds and removes the backup top filter, which does all the work. This leads to the solution. Let's make backup top filter public so that user can insert and remove it without backup job and of course support bitmap parameter for new filter node creation. Node incompatible changes. Filter is renamed to copy before write to show what it does instead of how it used and now it based on file child like all other public filters in QEM. Still these changes shouldn't break things as before. The filter was only created internally by backup job. So I think good name and consistency was smaller risk. At the moment of preparing the presentation corresponding patch series is in maintainer scheme. Let's look through the API. First we prepare temporary node. Next instead of starting backup job we prepare filter node by hand. And to finally insert the filter we should replace original disk by the filter. Here I use QEM set command to do this. Next we can start MVD expert or add any other video to our temporary node. To change drive property is part of proposed patch series. Still I have better API in mind and in patches for inserting filters. I will return to it at the end of presentation. Okay that's good but that's not the end. With this QEM we still have some shortcomings. First assume external client already finished backing up some part of disk. QEM guest wants to write same range. Copy before write filter starts to copy to temporary image for nothing. Placing client is not interested in this data anymore. Another problem is mostly about push backup is fleecing. Push backup is fleecing is when fleecing client is just another backup job. It's very useful when final target is slow. With simple backup job copy before write operations becomes slow too and guest write performance reduces. With fleecing scheme guest is not disturbed. Target of copy before write operations is fast local file. But what if at some moment throughput of target become better? Or guest write frequency reduces? At this moment we'll do extra writing to local disk when actually it's possible to transfer data to final target without saving it in intermediate image. So we may have some extra writes to temporary image, which could be theoretically avoided. Return now to extra copy before write operations. So we need a possibility for client to say that it doesn't need some region anymore. Right, it should be unmapped for discard request. But on this discard we should inform copy before write filter that some region shouldn't be copied before write anymore. That leads us to necessity of an addition filter driver on top of temp image to handle client's discard. And as you'll see, this new block driver brings a lot more improvements to fleecing scheme. Welcome new fleecing block driver. What it brings to us? It of course solves first problem. On discard it informs copying processes. Reason is not needed anymore. Note also that doing discards after reads on NBD export is good anyway. It saves disk space on the server. Next, think about incremental backup. A few we restricted copy before write operations by some bitmode. Originally if fleecing client occasionally reads something not covered by the bitmode, it will read somewhat predicted data, maybe already changed since backup start. A lot better is handling such wrong read by clean error out. And the new fleecing block driver does it. In fleecing scheme we need to synchronize guest writes with fleecing cleaned reads. Now when we control fleecing reads in the new driver we can do it less blocking. Next, if we can handle reads and anyway have to maintain some additional bitmodes to cover previous points, we don't need backing feature at all. Fleecing driver knows when to read from temporary image and when from active disk. So temporary image may be a simple route file without any backing super. Finally, to avoid extra writes to intermediate image for pushback up is fleecing. We can implement write cache in fleecing driver so that writes doesn't immediately go into temporary image but kept first in the cache. This way fleecing cleaned has a chance to read and discard this data before it flushed to temporary image. Of course, if throughput is enough at the moment. All this except for caching comes with my patch series make image fleecing more usable. Now look at API. Nothing unusual here. We just need to block define one more block driver. Note also the target of copy before write is fleecing block driver, not temporary image that's needed to implement cache in fleecing block driver. Now, what if we want to run pushback up is fleecing for this case, backup job gets one more parameter with new argument immutable source set to true. Backup job will not insert any filter but instead assume that source is immutable. Remember that normally backup job installs copy before write filter above its source mode, which we don't need here. That's the last thing about backups. Now a bit of information about inserting filters. In the latest release, blockDfReopen command gets possibility to change file children of block nodes and become stable. So it can be used to insert filters like it is shown on the right side. The drawback of this command is that it can't change children of block backends so we can't append the filter above topmost image note in the graph. But we do need it for backup. So in my published backup top filter series, I have improved QOM set command so that it can change the drive property of disk object. But this method allows us to append filters only above topmost image nodes in the graph. Another drawback is that QOM set is unrelated to blockDfCommons family and will unlikely be enabled for using transaction. So we need some good API that allows insert filters anywhere in the graph in some generic way. And there is my last patch series to present, blockDfReplace. New command allows to insert filter into almost any specific block graph edge including children of block backends and block exports. Parents may be selected by their identifiers. That may be QDFID or blockExportName or nodename and it supports also what we need for backup replacing node in all parents except for creating loops in the block graph. Same way works backup job when inserts copy before write filter. To summarize, I have presented two new public block drivers. Copy before write may be used in separate and flossing driver makes sense only in pair with copy before write. I propose new flossing scheme based on these drivers. Accompanied things are immutable source model for backup job and new API for filter insertion. Now if you have any questions, I am ready to answer them.