 Hi everyone. My name is Senthil Rammurthy. I am part of the Foundation DV team at Snowflake. Today, I'm going to present about instant backup and restore feature that we have been developing. First, I'll start off with explaining the current backup and restore implementation, and explain some of the challenges that we run into, and the new approach that we take, and how it solves some of our challenges, and what are the caveats with it. I'll conclude with a quick live demo of the backup feature. So the current backup implementation works at a logical level. A backup here is defined as a consistent copy of all the key values in the database. The consistent copy of the key values in the database are obtained by reading the entire database through the FDB stack. There are much more details about the current backup implementation, which I'm not going because of the time constraints. The restore works by playing back the backup key values into the FDB cluster again through the FDB stack. As you can see, if you want to read the entire key value in the database through the FDB stack, it consumes resources like CPU, network disk across the FDB cluster, and that impacts the foreground workload. The current restore mechanism also consumes resources. On top of it, it is slow. The FDB 3 version that we use is extremely slow. The future versions, it is supposed to get better, but it still may not meet our performance requirements. The impact of the current backup. This is a slide that Marcus, one of my teammates shared already. The pink bar here represents the IOPS purely driven by the backup process. This is from one of our production deployments. So let's switch over to the new Instant Backup and Restore and look at that approach. In the Instant Backup and Restore, we don't look at backup from a logical standpoint. We look at it from a physical standpoint. So backup here is defined as a consistent copy of all the disk images. Specifically, FDB takes an application level consistent snapshot of all the persistent data of all the FDB processes across the cluster. This collection of disk images form the backup, and you can copy this disk images over, attach it to another set of machines, and then you can restore the cluster instantaneously. So in this picture, there's a cluster one with a set of FDB processes, the gray boxes represents the FDB processes. Some of them have persistent data, some of them don't have persistent data. When you issue a snapshot command, which we introduced as part of this feature, it creates a copy of the disk images. The blue storage boxes that popped out are the snapshot disk images that came out as part of the snapshot operation. These blue disk images, if you can get a copy of them, those are the backup. The cluster two, it's like a vanilla cluster with no disk images attached to it. You can copy this disk over there and attach to it and you can bring up the cluster. Of course, you need to modify the FDB.cluster and foundation.db.conf, etc. A little bit of tooling is necessary, but the restores are pretty much instantaneous. For this feature to work, you need either a snapshotable file system or a snapshotable block storage. So this is pretty much the text of what I explained in the previous slide. So I'll just skip this. So how does FDB create a consistent snapshot across the FDB cluster? For that, let us go through the flow of the snapshot create operation. So this is a standard FDB cluster architecture, and there are really three persistent data, three sets of persistent data in the cluster. The coordinator which stores the information about where the master is or the T logs, etc. And the transaction logs which keeps the mutations and the storage servers that keep the end storage of all the key values. It's previous because I've already gone through this. So the client initially sends a request to snap the coordinators to the proxy which gets forwarded to the coordinators. The coordinators when they see this request, they go and snapshot their own file system or block storage based on the configuration. So basically the blue boxes that emerged out are the snapshot disk images of this coordinator. Once that is complete, the client sends a snap create command to snap the T logs in the storage. This snap create is nothing but a specialized transaction. Except that it is different in two ways, that this snap create command goes to every T log and every storage server in the system. The second way it is different is that when T logs in storage see this special command, they respond by creating a snapshot of their disks or the file system. So in this picture, you see the request goes to the T log and it creates a snap of their disk images. And then it sends a success to the client and shortly after, the storage is pulled the snap create and they snapshot themselves. So now all this blue disk images put together, make the snapshot, and you can bring it back as further stored. Since the snap create is a transaction, there is a version attached to it. And the storage T log, all of them will execute the snap create at the exactly at the same version of the database. That gives a point in time consistent image of all the disk images in the system. Now, there are three parts to this. The snap create is very cheap. We saw that it is as cheap as transacting a key value to the database. The second is copying of the disk images. This does take some resources, but it is significantly cheaper to copy the disk images at the lowest level, at the file system or disk level, rather go through the entire FDB stack. And this can be highly paralyzed and which will reduce the impact on the user workload. On top of this, many cloud providers give you an option to snapshot the disk images, block storage and have a copy of it available without you spending any additional resources from the production cluster. And the restores are instantaneous because you have the disk images, you just attach to your line any machine, set of machines and it is as much as the cost is like recovering a cluster. The disadvantages are dependency on snapshotable file system. And second, it's somewhat the production and the backup cluster has to be somewhat homogeneous. Now I'll switch over to the demo. This is instant backup really. So what I'm going to do, I have a cluster one, which is I'm going to start a workload on this and I'm going to watch the workload to make sure that the transactions are starting. There is a thing that says read rate transaction rates, etc. So now I see the transactions are progressing. Now I switch over, I'll start a CLI and I will create a snapshot. This response with the UID, which basically is all the disk images are tagged with this particular UID. And then I invoke a cluster copy on this one with that UID, which copies this disk images to the cluster two. As it is copying this workload is self-describing, we can validate this data on the second cluster by running a script. Now I'm going to, there is nothing to do here because the disk images have been copied. I'm just going to start the cluster here. And I'm going to watch for the cluster to come up. Okay, this takes few tens of seconds. And so the workload is self-describing so we can make sure it's a cycle workload for the FDB developers that the set of key values make a cycle and you can verify that the databases, the integrity is still there. So once it turns healthy, I'm going to run a quick, okay, it's taking its own time. Okay, it turned. Now I do a quick get range and then I cluster verify, which will take a while. So I'll leave it there and the integrity of the database gets verified. So that's the conclusion of my talk and we're excited that to have an alternative backup implementation for the FDB operators to leverage. Okay, thank you, Santil. Okay.