 Hello everyone. I would like to welcome our three speakers for this talk, John Strunk, Parul Singh and Ryan Cook for the talk on making copies with VolumeSync, your data where you want it. I will be sharing the recordings. I'm John Strunk and today I welcome and thanks for coming to our talk today. I'm John Strunk and today I and my co-presenters, Parul Singh and Ryan Cook, will be talking about data mobility in Kubernetes using the Volsync operator. Whether you're responsible for a single app or an entire Kubernetes fleet, it's critical to be able to easily move applications between clusters. Multicluster management tools and GitOps-based workflows make it easy to assign and reassign apps to clusters, whether for disaster recovery or just cost and performance optimization. Well, easy that is unless your apps are stateful. Multicluster workflows are great at managing and deploying QBMls, but they do little for persistent data. One of the main ways data is moved between clusters today is using the traditional technique of relying on individual storage vendors. Using storage-based replication has been the go-to for DR in legacy environments, but it's not well suited to Kubernetes. Cube clusters are deployed across a varied set of infrastructure and resource environments. Since storage-based replication typically requires the same underlying storage on both sides, that leads to inefficiencies. It should be possible to use the storage that's best suited for each environment, whether bare metal, cloud or edge. Additionally, there are more reasons to move or replicate data than just DR, so the methods of replication need to be flexible as well. And that is why we created the Valsync operator. Valsync is a Kubernetes operator designed to provide data mobility. It provides storage-independent mobility, so you get the same set of capabilities, no matter where your clusters are running and what storage they're using. You're free to choose a storage system based on the axes that matter for the specific environment instead of making one choice globally just to get migration. Data mobility is really many use cases. It encompasses traditional IT's main use cases of DR and backup, but it also enables workflows that are particular to multicluster Kubernetes, such as cross-cluster data distribution and app migration. Valsync's capabilities are built around the notion of data movers. At the highest level, Valsync defines sources and destinations for data movement via its replication source and replication destination custom resources. These CRs define common items such as which data to replicate and how often. Below that, actual data transfer is handled by a set of data movers that are tailored to different uses. There's an R-Sync based mover for one-to-one replication, such as application migration or disaster recovery. The R-Clone based mover supports one-to-many data distribution for things like sending data to a set of edge clusters. And the R-Sync mover supports creating simple backups of persistent data to complement GitOps workflows. Following up on that overview, we have three quick demos showing some of the capabilities of Valsync. I'll start by showing application migration using the R-Sync mover. Then Perul will follow that up with a demo of the R-Clone mover for data distribution and Ryan will finish up by showing how the R-Stick mover provides PV backups. In the first demo, we're going to show how Valsync can be used to migrate a stateful application between clusters. Here we have an application running on a primary site and we want to move it to the secondary. To move the data, we have the Valsync operator deployed in both clusters. Once all the data has been sent, the destination creates a snapshot of the volume to preserve a point-in-time record of the data. Subsequent synchronization iterations just send updates and generate new snapshots. When the time comes to actually move our application, the latest snapshot on the destination is used to create a PVC for the application. Now that we've seen a description of how it works, let's see it in action. This is a quick demo showing how Valsync can be used to migrate an application between two clusters. So the first thing that we're going to do is deploy a simple wiki application onto a kind cluster that I have running here on my laptop. So we'll create a namespace for it and then we will apply the manifests to launch the wiki. All right, let's watch it spin up. So what we have here is a deployment for the wiki and it has a persistent volume claim to handle its persistent state. Excellent, now that the wiki is ready, we can access it. So we'll set up port forwarding so that we can get to the web endpoint and hop over to Firefox and access. Okay, so here we have a blank wiki. Let's add a page. So here we can put some content onto the main page and there it is. All right, now let's get to work at migrating this over to a different cluster. So the first thing that we're going to do is set up replication of that persistent state over to our other cluster. So in order to do that, I'm going to start by creating a config file for Valsync. The details on this are in the documentation. And now we'll tell Valsync to start that replication after we create a namespace for it over on our destination cluster. And we're replicating in this case from a kind cluster on my laptop to a Kubernetes cluster that's running in AWS. And what we see here is Valsync is creating a replication destination over on the destination cluster, waiting for the operator to populate the keys and endpoint address, and then it's going to copy that back over to our local kind cluster to set up the replication. And if we look at what we now have on our source cluster, what we can see here is that we have a replication source object. And as a result of that, the Valsync operator has cloned the wiki's persistent data and has started a pod up here to replicate over to our destination cluster. Now taking a look at the destination cluster, we'll see some corresponding objects over there. So here on the destination cluster, what we have is a replication destination object. As a result of that, we have a PVC that's going to be receiving the incoming data, and the Valsync operator has started a pod to accept the transfer of incoming persistent data. And it's also created a load balancer to serve as a network endpoint for that incoming transfer. Okay, now that we have a synchronization that has completed from the source over here to our destination cluster, we can see that we have a volume snapshot over here, and that is holding a point-in-time copy of the data that was replicated from our source cluster. And in the future, we'll use that to spawn our wiki over here on the destination cluster. All right, so now that we've got that, let's actually go ahead and migrate our application. So the first thing that we're going to do is we're going to coalesce our wiki application by scaling that deployment down to zero, and let's watch everything shut down here. So we see that our wiki pod just disappeared, so everything is shut down, and we can tell Valsync to proceed with the final synchronization. So this is going to run one last synchronization iteration to make sure that all of the final data makes it over to our destination cluster. And now that that's done, the other thing that Valsync did for us was it took that most recent snapshot on the destination cluster and turned it into a pvc for us so that we can use that in order to spin up our wiki over on the destination cluster. And so what we're going to do is use that pvc name and adjust our manifest so that the wiki uses that pvc whenever it spins up, and then we will apply those manifests to the destination cluster. And now that the wiki should be starting up on the destination cluster, we can tear down our replication relationship between the two since we're done with that. And let's watch everything start up. All right, so what we see here is that Valsync is still cleaning up a little bit, so it is terminating its transfer pod on the destination cluster as well as getting rid of its temporary pvc. And we see that the wiki is starting up, here's its deployment, and it is using the pvc that Valsync created for us. All right, now that our wiki is ready, what we're going to do is copy the address over in AWS and hop on over here to our web browser and see if we can access it. And there you go. There's our wiki, all migrated over to our second cluster running in the cloud. Now let's discuss another use case of Valsync that is used for high fan out one to many data replication. You can use Valsync to distribute data from a primary or central hub to many at site. To do that, we deploy the Valsync operator on both the cluster. And then when you deploy the replication source CR on the primary site, the operator picks it up and initiates the data distribution. It first creates a point in time copy or snapshot of the source pvc and uses a temporary volume to move the data. The Valsync operator launches a job that uses our clone data mover to copy the source data from the temporary volume to an intermediate storage unit like object bucket. Similarly, the operator on the destination site triggers the our clone data mover job based on a schedule to pull the data from the intermediate storage to the destination cluster, which is an edge site. Valsync will copy the data from the object bucket to a temporary volume on the destination. It will then create a point in time snapshot of the copied data and update the destination CR with the volume snapshot image. To update the application, the latest snapshot on the destination is used to create the destination pvc for the application. Now, I'm going to demonstrate how you can use Valsync to do one to many high find out data replication. For the purpose of this demo, I already have a kind cluster running and I've deferred the Valsync operator. Now, let's start with the demo. The first step would be installing a target application, which is my SQL database on the source site. And we would be copying the content of this application from the primary to the secondary or to the edge side. So let's go and create a namespace, which is called source that will act as a primary site. And once the namespace is created, we are going to create the database MySQL database into that namespace. So as you can see that the MySQL is up and running and it uses this volume to store its content. Let's go and create a dummy database in this application. Now I'm going to create a dummy database called synced. And when this replication is finished, we will verify that this dummy database is also present on the edge side. And if it's present, that means that Valsync was able to do the data replication successfully. Once we have created this dummy database, now I'm going to create our clone secret. Since Valsync uses ArcLone data mover to copy the data from the primary to the intermediate storage, which is an S3 object store deployed on AWS. So I have the ArcLone secret. Now I'm going to create or deploy the replication source CR. Once the replication source CR has been deployed, the operator picks it up and it starts the replication process. So let's go and see what are the resources the operator creates for the replication to happen. So the operator creates a job pod that what it does is copies the content of this volume, which is a target application volume, onto this temporary volume created by the operator. And then it copies the content of the temporary volume onto the intermediate storage unit, which is S3 in our case using the ArcLone secret we created previously. So the second step would be to initiate the replication process on the destination site. And to do that, first we need to create the destination namespace. We are calling it that test. And then we are going to create the secret, the ArcLone secret as we did on the primary site, which will be used by the operator to copy the content from the intermediate storage onto the destination site. So we have the ArcLone secret created. Next I am going to deploy the replication destination CR. And as soon as you apply it, the operator picks that up and it would create other resources to initiate the replication on the destination. So on the destination, the wall sync operator creates a job and the job pod is running as you can see. So basically what this job pod would do is it will use the ArcLone secret to copy the data from the S3 onto this temporary volume, which I will highlight it over here. And after it has copied the content onto this temporary volume, it creates a point in time snapshot of this volume, which is listed here. And once the state of this point in time copy snapshot is ready to use, the operator would update the replication destination CR with the name of the snapshot. So next what we need to do is extract the latest snapshot image name from the replication destination CR, which is listed over here. And to finish this replication, we need to create the MySQL database application on the destination site and ensure that the PVC that is created for this application is mirrored from this snapshot image. So the last step would be updating the destination application with the latest snapshot image. All I'm going to do is to update the PVC YAML so that the destination application, database application creates its PVC from this volume snapshot name. And now I'm going to create the MySQL application on the destination site. And let's get into the MySQL part and verify if the dummy database synced is present on this. As you can see that the MySQL application on the edge side on the destination also has the dummy database that we created on the primary site. And that's how WallSync does one to many hyphen out data replication. Thank you. The final replication component we will talk about is RESTIC. Our implementation of RESTIC is used primarily for recording a point in time and then allowing us to recover back to that moment. WallSync allows for us to perform this action either manually or scheduled. This means you can run this process manually before a rollout of a new release or just have RESTIC running at designated times against your PVC to ensure a copy always exists. RESTIC operates similarly to RClone in that persistent volume claim assets are stored in object buckets. A secret containing information about our object storage and the PVC are mounted by the replication source pod. The replication source pod executes the RESTIC binary using the configuration secret and PVC to send the persistent data to the object store. The RESTOR process operates in a reverse to the backup. If PVC is mounted by the replication destination pod then optionally we can create a snapshot or clone of the data to create a new PVC which will be used by our application. Let's see this process in action. I called the demonstration deploying with a backup plan. Like you saw earlier, we have a wiki application that uses the PVC. Here you can see the message and you can also see that our replication source has been defined and it is currently backing up our docu wiki application. We have our application emails already saved on our system or in Git but that doesn't really help us when a mistake like this happens. So I accidentally just deleted our application which has our persistent data, volume snapshots. So this is a bad thing. The good news is that I can redeploy my application but the bad news is I'm not going to have application data. My PVC and my volume snapshots are gone. This is where Valsync and RESTIC are going to save us. To recover from this, first we're going to deploy our namespace. Next we're going to deploy our PVC and then we're actually going to create the RESTIC config secret. This actually links back to our S3 bucket and has the credentials that we need to pull the data back. And then finally, we're going to create our restore. Taking a look at this file, what we're going to see here is just a very simple restore. We're going to pull back the destination PVC of docu wiki. We're going to run the manual process of restore and we're using the RESTIC config secret that we defined earlier. So looking at our pods now, we will see that at a moment this container will create and run. So just like that, our container ran the RESTIC process and populated our PVC. So now we can start the remaining portion of our application and create the deployment. We can recreate the required service. We will create our required secret and then we'll wait for application to begin. So what we'll do is we'll get pods and see where we are in our deployment and our application has started. Now let's use the service address to access our docu wiki. Go back to the internet. We will see that our application is back with the message that we expected. So what we did was we deleted the namespace and everything contained below and then used RESTIC to recover all of that data for us and then we redeployed our application. Now that you've seen how Vault Sync works, you may be wondering what's next for the project. There's more to moving an app than just replicating data as we saw in the R Sync demo. Integrating with other tools such as multi-cluster management frameworks will be key to providing a good experience. We also saw the CLI in the demos. Everything that Vault Sync can do is controlled by custom resources but the CLI makes creating and manipulating them easier. For example, it has context about both the source and the destination so it can move configuration information between the clusters. We plan to continue developing the CLI with the goal of making the common operations easier. You've also heard about the three movers that currently exist in Vault Sync. When we started, there was only R Sync. The others were added as needs arose. R cloned for edge and RESTIC for PB backup to support GitOps. In the future, we'd like to incorporate more. The most interesting right now seems to be Sync Thing. It provides the possibility of providing cross cluster active-active data sharing. The eventual consistency won't be for everyone but we believe it could be useful for a certain class of applications. If you're looking for more info, here's where you can find us. You can come check us out on GitHub. Our documentation can be found over on Read the Docs. And if you want to give it a try, our Helm chart is available on Artifact Hub. Just search for Vault Sync. Thank you. Hey, folks. We're here. Thanks for coming to the talk today. And if you've got any questions, just put them in the chat and happy to take a stab at them. And if not, and you're just looking for more info, we've got a couple links over in the chat to the GitHub page as well as the documentation site. Yeah. Thank you very much, John. It was a wonderful talk and demo. I have put the breakout link in the chat as well. So the people are going to hop in over there and discuss other things with you. Yeah. Thank you. Great. Sounds great. Sounds good.