 All right, everybody. Happy Monday. Welcome back to OpenShift Commons. Today, as we like to do on Mondays, we have an upstream project with us and many of the team leaders, and we're going to make them tell us all about their project. And if I get this right, it's asynchronous data replication, which is what Scribe does for us. And we have John, Strunk, Ryan Cook, Paruelsing. I see Scott Creely somewhere in the background and Guy Margallet. I'll let you introduce your team, John, and there's going to be some live-ish demos here. So ask your questions in the chat, per usual, and we will try and get to them all in the end and have live Q&A. So take it away, John. Awesome. Thank you, Diane. So yeah, we're here today to tell you a little bit about Scribe, and there's a number of folks working on the project right now. As Diane said, I'm John Strunk, and we've got Ryan, Paruelsing, Guy, and Scott that are also helping out. And so let's start with a quick overview of what we're going to cover today. So we're going to start with a few intro slides on what exactly Scribe is about, and then we've got three demos because everybody likes demos. So we've got three demos keyed up for you, and then we'll finish it off with a little bit of Q&A. Let's get to it. So let's talk a little bit about your data. So Kubernetes and GitOps, it works really well for stateless applications. If your pod crashes, you lose a node, Kubernetes is more than happy to reschedule your pod somewhere else in the cluster. And in the event that your cluster goes down, if you're managing your application and configuration via GitOps, you have all that information and it's easy enough to just apply that to your backup cluster. And again, you're back up and running nice and easy. But if you've got stateful applications, it's not really that simple. You may have the configuration for your app ready to apply, but the hard part is getting your data over to that secondary cluster. Right? That's the difficult part. And that is why we decided to build Scribe. So what Scribe is, is a Kubernetes operator that is designed to do cross-cluster asynchronous data replication. And it does this in a storage system independent way. So you don't actually need the underlying storage system to support the data replication, right? So we handle it all on top. And one of the nice bits about that is that you're not forced to run the same storage system on all of your sites. So for example, if one of your clusters is running in the cloud, you can use a storage system that is optimized for that. Whereas if you've got some small cluster running at an edge site, you can use something that's customized to that resource constrained environment. And you can still replicate your data between those sites. Scribe makes use of CSI capabilities of clones and snapshots if they're available. So we use that in order to create point-in-time copies of your data to replicate. But if your storage driver, storage system doesn't support it, that's okay. We can still copy your data without. And as well, Scribe is designed around an extensible architecture so that if the storage system does support optimized replication natively, it could also be integrated with Scribe. So when we think about where all we might want to use Scribe, probably the first thing that comes to mind is disaster recovery, right? So replicating your applications data from a primary cluster to a secondary cluster. But it's also useful for data distribution scenarios. So perhaps you have a central cluster that's generating some data and you want to replicate that out to a number of edge sites, right? So you can use Scribe for that. It's also useful for for say data migration like within your cluster. So if you want to swap out your storage system, maybe change vendors, that sort of thing, you could use Scribe to move your your data that way, as well as migrating data between cloud and on-prem environments. You could also use it for off-site analytics or even just replicating your production data for dev and test scenarios. So Scribe is built around this notion of data movers. So there is, we have one data mover that is based on R-sync and that's really optimized for one-to-one volume relationships. So for example, that asynchronous disaster recovery scenario, right? Where you're trying to replicate a volume from a primary to a secondary cluster. And then we have an R-clone-based data mover that is optimized for the one-to-many volume relationships for data distribution scenarios. And both of these support both in clusters, so cross namespace, as well as cross cluster replication. And all of Scribe is built on top of just Kubernetes and CSI primitives, which means that Scribe is really well positioned to take advantage of some of the upcoming data management enhancements that are coming in Kubernetes like volume groups and data populators and container notifier as well. So in terms of where we are today with Scribe, we, I guess about two weeks ago, made our first release and that has both the R-sync and the R-clone data mover in it. So I encourage you all to go and check that out. And the way to do that is to head on over to artifacthub.io. We have a helm chart up there and you can grab that helm chart and install on both Kubernetes and OpenShift. So we'd love to have your feedback what you think of the initial release. And in terms of where we're going, high up on our list is getting Scribe packaged for Operator Hub so that it's a little bit more front and center in OpenShift. As well, we are in the process of adding a third data mover that is based on R-sync and that is to handle more archive type use cases. We're working on adding metrics to the Operator so that it's easy to keep track of the current status of the replication relationships. And as well, we're working on adding some helper programs to make it a little bit easier to replicate data into and out of Kubernetes environments as a whole, right? Because we realize that not everybody's IT environment is 100 cube at this point. And then finally, where to find us? So you can go check out the documentation. It's over at Scribe-replication on Read the Docs. Or you can check out our code that's on GitHub and we'll put the slide back up again at the end. So that is the overview. And now what we're going to do is we've got three demos that are keyed up for you. So the first one is going to be showing off the R-sync-based data mover specifically in a disaster recovery scenario. It kind of takes a look behind the scenes a little bit about kind of what Scribe is doing in the background in order to replicate the data from one cluster to another. Then there is a second demo that is showing off the R-clone data mover. And that is to show how it could be used for data distribution. And then we've got a quick third demo. And that one is really about how Scribe can be integrated with Red Hat's Advanced Cluster Management to really simplify the management of your stateful apps. So as we get into the first demo, what I'm going to do is provide a little overview here on this slide of kind of what you're about to see. So what we have is two different clusters. So we've got a primary site that is running an application that has some data volume. And then we've got a secondary site. And we want to replicate that data over to the secondary site so that we could move our application. And so what we're going to do in the demo is we're going to create a custom resource, a replication source over on the primary side that points at that data volume to replicate. And then on the secondary side, we're going to create a replication destination that provides a target for us to replicate the data to. Once those are in place, what Scribe is going to do is create a data pipeline from the primary site over to the secondary. So with each sync iteration, it's going to take the application's volume, clone it, and give that to the rsync data mover. That data mover is going to push the volume contents over the network to a volume on the secondary site. And then once everything makes it across, Scribe will take a snapshot to preserve a point in time copy of that data. And then based on the schedule in the replication source, that process repeats and updates the snapshot on the secondary site. Then whenever it comes time for us to move the application across, what we do is just take that most recent snapshot, turn it back into a volume, and spin up the application. So that is what you're about to see in the demo. And so what I'm going to do now is stop sharing my slides and have Diane play the first demo for us. This is a quick demo of cross cluster replication using Scribe. What we see here is I have two clusters. I have a primary cluster in this first window running in US West. And over here in the other window, I have a secondary cluster running in US East. And what we're going to do is use Scribe to ensure that our data is replicated so that we can move an application between clusters if necessary. So let's start by taking a look at our primary cluster. So what we see in this cluster is that I have a simple wiki application that's running and here's the pod for the wiki. And the data is residing in a PVC also in this namespace. And what we need to do is replicate this data over to our secondary cluster so that we could fail over our application if necessary. If we take a look at the secondary cluster, what we'll see over here is that we have the same application deployed. However, it's currently scaled down to zero. And you'll also notice that it doesn't have any PVC associated with it because we haven't replicated our data over here yet. So like we saw earlier in the slides, let's go ahead and set up Scribe to do the replication. So first thing that we're going to do is set up the replication destination over here on the secondary cluster. When we take a look at the Scribe CR, what we see is that we're asking it to create a 10 gig volume for the incoming data and then with each sync iteration to preserve a point in time image via snapshot. So let's go ahead and add that to the cluster. Now that we've inserted that into the cluster, let's go and check out our namespace again. Now what we see is here's that replication destination that we just created. And the operator has taken that and it's working on setting up the infrastructure necessary to accept the incoming transfers. So the first thing that we see is that the operator has created a PVC in the namespace that's going to receive the incoming data. It has also set up a load balancer that's going to act as the endpoint for our source to eventually connect to. And in a minute, once this PVC gets finished binding, we should see the rsync pod startup. So there we have our PVC and we have the rsync data mover pod that is currently spinning up. And so that's actually going to accept the connection from the remote side. All right, so now that that's done, let's take a look at the custom resource. And what we see here is that Scribe has added to the status field the connection parameters necessary for us to configure our source so that it can transfer data to this location. And what that information consists of is the address to connect to, right? So that's our load balancer. As well, it has given us a set of SSH keys that exist in a secret here in this namespace. And so we need to transfer that secret over to our primary cluster. So what I'm going to do is I'm going to save that secret out to a file. And now let's go over to our primary cluster and set up the replication source. So over here, the first thing that we're going to do is insert that secret. And now let's edit the replication source CR. Okay, what this is going to do is define what data we need to replicate. So the first thing that we'll notice here is we're specifying a source PVC to replicate that is our wikis PVC. We've set up a schedule to replicate once every two minutes to our secondary site. And then we need to specify the SSH keys, which was in that secret that we just inserted. And then the thing that I need to change, which is the actual address to connect to. So we're just going to go and copy that from what the replication destination provided. And we should be all set. So let's insert that into the cluster now. And now if we go and take a look at our primary namespace, I'll see that the operator has started to work here. And so it is just spinning up the first replication of data from our primary over to the secondary. And so what it has done is it has created a volume snapshot from the application to PVC. And currently we're waiting for that snapshot to become ready. But once that finishes, it will be turned back into a PVC for use by scribe. And then that will get picked up by the source, our sync data mover and sent over to the other side. In the meantime, let's take a look at our application. So this is the endpoint for our application running on our primary cluster. Here we see it's just a simple wiki. We can come and we can edit the data. And we see that makes changes. And eventually those changes will get replicated over to the other side. So it looks like right now scribe is in the process of replicating data over to the remote site. Well, that happens. Let's go and see what's going on over here. So we are awaiting that first transfer still. Okay, so the first transfer has completed. And what we see over here is that there's now a volume snapshot on the secondary cluster that has a copy of that data that was replicated. And if we look back here at the source, what we see is that we are going to be starting another replication iteration very shortly. And it's probably this next iteration that's going to copy our little wiki change across. So we will check back in a minute. Okay, here we see that our next sync iteration has started. So scribe has taken a snapshot of the application's volume. And we're currently waiting for that to be processed into a usable snapshot. Once that succeeds, we'll again get a new persistent volume that will be used by the rsync data mover to update the volume on the remote side. Okay, so our snapshot is ready to use. So this PVC should bind shortly. There we go. And now the rsync data mover container should start and send the data across. There we go. Now over here on the remote site, it has updated the volume snapshot to contain the most recent data. But let's go and take a look again at our replication destination. And again, take a look at the status. And what we see is this latest image field. And that always tells us what the most recent volume snapshot is. So that if we want to spin up our application, we can do that. So we're going to take the name of that volume snapshot. And we are going to add that to our customize and use that to scale up our application over here on the secondary site. And so what happens with that customize is it goes and creates a new PVC from that snapshot, that latest snapshot, as well as scales up the wiki deployment. So if we go back and look at the namespace again, what we'll see is we now have a PVC for our wiki. It was again created by that snapshot that we copied the name in for. And we have scaled up the deployment. We're currently waiting for it to become ready. Here's the pod for it. Again, we're waiting for that. But as soon as that becomes ready, then we'll be able to head back over to our browser. And we should be able to see the edit that we made to the wiki back over on our primary site, except it'll be here on our secondary. Okay, our pod is ready. So I'm going to copy the address here for our secondary site. This was the primary with our little edit. I'll open a new tab and paste in our secondary site. And here we are. So notice this one is over in east one. And there's the edit that we had made back on our original cluster. So that's a quick overview of how scribe works. Thanks for watching. There we go. Do we want a few more slides here or go right into demo two? Oh, no, we have, I have the control. I am trying this one. Take care. Take it away. Thanks. Okay, so to wrap up the key points from the first demo, we saw the replication of a wiki application from primary to secondary. And to do that, we saw how the scribe operator replicates by using point in time copy of the application data. And it preserves that image name in its CR on the secondary site. To restore the application on the secondary site, all we need to do is ensure that the that the destination or the secondary PVC is restored from the snapshot that is preserved in the CR. Okay, so we saw how we do our sync based replication. Now we have a second demo that shows how to use scribe for a wide fan out replication, which we believe have potential use cases in edge scenarios, where you need to replicate or distribute data from a primary site to multiple edge sites. The pipeline is more or less same as the R sync, which John has explained in previous sites. So we have an application that uses a PVC to store its data. And we have a replication source CR running on the primary site. So whenever the at the start of each replication iteration, what the operator does, it creates a snapshot or copy of the volume using CSI driver if available, or it does in a non CSI fashion if they are not available in the cluster. So once it has created the snapshot, it moves that data onto an intermediate storage for our case, it's an S3 object store. It creates a scribe or clone based data mover job that pushes the data onto the object store. On the edge side, you have a similar scribe data mover job that pulls this data and creates a temporary PVC to store that data. It then creates a snapshot or a copy of that data and stores that image name in the replication destination CR. To ensure that your edge application is up to date with the data on the primary, all you need to do is restore the PVC from the image name that is stored in the replication destination CR. So you can see that our clone based data mover job is a push and pull mechanism where the central hub or the primary site pushes the data onto an intermediate storage and the edge cluster pulls the data from the intermediate storage. For the demo that I'm going to show next, I have a client cluster with two namespaces, the source namespace which act as a primary site containing the source of truth and the desk namespace which act as the edge site that will pull the data. The application I am going to use is a MySQL DB which and the PVC is provisioned by the host path and we are going to use a snapshot feature of the CSI drivers. Can you play the video please? So for the purpose of the demo I already have a client cluster running and I'll deploy the scribe operator which you can see over here. Next I'm going to create two namespaces, the first called source that will act as a primary site and the second as test that will act as the edge site. So that is my first namespace and this is my second namespace. So in my primary namespace or my primary site what I'm going to do is deploy a MySQL database application. Let's see if the application has been deployed or not. Okay so it is greeting the container. Let's give it a few seconds to get up and started. Okay as you can see that my SQL application is up and running. Let's see what PVC it is using to store its data. So if you see here it is using a PVC which is called my SQL PV claim and what scribe will do now is going to create a point in time snapshot of this PVC and copy that data onto the intermediate storage which is our S3 bucket. So now that we have verified that the database is running let's go ahead and create a new database in this application. As you can see right now we have four database inside this application. Let's create a new database called saint and we will be verifying this particular database on the edge site or not. So we have a new database and we will verify if this new database is present on the edge site or not once the replication is completed. So now that we have created a new database it's time for us to deploy the replication source CR on the primary site. But before we do that we need to create a secret which is called our clone secret and the operator would be using this secret to push the data on the intermediate site which is our case is a AWS S3 object store. So I am going to go ahead and deploy the secret. So you can see that a new secret has been created. Now it's time to deploy the replication source CR. So we have deployed the replication source CR. Let's go and see let's go and evaluate this CR. As you can see that it has a status field and it has a trigger. So what it means like it would it is scheduled to run every five minutes to push the data on the intermediate storage and the source PVC that it is going to use is my SQL PV claim and you can see that this data mover pod is running. So let's go and evaluate what's happening behind the scene. So we see that the our clone base data mover job is running and the operator has also created a PVC which is scribe SRC database source and this PVC is created or mirrored from this snapshot which is over here. So basically the scribe operator creates a snapshot based out of this PVC and it creates a temporary PVC which is called scribe SRC database source and the our clone base data move job uses this PVC to push the data onto the intermediate object store. So let's let's wait for the operator to finish moving the data. Okay as you can see that the scribe operator has finished and it's now time to verify if the replication was is happening or is initiated on the destination site or not. So to do that I am on my destination namespace and just like the source site you have to create a secret which you can see over here and now I'm going to deploy the replication destination CR. Okay so I've deployed the replication destination CR on the edge site and now I'm going to watch what the scribe operator is doing behind the scene. So we see that we have a new PVC which is scribe test database destination and a new volume snapshot which is the scribe test the new snapshot over here. So let's see let's extract the so let's go and verify what scribe operator has done on the destination site and to do that let's see what is happening in the replication destination CR. So again you see that it is a trigger based that is scheduled to run every five minutes and in the status field you can see that the first week's consulation is complete and over here you can see that it took the data from the object store and created a snapshot out of fit and that snapshot image name is preserved in the latest image field of the replication destination. So now just to to sync your database application on the edge side all we need to do is ensure that the database application restores its PVC using this snapshot name. Let's do that and see if the database that we created called synced is present on the destination site or not. So I will extract the latest point in time snapshot image which is this and I am going to create a PVC out of it which the edge database application would be pointing to. So I've used this PV this snapshot name and I'm going to replace it into I'm going to create the PVC out of it. So now let's just go and deploy the MySQL database application on the edge site. As you can see it has deployed let's go and verify it is creating the container let's wait for the MySQL pod to be up and running. Okay as you can see that the MySQL pod is up and running next I'm going to verify if the synced database is present in this MySQL application or not. So to do that great so we see that the synced database that we created on the primary site has been replicated on the edge site as well. So that's all for the Rclone Waste Replication demo. There you go back over to you. Okay so back to me and so we saw the Rclone Waste Replication that scribe uses and we think that this has potential use cases in edge scenarios. So what we did in this demo is we did a replication of MySQL based application from one namespace to a different namespace and the scribe operator uses an intermediate storage like S3 object store. The primary site pushes the data to the intermediate storage while the edge sites pulls the data from the intermediate storage. So so far I have been talking about how this wide fan out replication has potential for edge scenarios but I didn't actually showed you how you can move it between different clusters didn't I? All I showed you is like how you move from one space to another namespace. So to prove my claim we have a third demo that's coming and Ryan will show you how you can integrate scribe with Red Hat Advanced Cluster Management to easily scale applications. Over to you Ryan. All right so as Paul said this last demonstration is going to kind of glue all of the pieces together. John and Pearl both talked about and demonstrated how scribe creates a Kubernetes centric way of managing replication. Pretty much the easiest way to say is there's YAML files to control your replication. So the really cool part about that and John mentioned early on is that we can use GitOps tooling such as Red Hat Advanced Cluster Management, Argo CD to manage our application placement and then with the addition of scribe then we can handle our replication placement. So with both of those combined then we can actually just scale outside as we need as we see fit. You can bring a cluster up and be in business within 30-40 minutes. So for the final demonstration I'll be showing scaling out clusters with Red Hat Advanced Cluster Management. And Red Hat Advanced Cluster Management will do the applications and scribe will do our data. So as you see from this page here we'll start out with a primary cluster. You'll see it's RHACM. It's labeled within RACM as local cluster and this will be running OCS as a storage class. Then we're going to actually scale out to US West 2 where we're going to use the storage class at GB2 CSI. And then from there we're going to bring in a bare metal cluster and we're going to use the last post-pass storage for our docky-wicky application. And the whole key takeaway of this architecture and even this demonstration is that regardless of storage class underneath, scribe is going to manage the data and to the use of GitOps tooling we're going to effortlessly scale out. So we're visiting Perl's slide from earlier. You'll see our three clusters. We have our local cluster, we have our AWS cluster, and we have our bare metal cluster. Our application will be created and updated here. And then any changes to that application will get a snapshot, scribe will move the data to a bucket, and then we'll actually be able to update both clusters at the same time. Or even if one of those clusters is on a boat or on a plane, whenever that cluster comes back in connectivity with ACM it will be replicated. The new data will be there, any application updates will happen. So as you see, combining both of these technologies is just a huge strength because you're no longer having to go to each cluster and kind of poke it. So like I said, at our primary site, any changes that we make are going to be sent to these other clusters. And with that, Diane, I think I'm ready for the video. So jumping into the Rackham console, we will take a look at our clusters. Currently, there is only our source cluster that we showed earlier, which by default is named local cluster. Rackham handles application placement based on labels shown here, you will see storage of OCS in site headquarters, as well as some various auto generated labels. The storage label and the site labels are used to determine what storage class to use and whether the location is the replication source or destination. To save some time, describe components, storage class modifications, our replication source and destination objects, which Perule and John had showed earlier, and our docky wiki site have already been defined within Rackham. Here's our docky wiki page with a simple hello message. Now it's time to bring up the remote sites. We will create an AWS cluster and specify the labels of site, remote and storage GP2. This will automatically configure storage class changes and place a replication destination object and deploy our docky wiki application. As you can see, the docky wiki site has been deployed on our new cluster automatically, thanks to Rackham and Scribe, with the same hello message that matches our primary deployment. Now we will import a metal cluster. This cluster was too small to deploy OpenShift container storage, so we will use host path as our storage class. We will use the labels storage equals host path and site equals remote. When the metal cluster becomes ready within Rackham, our docky wiki application will deploy. We will now update our docky wiki page. The changes will be synchronized to the remote clusters, and when we are ready to update our docky wiki site, we will update our PBC definition and deployment within our gate repository. Rackham then will reconcile these changes, and as you see on our road, clusters showed the new message. First, we will see the changes sent to the metal cluster, and then the AWS cluster. That was a very short video. Was it supposed to be that short, Ryan, or was there, yeah, okay, good. There we go. With this scenario, we showed that by step. So actually that's great though that you asked that. I mean, if we could scale out clusters and things that are that simplistic to do, that shows the strength of not only Rackham, but Scribe. So that is perfect. The parts that we did definitely cut out of that, if everybody has spun up an OpenShift cluster before, you know how long it takes to spin up a cluster. So we did cut those parts, but it just definitely shows that by implementing a mature process, adding a GitOps tool like Argos CD or Rackham, and then with Scribe, you could just have this beautiful scenario. And if you're talking about like a factory, you could just spin up new factory locations as you see fit. If you're a restaurant, you can pop up faster and you can even imagine. So it just the ability to scale and not have to worry a new way to figure out how to get your data there is exactly what we were wanting to share today. Yeah. Well, even with the video editing, it was still incredibly quick. That's the key. That's a good thing. You know, keeping it simple definitely is going to make everybody's lives easier. All right. Well, are we ready for a little Q&A and conversation about this now, John and Paru? Sounds good. This seems to me and I'm going to unmute a few folks here and all of you want to join in as well. This seems to be a very early stage and someone is sharing their screen with me still and I'm getting their Slack channel. So that's all right. It happens all the time. It seems to me a very new project. It's like this, when you guys approached me to do this talk, how long has Scribe been around? So I guess it's really, we only really got started back around, what, maybe October or something of last year. It's something that we'd want to get started for a while, but just in terms of scheduling and that kind of thing. So it is, it's still really, really new, but we're excited with the potential. No, it seems huge and like lately it seems, and no surprise anyway, we've been having all these edge conversations. So it seems like when you walk through the beginning, in the beginning of the talk, this data distribution problem and solving that problem for edge and IoT devices that are out there is huge and it seems like it's a big part of what we're going to need to move forward successfully in the edge space anyways. So that's that part really amazing and cool. So I can see lots of applications for that. Are you working with people who are deploying things on the edge? Is that one of the major reasons for building Scribe or was it just the data replication that drove you to start it? I think John can answer what was the motivation behind, but down the line when we were developing this project, it was a time we were getting a lot of emails and people were talking about edge clusters, single node open shift, and we kind of thought like, how can we have a solution that is storage independent because with Scribe you don't even have to depend and be dependent on the underlying storage system you're using. So let's say if one cluster is using GP2N, someone is using host path, you can use Scribe to replicate data irrespective of the underneath storage. So as we started to build Scribe, we thought like, oh this is a really cool case that can be applied to edge scenarios as well and that's how we started investing more time on it. So yeah, so everybody's talking edge these days and you know that's sort of the new hotness right now I think in my little world of spheres so it's amazing to see this done and I totally appreciate it. So the code is in GitHub, correct? Yes, I saw a URL go up there. How are you, besides things like this OpenShift Commons briefing, how are you interacting with the community? Is this something that you're going to try and grow a big community around? Is this just a piece of a bigger project? How do you situate yourself and how do you want people to interact with you? Anybody that's interested please come visit the GitHub repo, open issues, start discussions, whatever. That's kind of our primary way right now. We are trying to get Scribe into various forums, give talks and that sort of thing. But yeah, definitely try it out, check it out on Artifact Hub and then send us your feedback. Open issues, I'm sure there's a bug or two in there somewhere but we'll get it. Never a bug. Yeah, it seems there's a couple of sigs and maybe if Paul's around that it would be great to get this in front of in the CNCF so I can see a lot of interest coming from that and that maybe this is something we might want to throw into a sandbox in the CNCF and some not-too-distant future because I think that's a great way to get other people to participate in a project as well and as well as to get other Kubernetes folks there. So yeah, the Artifact Hub Helm charts, the operator stuff sounds like it's coming soon to a theater near me. So that's good. Is there an operator, a section of the repo itself where you're working on that in public or is that still behind some firewall? Yeah, no, it's all there in the Scribe repo. So there's the Kubernetes operator and then there are and so that's like one container whenever you build it and then there are these other the data mover containers, right? So one for our sync, one for our clone and then the one that we are working on for RESTIC. Perfect. Have you gotten any outside contributions yet or is it mostly the team so far? It's just the team so far. Yeah, it was a new topic to me, Paul. So that's why I was really interested in getting guys on because it's like, whoa, where did this come out of? And this is because we've had in the OKD working group a number of conversations with people, especially from the Fedora IoT and the CoreOS teams and doing kind of interesting stories, especially around bare metal and edge stuff. So it's definitely something we want to take you to the OKD working group and make you show off as well. So we'll share this with them. But I think this has got a very broad reach. So it'd be very interesting to see how other people respond to this. Is there anything out there that competes with this or is similar to this? Any other projects? So the thing is, right, the asynchronous data replication or I mean just data replication in general is something that has traditionally been done as a part of the storage system, right? Because in traditional IT environments, it was all up to the vendor to do that replication. And this was actually kind of one of the reasons why I thought it was important for us to build this operator to be a cross-vendor replication engine, right? Because there's kind of that lock-in of relying on the storage vendor to do the replication and not all of them support it and that sort of thing. Whereas Kubernetes is really good about abstracting away the underlying hardware and environment that you're in. And so we thought that it was really important to be able to provide those advanced data management capabilities also in a way that allows you to move across different footprints and environments. So, yeah, this is the question that I always ask myself is what overhead does it add to an application? And have you taken a look? I noticed the reference to Prometheus and other monitoring things, but I'm wondering what overhead does it add to my application? Yeah, so I would say directly to your application there really isn't any overhead today, but I mean it is going to use some resources in your cluster and I'll be perfectly up front if your storage system itself supports replication natively. It's going to do that more efficient than anything that a scribe operator is going to be able to do. But the data movers that we've chosen are sync. It does go in and it calculates just the changes that have been made in the volume. And so it does try and minimize the amount of traffic that goes over the network, that sort of thing. And so we've tried to make it fairly lightweight in that way. And so we think that for a very broad spectrum of use cases this will be a good solution. Awesome. So any questions out there in the chat room? I'm not seeing any yet. I'm wondering if anyone's actually deployed scribe yet in production. Is this still that new that we don't have customers or end users giving you feedback yet? Yeah, it's still pretty new. But we're talking with folks trying to convince them to give a shot. I think you've got a really good shot. I can think right off the top of my head about two or three folks that have been talking about this problem with me. So I'm going to definitely hit them. And we've got one question coming in. Could the traffic traveling over be compressed and thus reducing the traffic? Yeah, absolutely. So for the case of the R-Sync data mover, for example, it's just the R-Sync protocol over an SSH tunnel between the two sides. And so the SSH connection itself does compression. R-Sync is doing the deltas of the files. So yes. Definitely. Great. Good to know. Not a question, but I just wanted to say before we get too far afield of this topic that I think the demos and presentation you gave today, like that would be great to show to CNCF SIG storage when you feel like you're ready. Yeah, definitely. Yeah. We can set you up with that. That's what I was trying to figure out, which one definitely storage, but there's a few others too. I think there's even an edgy SIG coming around soon. So there's another question coming in. Is it possible to encrypt the data? So the data, whenever it's going over the wire, at least with the R-Sync protocol, it is going over an SSH connection. And in the demo there, we had to copy that secret from one side to the other. And that was basically moving the SSH keys from one side to the other so that both sides could authenticate each other and properly encrypt that traffic. So I hope that was what they were after there with that question. Like it? We'll give everybody a few more minutes to post any other questions. Are any final words from Ryan or Perule or anyone else? Can you throw back up that final slide with the links on it? So where to get ahold of y'all? Find you. Any limits on the number of edge sites that can pull right now? So one of the... Sorry, I'm just trying to put up the slide again. So one of the benefits about using that intermediate site is that that takes the load off of your central cluster, right? Because your central cluster is really only pushing it to one place, which is the S3 bucket in the case of the demo. And so then all of your edge sites can simply hit that S3 bucket and not overload your central site or your central cluster. John, I have a follow-up question. Does that mean that system only works if you have a CS side snapshot? If you're using something like the R-Sync, does it still go to an intermediate location or does it go directly to the destination cluster? Right. So in terms of... Well, so I guess there's two sort of... There's two things to separate there. So one, the intermediate site is used for the R-Clone mover only. And whereas the R-Sync mover goes directly site to site, right? Because it's just a one-to-one replication relationship. So there's that side of it. Now in terms of the clones and snapshots with CSI, there is a configuration both on the source side and on the destination side that allows you to basically enable or disable whether you want to do snapshotting. So on the source side, you've actually got three options. You can get your point-in-time copy of the source volume either via clone or via snapshot. So the most efficient way to do it is just directly clone the volume to be on the source side to hand that to the data mover. In the demos that we did, we were using the EBS CSI driver, which doesn't actually support clone. So we had to use the snapshot mode where it takes a snapshot and then restores that snapshot as a volume in order to get point-in-time copies. But then there's the third mode, which is basically to just use the live volume and replicate that on a schedule. And so that gets you around requiring CSI snapshots or clones, but you lose the instantaneous view of the volume when you do that. So you can potentially get a little bit of skew in there whenever it's being replicated. And then over on the destination side, you have your choice of whether to snapshot after each sync iteration or to just leave the volume as is, which you could do if your storage provider on the destination doesn't support snapshotting. Does that help? Did that answer your question, Sean? I think so. I think I just have to think some more. But thanks, Sean. Okay. More in the chat. Is it reasonable to use for huge volumes? Do you have done benchmarks or tests? Right. So we haven't done, we haven't really done a lot of benchmarking of it yet. Obviously, the bigger your volume, the higher your change rate and stuff like that, the more, I guess, the more latency you can potentially see in terms of how long it's going to take to replicate your data. In terms of doing large volumes, I would say I'm not all that concerned about just having a big volume in terms of data. It could take a while to get the first copy of that over to your secondary site. But sort of as the replication is ongoing, it's really just sending the deltas, right, what has changed for each iteration. And so I expect it to be much more based on what your overall data change rate is as opposed to how big your volume itself is. So benchmarking will be interesting when we get to that stage and figure out what it is we should actually be benchmarking. Because everyone will have a different scenario. So that's going to be interesting to figure out what the best thing to benchmark is. All right, so we're at the almost end of the hour. Anyone get any more questions? We'll give them a second. Otherwise, we're going to say thank you and we're going to have you back in a few iterations. So we'll see what comes out in the next releases. And if folks are interested, please do go to the GitHub repo and reach out to John or Pearl or Ryan or anybody in the storage team over here at Red Hat. You can get ahold of them all. And we'll definitely keep you posted because I know this is something near and dear to a number of folks' hearts. And I'm really pleased to see this solution. So thank you for taking the time today, everybody, and for the wonderful demos. And especially the very, very short ACM, Rackham one. That was impressive, even with the minor edits of launching the clusters. That was great. So thanks very much. And thanks, everybody, for joining us today. And we will keep you all posted on Scribe's progress and see what we can do about getting you in front of SIG Storage and other places to get the word out. So thanks again. Awesome. Thank you, Diane. That was good presentation. Thank you. Thank you. Bye.