 Hi, I'm Annette Clute with Red Hat. Today I'm going to show you how to do automated disaster recovery failover and failback. The diagram shows what I am going to do. I have two OpenShift clusters. They're called managed clusters. One of them is primary, where the application starts, and the second is called the secondary cluster. In the middle there I have the hub cluster. This is where advanced cluster management is going to be installed, and in all three of them are components of OpenShift disaster recovery, supplied by OpenShift Data Foundation, as well as the OpenShift cluster themselves. So we start with an administrator initiating a failover on a per application basis. That populates metadata for the persistent volumes into the secondary cluster. We then demote the storage on the primary side, promote it on the secondary side. Then we can go ahead and have ACM hydrate or redeploy the application on the secondary cluster. At that point we redirect the IOs to the secondary cluster and stop or delete the application on the primary cluster. So if we look at what's in ACM on the hub cluster, I have cluster one and cluster two. These are AWS OpenShift clusters, and I have one of them in US West 2 in Oregon and one of them in US East 2 in Ohio. So they're basically across the country in US. If I look at how they're connected, I can go to cluster sets. I have one cluster set created. I'll go to Submariner. So we're using the Submariner add-on available in ACM. And using that we've connected the two clusters in the AWS regions, and we can see that the connection is healthy. So this is connecting the cluster network and the service network for those two OpenShift clusters. So now let's look at what is on the managed cluster in terms of operators. And what we can do is just go to a view here. So this is cluster one. We have operators OpenShift Data Foundation, OpenShift DR cluster operator, and Submariner. So I've installed these in order to set up the disaster recovery capability. On the second cluster, it's exactly the same. OpenShift Data Foundation, DR cluster operator, and then Submariner. The OpenShift DR cluster operator is part of or available via OpenShift Data Foundation in the advanced subscription. And it is upstream called Brahmin. On the third cluster is where I have installed, let me go to all projects. So if we look at the operators here, I have advanced cluster management. And I also have the ODF multi-cluster orchestrator that's used to set up the mirroring between the storage clusters for ODF. And then we also have OpenShift DR hub operator. And we'll see that we're going to use some custom resources there. And that again is from the upstream project, Rahmin. So now if we look at the application, I've already installed the application. It's called RBD loop. And all it does really is just writes a 4KB block every second to the storage so that we can just have IOs continually. If I look at the topology of it now, we can see that it's currently on cluster one. And there's one pod and one persistent volume. So let's look at it in the terminal now. So on the top, that's cluster one and logged into the OpenShift on cluster one. On the bottom, I'm logged into the OpenShift on cluster two. And we can see that I have the application running on cluster one. And if we look down at cluster two, it's not running. There's there's nothing available in the namespace. So what we want to do now is fail over the application. So I'm going to just go ahead and go back to the console. So it's very easy to fail over. We'll go to installed operators here. And we'll go ahead and go to the OpenShift DR hub operator. We need to be in the namespace. So I'm going to go to RBD where I've installed the RBD loop application, choose this operator. And then I'm going to go to DR placement control. So DR placement control is a new custom resource. And it is how on a per application level, you can fail over and fail back. So I'm going to choose it and in the AML. Let me just show you quickly. So there's a Grafana dashboard that goes with this application. And currently, we can see that on the left hand side, which is the cluster one, that we have IOs running. And it has a green bar, meaning that's the active cluster right now. We're going to see when we fail over exactly how long it takes to go from pervert cluster to fail over cluster and how much data we lost. So that would be the recovery point objective on a failover, which is how much data are we going to lose? So let's go back to our hub cluster now. And all I'm going to do to fail over is I'm going to change the action here to fail over. And the failover cluster is already in here. It's cluster two. So I'll go ahead and save this. As soon as I save it, it starts to take down the application on the primary cluster or cluster one and move it the application to cluster two. And we can watch that in the terminal, and we can also watch it in our dashboard. So if we go here, let's just type a watch. Okay. And it started, we can see it already started running. And if we go back here, we see it just changed over and the gap was only 39 seconds. So we must have caught it right after a snapshot. So basically the way the mirroring works is every I've set it up every five minutes, the data from cluster one on a per volume level is replicated over to the second cluster. So whatever the alternate cluster where the application is not active. And I can tell by the time that it probably just did a snapshot. And then I did a failover. So we lost in this case, very little data, we could lose up to basically just under five minutes. It's an asynchronous replication and we're replicating every five minutes. So now that we failed over, let's go back and look at our application. And as I just moved here, it updated and said, now the application is on cluster two. So we have now successfully failed over. And we are as this shows, taking iOS. And the main thing is, is that we did lose a very little little bit of data. So now what we want to do is, is to fail back, failing back is about as easy. Let's go ahead and go to back to the hub cluster. I'm going to reload because there's been some changes in the configuration. And when I want to fail back, I just have to type relocate. And I'll go ahead and save that. So again, as soon as I save it, it's starting now to move the application back from cluster two, back to cluster one. And let's go to our terminal. And we can see on the bottom that it's terminating. So let's do a watch on the top. Because again, we're moving it back to the cluster one now. So we'll just wait until we see a change in the top terminal. We can see here that it's still still running on the fail over cluster. So let's go back here, it's still, it's now moving to cluster one. And if we go back to our terminal, we don't have any resources yet in cluster one. So up there, they're starting to come in now, the containers creating PVC was created using the snapshot. And now if we go back to here, we see that we just failed over. So just to to finish up here, we should see that now the application is again running on cluster one. So we've basically done all the steps here, and both fail over and fail back for automated disaster recovery. Thank you.