 Hello, our data is important and we want to make sure it's safe and protected from failures. This could be hardware failures from disk to host and software bugs and even in environment failures. In today's session, I'm going to go over the different features of OpenShift Container Storage Edge for data resilience. My name is Orit Basaman. I'm OpenShift Container Storage Architect at Red Hat. I've been working and developing for storage for quite a while. Some of you may know me from my previous role at Red Hat where I was a SAP IGW core developer working on SAP Object Storage. We'll go over a bit what is OpenShift Container Storage and SAP. We discuss our backup, our higher availability features and finish with disaster recovery, making sure our data is protected from the total data center loss. OpenShift Container Storage or RCS for short because it's a very long name. We provide persistent data services for OpenShift application, Container as application. OpenShift is Red Hat platform as a service based on Kubernetes offering. And we run whatever OpenShift run basically on all the hybrid cloud from public cloud like AWS and Azure to on-premise virtualized environment like VMware and OpenStack to on-premeter installation. We're completely integrated with OpenShift. We even sync our release cycles. We are an operator and we use the operator framework for packaging, deployment and management. Our underlying storage is SAP and another important part is Rook. Rook is a CNCF graduated project. He allows us to deploy, maintain and administrate in SAP in a Kubernetes environment. And we also have a multi-cloud gateway providing multi-cloud object storage and provided by Nuba. SAP is an open source project. It's my favorite project. It's a software-defined storage, which means it's a software-only solution and it runs on any commodity hardware, commodity servers, standard X86 servers, power and even mainframe, standard IP network, TCP-IP or RDMA, and standard disk, hard disk, SSDs, and MPMS and so on. We take all those unreliable commodity hardware and create a very reliable storage cluster. In a single storage cluster, we provide object storage, block storage with RBD, and distributed file system with CFFS. SAP is very highly available and very resilient. It includes self-healing to automatically protect the data in case of a failure. It's very, very scalable. Tens of better. And it's very elastic. You can actually expand the cluster very easily when everything is running and without any effect on your wallet. As we're going to talk a bit on OCS and we're integrating with SAP, I'm going to discuss two important component SAP. There are others, but those are what I'm going to discuss today. So we have the monitor. The monitor is basically our cluster manager. He has the cluster view and state. It is the central authority for authentication and placement and policy and all the other cluster components get the cluster state from the monitors. To be highly available, we have several monitors in a cluster, three, five, or seven. It's always an odd number. So we have a quorum, a majority in case of a failure. The monitor communicates between the cells with Paxus and consensus protocol. It's quite famous. This is a safe implement its own Paxus. But you're probably familiar with RAF, which is another Paxus implementation in Go that is used in Kubernetes. Then we have the OSDs, object storage demand. We have 10s to 10,000 OSD per cluster, basically an OSD per disk. In some cases, when the disk are really large, you can partition the disk and OSD can serve a partition of a disk. The OSD provide all the IO to the clients and all the data path goes through the OSDs. The OSD communicated between the cells with a specific set P2P protocol. They are handling the data application, making sure we have enough copies of the data to be resilient to failure. They handle failures and move the data in case of OSD failure. And when you expand the cluster, they rebalance the data, make sure we use all the OSDs in the system. So first of all, we want to protect the data from accidents. Accident can happen when you delete by mistake your data, which happens to me. Or when the data is actually corrupted, or when the hardware can also cause some corruptions. Backup is infrequent. We do it usually once a day, like nightlies, or maybe even once a week. It could be several times a week. The restore is manual, meaning when something bad is happening, I may ask that mean to restore my data. Or maybe I'll have an API, but it will take time. The data has to be stored remotely, so it won't be affected what's going on in the system. For my experience, it's very important to check your backup, make sure it's functional. Still remember, in the start of our work, we came on Sunday. In Israel, we started working on Sunday, and we found out we had a power loss during the weekend, and we lost all our storage. And then, to make things much more complicated, we found out the backup wasn't running for the last two months. We actually could not restore all the data, so please check your backup. OCS backup is based on incremental snapshots. We take a point in the copy of your data, and only store what changed from the previous snapshot. This reduces our space usage. These snapshots are copied to an external object storage. It could be public, cloud, object storage, or it could be another OCS. This is also good for archiving. We have a new community operator called OADP, OpenShift API for data protection. This operator provided API for the backup solution to integrate with OCS. OCS provide the storage function, and the backup solution provide backup scheduling, management policies. We use the narrow community operator provided by VRL, so thank you VRL. I've already did the integration with IBM Spectrum Protect, Trilio Castan, and many more common backup solutions. So backup is great, but it's very infrequent, and it's very manual. We want a way to protect the data and make sure it will be available in case of failure. Here comes the high availability features of OCS. When we discuss high availability, we also discuss our failure domain, what section of the infrastructure we can lose and still be functioning. In storage systems, usually the lowest unit is disk. Here's a separate application, save us from a disk failure. Make sure that each replica of our data is stored in a separate disk. So in case we lose a disk, we, if you use, for example, replica-free, will have two copies of the data, and we are still resilient and can function and update the data. The next failure domain is host. When you lose, we want to be able to function when we lose a host, a server. This is the default failure domain for step. So step makes sure not only each replica is in a different disk, but it's a different disk in a different host. Again, when we lose a host, we still have two copies in two other host and the data is saved. We can also provide a rack failure domain. In this case, you need to have at least three racks, and then we'll make sure that each replica is in a different host in a different rack. Again, we are safe. Next, but sometimes we want even higher protection. An ability zone is a false isolation section of the data center. In AWS, for example, you have at least three ability zones in a region. Each ability zone will have its own power source. So in case we have a power failure, only one ability zone will be affected, the same with networking and other infrastructure. In the next few slides, I'm going to discuss more about an OCS ability zone protection. And in the disaster recovery section, I'm going to discuss what we do in the even higher failure domain of a data center. So the classical setup is when we have free ability zone. This is OpenShift and OCS default installation when we are installing it in a public cloud, when we always have free ability zones. And our failure domain here is an ability zones. So OpenShift makes sure that each of its master nodes is in a different zone, and OCS makes sure that each safe monitor is deployed in a different zone. So in case of a zone failure, OpenShift will still have two master and can completely function as regular, and Safe will have two monitor and continue to function as usual. OpenShift in addition can provide you your application high availability and make sure your pods are moved to different ability zones. But if your application uses persistent data, you need to make sure that that data is available in those zones. EPS, for example, is limited to a single availability zone, meaning that if you lose a zone and you move to an application once in a different zone, it won't be able to no longer use the EPS volume. OCS on the other hand provides you multi-zone access, and we make sure that each replica of your data is in a different zone. So in case you lose a zone, you still have two copies of the data you can use and access from any running zone and your Safe form failure. This is done by specific sequence applications, so there's no data loss, everything is automatic with the failover and also the recovery. But on premise, we don't always have free availability zone. In many cases, we only have two. This is a new OCS 4.7 and in Safe Octopus, we have the Arbiter cluster. We have a new kind of monitor, the Arbiter. Its main job is to allow us to have a quorum in case of a zone failure to still have a majority. It doesn't communicate with the OSD like the regular monitors. If we look at our setup, we have two availability zones. We call them data zones as the OSD only run in those zones, and then we'll have the Arbiter somewhere else. If it's a Safe deployment, the Arbiter can even run in the public cloud. If it's an OpenShift, then we are still limited by the requirement for network latency that OpenShift needs. Each OSD in its data zone only communicates with the monitor running in that zone, so it won't be affected from its own failure. Because of that, we decided to use five monitors in our deployment when we deploy in Arbiter mode. I like the three because when we lose a zone, we want to have more than a single monitor, so it will be highly available. With four regular monitors and the Arbiter, we're going to be left with two functioning monitors and one Arbiter to add the quorum. Replica-free, for example, here won't work because it's odd and it doesn't divide in two. If we have chosen to use Replica-2, what will happen when we lose a zone is that we'll be left with a single copy of the data. It's very risky to allow updates in the data when you only have a single copy because we do expect in large cluster there will be additional fonts. So we are enforcing Replica-free. We keep two copies in each data zone, so when you lose a zone, you have two copies of the data and the data is received. So we are protected from an Arbiter zone failure, but what should we do when we lose all our data center? For that, we have disaster recovery. Disaster recovery will provide us failure domain of data center when we lose a full data center or site or region in AWS. Our recovery sites needs to be remote, so it won't be affected. For example, if it's a natural disaster or a flood or maybe a statewide palace that happened in Texas this week. So it has to be remote. The fact it's remote will mean we have high network latency. So here we cannot use Synchronous Replication, and so we are going to use Asynchronous Replication. We are using RBD as Sync Snap Mirroring. This is new in Safe Octopus. A previous mirroring for RBD was Drunal Base and was only available when using LibarBD. We in OCS use the Kerner RBD Client Kerabidi, so we can only use the Snap Mirroring. Another benefit is that it's Snapshot Base and not Drunal, so it doesn't affect the load of the clusters. We have a new RBD Demon in each cluster. It's Snapshot Base, so the Demon in the primary cluster, cluster A, will take the Snapshot in the Interval you configure. Then the secondary cluster, cluster B will pull the Snapshot and make sure its value data is up to date. We support failover when cluster A for some reason is down, cluster B can become our primary and serve the data. We think cluster A is back, it needs to sync the data if that was updated when it's done. Then failback, cluster A is completely synced and it can serve again as primary. You can enable mirroring per volume and you can also set the Snapshot Interval per volume. So let's look how it goes together with OCS. So we have two data centers that are connected with a slow network on the one. The first cluster, the fourth site, the OpenShift cluster there is active, your application are running, they have their own persistent volume that is serving and they are writing to them. Then we have the other data center, a different OpenShift cluster running OCS. This is the standby cluster. Here the application is not running and the persistent volumes are replicated to this cluster, but they are not active, you cannot write to them. In case of disaster, we'll promote the second cluster too and make it active and only then you could use the previous in that cluster. The previous are replicated using the SafeRBD Snapshot Mirroring and you can set the interval for the Snapshot depending on your network bandwidth and your rate of data change. This is tech preview in OCS 4.7. What else is coming for OCSDR? So first of all, we believe in OCS on ease of use to making that the user life is there. So the current implementation is very CLI driven. First of all, we are going to add your user interface with ADMOR automation. So many operations will happen automatic and one require manual intervention. The current generality is fair PV. We want to provide higher generality, fair OpenShift, web space or level. As you notice, I discussed only RBD, only the block storage. We want also selfie face support that's coming in the Snapshot release specific and then it will come to OCS in the version probably 4.9. We are working on improving our performance but by looking at compressing the network and increasing our scalability, the number of concurrent PVs we can replicate. In DR, it's not enough to only make sure the data is replicated. The application configurations that also need to be handled. This won't be handled by OCS but by a different operator. We're working on a new operator called Raman. And hopefully in next DevCon, I could present you Raman and how it works with all the stack. So we discussed today that the different data resilient method OCS has. We started with backup APIs. Then we moved to higher availability with free ability zones and two ability zones which we call the arbiter. You can protect your data from data center failure using our DR feature with the RBD snack mirroring and lots more is coming in the future. Thank you. I'll be happy to answer any questions. Alright, we have a question from Sean here. So Sean mentions, will destination PVs replicated during RBD mirroring get the same name? So we'll be able to use them seamlessly in the destination deployments that are deployed using Argo CD or ACM. So the answer is yes. We are going to skip the PV names. So you don't need to change the application configuration. And we are actually doing integration with ACM. This is Raman and maybe also a GitOps model. We have not finalized the design years of Raman. So that's it's still working for this. Thanks for answering this question. So another question. Merix says, does the snapshot mirroring work for both file and block storage the same way? Well, in the current implementation, we only have block and the first work is still in progress. But yes, it's snapshot based. So it's basically similar but for file system. Sure. I think we have a couple minutes left. So I think we can take one or two more questions. Okay. Yeah. So another question is from Tom. And he asks, what's the difference between Nuba and rope provided buckets OCS is providing multiple storage classes to choose from and it's confusing to him. So Nuba provide object storage and you can roll and so basically for object storage, we don't use storage classes. It's not CSI. And you can use OBC object bucket claims. So it's a similar mechanism. Or you can use regular as free protocol in your application. And for a P persistent volume, you use CSI and then you have storage classes could be a read write only storage classes usually are backed by a BD with either it could be a roadblock or a local fast system depending on what storage class. Or it could be a safe, safe and fast storage classes usually write read many as it's a shared file system and you want to share the data between the different previous. I hope that answered my question. So we have. Yeah. We have one more from Tomas. And he asks, does OCS support RWX volumes for OSP, because AWS volumes do not. As far as he as far as we understand that only NFS supports are RWX. And we actually for an open shift virtualization cook it and we support also our BD PVs that are really like many for like my question, but only for cook it because for block it's not good to do sharing. I see. So I didn't answer your question about the storage class. So, so look in Nuba buckets and a bit different because Nuba is a multi cloud gateway. So you can actually have a bucket that is on AWS in a bucket is that is on Azure. So you can access it both for Nuba work usually provide buckets that are backed by RGW set object storage. So it's usually only those buckets. For example, install a CS on bare metal. So you can actually have buckets backed by RGW and you can access either directly without the W order of Nuba and then get the multi cloud options. Share bare metal with AWS.