 Hello everyone. Today, Xiang Qian and I will give an introduction and feedback for the Kubernetes data protection working group. My name is Xin Yang. I work at VMware in the cloud storage team. I'm a co-chair of CNCF tech storage, a co-chair of Kubernetes 6 storage. I also co-lead the data protection working group with Xiang Qian. Hi everyone. Thanks for coming. My name is Xiang Qian. I work for Google. I'm right now co-lead the data protection working group with Xin. Next slide, please. Today, Xin and I will go through these topics and this is an agenda. We will go through what motivated us to establish this working group and who are the parties that get involved in this data protection working group and explain a little bit what we think data protection is in the Kubernetes context and go through what are the existing building blocks or modules that allows application owners or cluster owners to protect the stateful workloads and what exactly are the gaps that this working group is looking to build or propose to help providing data protection in the Kubernetes context and the very last Xin will go through how to get involved in this data working group, this working group. Next, please. Many of you may be very well aware of the day one operations for stateful workloads actually pretty well supported in Kubernetes context. In order to, there are persistent volume operations, including provisioning a volume, attach a volume to a specific node to hold your data or your workloads and data, to the app orchestration APIs, like workload APIs, deployments, and stateful support to use those persistent volumes in an application context. It's been there for a couple of years and is very stable. With that, the direct impact is more and more stateful workloads like databases, message cues, etc. etc. are looking forward to utilize those constructs as well as Kubernetes very nice feature of scaling, scalability, etc. etc. to move into the Kubernetes environment. However, the date to maintenance operations for example to how do I protect my stateful workload within the Kubernetes context is still in need. It's very limited at this moment for various of reasons. Kubernetes users as of today uses GitOps to protect their stateful workloads config files, but the story around how do I protect my stateful workload, including the configs as well as the volume data is still yet to be discovered and also supported by the community. Next slide, please. With all this reality, there's a lot of companies that are supporting this initiative for providing data protection in Kubernetes context. The listed companies over here actively attending data protection working group meetings by weekly basis. If I missed any company over here, please let me know or we can add your names there. Next slide, please. Moving next. One of the charters for this data protection working group is to define what exactly means to protect stateful workload in Kubernetes. The main purpose of this group is to propose and design modules to ensure an application like database, including its config and data to be restored to a previously preserved state, meaning a backup. In case there's any disaster happened to that application, that could be lost in a cluster or clung to just delete the whole namespace where the application is wrong, et cetera. In Kubernetes context, it mainly basically involves two pieces. The one piece is the API resources that describes the application. The other piece which is more critical is the persistent volume data the application writes into the disk. This is a very complicated and a layered problem. It includes backup and recovery at different levels, including persistent volume, the application level, or even up to the cluster level. When the cluster is actually gone, you want to restore the whole thing into a new cluster. Next slide, please. As I mentioned previously, part of the working group's charter is to define a list of Kubernetes-native constructs to enable those workflows, state-to-operations to protect your stateful workload. That includes how do you provide constructs to protect persistent volumes. We have volume snapshot and volume backup, and volume restore, of course, when restoration is needed. The application level is more about what composes an application or what are the resources that stateful set, secret, and the service that application explores, et cetera. Another tricky piece is a lot of application wants to achieve application consistency, snapshot, or backup. That means it needs to be able to quiet, basically freeze itself from writing before the backup is taken, and unfreeze itself after the backup is taken. That's the quiet and quiet hooks I was talking about. Lastly, what does the orchestration look like at the namespace level and the cluster level? Next slide, please. Talking about all the definitions, let's drill down a little bit. What are the common use cases? There are three listed over there, but there are definitely more, but those three are very typical, and it covers a different layer. More typical thing is that you, as a MySQL database owner, you want to protect your MySQL database to cover common failure scenarios. For example, a bad rollout. You have a bug in your system, and they corrupted your data. You want to roll back into the previous state. The other scenario is more about migrating or geo expansion, meaning that, okay, you have a database running in Europe. Now you want to expand your business into Asia. In this case, you want to migrate whatever the data from one area to another area. Those are the common use cases as a SQL database owner or application owner. If you move up a little bit to the class owner's role, typically a namespace owner, typically that is owned by a cross administrator, they want to enforce protection over the namespaces they own in the cluster. That not necessarily means that they understand every single workload within the namespace, but they have the desire to protect the cluster as a whole. Should disaster happens, the cluster owner can seamlessly restore a cluster as a single unit or namespace as a single unit from an existing backup. Moving further up, in big organizations, especially for enterprise businesses, they typically have this data protection administrator who wants to enforce an organizational like RTO or RPO policies over every single workload running on Kubernetes clusters. They typically do not understand the details of the workload running on those Kubernetes clusters. The only thing they want to enforce is you have this application running on a production Kubernetes cluster. I want it to follow a certain policy that it's backed up in a certain schedule and it can be restored to an any given point of time such that to satisfy the company's compliance requirements, etc. With all these common use cases and there's a lot more, of course, that drives the working group to deliver desired components to achieve those goals. Next slide, please. With this, I want to walk you through what exactly a backup workflow looks like. Within the Kubernetes context, that workflow starts from the left and it ends up in the right. Basically, a user kicks off a backup process. This process, as mentioned, previously includes two pieces. One is the Kubernetes resource that that backup can contain may be scoped to a certain application, for example. This resource will most likely, in many cases, just a simple API resource, YAML file done and getting exposed into some remote repository that has independent lifecycle to your cluster. The other major piece is this so-called data backup. There are many ways you can do data backups. One of these is using an application native data dump. Typical tools like MySQL dump or Kafka dump, etc., this kind of thing. You basically dump whatever the data into a backup location and then there's some remote, there's some local process that copies this dump data to a remote location. Again, this remote location has to be kind of independent lifecycle to your cluster so that you can restore from it. Another way is this so-called controller-coordinated. It is a module that the community is looking forward to providing the future as well. Basically, it's agnostic to what kind of application it is. What it does is that it first goes ahead and check, okay, should I acquire this application so that normal rights is accepted and then it conducts YAML snapshot using the YAML snapshot API, etc., and doing the YAML backup and then inquires the application. After all this is done, then the YAML data gets transported again to an external location so that it can have independent lifecycle. This is basically described as a whole really 1000-foot view of the backup workflow. Moving to restoration, next slide, please. It's kind of the reverse, right? So basically the user starts a restoration and then the first thing is to import whatever the backup from a remotely maintained storage system, typically object storage, into your Kubernetes cluster and that has two pieces level. One piece is the Kubernetes resources. Basically, re-create the deployment, or Stateful said that re-create the secrets, re-create the service conflicts for your application, and then when you ask for the storage layer, it re-create the PVC and then force back into two branches as well. Once the one branch is, once the volume is there, the application-native tools can read from the remote and restore data from a data dump into the volume and the application can orchestrate all the restoration process, or rather a more Kubernetes way like re-hydrate PVC from a previously stored volume snapshot or volume backup. Next slide, please. Given that, I'm describing the workflows, so what are the modules needed to accomplish all these workflows? There are a couple of existing building blocks in the Kubernetes community to already support that. In the application layer, we have Stateful Sets and deployment, DemonSets, which are really well-established workload APIs, and those workload APIs can easily tell you what are the resources contained for this application by doing a simple label query. There's also an application CRB, which tells you what are the components composed of this particular application. In the storage layer, there's a volume snapshot, which is a GA feature. It's a CSI snapshot that gets built on top of the CSI driver. Next slide, please. But let's take a look at this picture. The workload API is like Stateful Sets and SIG apps applications. CRDs solve part of the problem of figuring out what are the Kubernetes resources that a backup process should be backing up. Next slide, please. The volume snapshot feature is really fitting to the data piece where there's no, let's say there's no native application, native data dump. The controller can go ahead and then create the volume snapshots from a PVC and inquires the application. With that, next slide, please. There's a lot more. With that in the restoration process, the volume snapshot can be used natively to rehydrate the PVC from it. Next slide, please. With this, there's still a whole bunch of missing building blocks, volume backup or change block tracking, which allows very efficient volume backup and the data populator, which reads the volume backup and repopulates the PVC. We talked about a remote place to store your backup, such that it has an independent lifecycle than the cluster itself. And the inquires and inquires hooks that allows application level, sorry, application consistent snapshots, etc. All this are yet to be built. Fitting into the workflows. Next slide, please. This gives you an overview of what are missing components in this community, right? The green box are existing ones. The yellow box are ongoing efforts, and the orange boxes are to be designed. She will give more details into this one. Next slide talks about the restoration, similar things. Yellow boxes are the causes basically an effort to provide object storage interface. And the green box are existing ones, and the orange box are yet to be developed or proposed. With that, I'm going to hand over to Xin to continue a deep dive into the missing blocks. Thanks, Xin. Thanks, Chen Chen. So the first missing building block we identified is the volume backup. We need this because we need to extract data to a secondary storage. We've already got a volume snapshot API, but there's no explicit definition made in the design to have snapshots stored on a different backup device separate from the primary storage. For some cloud providers, snapshot is actually a backup that is uploaded to an object store in the cloud. However, for most other storage providers, a snapshot is locally stored alongside the volume on the primary storage. Without a volume backup API, the alternative is for backup vendors to have two solutions. For storage systems that upload snapshots to object store automatically, a snapshot is a backup. For storage systems that only take local snapshots, use a volume snapshot API to take the snapshot, and then have a data mover to upload a snapshot to a backup device. We are at the very early stage of discussions about this one. So let's take a look at this diagram. Volume backup is next to a volume snapshot here. We put it in an orange box to indicate that it is a missing Kubernetes component. We have started discussions about it, but there's no concrete design yet. The next one is a CBT change block tracking and the change file list. Without CBT and change file lists, backup vendors have to do full backups all the time. This is a not space efficient, takes longer to complete, and needs more bandwidth. Another use case is snapshot-based replication, where you take snapshots periodically and replicate to another site for disaster recovery purpose. So what are the alternatives? Without CBT, we can either do full backups or call each storage vendors API individually to retrieve CBT, which is highly inefficient. We are currently doing a design for this feature. Let's take a look at this diagram. So CBT is next to volume backup and volume snapshot, as it's used to make backups more efficiently. It is in the yellow box indicating it is work in progress. The third missing computing block is backup repository. Backup repository is a location or repo to store data. This can be an object store in the cloud on-prem storage location or NFS-based solution. There are two types of data to backup that we need at the restore time. Kubernetes plus to metadata and local snapshot data. We need to back them up and store them in a backup repository. Currently, there is a proposal for object store backup repository that's the proposal for object bucket provisioning or COSI. This proposes object storage Kubernetes APIs to support orchestration of object store operations for Kubernetes workloads. They are for bringing object storage as the first class citizen in Kubernetes. It also introduces container object storage interface COSI as a set of GRPC interfaces for object storage providers to write drivers to provision object stores. Kubernetes COSI is currently a subproject in SIG storage. It has weekly meetings and it is targeting alpha in 1.23 release. Let's take a look of this diagram so we can see that COSI is in the yellow box indicating it is a work in progress Kubernetes component. This is an object store backup repository. It can be used to export a backup and store the data. At the restore time, COSI is used to input the backup data. So the next missing building block is a volume populator. Without a volume populator, we can only create a PVC from another PVC or from a volume snapshot. But what if backed up data is stored in a backup repository such as object store? The volume populator feature allows us to provision a PVC from an external data source such as backup repository. In addition, it allows us to dynamically provision a PVC having data populated from that backup repository and honor the wait for first consumer volume binding mode during restore to ensure volume is placed at the right node. The pod is scheduled. There is an any volume data source alpha feature gate which was introduced in 2018 and had a redesign in 2022 release. There are repos for shared library for volume populators and a controller responsible for validating PVC data source. And we have already got our first release from this repos. And this feature is targeting beta in 123 release. Let's take a look at the diagram. We can see that volume populator is needed at restore time. It's in a yellow box indicating that it's a work in progress Kubernetes component. It's used to rehydrate the PVC from a backup repository during restore. The next one is quiesce and unquiesce hooks. We needed these hooks to quiesce application before taking a snapshot and unquiesce afterwards to ensure application consistency. We investigated how quiesce unquiesce works in different types of workloads. They have different semantics. We want to design a generic mechanism to recommend containers. But we want to mention that application specific semantics is out of scope. We currently have a proposal called container notifier. Cap is submitted and being reviewed. We are targeting Alpha. Well, actually, we are targeting Alpha in 1.24 release now because we actually did not make it in 1.23. And so let's look at this cap. In phase one, we propose to introduce several API changes. Adding an optional field notifiers, which is a list of container notifiers into container. Adding an inline type container notifier handler, which defines a command. Adding a core API type, pod notification, which defines request type to trigger execution of container notifiers in a pod. Introduce a new gate container notifier to toggle this feature. A single trusted controller, pod notification controller, will be introduced to watch pod notification resources execute the command and update their statuses accordingly. In phase two, we propose to add a core API notification type and the controller, which possesses notification resources. Add an inline pod definition for signals and allows the API object to send a request to trigger delivering of those signals. More logic in the pod notification controller into Cubelet. Cubelet watches pod notification objects, runs the command and updates statuses of pod notification objects accordingly. In phase three, a probe may be added if needed as an inline pod definition to verify the signal is delivered or whether the command is run and the results in the desired outcome. So as shown in this diagram, container notifier, this is mainly used at a backup time to do requests before taking the snapshot and unquest afterwards. The next one is consistent group snapshot. So we talked about the container notifier proposal, which tries to ensure application consistency. What if you can't quiet the application or if the application quiet is too expensive, so you want to do it less frequently, but still want to be able to take a crash consistent snapshot more frequently. Also, an application may require the snapshots from multiple volumes to be taken at the same point in time. That's when consistent group snapshot comes into the picture. There is a cap on one group and group snapshot. It proposes to introduce a new one group CRD that groups multiple volumes together and a new group snapshot CRD that supports taking a snapshot of all volumes in a group to ensure a right order consistency. This cap is being reviewed. So let's take a look of the diagram. So we don't have a container notifier to do requests here, but if you have a consistent group snapshot here that facilitates the creation of a snapshot of multiple volumes in the same group to ensure right order consistency. And we have snapshot APIs for individual volumes, but what about protecting a stable application? There is a cap submitted that proposes Kubernetes API that defines the notion of stable applications and defines how to run operations on those stable applications such as snapshot, backup, and restore. This is still in very early design stage. As shown in this diagram, application backup handles the backup of a stable application. It can leverage container notifier to do requests and use COSY to be a backup repository. Similarly, we can have an application restore that handles the restore of a stable application. So these are all the missing fitting blocks that we have identified and are working on. Next, I will talk about how to get involved. As discussed in previous slides, this working group is working on identifying missing functionalities in supporting data production in Kubernetes and trying to figure out how to fill those gaps. We are also working on a white paper on data protection workflow. We have bi-weekly meetings on Wednesdays at 9 AM Pacific time. If you are interested in joining the discussions, you are welcome to join our meeting. We also have a mailing list and a Slack channel as shown here. So this is the end of the presentation. Thank you all for attending the session. If you have any questions, please don't hesitate to reach out to us. Thank you.