 Hello, everyone. Today, Xiangqian and I will give an introduction and deep dive for the Kubernetes data protection working group. My name is Xin Yang. I work at VMware in the cloud storage team. I'm a co-chair of the CNCF tech storage, a co-chair of the Kubernetes 6 storage. I also co-lead the data protection working group with Xiangqian. Welcome, everyone. My name is Xiangqian. I work in Google storage department. I lead the data protection working group with Xin. Next slide, please. Today, Xin and I will go through all these items in this agenda. We talk about why we are doing this working group and who are the organizations involved in this process and how building the data protection concept in Kubernetes. And then we'll talk about what exactly we want to achieve for data protection in Kubernetes and what are the existing building blocks that allows backup vendors or application owners to protect the valuable assets. And, Xin, we'll talk about missing building blocks in the community. We're looking to define or provide to further bring the availability to backup vendors or application owners to protect the workload. And, last lesson, we'll get into how can you get involved. Next slide, please. The day one operations for state workload as of today in Kubernetes is kind of well-defined. First of all, there's a very matured persistent volume operations supported either via CSI or in-tree plugins that allows state for workloads to use a persistent volume layer. On top of that, there are workload APIs which uses the persistent volume to define workloads. Some of them are deployments, some of them are state processes, etc. With a strong desire of moving to a container orchestrated architecture for state for workload, there is a deep requirement in terms of how can I protect my data should disaster happens. And unfortunately at this moment, the day two operations for protecting those workloads in Kubernetes is still insufficient. Some of the organizations or backup vendors utilizes GitOps, but GitOps can only protect the application definitions, basically the YAML files, but not necessarily the persistent volume data that is typically managed by the underlying storage system. Given all that, we build the data protection working group, try to define a solution in this scope. Next slide, please. So who are involved? There are many, many organizations involved actively in this data protection working group, and here are all the companies. If you feel I miss some of you, please feel free to reach out and we will add. Next slide, please. So what exactly is data protection in Kubernetes? First of all, let me just explain from a really high level what's the main goal. The main goal is to enable an application owner or cluster administrator or a backup, typically in a big organization, there's a backup administrator such that any valuable assets running in the Kubernetes environment can be restored to a previously preserved state. Should any disaster happen? It can be loss of cluster, loss of the storage system, etc, etc. And to define that in the Kubernetes context, it mainly involves two pieces. One is the API resources that constitute an application or compose internal application. The other piece is the persistent volume data that application relies on. This is a complicated and later problem that basically means it includes backup and restore at persistent volume level, application level, and further down to namespace and cluster level. Next slide, please. So part of our working group's charter is to define what other components needed to support all this backup and restore operations. So of course at different levels, persistent volume level, like volume snapshot and how to backup a volume, and how do you restore a volume, and application level, what are the resources needed by an application? And in some cases, application wants to be application consistent when a backup is created and mechanic them to inquires and inquires an application such that normal rights is accepted during that process also needs to be supported. And further to a bigger scope, how we do this in a namespace and cluster level. Next slide, please. Here are some common use cases we gather in that protection working group. A typical one can be I as an application owner, I once protect my application to cover common failures. For example, I rolled out a bad software, I want to roll it back with exactly the same configs I used to have as well as the data I used to have. Another example for that personnel, it's basically I want to migrate in between namespaces, in between clusters, and even across different availability zones. Another one is standing from the Eury cluster administrators point of view. They want to protect the namespaces running in their cluster as single units. So they want to provide data protection over a namespace level. And finally, another big case is I as a data protection administrator in an organization, I don't really quite know the details, what kind of workloads is running in each Kubernetes cluster, but I also want to enforce organization policies around those workloads or the aura around those clusters. So how do we provide the utilities so that backup vendors can kick in and build those solutions for them? It's also critical. Next slide, please. So we're talking about the definition. Let's try to see what a typical an application will backup. This is really a very high-level view. So we talk about two pieces, Kubernetes resources, as well as the persistent volume data. The user kicks off a backup, and the first, that the two things will happen. First thing is to gather all the resources that compose a specific application, one namespace or cluster, and ship it externally, typically, to a secondary object storage system or a secondary storage system. And the other piece is to backup the persistent volume data. Because there are many, many applications and they have different flavors of doing things. So in this process, it can well be an application-specific native data dump, for example, mySQL dump or Postgres dump. Or it can well rely on existing Kubernetes components, like control-coordinated volume snapshot process or volume backup process with the choirs and unquires hooks so that it can achieve application consistent. And after this is done, the volume data, a volume snapshot, will be copied as well as to an externally managed backup repository. In this case, the application can safely claim that their data and the config has been stored in an externally managed system that has not tied to the cluster and the application is running on. Next slide, please. Let's go to how the restore will happen. The user starts a restoration process, if it's needed, which imports a backup, including data and a config from an externally managed storage system. And this process will also take place in two pieces. One is to restore the Kubernetes resources, which consists of the application. And the other piece is to restore the PVPVC data. Of course, this will need a specific PVPVC data restoration process, and it also depends on the nature of the application. For example, if it's a native one, there are tools that the native application will provide to restore the data into the format they want it, or a Kubernetes controller managed process which just backed by restoring a PVPVC from a snapshot or a volume backup. Next slide, please. So in order to achieve all this, we need a lot of building blocks. And there are already some existing ones in the community. But for application, to figure out the application, what resources are needed to backup, there are workload APIs like Step4Set, deployment, etc., and there are also application CRDs. To restore and backup a volume, the components like warm snapshot features, which is already G8, can be used as well. The thing over here is we are still having a lot of missing building blocks. We'll cover this later by soon. Next slide, please. Now let's see how this will be used. Recall that we need to define basically a group of resources that consists of an application. And that's where the workload APIs or SIG apps application CRD can help. Typically, this is achieved by a label query or within the namespace to get all the resources that consist of the application. Next slide, please. For the volume backup piece, the warm snapshots feature plays a good role here. Because warm snapshots typically is really, really fast. And for many storage vendors, the implementation is just the internal marker of the current state. And that allows the application to freeze and unfreeze themselves in a very short period of time, like in the millisecond level. A warm snapshot feature can be used to plug it into the warm backup process. Next slide, please. But in the restoration process, very similar way, the warm snapshot can be used to rehydrate a PVC such that the PV will contain the data that when the warm snapshot was taken. Next slide, please. By saying that, we're still missing a lot. For example, warm backup. How do we do a warm backup? Because warm snapshot in many implementations is a local, still stored locally on the local storage system. How do we expose, explode the warm snapshot out? We really need a warm backup module. Like, where do we store the backups? Where do we store the config, the application API resources, and the volume data? How do we do a choirs? How do we trigger choirs against a specific application? And how do we coordinate the application snapshot or application backup process, which includes backing up the config, backing up the volume, before backing up the volume, maybe choirs and un-choirs as well. Next slide, please. So, putting things together, this is, again, a really high level view. The blue pieces are the workflow. And the green pieces are what is already there. And the yellow pieces are in progress. And orange pieces are still to be designed or discussed. So, given this whole picture, we need something that can coordinate an application's resource backup, as well as data backup. We need a place where the backup will be restored. It's called a backup repository. I put the name over there. And we need something to really backup the volumes, which is a volume backup component. The choirs and un-choirs hooks, there's an ongoing cap. She will get more details into that. It's got container notify. In the restore workflow, next slide, please. Up. Similar thing. So, the backup repository needs to play when a backup is imported. And some application restore API to coordinate or orchestrate the whole restoration process. And then, finally, the volume backup module, which can import the backup volume data into the Kubernetes cluster. With that, I will hand it over to Xin to discuss all the missing components. Thanks, Xiangqian. The first missing building block we identified is volume backup. We need this because we need to extract data to your secondary storage. We've already got a volume snapshot API, but there's no explicit definition made in the design to have snapshots stored on our different backup device separate from the primary storage. For some cloud providers, a snapshot is actually a backup that is uploaded to an object store in the cloud. However, for most other storage vendors, a snapshot is locally stored alongside the volume on the primary storage. Without our volume backup API, the alternative is for backup vendors to have two solutions. For storage systems that upload snapshots to object store automatically, a snapshot is a backup. For storage systems that only take local snapshots, use volume snapshot API to take snapshots, and they have a data mover to upload snapshots to our backup device. We are at a very early stage of discussions for this one. Let's take a look of the diagram. Volume backup is next to volume snapshot here. We put it in an orange box to indicate that it is a missing Kubernetes component. We have started discussions, but there's no concrete design yet. The next missing building block is CBT, change block tracking, and the changed file list. Without CBT and change file list, backup vendors have to do full backups all the time. This is not space efficient. It takes longer time to complete and needs more bandwidth. Another use case is snapshot-based replication where you take snapshots periodically and replicate to another site for disaster recovery purpose. So what are the alternatives? Without CBT, we can either do full backups or call each storage vendor's API individually to retrieve CBT, which is highly inefficient. We are currently working on our design for the feature. Let's take a look of this diagram. The CBT box is next to volume backup and volume snapshot, as it is used to make backups more efficiently. It is in a yellow box indicating this is a work in progress component. The third missing building block is the backup repository. Backup repository is a location or repo to store data. This can be an object store in the cloud, on-prem object location, or it could be an FS-based solution. There are two types of data to be backed up that we need at restore time. The first one is Kubernetes cluster metadata. The second is local snapshot data. We need to back them up and store them in a backup repository. Currently, there is a proposal for object store backup repository. That's the proposal for object bucket provisioning or COSI. It proposes object storage Kubernetes APIs to support orchestration of object store operations for Kubernetes workloads. Therefore, bring in object storage as the first class citizen in Kubernetes, just like file and block storage. It also introduces container object storage interface, or COSI, as a set of gRPC interfaces for object storage providers to red drivers to provision or provide access to object stores. Kubernetes COSI is already a sub-project in six storage. It has weekly design meetings. It's targeting alpha in under 23 release. Let's take a look of this diagram. COSI is in a yellow box, indicating that this is a work in progress Kubernetes component. This is a object store backup repository. It can be used to export backup and store the data. Now let's take a look of restore. COSI is used to import backup data at the restore time. The next missing building block is voting populator. Without voting populator, we can only create a pvc from another pvc or from a voting snapshot. But what if the backup data is stored in a backup repository, such as an object store? The voting populator feature allows us to provision a pvc from an external data source, such as a backup repository. In addition, it allows us to dynamically provision a pvc. Having data populated from that backup repository and honor the wait for first consumer voting binding mode, during restore to ensure that voting is placed at the right node where a pod is scheduled. There is in any voting data source of a feature gate, which was introduced in Wandao 18 and we had a redesign in Wandao 22 release. There were repos created for a shared library for voting populators and another repo for a controller that is responsible for validating pvc data source. We will have our very first release from those repos very soon and this feature is targeting beta in Wandao 23 release. Let's take a look of this diagram. We can see that voting populator is needed at the restore time. Voting populator is in yellow box indicating it is a work in progress Kubernetes component. It is used to rehydrate pvc from a backup repository during restore. The next one is choirs and unchoirs hooks. We needed these hooks to choirs application before taking a snapshot and unchoirs afterwards to ensure application consistency. We investigated how choirs-unchoirs works in different types of workloads. They have different semantics. We wanted to design a generic mechanism to run commands in containers, but we want to mention that application-specific semantics is out of scope. We currently have a proposal called container notifier. Cap is submitted and being reviewed. We are targeting AFAR in Wandao 23 release. So here are some details of the container notifier cap. We are doing this in phases. In phase one, we propose to introduce several API changes. We are going to add an optional field called notifiers, which is a list of container notifiers into the container. Adding an inline type container notifier handler, which defines the command. Adding a core API type pod notification, which defines request type to trigger execution of container notifier single pod. Introducing a new gate container notifier to talk about this feature. And a single trusted controller, pod notification controller, will be implemented to watch pod notification resources, execute the command, and update their statuses accordingly. In phase two, we propose to add a core API notification type and a controller, which processes notification resources. And add an inline pod definition for signals and allows the API object to send a request to trigger the delivery of those signals. More logic in a pod notification controller in Cubelet. Cubelet, which is pod notification objects, runs the command and updates statuses of pod notification objects accordingly. In phase three, a probe may be added if needed as an inline pod definition to verify the signal is delivered or whether the command is run and resulting the desired outcome. As shown in the diagram, container notifier is mainly used at a backup time to do choirs before taking the snapshot and inquires afterwards. The next one is consistent group snapshot. So we talked about the container notifier proposal, which tries to ensure application consistency. What if we can't acquire the application or if the application choirs is too expensive, so you want to do it less frequently, but still want to be able to take crash consistent snapshot more frequently. Also, an application may require the snapshots from multiple volumes to taken at the same point in time. That's when consistent group snapshot comes into the picture. There's a cap on voting group and group snapshot. It proposes to introduce a new voting group CRD that groups multiple volumes together and a new group snapshot CRD that supports taking a snapshot of all volumes in a group to ensure right order consistency. The cap is being reviewed. Let's take a look of this diagram. We don't have a container notifier to do choirs here, but we have a consistent group snapshot that facilitates the creation of a snapshot of multiple volumes in the same group to ensure right order consistency. We have snapshot APIs for individual volumes, but what about protecting a stateful application? There is a cap submitted that proposes a Kubernetes API that defines the notion of stateful applications and defines how to run operations on those stateful applications such as snapshot, backup, and restore. This is still in very early design stage. As shown in this diagram, application backup handles the backup of a stateful application. It can leverage container notifier to do choirs and use cozy as backup repository. Similarly, here we can have application restore that handles the restore of a stateful application. So these are all the missing building blocks that we have identified and are working on. Next, I'm going to talk about how to get involved. As discussed in previous slides, the data protection OIN group is working on identifying missing functionalities in supporting data protection in Kubernetes and trying to figure out how to fill those gaps. We are also working on a white paper on data protection workflow. We have a weekly meetings on Wednesdays at 9 a.m. Pacific time. If you are interested in joining the discussions, you are welcome to join our meetings. We also have a mailing list and a Slack channel as shown here. This is the end of the presentation. Thank you all for attending the session. If you have any questions, please don't hesitate to reach out to us. Thank you. Bye.