 Hello, everyone. Thank you for coming to the session today. Dave and I will be giving a deep dive of Kubernetes data protection green group. My name is Xin Yang. I work at VMware in a cloud native storage team. I'm also a co-chair of Kubernetes 6 storage and the data protection green group working with Dave. I'm Dave Smith Uchida. I'm a technical leader at Veeam working on the cast and K10 product. Here is today's agenda. We will give key updates of the green group. We will talk about who we are and what is the motivation for establishing the green group. And we will discuss some of the projects that we are working on and finally how to get involved. Here are some of the key updates. We wrote a white paper on the data protection workflows in Kubernetes and here's a link to the annual report and also some of the previous talks at KubeCon. Here you can see the companies who are supporting this green group. If your company also is supporting this green group but not showing on this list, please let us know and we can get you added. In Kubernetes, the data operations for stable workloads are well supported. We have persistent volumes and persistent claims for the voting operations. We have workload APIs such as deployments, a stable set that you can use to manage your workloads. According to the 2022 survey by data on Kubernetes community, more and more stable workloads are moving to Kubernetes. There are different types of workloads. There are database workloads. There are machine learning, machine streaming, and other types of workloads as well. These stable workloads are moving to Kubernetes to take advantage of Kubernetes self-healing ability, possibility, scalability, agile deployment, and so on. On the other hand, data operations such as data protection is still limited. GitOps workflow has limitations for state for application, secrets, config maps, and your precious data stored in the persistent volumes are not on the Git. So we still need to figure out how to better support data protection in Kubernetes. That's why we formed this working group. We do work with other groups. SIG storage and SIG apps are sponsors of this working group. We also work with text storage on data protection related topics. This shows the backup workflow with existing and missing building blocks in Kubernetes. The blue color shows the process. The green color shows existing building blocks. And yellow ones are working progress. And then the orange ones are missing building blocks. When you take a backup of an application, we need to backup both Kubernetes metadata and the volume data. To backup volume data, there are mainly two ways. You can use the native data dump, such as my SQL dump, or you could use the controller-coordinated approach, which will take a loading snapshot or a backup. And before taking a snapshot, we first want to quiet the application. And after taking the snapshot, we want to unquiet the application. Both the metadata and data are exported into a backup repository. The backup repository is a repo or location that you can use to store the data and metadata. We have some existing features in Kubernetes. SIG apps has this application CRD. And we also have a volume snapshot that's a feature that has been GA since 1.20 release. And we also have a cozy consistent group snapshot, a voting mode conversion, which I will go over later. This shows the restore workflow with existing and missing building blocks. To restore the application, we first need to import the backup from the backup repository. Then we need to restore Kubernetes metadata, including PVPVCs. And we need to restore the volume data. Depending on how the data is backed up, if it's backed up natively, then we need to restore from the native data dump. Otherwise, we will rehydrate PVC for a volume snapshot or a volume backup. There is a beta feature called a volume populator, which is very useful when we are doing a restore. It allows you to rehydrate the PVC from an external data source such as a backup, not just from a snapshot or another PVC. This also supports the wait-for-first consumer voting binding mode. And it allows the PV to be created, populated with data, and also make sure that the PVPVC are bound. So when you create a PVC from a volume snapshot, you can change the voting mode from file system to Roblox, but it is possible that it could introduce vulnerability to the kernel. But on the other hand, doing this voting mode transition is a valid use case because we want to do efficient backups. We want to be able to retrieve change blocks. That's why we introduced this feature to prevent unauthorized voting mode conversion. We added a source voting mode field in the volume snapshot content and also an annotation on the voting snapshot content called allow voting mode change. So when you take a snapshot, the external snapshotter, the snapshotter controller will populate the source voting mode field based on the PVC's voting mode. And when you create a PVC from the volume snapshot, then the external provisioner will check the source voting mode and your new voting mode in your new PVC. And if they're different, then it's going to check if there is this annotation added on volume snapshot content. If it's not there, then the operation will be rejected. Otherwise, it is allowed. And this feature is targeting GA in 1.30 release, and feature flag is enabled in both the snapshotter and the provisioner. So if your application relies on this workflow, then action is required. You must make a change in your code, otherwise your application will fail. So this feature will move to GA. The GA blog will come out soon after 1.30 release. Now let me talk about backup repository and cozy. Backup repository is a repo that you can use to store your data and metadata. This can be an object store or an NFS location. It can be either on-prem or in the cloud. And there is this project called COSI container object storage interface trying to introduce the object storage into Kubernetes. It provides APIs to provision buckets and also allows the bucket to be used by the pods. So there are a few COSI components. There is a COSI controller manager that validates and binds the COSI created buckets to bucket claims. And there is a sidecar that watches the COSI Kubernetes API objects and calls the COSI driver to provision buckets. And there is also a COSI driver that communicates with the storage backend and provision buckets on the storage backend. And there are a set of APIs introduced in Kubernetes. The relationship between bucket, bucket claim and bucket class is very much like that for the PVPVC and storage class. And we also have bucket access, bucket access class. Then that allows your pod to consume the bucket. We also introduced the new gRPC interfaces for you to provision the buckets. So this feature right now is alpha. We are trying to move it to beta. We have a weekly meetings. So join the meeting if you are interested in this feature. And I mentioned earlier that you want to quiesce the application before taking snapshot and unquiesce afterwards. But if it is not possible to quiesce the application or if it is just too expensive to quiesce the application, but you still want to be able to take a crash-consistent snapshot. And also the application might require snapshot to be taken for multiple volumes that are part of the application at the same point in time. That's when consistent group snapshot comes into the picture. We introduced new Kubernetes APIs. There's the volume group snapshot that's a username space object. It represents users' request for group snapshot. And we have volume group snapshot content that represents a group snapshot on the storage system. And we have volume group snapshot class that defines the type of group snapshot you want. That is defined by the admin. And we also have CSI spec changes. We introduced a new group controller service. And then we also introduced the new gRPC interfaces for you to create, delete, and get volume group snapshot. So this feature was first introduced in 1.27 release. And we continue to work on it. And we finally finished the implementation in 1.29 release. So now let me hand it over to Dave to give a deep dive on CBT. Thanks, Xing. So change block tracking. So when we do snapshots currently, change block tracking is a complement to the existing CSI snapshots. So if we take a snapshot today, after we've snapshot the volume, often we'll want to export the data. And then what we'll do is you clone the volume, and then you have to either mount it as a file system and work through the file system and find changes or compare that against what's in your backup repository. Or you can try to do a block mode export. But again, you don't know what's changed. You just have, you're just reading blocks. You can do some de-dupe on the back end, but it requires a lot more IO. Many of the storage systems that Kubernetes is running on top of will actually keep track of what the differences are between two snapshots. So this is, so our change block tracking API is a way to standardize and expose those interfaces from different storage vendors through CSI up to a Kubernetes backup application. So as you can see here, in this diagram, say we have a snapshot, we took the T1 snapshot and then we ran for a while and blocks two, six, eight, nine got modified. Then we take another snapshot and then we can ask and query the system, hey, what changed? And it'll give us back that list of changed blocks. So that's where we started from. And this gives us much faster backups because obviously we don't have to pull all of the data out. We don't have to de-duplicate it. We don't have to store it. So we know what's changed. We obviously want a more generic interface so that backup providers don't have to do storage system specific integrations. That's where we've been up to this point. We've seen a number of different backup applications are doing storage system specific integrations, but it doesn't cover everybody. And as things change, you have to go back and update the backup applications. And we've got other things we have to do in here, like full backups and integrate with the storage system. So this has been the project, one of the major projects in the data protection working group for, I don't know, what, two years now? It's about two years. And we've gone through a lot of discussion, design. We've had a lot of community involvement, which has been fantastic. And so we've wanted to define change block tracking for CSI. And then we also had to discuss the security implications of this as well. We looked at various approaches this. In like a worst case scenario, you can wind up with say five gig of change block tracking data on a one terabyte volume, you know, is the difference between two snapshots. So we had to find some ways to get that out without overloading things like the API server. Now, so far, we haven't defined a data path for directly accessing snapshot data that may come in the future. So our data path is still going to be attaching a clone of a snapshot, a volume created from the snapshot to read the data. And we're not yet doing things like being able to tell what files in a file system have changed. So we went through a number of different approaches to this. Kind of we started with, let's just put all the data in as custom resources. When you ask what we can have the CSI driver populate the custom resources, but that rapidly ran into space problems in the API server. As I said, there's about up to say five gig, which is a lot more than you want to plug into your API server, especially when this is like for each chain pair of snapshots that you're comparing between. We looked at an aggregated API implementation where we would have the CSI driver provide virtual resources into the API server. But then again, we ran into concerns about how much data was going to be pushed through the API server, even though we weren't storing it there anymore. And so what we wound up at was more of an imperative interface where we can connect directly to the CSI driver and query it. So where we're going to is this GRPC service that you can attach to with authentication via the API server. So this is a little diagram of how things would work. So you've taken a couple of snapshots and now you're looking to export data. So your backup system is going to go ahead. Step one is going to get a token from the API server. Then it's going to execute a GRPC against the interface of the CSI driver, which we've exposed. That then verifies the token with the API server, verifies the snapshot IDs. So when you do the get delta, for example, you'll give two snapshot IDs to compare against. We'll verify the snapshot IDs and then go ahead and call into CSI. So we've added some new calls into the CSI spec down below to support this. The data will come back and we'll stream it back to the backup system. And it could then use that to make decisions about what blocks to actually copy and backup. You'll notice in step two there's actually two different calls. One is called get delta, which will get the differences between two snapshots. And another one is called get allocated, which will simply return the list of blocks that have been used so far. So for example, like on your first backup of a disk, traditionally we'll do a full backup, but quite often disks are not fully used. So if the storage system supports it, it may tell us, yeah, you want to do a full backup, but reality is only these 10 gig of storage in your terabyte volume have actually even been written. So you can just start from that. So that's another one of the features in there. So where are we today? So we've got a cross company, you know, community driven team. So I was originally started by a person from Dell EMC who's since left the Kubernetes stuff. Ivan Sim is now Dell EMC, Prasad, Gungal, and Carl Braganza are from Veeam. We've had input from pretty much everybody in the DP working group, people from Red Hat, people from Dell EMC, people from HP, IBM. So that's, I think it's been a really good collaborative effort. So we're still targeting get the KEP merged into 1.30, may or may not make it. We'll see. But it's under review. We're still taking comments on that. We have made the modifications to the CSI spec and gotten those merged and actually merged the code to support those. We have a prototype of this working at this point and we're planning to set up a new repo for the CBT work under Kubernetes CSI because this is going to be alpha for a while. We didn't want to merge that in with the existing GA code. So then back to the working group in general, you know, we're starting to wrap up on change blog tracking, which has been a major effort. We're looking for places to go next. So one of the areas of discussion right now is replication and how we can leverage storage system features that would allow us to control replication of volumes between storage systems. There's some proposals on that. We want to look a little higher up the chain and look at things like how can we actually replicate Kubernetes resources or hook up applications so they can replicate. We published our first white paper was on the needs for backup data protection in Kubernetes when you need it, why you need it. Now we're thinking about putting together another white paper, which would explain how you make your application backup restore aware. So there's some, there's some tips, you know, there's some tricks there because often, you know, you'll be in like a crash consistent state. So your application has to recover from that. So we're thinking about putting together a white paper that would lay out what all the tricks are and things you can do. So we'd love to have people get involved. This is not just for people who work on data protection software or storage systems. We'd very much like application developers. We'd like end users to give us input. You know, what do you need? What do you see as, you know, directions we can be going in? So we have our home pages here. We do a meeting every two weeks on Wednesdays at 9am Pacific. We have the mailing list and the Slack channel. And I think right, we're supposed to have a QR code. I think that's, that's it for our presentation. We're happy to take any questions you might have at this point. Thanks over here in the white. Yeah, go into your disaster recovery. I mean, are there any specific like projects that you're looking at or that you're using? I mean, I was thinking of, you know, canister or just thinking also, you know, you're not the cast and K10, for example. How does that come into play? Or are there, is there still a huge menu that you can use that you're looking at? So we are not like prescribing backup systems from the working group. So we have vendors from, you know, multiple different backup applications, Dell EMC, Veritas, ourselves at beam cast and so it's not really appropriate for us to say, hey, this is the best one. They're obviously what I work on is best. So yeah, so there are a number of products out there. We do have open source projects like canister of allero as well that can handle this. You're certainly welcome to come by and discuss, you know, like what would work for you or, you know, we can give you some recommendations. The usual grain of salt, right? I mean, if I'm talking about K10, obviously I'm going to talk up K10. Where do you find that you decided a white paper though for the storage? Where do you find that white paper? It's on our homepage, right? On the repository, the GitHub repository? Okay. Yeah, it's actually checked in as a Kubernetes thing in Git, but yeah, you can get it from there. Great, thank you. Yeah, thank you. Any other questions? Over here? That's option. So the question was, will the changes in the CSI driver to support CBT, will these be required or optional? Those are optional. If you look at CSS back, actually, there are very small number of APIs that are actually required. I think it's only like mount on mount. The rest of them are optional. Yeah, this is definitely an advanced feature. Okay, well, we can go ahead and wrap up then. Thank you very much for coming. You know, if you do have stateful workloads, we'd advise you, you know, we ask you to come check out the group, take a look at the white paper, see if it makes sense for you. If you have comments, you know, we're open to them and we'd love to see you come by the working group and just, you know, tell us what you need or what you think about what we're doing. And thank you so much. Thanks.