 Hello, everyone. Today, Michelle and I will give an intro and deep dive of Kubernetes 6 Storage. My name is Xin Yang. I work at VMware in the cloud native storage team. I'm also co-chair in 6 Storage. Hi, I'm Michelle. I'm a software engineer at Google and I am a 6 storage tech lead. That's Michelle. And here is today's agenda. First, we'll talk about who we are, what we did in 1.28 release, and then we'll be talking about what we're working on in Kubernetes 1.29 release, and what features we are designing and prototyping, and finally, how to get involved. In 6 storage, Sadali and myself are co-chairs. Michelle and Yang are tech leads. Other than 6 leads, we also have many other contributors in 6 storage. There are more than 5,000 members in the 6 storage Slack channel. We also have several other Slack channels as well. We have about 30 unique approvers for 6-owned packages. What we do in 6 storage is defined in our chatter. 6 storage is a special interest group that focuses on how to provide storage to parts running in your Kubernetes cluster. Most notable features in 6 storage include persistent volume claims and persistent volumes, storage class, and dynamic provisioning. 6 storage has volume plugins. In addition to persistent volumes that possess data beyond the POTS lifecycle, we also support ephemeral volumes such as secrets, config maps, and others that can provide scratch space for POTS and are coupled with POTS lifecycle. We also support a container storage interface CSI so that a storage vendor can write a driver and have it work in Kubernetes and other container orchestration systems. CSI is for block and file storage. We also have an upper-feature cozy that supports object storage. So let me talk about what we did in 1.28 release. We have 2 GA features in 1.28. The first one is reconciled default storage class assignment. This feature allows the PVC without storage class name to be updated to use the default storage class when it becomes available. It differentiates the case when the storage class name is set to an empty string and when it is not set at all in a PVC. With this change, all the PVCs that have storage class name set to an empty string can be bound only to PVCs that have storage class name also set to an empty string. However, PVCs with missing storage class name can be updated later once a default storage class becomes available. If the PVC gets updated, it will no longer bind to PVCs that have storage class name also set to an empty string. And the second GA feature in 1.28 is non-grace for no shadow. This feature was introduced in Kubernetes 1.24 moded to beta in 1.26 and now it is GA. Note that non-graceful no shadow is different from graceful no shadow. A no shadow is graceful if the shadow is detected by Kubelet. Kubelet ensures that the pods follow the normal pod termination process during the no shadow. The node will be joined. It relies on the system D inhibitor log mechanism. However, this mechanism is not supported on every system. If it is not supported on system, then a shutdown command will not trigger this inhibitor log and Kubelet will not detect it. There are also two grace period parameters that need to be configured in Kubelet for graceful no shadow to work. If this is not configured properly, graceful no shadow will also not be triggered. When a node is shut down but not detected by Kubelet's no shadow manager, the pods that are part of the safer side will be stuck in a terminating status on the shadow node and cannot move to a new running node. This is because Kubelet on the shadow node is not available to delete the pods, so the safer side cannot create a new pod with the same name. If there are volumes used by the pods, then the warning attachments will not be deleted from the original shadow node, so the volumes used by this pods cannot be attached to a new running node. As a result, the application running on the safer side cannot function properly. If the original shadow node comes up, the pods will be deleted by Kubelet and new pods will be created on a different running node. If the original shadow node does not come up, then this pods will be stuck in terminating status on the shadow node forever. So the non-graceful no shadow feature allows safer workloads to move to another running node if the original node is shut down unexpectedly or ends up in a non-recoverable state. To use this feature, you need to apply the auto-substant on the node that is shut down. After that, the pod.gccontroller will forcibly delete the pods, and the attach-detach controller will forcibly detach the warning and allow warning attachment objects to be deleted. In 1.28, we also have two features that are staying in beta, where we fixed some bugs. The first one is secure Linux relabeling with mount options. We have a feature that allows the warning ownership and permission change to be skipped during mount to speed up the pods at a time. However, that feature does not apply to secure Linux-enabled systems. So this secure Linux relabeling feature tries to address this gap by mounting volumes with the correct secure Linux context to speed up. We also have a robust volume manager reconstruction. This feature is actually a refactoring of the volume manager. It allows KubeLit to populate additional information about how existing volumes are mounted during the KubeLit startup time so that it will be easier to rebuild and clean up the volumes. In 1.28, we also worked on some other features. Recovering from Resize Failure is a feature that was introduced as often in 1.23 release. It allows users to cancel previously issued warning expansion requests, assuming that they are not yet successful or have failed. It also allows users to retry expansion requests with a smaller value than the original requested size in PVC spec resources, assuming that the previous requests are not yet successful or have failed. In 1.28, we made some API changes. We renamed the resize status in PV status to allocated resource status to make it more general and made it a map so that it can be used for other cases such as this new volume attribute class feature that Michelle will be discussed later. In 1.28, we also added a PV last phase transition time alpha feature. And with this change, now in the persistent volume status, we have this last phase transition time field. It holds a timestamp of when the volume last transitioned to its phase. And for the newly created volumes, the space is set to pending and the last phase transition time is set to the current time. CSI migration is something that we have been working on for several releases. In 1.25 release, the core CSI migration feature moved to GA. CSI migration for open stack sender, azure disk and file, AWS EPS, GCP D and B-Sphere all moved to GA now. Some entry plugins are already removed. Others are targeted for removal. In this table, we see some entry drivers are getting removed. This entry drivers do not go through CSI migration. GlastFS entry driver was removed in 1.26 release. Both SAP APD and SAP FS entry drivers are deprecated in 1.28. We are targeting code removal in 1.31 release. That's all I have. Now I will hand it over to Michelle to talk about what are we working on in 1.29 release. Thank you, Xing. Yeah, so for the upcoming 1.29 release, we are targeting a few features to be promoted to beta. First is the PV last phase transition time, which Xing talked about. And then another feature we are going to promote is the always on your PV reclaim policy feature. So this addresses a long standing issue where if you deleted the PV object before the PV object, then it could cause the volume to be leaked, even though it had the delete reclaim policy. So now with this feature, we are adjusting that issue and the order of deletion between the PV and PVC objects won't matter. Next slide. In addition to these beta features, there are a few new features that we are targeting for alpha in 1.29. First is a change block tracking. This feature provides a common API to get the incremental blocks that have changed between two volume snapshots. And this feature is intended for block based storage and is going to be used by backup software to be able to take efficient backups. So if you're interested in this feature, please join the data protection working group to learn more and get involved. Another new feature that we plan to introduce is a modifiable PVCs. This lets you modify certain volume attributes that are supported by the storage provider, such as IOPS and throughput. There will be a new object called the volume attributes class that is very similar in concept to a storage class, but it has the property that you can modify the PV object in order to change to a different volume attributes class. So let's see an example of how this will look. Next slide. So here in this example, we define two different volume attribute classes, silver and gold, and each of them specify different IOPS parameters. And then the PVC is first created with the silver class. Then at some time later, it turns out that we need more performance. And so then we update the PVC object to the gold class. And when this happens, Kubernetes will make a new modified volume call to the CSI driver. And then the CSI driver will then update the underlying volume with the new attributes. So be on the lookout for this new feature, and we would love to hear your feedback. Next slide. So beyond these features, we have more projects that are in prototyping and design pieces. We would love to get your early thoughts and help on these features. First is volume expansion with stateful sets. This is a long ask feature to be able to update the PVC template in the stateful set in order to trigger volume expansion. At least today, the only way to resize the volumes being used by stateful sets is to modify the PVCs directly. And then the next feature is storage capacity scoring. This is an extension of the capacity tracking feature that is primarily designed to make the scheduler aware of local volume provisioning capacity. And so this enhancement will add the priority rules to the scheduler so that you can configure the scheduler to either try to do bin packing or even spreading for dynamic provisioning of these volumes. Lastly, we have an ongoing investigation to consolidate our CSI sidecars into a single component. This has many benefits such as simplifying our release process so that we can get releases out quicker. And in addition, do regular patch releases of these sidecars, which is not something we can easily do today across our 10 different episodes. In addition, combining the sidecars into a single binary also can improve the overall resource consumption that these sidecars require today. We can do things like share watchers and caches and use shared informers to reduce the overall resource usage. And so there's going to be a lot of work that needs to be done in order to accomplish this and so we're going to need a lot of help. If anyone is interested in looking for ways to contribute to the SIG, this will be a good project to get started in. Next slide. Yeah, so as you can see, there are a lot of different projects and features that we are working on in SIG. We always welcome help. So if any of these projects sound interesting to you, please reach out to us through our Slack channel or join one of our SIG meetings that happen every week, every two weeks. Alright, so that concludes our presentation today. Thank you for watching and we look forward to seeing you in the SIG. Thank you.