 Hello everyone. Today we are going to give a presentation about Kubernetes 6 Storage. My name is Xin Yang. I work at VMware in the cloud storage team. Hi, I'm Michelle. I am a software engineer at Google. So here are the 6 storage leads. Sadali from Google and myself are co-chairs. Michelle from Google and Yang from Red Hat are tech leads of 6 storage. So we are going to talk about what is 6 storage, what we did in Windows 19 release, what's coming in Windows 20, our future plans. We'll talk about cross-seq projects and working groups, and finally how to get involved. So what is 6 storage? 6 storage is a special interest group that focuses on how to provide storage to parts in your Kubernetes cluster. 6 storage scope is in the storage control plane. It provides a way for containers in the parts to consume block or file storage. This can be persistent long-term storage that leaves beyond the part's lifecycle, or it can be ephemeral temporary storage which becomes available when the part is started and goes away when the part goes down. 6 storage is responsible for the lifecycle of volumes used by parts. This includes provisioning a new volume, attaching a volume to the node, and mounting it so that the part can use it, unmounting, detaching, and deleting the volume when it is no longer needed, taking snapshots so that it can be used to restore the volume if the original volume is corrupted for some reason. 6 storage also looks at how to influence the scheduling decisions based on topology information to see whether the storage is accessible to a node and make sure volume is scheduled to a node which can have access to the storage. Also 6 storage is responsible for managing storage capacity, managing quota resources based on capacities or number of resources, and also provides ability to expand volume if a volume runs low in space. So here are some of the 6 storage features. 6 storage owns the persistent volume and persistent volume claim feature. This allows storage vendor to create a volume and persist data in this volume which can be preserved even if the part goes away. We have a storage class concept, a storage class provides a way for administrators to describe the classes of storage they offer. Different classes might map to different quality of service levels. In dynamic provisioning, storage class is used to find out which provisioner should be used and what parameters should be passed to the provisioner when creating the volume. Kubernetes plugins include entry plugins, auto tree flex volume, and CSI drivers. Both entry plugins and flex volume are deprecated. CSI driver is the recommended way to write plugins. The Kubernetes implementation of CSI has been GA since the 2013 release. 6 storage has been working on migrating from entry plugins to auto tree CSI drivers. New features are only added to the CSI drivers. We will be talking about the status of CSI migration in later slides. CSI is supported by multiple container orchestration systems and storage vendors. Other than persistent volumes, there are also ephemeral volumes. Ephemeral volume is specified directly in a pod stack. It's mounted on the pod as a directory. Data can be stored in a file under that directory. Ephemeral volumes include secrets, config maps, and CSI inline volumes, and so on. And its lifecycle follows the lifecycle of a pod. At the bottom of this slide, you will see the 6 storage home page. Based on this page, you can find lots of information about the SIG. So now I'm going to talk about what we did in the 1.19 release. There were a few features that went beta in 1.19. Both Azure Disk and vSphere CSI migration was promoted to beta in 1.19. This is part of an effort to move entry cloud provider plugins to arbitrary CSI drivers. To use this feature, users must also deploy the equivalent CSI driver to CSI migration feature flag that is defaulted on in 1.19 release, and also the CSI migration provider feature gate for the corresponding cloud provider. In 1.19, CSI Windows was also promoted to beta. CSI Windows was introduced as an alpha feature in 1.18 release. CSI Windows relies on a CSI proxy to perform privileged operations, such as mounds and format disks, because Windows containers are not privileged. CSI drivers communicate to CSI proxy through a gRPC API. Supported protocols include BLOCK, SMB, and ISOCASI. The other feature got promoted to beta was immutable secrets and config maps. This allows secrets and config maps to be read-only. Both 1.19 expansion and 1.19 snapshot stayed in beta in 1.19 release, but we made lots of improvements. For 1.19 expansion, we added offline 1.19 expansion support detection, so CSI driver can return a filled pre-condition error code if only offline expansion is supported, but the user has requested an expansion when 1.19 is online. This way, Kubernetes CSI reset the sidecar will stop calling controller expand volume. For 1.19 snapshot, we added a validation web hook to validate API objects, move the snapshot APIs and client library to a separate goal package. In 1.19, we also introduced a few new alpha features. We introduced this CSI storage capacity tracking, so we added a new CSI storage capacity API object in Kubernetes core API in the storage API group. Without this feature, part scheduling was done without considering that the remaining storage capacity may not be enough to start a new part. With this feature, CSI driver can report available capacities associated with node topology and storage class through the dead capacity CSI function, and Kubernetes scheduler will make placement decisions when choosing a node for a part based on this information. This feature is a stabbing stone for supporting dynamic provisioning for local volumes. The second alpha feature introduced in 1.19 is generic FMRO volume. Kubernetes has volume plugins whose lifecycle is tied to a pod and can be used as a scratch space. For example, we have built-in empty-door volume type, or those type of plugins can be used to load some data to a pod. For example, we have the built-in config map, security volume types, or CSI inline volumes. This new generic FMRO volume is different. It actually allows any existing solid driver that supports dynamic provisioning to provision an FMRO volume with the volume's lifecycle bound to the pod. This can be used to provide scratch storage that is different from the root disk. All the parameters used in a storage class for the volume provisioning are also supported by this generic FMRO volume, and all features supported with persistent volume claims are supported. That includes the storage capacity tracking, snapshots, and resizing. The third new alpha feature introduced in 1.19 is CSI volume health. Without this feature, Kubernetes does not check whether the volume is healthy or not after volume is provisioned. With this feature, now CSI drivers can share abnormal volume conditions from underlying storage system with Kubernetes so that they can be reported as events on PVCs or pods. This feature serves as a stepping stone towards problematic detection and correction of volume health problems by Kubernetes. The last new alpha feature introduced in 1.19 is CSI driver policy for FS group. We added a new alpha-level field to support FS group to the CSI driver so CSI drivers can now specify whether they support volume ownership and permission modifications. That's all the beta and alpha features we added in 1.19. Now I'm going to hand it over to Michelle to talk about what's coming in the 1.20 release. All right. Thank you, Sheng. In 1.20, we have a lot of good things planned that are graduating. First is the CSI volume snapshot feature. We are targeting GA in 1.20. This feature gives you the ability to be able to take snapshots of a persistent volume claim and then be able to create new PVCs from those snapshots. This is very important for supporting the backup and restore use cases in your applications. The next feature that we are planning to graduate from alpha to beta is non-recursive volume ownership, specifically for the FS group setting. This feature gives you the ability in your pod to specify that you only want to change volume ownership of a volume only if the FS group setting has changed. This is an important performance enhancement for if you have a volume that has a lot of files on it and the ownership doesn't change very often. This will be an important feature for you to make your pods be able to start up faster. The third feature that we are targeting beta in 1.20 is sort of related. It's the CSI driver policy for FS group. This is the ability for a CSI driver to be able to specify if they support FS group modifications or not. This is planning to graduate to beta in 1.20. The last feature graduation is Azure file CSI migration and this is targeting beta in 1.20. I believe after this feature, Azure file graduates, that means all of the entry cloud provider plugins, which includes GCE, PD, AWS, EBS, vSphere, and Azure disk and OpenStack. After 1.20, all of the CSI migration features will be beta. That's for the graduating features. Also for new features that are being introduced, we are targeting a new feature to be able to pass pod service account tokens to CSI drivers. That's targeted to be alpha in 1.20. This is a really important feature to enable CSI ephemeral volumes such as the cert manager and secret store that act on behalf of pods. Having being able to use get the pod service account token to be able to make queries on behalf of the pod is an important feature for those types of CSI plugins. That's all that we have planned for 1.20. Moving on to beyond 1.20, we have a couple of a number of different features in the pipeline. First are the graduating features. First is volume expansion. This has been long in beta for quite a few releases now. We are working on trying to iron out all the remaining design issues. We're hoping to be able to target a GA of this feature in 1.21. Important change that we're making is we are deprecating the online offline volume expansion CSI plugin capability. The online and offline expansion is still supported. It's just that the actual CSI capability will be deprecated and instead will be using specific error codes to be able to determine if a plugin can support online or offline expansion. The next major feature that we're planning on graduating within the next couple of releases is CSI immigration. This is the effort to move all entry volume plugins to their CSI counterparts. Once all the plugins are beta in 1.20, then the next phase is going to be looking into targeting on by default in 1.21. Assuming all goes well, then GA-ing of the feature in 1.22 followed by the removal of the entry plugin code. The important thing to take away here is that if you are managing your own Kubernetes distribution and you are using one of these entry cloud provider implementations, then you will definitely want to pay attention to the CSI migration feature and work towards deploying it and using it in your environments as we plan to eventually remove these plugins from the core of Kubernetes starting around 1.22, 1.23 timeframe. The last graduating feature that we have targeted is CSI Windows. This is being able to support the Windows OS for CSI drivers. I believe the target here is GA in 1.21. This is also some of it is a little bit up in the air because there is a new effort going on in SIG Windows for allowing privileged containers in Windows. That might adjust some of the timelines here, but I think for the most part, the main plan we have right now is to target GA of at least CSI proxy Windows support in 1.21. Those are all the graduating features. We have a number of other features that are in the design and prototype phase that we should be able to see new output implementations in the upcoming releases. First up is recovery from volume expansion failure. This is right now the way the volume expansion feature works is you can't actually undo a volume expansion request once it started, even if it was failing. There is a design and prototype going on to figure out how we can implement the recovery behavior in a safe way and also have it aligned with some of the other expansion efforts that are going on in other SIGs, such as there is a pod resizing effort going on. We will want to make sure we align our APIs to be consistent with those. Another feature currently under design is to be able to support non-recursive SC Linux and CSI driver configuration. This is very similar to the effort that is going on currently for FS Group, but also applied to SC Linux. Another feature is adding more plugin support for the CSI migration feature. The initial phase of CSI migration was targeting the entry cloud providers and the next phase will be to target some of the non-cloud provider specific implementations starting with Ceph. That is currently under prototype. Another feature that we are currently actively designing and discussing right now is volume groups. This is sort of the notion that a group of PVCs are related to each other in some way, such as all the PVCs are replicas of a particular stateful set as an example. This concept is going to enable a lot of various use cases, some of which include being able to have snapshot consistency across multiple volumes, and another use case is to be able to do failure domain spreading across volumes in the same group. Another feature being actively worked on right now is the generic data populator feature. This feature offers the ability to create third-party populators that can basically take a newly provisioned PVC, inject it with some data, and then hand it off to the actual pod that's using that volume. So this is a very interesting use case that can, for example, some use cases include being able to populate particular data sets into a volume before giving it to a user, also being able to populate things from third-party backup solutions and other use cases like that. So that's generic data populators. And then the last major feature that's currently in prototyping right now is the container object storage interface. This is going to sort of follow the patterns that CSI has developed in that the container object storage interface is going to be designed as sort of across a portable object storage interface layer for different object storage implementations. This effort is currently under prototype and there are weekly design meetings for these. So definitely if you are interested in these efforts, please join us in SIG Storage and you can go ahead and participate in all of these discussions for these various features. All right. So in addition to the projects that the SIG is directly working on, we are also involved in working with other SIGs in Kubernetes to work on some cross-SIG projects. First up is the data protection work group. This is a work group that is run by both SIG storage and SIG apps. One of the major features that the work group is focusing on right now is how to implement queuesing and unqueuesing to do application level snapshots. There is a feature that the work group is working on with SIG node called the container notifier and this will basically provide a way to for another application or another controller to be able to run different commands in a container to start and stop the file systems or writes while snapshots are being taken. So that's a very important effort that we're currently under design and prototype right now. In addition, we're working with SIG apps on a number of features related to stateful sets. First is supporting volume expansion for stateful sets through the stateful set volume claim templates. And then the other feature we're working on is to be able to clean up PVCs when a stateful set is deleted or scaled down. So these are both efforts that we're sort of co-working on with SIG apps so be on the lookout for those. And then with SIG scheduling, we are working on a new alpha feature to be able to prioritize pod scheduling based off of available volume capacity in a particular node. This feature is going to target alpha in 1.20. All right. So as you can see, there are a ton of different features and projects that we are working on in SIG storage, and we would definitely appreciate all the help we can get. So if you are interested in getting involved, please go to our SIG storage community page. There you can find the details for our bi-weekly meetings every 9 a.m. Pacific time, every other Thursday. Please join the mailing list so that you can get access to the calendar. And also the site has links for our Slack channel that you can also join and feel free to ask questions or if you want to get involved with anything, feel free to ask questions in the Slack as well. If you want to get one of the best ways to start getting involved is to just start looking at the code. One of my favorite methods is to just start with the main method and walk down it until you get to the part you need. It's a great way to run through the code and sort of start at the top level and see where it's going. We have a lot of different bugs and issues that are filed every day, and we could definitely use help fixing them. So please, if you're interested, look through the open SIG storage issues in the Kubernetes-Kubernetes repo, along with the Kubernetes CSI repo. Look for the help-wanted labels. Also, there's a good first issue label as well for those that are getting started. And if you're not able to find anything there, feel free to just ask in the Slack again and someone can go and help you out and try to find something for you. All right. And some other ways that folks can help out is to help write features. So as you can see, we have a lot of features that are sort of in progress and we can definitely use help implementing various parts of those features. So again, if any of these features we're working on interests you, please attend one of our SIG storage meetings. Go ahead and ask in the chat and we can definitely connect you with the right people to get started on writing these features. Every SIG storage meeting, we go through our SIG storage planning spreadsheet, which is where we track all of the features we're working on for a specific Kubernetes release. And at the beginning of every quarter, we plan the features that we want to work on for the next Kubernetes release. So these are definitely a great place to start getting started. And if there is a new feature that you're interested in that isn't being actively worked on and you want to work on that, please also attend the SIG storage meetings to talk about it so that we can get it onto our planning spreadsheet and we can make sure all the right people are involved and that the release or that the feature will be appropriately tracked by the Kubernetes release team. All right. So that's how you can get involved with the SIG. And with if I guess this session is the last session of QPON, but if you've missed, if you're interested in seeing other storage sessions that were going on, here are the links to a lot of the other storage presentations that have already happened. I think all of these videos should be available on YouTube in the next couple of days. So if you see something that is interesting to you, please feel free to check out those other sessions. All right. So I think we are done for the presentation here and I guess we'll go to a Q&A now. Thank you.