 Welcome, everyone. Today we're going to be talking about how we can connect to a different ecosystem, OpenStack and Kubernetes, and the different projects we have to do this. Who are we? We are part of the Red Hat OpenStack Storage team. And today we'll have Christian talking about Ember CSI and Manila CSI. He's an OpenShift contributor and Ember CSI contributor. We'll have Eric Hardney talking about Cinder and Cinder CSI, and he's a Cinder contributor. We will also have Tom Byron on the Q&A station afterwards, and Manila everything is Manila contributor, XPTEL, and Manila CSI contributor. And there's also me, who I'm basically contributing to Cinder and Ember CSI. I always, so before we can discuss the different projects that connect these both worlds, we have to understand some basic concepts in that I will be introducing and then we'll have the different projects. We'll go over the different projects. I always like to start from the beginning, back when everybody was saying that there was no need for storage in containers because everything was microservices and stateless. And then just to, as a reminder, it's good to know that PBs didn't exist in Kubernetes until version 1.7 and they didn't reach GA support until 1.14, and there was not even dynamic provisioning back there. So this is to show how it has been a process adding storage to Kubernetes. And in this process, there has been in-tree drivers and then there came out of tree drivers, which are called flex volumes. But this is what happened in Kubernetes, but from a vendor's point of view, there are many container orchestrators. We have Docker or Swarm, we have Messos, we have Cloud Foundry, and this creates problems for vendors because they have to maintain a lot of different drivers, but also for users. Because if they're using a specific storage in one platform, they may not be able to change to other platform that doesn't have the driver. There's also a problem with the code reviews because if the drivers are in-tree in Kubernetes, then the bandwidth required for reviews is reduced by looking at that code. So Kubernetes wanted to move those drivers out of tree, but the flex volume or the flex interface was not enough to do everything they wanted. So what's the solution? The solution is a new standard. That's obvious. And this standard is called the container storage interface, and we all know how the story about the standards go. There's always a new one that tries to resolve and merge all the previous one. How is this side different from the others? First of all, it's an open standard that everybody can contribute to, and it's a joint effort between container platforms and storage vendors. So everybody is in line to getting this done and supported everywhere in the most standard way. The CSI specification defines, first of all, architectures, and here is one of the architectures that define where you have your controller nodes or mastering Kubernetes and your Minions or worker nodes. And you also have your storage. Within this architecture, they define that storage can have different networks, one for management and another for data. And how on the controller nodes or master nodes, there would be a controller CSI plugin that implements a specific set of GRPC calls and this service could connect with the orchestrator via Unix sockets. Also, there's a different plugin service, which is the node CSI plugin, and this one would only need to connect to the data plane. And it also implements a different set of operations to interact with the orchestrator. What CSI does not define is how we deploy this whole thing. How do we deploy the controller CSI plugin? How do we deploy the node CSI plugin? That is outside of the scope of the spec. Then another thing that they don't specify in this spec is how we, the orchestrator must architecture its code to interface with these GRPC calls. That is up to the different orchestrators. For example, in Kubernetes, what we have for the implementation is a set of type card containers that interface with the Kubernetes objects. So, for example, the start non-provision or would look at the PVCs, PVs and storage classes and make the calls to the controller CSI plugin to create or destroy PVs. On the worker side, we have a cubelet that is the one that is going to be interfacing with the different node CSI plugins that are running on a specific node. And the side container that is relevant here is the node drive register. This one makes sure to inform that there's actually a plugin running there. Most CSI plugins currently do not discriminate between master and minions, and they deploy everything in the worker nodes. From a vendor point of view, the benefits of CSI are clear. You only have to develop one driver and you will support all the different container orchestrator. But what about from a developer's point of view? From a developer's point of view, you just create a persistent storage claim and that works mostly for every single platform, I mean, for every Kubernetes platform. So you can use the same channel. You only need to change the storage class name from your public to your private cloud. So you are decoupling all the storage information from the application, just like Kubernetes does. So it stays in line with the philosophy of Kubernetes. There are two important things to remember in Kubernetes, and those are the difference between volume types and volume nodes. The CSI defines two access types, block and mount, and several access modes, five different ones. But in Kubernetes, they are named slightly different, block is still block, but mount is file system, and this is the default for Kubernetes. The access mode are reduced to three. We have read-write ones for only many and read-write many. One different important aspect to highlight here is that the read-write many is completely different from block and file system. For block, it means it requires that the software using it has knowledge that the block is actually being shared, and this is the case of some databases and virtualization, for example. Whereas when using the file system, the application doesn't really care about that. For example, if you're using Nginx or something like that. Now, Eric will continue with the cloud providers. Thank you, Gorka. Yes, Kubernetes cloud providers are modules that provide integration with the underlying infrastructure provider. The amount and types of integration they provide vary, and integration is optional, so this depends on what provider you're using and how you're deploying it with your cloud. These providers unlock features like node address and zone discovery, cloud load balancers for services that you've configured as type load balancer, IP address management, and provide cluster networking via VPC routing tables, which can be helpful in combining multiple sets of nodes over larger networks. Where is the code for these providers? So, in Tree, we have the kube controller manager, and out of Tree providers are used via the cloud controller manager. So, cloud provider OpenStack is the provider to connect Kubernetes to your OpenStack cloud. This enables you to run hundreds of independent containers of service clusters on an OpenStack cloud. And since you have Keystone as part of OpenStack, this provides hard multi-tenant separation for multiple Kubernetes clusters with independent ownership. And then you have Kubernetes admins that are regular OpenStack users that are not admins, but can request resources from the OpenStack cluster to scale their deployment as needed on their own. So, this is kind of the whole picture of this. So, as I was saying, the entry providers are managed by the kube controller manager. Out of Tree providers are managed by a custom cloud controller manager that interacts with the underlying cloud infrastructure. And the OpenStack provider was in Tree and is now an out of Tree provider. The OpenStack provider integrates with OpenStack services like Octavia, Cinder, Keystone, Manila, Barbican Key Management System, and Magnum. And when doing that, some of these OpenStack services require pods to run in your Kubernetes minions. So, Cinder, for those of you that are not familiar, is the OpenStack block storage service. Traditionally, this provides volumes to VMs. Cinder supports dozens of different storage drivers for different hardware backends, and delivers storage over protocols such as iSCSI, FiberChannel, RBD, NFS, to provide block storage. It has a full set of features like snapshots, cloning, migration, quality of service management, FiberChannel zoning, pool management. So, Cinder CSI takes these features that are provided by Cinder and lets you use them for storage when using Kubernetes on OpenStack. This used to be part of CloudProvider OpenStack. Now it's out of Tree as a Cinder CSI plug-in. And on the Kubernetes front supports topologies, snapshots, volume cloning, volume expansion, inline volumes, raw block volumes, file system volumes are provided by attaching a block device and formatting a file system onto it and mounting it. And do RWX multi-attach volumes with the block interface or file system if the containers are on the same node. I'll walk through an example of what this looks like. In OpenStack we have our compute nodes where we're running VMs and controller nodes where everything else is run. And when we deploy Kubernetes on OpenStack, we have the Kubernetes master running on a VM with the Cinder CSI controller plug-in and then menus running in other VMs where we run the Cinder CSI node plug-in. So when we do a volume attachment, Kubernetes expresses a need to have a volume present. And the CSI controller pod reacts to this and tells Nova to do the attachment. So this is an OpenStack Nova API call. Nova will call Cinder which will then talk to the back end and map the volume on the storage array. And then Nova will attach that volume to the VM. And so at this point the CSI controller part is finished and Kubernetes will call the CSI node plug-in which will present the volume where Kubernetes asked for it to be attached. And I think now I'll hand it over to Christian to talk about Ember. Alright, thanks Eric. So Eric introduced the Cinder CSI and Ember CSI is also using Cinder or part of the code base of Cinder to provide storage access to pods within Kubernetes and OpenShift clusters. Ember CSI reuses the existing code and OS brick but at the same time you don't need to run Cinder services nor RabbitMQ or MariaDB. All the metadata itself that is required to run this stored in Kubernetes, customer source definitions and this makes Ember CSI pretty lightweight or much more lightweight than Cinder itself and supports bare metal deployments. So in comparison to Cinder CSI which requires running OpenStack deployment, Ember CSI can actually provide storage access for bare metal OpenShift and Kubernetes clusters without any additional needs from OpenStack itself. This is done basically by using a new library called Cinderlib which GOKA was developing on and it provides basically the same set of features or many features that actually come with Cinder for example snapshots, cloning, expansion, block and fire system support and read write many block support as well as topology support within Kubernetes. So similar to what Cinder CSI does, Ember CSI is also using infrastructure nodes and worker nodes and the infrastructure nodes typically run CSI controllers where Ember CSI is running as well as a couple of sidecar containers for example the attacher, provisioner and if needed and wanted the snapshotter and resizer. These sidecar containers are standard CSI containers which are typically used by many different storage baggants. Every node that wants to provide storage access for the parts running on it also needs to run a demon set, a CSI driver demon set and again Ember CSI is running here using Cinderlib and this demon set or Ember CSI to be more specific is accessing the backend storage directly. There are multiple ways for different drivers. In the future you will have like for example different containers depending on the driver that someone want to run or want to use as a storage baggant and the communication between the Ember CSI pods, controllers and nodes is typically done using Direct Unix domain sockets. From a deployment perspective it's pretty easy because as a Kubernetes administrator you don't need to deal with any Cinder specific settings. You only need to know about your storage baggants and for example if you're using the web interface from OpenShift and want to create a new storage baggant let's say for an LVM volume like in this case or this example you only need to provide a couple of settings that are only storage specific but not Cinder specific. So you don't need as an administrator you don't need to learn about Cinder and all the details that come with it. From an application developer as Gorka mentioned earlier it's again just creating a persistent volume claim and you define the storage class and by doing so you automatically get access to this volume or storage that Ember CSI provides for you. Now sometimes there is also a need to not only have like block storage but fire system storage and that's where Manila comes into play. So Manila is the shared and distributed fire system provider with an OpenStack supporting multiple baggants as of today approximately 35 different storage baggants. Some of the more popular ones are CefFS and NFS and it's a self-service provisioning of these fire shares for compute workloads. So if you run multiple virtual machines with an OpenStack and you want every machine or every virtual machine to have access to the same set of files for your workload you would use Manila. It's multi-tenancy aware meaning that one fire share from one tenant is not accessible by other tenants which is like very great actually achievement using on public and private clouds. In general it's pretty hard to actually provide consistent persistent storage for replicated ports and virtual machines, pretty tricky. So with Manila providing this for compute workloads why not use the same code base or the same approach to provide this for containers? And this is where Manila CSI comes into play. So very similar again to Cinder CSI and MS CSI you have a capital of controller and node plugins running with additional protocol specific deployments if needed. You have the general site car containers running provisionals and attaches and this provides access to the persistent fire system volumes across multiple nodes and across multiple ports running on these nodes. The deployment itself consists of demon sets, stateful sets, that's role-based access controls and there are basically two ways to deploy this. There's the one approach, the upstream approach which is using ham charts to deploy and there's a way in OpenShift to actually use the so-called operator which makes it pretty easy to install Manila CSI because this will handle everything for you. So when you deploy Manila CSI as an administrator you just in OpenShift you just select the operator from the operator catalog and it manages and creates storage classes per share type for your later usage. In general as an application developer again like I mentioned you just need to define a persistent volume claim with a storage class name that references your Manila CSI file share backend and you're done. This is a great approach actually for all these CSI implementations as an application developer I don't need to deal with like storage specific settings. I just need to know what is my storage class name in everything else is handled by somebody else. So let's summarize what we heard so far about the different CSI implementations and CSI itself. So currently there is a migration from storage code that is being removed from the Kubernetes code base and migrated out of tree and the way how it is done is by using a standardized interface, the container storage interface, which provides actually the opportunity for developers or any different storage backend to add a code base or add a driver to use the storage backend. There are differences between the access modes. So for example read write many, you have block and file system modes and if you want to or need to decide which project to use there are basically two questions that you need to answer. First one is are you running Kubernetes or OpenShift and OpenStack and if so then you need to decide between Cinder CSI and Manila CSI. So Cinder CSI the easiest way of providing read write many access to block storage and Manila CSI providing read write many access to file system or file share storage. On the other side, if you're not running Kubernetes and OpenShift on OpenStack but just a bare metal deployment, then Embassy CSI comes into play and provides for example read write many block storage for your deployment without the need for OpenStack or Cinder or any other related and required services. This ends our talk. We hope it was useful for you and informational and if you have any questions please let us know.