 Hi. Good morning, everyone. Welcome. I am Sudhir Prasad. I'm Product Management Director for Container Stories at Red Hat. I have with me Annette Kluet. She is a senior software architect, very well-versed in Kubernetes and storage. She will lead the demo for us. We also have Michael Adam, who is heading the manager and lead for CNS engineering, Jose and Erin, who actively contribute on and developer in the CNS world. So if you guys have any deeper question, they will definitely be staying here after the session to talk about it. So I think most of you guys are familiar about Kubernetes and how you can run your stateful applications with it. Today we'll go and see that, you know, can we run the storage on the Kubernetes platform? Why we are treating that differently? Can it be treated as any other service and become a part of the platform? So if you look at the persistent storage, right, and I categorize them in three broad categories. The first one is basically the platform infrastructure needs, right? It's registry, logging, metrics. If you have other management applications and things like that, right? This is basically persistent storage needs for running the infrastructure, running the Kubernetes platform itself. And the second one is your application persistent needs, the data, you know, input data, what you need, or the results of the analytics and things like that. And the last one is mostly a local storage or ephemeral storage. And traditionally for persistent storage, we are focused on the first two one of them, right? Can we run the platform? Can the application get persistent storage? There is some work going on to really make the local storage also persistent. It's not yet there, but, you know, it will be there shortly. So there are various solutions for persistent storage. And, you know, anyone and every storage company have some persistent storage per container. And extremely confusing, right? And I'm sure they all have unique value proposition into them, right? Something some of the application do better for some specific application. Some of them do good for some infrastructure needs, right? What we'll discuss here is that what is the design pattern they are following and what's the trend and what are the advantages for advantages at a general broad category. The first one is very simple, right? This is how everything started. That, you know, Kubernetes came as a platform. People wrote their drive or a plug-in. And then they have legacy storage or existing storage or the storage from the cloud provider. And then Kubernetes platform, the containers, can get the persistent volume from the platform, right? The key to note here is that, you know, you already had the storage. It's a separate independent subsystem. And you're trying to make that work for your containers, right? That's the biggest advantage that you have the storage. You can leverage it. You can focus on the new platform. But the disadvantage of this approach is, you know, it's still a different subsystem. It's not part of your platform. But it's not, you're not taking advantage of the newer paradigm with your storage subsystem. The second paradigm is basically, you know, let me talk about this a little bit. You may have heard about CSI interface, right? That, you know, these plug-in, you know, then there are different orchestration engines, you know, and these plug-ins doesn't work interchangeably between them. Then it's like, let's make it a standard and this is the CSI initiative that all these drivers should be talking through one interface so you can use Kubernetes, you can use Mesosphere and things like that. But the general principle stays the same, that I want to create a plug-in with my existing storage subsystem or my storage subsystem, a newer one, and then make it work. The second one is basically, can we make it API-driven? We don't really need storage admin to really give developers the storage every time they need it, right? So this one is basically, you know, how we can publicize an API through Service Broker or a service, you know, where developers can go and then get their own storage. As long as admin has given them the privilege or the quota, they can, through the APIs, they can go and access the storage and then do their work without, you know, asking the storage admin for, you know, any storage. The key thing here is that though it's API-driven, it's still a different subsystem, right? You know, you can manage them as a different subsystem, but they can interconnect using a more automated way. The third one is much more interesting, and this is where we will do deep-dive and do a live demo of that one, right? Can we merge a storage subsystem as make it as invisible but an integral part of the Kubernetes platform itself? Here you run the storage on top of Kubernetes platform. You have multiple nodes, you know, one cluster. Some of the nodes, just like it's hosting an application, it's also hosting your storage service. So in this paradigm, right, you have basically one cluster. You know, storage is just like any other service. Just like you have router service, you have a storage service. It's managed and basically maintained by Kubernetes platform itself. I'll go a little bit deeper here. So you have multiple Kubernetes cluster. You have multiple pods. One of these pods will be running your storage node, right? And here I have taken an example of Red Hat container native storage. I'm more familiar with that because I'm product manager for that. So you will have different nodes, you know, depending on how many, how much scale you need. You have multiple pods, you know, or multiple nodes of the clusters. But the key is that just like your application is running on top of Kubernetes platform, you have a storage running on top of Kubernetes platform, right? You get many advantages with that, right? Because you are taking the full advantage of Kubernetes orchestration itself. It will maintain this state. It will scale it just like it can scale other applications. You have one control plane, one management plane. And though storage is a little bit special because attached to the data and it's a little bit heavier, but you know, it will be kind of a integral part of the platform. Because you're running on top of the Kubernetes platform, right? It can run anywhere, wherever Kubernetes runs. So it can run on the private cloud, public cloud, virtual, bare metal, hybrid, whatever you take, right? That's the advantage of taking full leverage of the platform itself. So if you can run Kubernetes on any infrastructure, then the storage can run there. The other advantage you get is that since you are running as a container, right? Container itself has its own value proposition, right? Imagine deploying a storage in a standalone appliance. It always takes some time. Here you can do that in minutes, right? Because it's just another container. You can deploy that, upgrade that. It has full isolation. You can do rolling upgrade, things like that, right? Some customers also has done co-location of the storage nodes and application nodes, you know, for various reasons. You can do that, right? Because you can put them in the same pod. You can do them in different pods in the same host. I'll summarize this one. This approach is basically, you know, how we can take the whole platform, make the storage as integral part of the platform itself, run it on top of the platform, make it invisible, though it's a little bit special, no doubt. But let's exploit the full potential of Kubernetes, full potential of container, and then, you know, of course, the software-defined approach what you get with it. We'll do a demo of the Red Hat Container Network Storage, and I want to take over. Yep. Okay. Can you hear me now? Yeah. Okay. Thank you, Sadeer. Yeah, I want to sort of go first through the demo that I'm going to do, and then I'll actually do it. It will be a live demo. It might look a... I've set it up so that it can't go wrong, right? But you'll be able to see it. It's pretty cool. So let me just move on here. So we are going to go through what Sadeer called the pattern where we're actually using the platform for the storage. The storage from Kubernetes' point of view is just another workload, which, you know, makes it pretty good because at that point, you know, OpenShift or Kubernetes, just to be clear, I'm going to be using an OpenShift environment to show this, but it is a workload and it has all the advantages of being on the platform. I want to just sort of make it simple and clear that there's no magic here. We still need real storage somewhere. And the storage that we're going to consume is connected to the OpenShift nodes or the Kubernetes nodes that we want to create the cluster, the storage cluster with. That storage can be all of the storage types that you can put into a piece of hardware. It could be SSDs. It could be, you know, hard disks. It could be NVMEs. I mean, it could be whatever storage. The main thing is that for each cluster that you create, and this is a cluster cluster, you want to use similar storage. So it's conceivable that I could offer to the developers for dynamic provisioning. I could offer a fast cluster. Think of like AWS, right? I could have a cluster made of I01 EBS volumes, and I could have another cluster made of, you know, GP2 or a slower storage type. And I could offer that and then, you know, based on their rights, they could make decisions about which storage they want to use. The other thing to just point out on here is that this is a simple architecture, a simple Gloucester architecture called replica3. It does require that there are three OpenShift nodes involved, or Kubernetes nodes. You need to have three. At this point in time, you can only have one of the storage containers on each node, so that's why we need three. If I wanted a second cluster, which is shown here with the... I don't know if I can get it from here. But anyway, the fast and the slow, that would take six nodes to deploy that. So the demo that I'm going to do here, so I mean, I've been watching a lot of demos this week, and they're good, but sometimes you can get a little lost in the CLI. So I'm going to do a demo that's going to be CLI, but I wanted to give you some context before that. So my Kubernetes OpenShift environment here has a master node, it has an info node. The info node houses the registry, the router, and then I have three app nodes, and those app nodes are where I'm going to deploy the CNS. You're going to see there's going to be Gloucester pods, and then very important for the service that Cedir discussed is the Hackety API. So there will be a pod that Hackety runs in. The Hackety pod is not tied to any one node, and that's very useful because if the node that the Hackety pod is down goes down, the Hackety pod will recreate itself on another available node, and it will remount its database. So we see here just pointing out that these, you know, this is how we're going to create the cluster. We're going to deploy pods onto the three app nodes. In terms of Hackety, what makes this, I think, a really useful approach is that at the developer view, they just see the ability to create dynamic storage, and they do that via something called a storage class resource, and this storage class is really the glue between OpenShift and the storage provisioning. You see in there that the provisioner is Gloucester FS. So Kubernetes has that provisioner, and you also see that there is a route there. That is the Hackety, that is the route to the Hackety service. You also see a cluster ID. That is a Hackety ID to uniquely identify the cluster. So if I wanted to have different storage tiers, fast, slow, that uniquely identifies that cluster. And then we want to absolutely secure the API because, you know, we don't want people coming in and just creating, you know, PVs, persistent volumes on our storage. So we secure it via, again, an OpenShift resource called a secret, and that makes up the storage class. This, again, is completely a Kubernetes OpenShift resource, and we're going to use it. To the left there, you just see that, you know, Hackety is in the middle. When OpenShift calls in for a new persistent volume, it goes through Hackety. When I delete that project, it's deleted with the project, and Hackety does that work too. We'll also see that Hackety is really useful for inspection, as well as day two activities. All of your day two, adding more storage, adding more clusters, all of that is done via the Hackety CLI. And this, I think, pretty much we've already gone through this, but just to reiterate, you know, it's three OpenShift or three Kubernetes nodes per cluster are required because of the replica three. So the lifecycle, again, I'm going through these slides so you know what's happening when I do the demo, but the lifecycle of this is a developer via a template that, let's say, they want to create, I don't know, MySQL Postgres application in the template. They're going to request a persistent volume. And in that persistent volume request, there will be the defined storage class. So that's really, again, that's just the glue, is I want a persistent volume from this storage class of a particular size. That's submitted via the storage class and the Hackety route to the storage back in. It goes through Hackety. Hackety goes into Gluster, provisions the volume, comes back up, and then that is mounted to whatever the mount point was in the template. So let's get into the demo. Any just burning questions before I get started here? I don't know how much time. We should have a little bit of time afterwards if you have some questions. Okay. So the first thing we have there, I didn't know was going to bottom the screen, but that's okay. So in OpenShift World, instead of using KubeNet or KubeControl or Kube... Anyway, instead of using that, you use OC. See, that's how often I've used the other one. So OC is essentially equal to that. So I say, you know, what do I have? So I have already, I did this a couple hours ago, creating an OpenShift here. And we have the M for the master and then we have the three nodes. So the first thing I need to do or I can do, I can put it in a different namespace or a namespace that exists, but in this case, just to make it clean, I'm going to create a new namespace for my deployment, and I call it container native storage. And now the one thing I do need to do, we're going to see that the Gluster pods are going to use the host networking. So we need to give the daemon set privileged access. So that's done via that. And then I just want to show you quickly here. So this is the topology file. I'm sort of starting from the bottom, but if we look here, this is a node. So we put in the node. We put in the IP address. That IP address actually is equal to the host. So the hosts are going to have the same. And I mean the Gluster pod when it's created is going to have that IP. And then we give it the devices. Important thing on the devices, in this case, this is actually on AWS. So this is an EBS volume. But you can put as many storage devices just comma separated as you want in there. There's reasons that, let's say you wanted one terabyte for your cluster. You may want to do it in 250 gigabyte chunks so that if you need more storage and you need to add more storage to your cluster, you just add another 250 gigabytes. So it is useful. It's not technically required to have the storage. Well, it absolutely has to be the same storage, but to have it also the same size. So all of these are, in this case, only 50 gigabytes. So let's continue here. So we're actually going to do the deployment now. And I should have mentioned it, but topology.json is a... is a Hiketti environment. So Hiketti uses that to create the cluster. So going into some UI here. So here's my new project. And I'll start to see the cluster pods coming up here. Again, there will be, end up, there'll be one on each one of the node 0123 that I showed you. And they're starting to come up. Once they are available, then it'll go on to the Hiketti deployment. And I'll just mention it because I don't want to go back and forth too much. But remember on this slide, I showed you the Hiketti route. And when we get done with the deployment here, it will automatically create that route that we're going to use in our storage class. So let's see how we're doing here. Got one up. So it should go pretty quick. Let me just, yeah, there we go. So now we've finished the... we finished the cluster pod deployment. And now we're going to do an interim pod called the deploy Hiketti that will be used to create the final Hiketti service. You know, this does take a little bit of time, but when you think about what it's going to do here, and this CNS deploy is a Red Hat script. It's essentially just a very large script, but this is very useful. And I just wanted to mention today we're just going to be looking at file volumes of the type file. The latest CNS images are what we call CNS, but essentially cluster images that can be used on Hiketti also support iSCSI targets, as well as the S3 API. So, you know, we started with a file because that is the most sort of commonly used. So we're ready to create and provide cluster volumes. Let me just quickly go back here and show you what we have now. It looks like it's still killing off one of them here, but I just want to point out this route here. Okay, so that's the route. And it didn't paginate really well because it's just a little too big. But you can see that now when I do the pods, you can see that I've got three cluster pods and a Hiketti pod. And if you look at it, each one of the cluster pods is on a... This is the IP address that was in the topology JSON file. So to create our storage class, we need the Hiketti route. So we're going to use that. And that is created as a combination of the namespace and the domain name. We're also... This is sort of... This is just going to just show you one of the Hiketti. This one is a little busy, but what we have here is that we can actually see... We can see that this is our cluster that we just created. We can see that we actually have one new volume. This is the Hiketti DB storage. The Hiketti DB storage is how the PVs are kept track of. And like I said, if the node that Hiketti is on, the Hiketti pod, it will recreate itself on a new availability zone or a new machine and it will remount this. So in this situation, the Hiketti... You'd be headless during the time it was being recreated. You couldn't create or delete any PVs, but as soon as a Hiketti came back up and remounted this, you'd be good to go again. So the other thing we see here, we see that we already have our three nodes and we see that we've used two gigabytes of our total of 49. So very important. I just want to show you what the JSON looks like or the YAML looks like for the secret. That key is... I didn't point it out, but when we did see it as deploy, we gave it an option called admin key. So we set that password and this is the Base64 encoding of that password. So we're going to go ahead and we're going to create that resource. This is an OpenShift resource. So now we have a CNS secret. We also need our cluster ID for our storage class. So that's our cluster ID. Again, this is a KETI CLI command. Very useful for listing what you have. And this is our storage class. I showed you an example, but this is the example for this deployment. So we have our route. We have our cluster ID. If you were doing this, you would have had to copy that cluster ID and put it in, but this is... I did it so that it's already there so we don't have to do that. And we have the CNS secret. So we're ready to go on that. So we're going to go again and create that resource. That resource is, again, a Kubernetes OpenShift resource a storage class name. Just one thing on the storage class name, even if you're doing evaluations, maybe it's just me, but try to do storage class names to explain what it is. So if you're AWS, maybe put GP2 into the storage class name. If you just put Gluster, that doesn't say much about what kind of storage it is. So we now... And the other thing I didn't point out when we were looking at it was that if you're doing evaluations in the storage class that you can add, that will make that the default storage class. What that means is that if I am in my template, you'll see that the template I'm going to show you, I do have it... I do have the annotation in there, but you don't need it if you have a default storage class. You can only have one default storage class per OpenShift or Kubernetes instance. So if you add another cluster, it would not be default and you would have to specify the name. So what we're going to do now is we're going to go ahead and use... We're going to use this storage class and create some storage for my SQL deployment. So I created a namespace right now in that namespace. We're just going to get it ready so we can watch it. There's nothing in it right now. So we're going to deploy it via the template. So let me just show you where it shows up in the template. So all we have to do as a developer now is to include this in our template. And in our case, because I have a default storage class, you wouldn't even have to put this annotation in here. I think it's sort of a good practice because it's pretty easy to change a storage class to not be default for some reason, so at least you have it. The other thing we have is a read-write once. Important concept is that the Gluster file volumes support read-write many. You know, you could have pods writing to one file on a read-write many. This is a read-write once because it's a database. But the access modes are read-write once, read-write many, and there's another one that I don't remember right now that's not used very often, but you do specify that, and then, very importantly, you specify the size. You can, in day two activities or day two tasks, keep track of the usage on the volume. Via Hakedi, you can expand that volume. Okay, so we create this now using our, so we see that there's some resources now. We're creating the secret. This is all in the template. We're creating the service. We're creating a persistent volume claim, and we're creating the deployment. If we go back to, to here, we start to see the deployment, and if we go to the storage, we see that we already have our PV. It might be sort of small. I'll show it to you a different way. I can show you in CLI, but it's already been claimed via the CNS Gold Storage class, and it's ready to be used. Go back to our pods. So it looks like that we're still sort of running this. Usually, the MySQL deployment goes pretty quick. I just, I want to show the, so if we hit return here, you see that I can see that my PV is bound, and again, it tells me the name of the PVC. So in OpenShift or Kubernetes resources, you have a PV resource, and you have a PVC resource. This is a PVC resource that we just created, but it automatically creates the PV. And what's cool about that is when you delete the PVC, the PV is gone too. It's not hanging out. It's completely gone. So I know in some cases, like something like NFS or something, it sort of hangs around and you have to manually do it. Not true with this. So our MySQL deployment is done, and we should be able to see that here. Yeah, so it's going. So let's go ahead and see what we have here. So we're going to log into it via the RSH command. And what I see, and again, it's not paginating great, but I see here that I have a volume name and it's mounted to my MySQL data directory. If I then get a little now what I want to do is I want to track this volume that I see mounted in the pod all the way back to Gluster. So I'm going to now log into Gluster and again, we're back in our container native storage project. We log into the first we can log with Gluster. If you've ever used Gluster, you can run the Gluster commands and the commands we're running right now would be these, I mean, Gluster just like OpenShift has no idea about Gluster. Gluster has no idea about OpenShift or Kubernetes. So this is exactly the same command I would run in a Gluster cluster that had nothing to do with what we're seeing today. So we see we do see our Hakedi volume up here at the top, the very first one HakediDB storage and then we see the volume that we just created. We also see that there are three bricks. This is, you know, Gluster's terminology for the storage and each one is on a different node. So what we're going to do now in the last activity we're going to do is just get used to Hakedi again to get more detailed information about that volume. We get information about its volume ID cluster ID where it's mounted and its replica 3. If you had day 2 activities all this would be important to you if you wanted to increase your storage to your cluster, add another cluster well actually not add another cluster. If you add another cluster it has a different cluster ID but if you're working on the same cluster you know this is how you can get that information. The end. So anyway, we have it looks like about 3 minutes so I don't know Yeah, questions? Yes. Do you have any more to elaborate on that Jose? Yeah. Can you repeat the question? Do we red hat plan on packaging this into a helm chart? And the answer is yes. Yeah, I can speak to that. We all this entire year have been doing performance benchmarking and the thing you need to I guess be, you need to realize is we are doing replica 3 and it's absolute consistency so compared to a single write you're writing 3 times right so and that's good and bad if I'm in a AWS environment say I'm across 3 different regions can tell you because I've done the latency testing those regions are not in the same they're not in the same data center in some cases they can have like a millisecond of delay which is about 20-30 miles in that case your slowest write is going to control your write speed and in general reads no problem at all but because a read is a single it's not a replica 3 but the writes you know you do we have a maximum of a 5 millisecond latency budget diameter between the replicas. And running it on top of platform versus not running it in a standalone we see not much difference 2-5% depending on what kind of infrastructure you have it's not statistically different other questions yes yeah let me take that so we took the 3 replica out right now as a default we do have options to basically yeah he was asking if you're in a very like a single site why do you need 3 replica why couldn't you just do 1 replica so it's we took the default route right now replica 3 we fully intend to support the other replicas right now we're not publicizing the challenge is that if you're running on top of the platform we do want the Kubernetes can reschedule and not can go down we wanted to protect there so then there are options like replica 2 plus you know Arbiter kind of approach so we will be making that as an option because Gluster already supports that we will make that available yeah and just in general right I mean you're looking for high availability or storage so if you do have a single replica and that replica is gone you know you have the pods can't come up in a different host or a different az okay anything I think we're we're going to get booted out we'll be outside if you have additional questions including the cns engineering team here thanks everyone thank you