 Hello, my name is Pavel, I'm Pavel Snagovsky, I'm software engineer from Quantum Corporation, and I'm here to talk about using persistent storage in the public cloud. So first we will discuss what the current options are and how they have been used by Kubernetes. And then I will provide an example of using a storage orchestrator and local storage of the instances in the public cloud to make your storage more cloud native. Let's put it this way. Then we will look at some examples and questions and answers and conclusions. So we all laugh and know persistent disks provided by our cloud providers. So they're extremely durable, we use it all the time, it's practically became default way of obtaining persistent storage in public cloud. So they're highly durable, there's very convenient snapshot functionality, they're fairly performant and there's quite a few options to get performance you actually desire. The elastic at this point on both on Google and AWS so you can dynamically scale them, they provide encryption, they provide dynamic provisioning, all the good stuff. So however, there's few things that are actually current offerings that are not quite working for cloud native applications. And those things are the provisioning time of persistent disk or EBS is significant, persistent disks take a while to provision. So they're not quite instance and when you fail over it actually does matter. So your fail over times will be significant. Those offerings are expensive so we all perhaps opening our eyes seeing the EBS charges when we use something more than generic storage. And besides they have locality limitations so volumes could be their zone specific, those volumes could be only, they live only in specific zone in order to move it you have to do through the snapshot and recreate the volume and they're of course proprietary. So however, there's another option there, there's a local storage that I feel is being often overlooked and people just go with the default using the persistent disk. So what good about local storage? So this is really the instance storage that the disks connected directly to the node that is hosting your VMs. It provides you very low latency, very, very high performance of your. It's very, very inexpensive so disk is cheap and comparatively speaking if you're talking about the services for persistent storage that are provided, local storage is inexpensive. It has both options for transactional and streaming I.O. So you would be able to get the spinning disks, you would be able to get SSDs or Vaskazi or NVMe and it provides really very consistent performance. So the problem with the local storage for the most part is this lack of dynamic provisioning which is not quite true for Google but very true for Amazon and the data durability. So that storage is usually considered to be a thermal. So if we would be able to... So this claim that I'm making that instance store is faster than EBS, it could be proved with this graph. So what I did, I took an instance in AWS and I touched two different EBS volumes, one with provisioned I.O. and another one just GPT2 and I used the NVMe disks. And what we're looking at, we're looking at I.O. completion latency in microseconds here for 16K blocks and I tested reads and writes for both sequential and random. If you look at the last four right here, so these are all for the local disks. So your operations are usually complete within 50 microseconds as opposed to if you look in any of the network attached storage, the EBS. So your I.O. completion practically 10 times of that. So local storage really does provide very, very high performance at the low cost. The only problem is the durability and life cycle management. If we would be able to solve those problems, we will get very performant, very awesome storage at very low cost. So yes. I'm working for Rook project, not working, I'm participating to development of Rook project. It's an open source project and it's storage orchestrators that is fully integrated into Kubernetes and that utilize all the Kubernetes functionality to deliver cloud native storage experience. So it basically leverages CRDs, it generates the storage class, it's backed by SAF. It provides fully automated life cycle management for your storage. So it has all the self-healing and monitoring functionality. It gives you a block file and object store. What I, let me demonstrate, let me do a quick demo. So on the, can we see it? We cannot see Michelle. Can we see Michelle? Okay. So what I will try, should I make it bigger? Is it better? Okay. So I have a cluster running in Google cloud. So I'm using the Kubernetes service in Google. So this cluster has three workers and I just created it 10 minutes ago right before this presentation. So it's a bare cluster, there's nothing on it. And I would like to demonstrate how easy it is to install Rook and start creating PVCs that could be consumed by your nodes. So Rook adds right now, Rook would be able to, out of multiple devices, it would be able to create a storage pool and provide replication. So it will solve the problems that we are trying to solve for the local storage. It will use a local disk, providing this amazing performance. And at the same time, it will provide the durability by replicating your data across multiple nodes. So it solves the problems that really stay in the way of utilizing local storage at low cost and high performance. So the first thing that needs to do, Rook using operator pattern of management. So it's just a set of external controllers that are watching for events that are meaningful to Rook and take action on it. First thing we need to do, we need to deploy the operator. So operator itself, so let's create operator. So operator itself will put all the containers into Rook system namespace. And what we will see here is, so what we will see here is we will see operator itself that is running and we will see, we also have a demo set for Rook agent. So Rook using the flex volume plugin and we have an agent on each node that is running and really managing the RBD commands and set commands. So at this point we have operator and what you also would need to install, you would need to define your cluster and install the cluster. So let me install the cluster and then I will show you the CRD for the cluster and discuss a couple of options. So Rook cluster has a CRD and you would be able to define what devices you would like to use for your Rook storage pools. So you would be able to put, for example, red jacks or some sort of criteria by which you would like to select the devices and you also have a different set of criteria you can insert to specify which nodes you would like to be part of the cluster. And once you define the cluster, Rook operator will be looking for any changes for it and will automatically take action. So providing you this fully automated life cycle management. In my example here, I'm not doing anything fancy at all. So I'm just using, so you would be able to use Rodevices for your Rook storage or you would be able to use also a file system. Just a path on the file system is also an option to use. In this case, I'm using just a path on the file system. So at this point, we have the cluster defined that will really set which devices we will use for our storage pool. Next, we would need to create the pool itself and define the storage class for Rook. So the storage class will provide us block devices. Right, so once your storage class is defined, it gives you the same dynamic volume provision in like other well-known storage classes. So it gives you the same functionality and you can dynamically provision volumes by creating the PVCs. And let me create very quickly the PVC. So at this point, we have fully running Rook cluster so that it took a couple of minutes. And I was talking more than I was actually doing, so it took literally two commands. So at this point, I will create a PVC and we will just see how it's bound and we will return to the presentation. So what is happening here, let me actually show the PVC itself. So PVC is very, very simple. So we're using the Rook block storage class that we just created a second ago with the defined devices to use for the pool. And we define the storage size and a few other options for our PVC itself. So at this point, this PVC is bound and it could be consumed by any part. Let's speak about performance. So once we define that local storage is very, very high performance, instant store in terms of AWS, it's very, very high performance. What it really means when you put Rook on top of this local instant stores across multiple instances. You're getting significant increase in performance in this specific scenario. What I did, I took medium-sized AWS instances. All of them would have some local disk so that local disk was used to put Rook on top. And I'm comparing and I was created a test part and this part would have within it one Rook volume and one EBS volume. In this case, it's GP2. And I run the few tests against these volumes, just directly on these volumes. And this is the IOPS I'm getting with the 60k block sizes. You can notice that local storage really, really, really give you very nice read performance. Write performance, however, because you need to replicate. Write performance is on par about the same, like with the persistent disk or with the EBS. Because you have to replicate over the network and you're basically getting similar problems you get with the EBS. You're not talking to your devices over the controller. You're talking to your devices over the network. This is a very different hardware in Google I used. So the numbers are different. However, the performance also significantly, significantly higher using the local disk. What also is important to note here? So when you use EBS a typical pattern to use persistent volumes would be to create one EBS volume or persistent disk per part. And I would like to argue that this is not a good practice if you expect your application to be resilient. For example, it takes a few minutes for the part that using the EBS volume or persistent disk. It takes a few minutes for it to fail over. Because what needs to happen, you need to do device attach, detach, mount, not mount. You have to make API calls for provisioning it. The devices themselves would have dependencies on other resources and so forth. So there's a lot of penalties you are getting by using EBS per part. By using something like Rook, you don't get any of these penalties. So your failover for a part that uses a persistent volume based off Rook happens practically instantly. There's no penalties whatsoever. You basically have the large storage pool and you create virtual volumes and delivering it to your parts. And I feel it's significant. So in other comparison that we can make between the EBS or storage, when I speak EBS or say persistent disk, practically everything I'm saying applies to both to Google and Amazon. So let's talk about the zone. Because each volume is tied to a zone it's in, and we all would like to have resilient Kubernetes clusters that we want to deploy into multiple zones, it gives us a problem if you're using persistent disk. So when, for example, one zone goes away, so your zone has failed, those parts that had volumes within that zone cannot be rescheduled to a new zone because these volumes are not there. So you can manually, of course, move it, but it will be a significant amount of effort. You can code for it to automate it to some degree, but it's a significant amount of effort. So the problem of running a multi-zone cluster when by losing a zone, you cannot fail your parts if they use persistent disk. I think it's a significant problem. Significant problem for using, again, the scenarios that being recommended of using one volume pop-up. Compatibility. So Rook is wonderful note that it takes away the entire management for safe from you, and everything just magically works, but it's also great as the abstraction layer for your persistent storage within Kubernetes. So if you're using Rook and your PVs, they're really interacting with the Rook storage class, and that creates everything that consumes this, so all your user code wouldn't have to change. You either go to AWS, you go to Google, you go to on-prem, you run it on your laptop, so your code wouldn't have to change. So it's very, very portable. You're really abstracting your storage by one interface with which everything else talks to. So it gives you the portable code, it creates no vendor lock-in. You don't need to know anything about proprietary storage there. It supports multi-environments. You can run your testing anywhere, so it will work the same way on your laptop or in the public cloud, or in the private cloud, or in the bare metal. And it's also avoids the provider-specific functionality. Cost. So this talk is titled that we will be talking about economics. Cost is important here. So I took very small, simple here in Google Cloud, and this is the Google Cloud Cost Calculator. And out of this calculator, what we see, I'm provisioning an instance with the 16 virtual CPUs, and it already has two local disks on it. And yet, the size of this disk is it's two disks of 375 gigabit each. And yet, the cost of this instance with the local disk is still smaller than provisioning a persistent disk. So please note, so this persistent disk is for 300 gigabytes. It costs $16 per day. And the instance with two local disks cost less. Isn't it crazy? That's how cheap the local storage is. The similar situation, even though the local storage is treated differently in Google and Amazon, so like, for example, in Google, you can dynamically provision local storage to some degree. Amazon, it's a similar picture. So here is an example of two different instances that have all the same compute resources. The only difference between these two instance types is availability of the local disk. So one of them doesn't have any local disk attached, and another one has eight NVME SSDs. So and the cost difference here is just 40 cents. So if we speak about percentage-wise, so the cost difference between these two instances would be about 20%. Different instances would have different cost difference, and there's a lot of things to choose from. So this is just one example here. So in short, the local disk is inexpensive, very, very, very performant, and it costs significantly cheaper per gigabyte than you would use the block service from your cloud provider. Another good benefit, it's always more efficient to utilize your storage when you are slicing a bigger pool of storage. It would give you way more flexibility to utilize your storage better, and it also will reduce your cost. Once you have more efficient way to utilize your storage, it saves your money. So Rook also could allow you to do a lot of different things. You don't have to use local storage to backup Rook. So you, of course, can use EBS or persistent disk as well. So it's just a block device, and Rook will take any block device or even path on your file system. So what you could do, you could take, for example, large GP2 EBS and make it to the maximum size to get the maximum amount of IOPS, and then put Rook on top of it and slice it up anyway you want. And that could provide you a cost saving. You could use Rook, as I mentioned, provides the object store. And you could use object store that is internal to your Kubernetes cluster if it feeds your tasks. So you could theoretically replace S3 and remove it from your bill and use internal to Kubernetes object store. It gives you quite a bit of more flexibility, because this is the object store that you manage. So you're managing your domains, custom domains, and everything else happens absolutely independent from AWS. So if for S3, you would have to move your domains in order to use custom domain. You have to move it throughout 53, and they have to use their certificates and everything else. Here you have all the flexibility in the world. Use any domain you like. And any user management you like. So it makes it portable as well. So you would be able to run and use this object store in your testing environments elsewhere or in your cloud provider. It's not only saves you money, it gives you huge portability. And of course, you can create multiple clusters for different workloads. You can create one cluster that is backed by very fast drives, local storage. And it will serve, for example, your databases. And then you need to store logs somewhere. So you can form another cluster. You can form another cluster with spinning disks underneath it. And you would use a different cluster for different type of your cloud. And if you want to take it even farther and save even more money, now, since you have reliable, portable, persistent storage, you are able to deploy some services that you currently pay Amazon for. You can deploy it within the cluster now and reliably make them portable and reduce your bill. So if you have a service that, for example, self-healing and self-replicating and everything else, somebody wrote a Helm chart for it. You just deploy this and use it instead of the service. It will save you money. It will make your applications more portable. To conclude, Ruku's awesome. And it's not only awesome, but it also would provide you significantly better flexibility and can reduce your overall cost of running your Kubernetes clusters with persistent data when you need applications to persist data. You will find that it gives you more flexibility at lower cost. And we went over the compatibility, how Ruk makes your code portable. We talked about performance. There's a very, very clear performance gain when you're using the local disk. And if you are overcoming the downsides of local disk, namely it being ephemeral and not dynamically provisioned, so by using Ruk on top of it, you're getting both. You're getting the durability, and you're getting the dynamic provisioning. Another thing that I didn't mention, but we were talking about the multi-zones again. So if you're using something like persistent disk, you will not be able to fail over your parts if availability zone died where that persistent disk was. So it will affect also the provisioning and failover time. So I would recommend against of using persistent disk per part, meaning that don't use CBS volume per part. So I feel that your failover times will be in minutes. So you're really creating a set up in which when you need to fail over fast, you can't. I want to fail, and I want to fail fast, very, very fast. And with EBS, it will be literally minutes. And of course, cost. That concludes my talk, and I'm ready for questions. Excellent question. So let me repeat it so we could see it and correct me if I'm paraphrasing it poorly. So the question is, Ruk provides all this wonderful automated management of your life cycle for yourself. But what happens if it cannot handle something or you want something that really unique and you want to execute a self-command? Well, let's do it. So I have a command here, for example. And you have, in Ruk containers, we're exposing, so let me speak first. In Ruk, every Ruk container deployed by Ruk, here's a self binary that you can talk to if you provide the right configuration to it. And that's exactly what I was trying to demonstrate. And oh, OK. So I changed the name space here. So let me add the name space, and I will show you how it actually works, which is an excellent question. So this is name space. I think it's Ruk. Yes, I need to go here and add the name space here. OK, so sorry, but this is not intended to be, but I assure you it actually is the way it works. So you are able to, yes, go ahead. No Ruk by itself does not do it. However, you can use notafinity or anything else and manage it yourself. So you can do it. Or if you feel it should be something Ruk should be doing, Ruk is an open source project. Please create an issue. Let's discuss and let's get it in. So it's actually differs for provider. Google and Amazon treat local disk very, very differently. So in Google, what they offer, they offer you two different interfaces. You can either use SCSI, or you can use NVME. And you can choose between the two, actually. However, the local disk, the way they offer it is they all, you can attach it to any instance. So you don't have to pick specific instance like in AWS. You can attach it to any instance, but the sizes of the disk there are the same. It's 375 gigabyte, and the maximum is 3 terabytes. So you can attach to each node this much disk. With Amazon, it's a completely different scenario. They really expose to you what they have as a hardware. So they have this node in the data center. And that node has a storage controller, and has disk attached. And they literally giving you this disk throw. And they tell you what type of disk it is, and they vary based on the instance. So that example that I had, so that specific instance had eight NVME disks, I think close to two terabytes each, something like this. So with a total storage, usable storage, somewhere like 15 terabytes. So in different instances would give you different type of disk, even different interface, or different disk sizes. So with Amazon, it's all different. Any other questions? Actually, it's not quite true. So it depends on the amount of replicas you have. So each disk would have, each storage device under SEF, it's really about SEF. Depending on, it would have to write to multiple OSDs, and those writes would be based on the amount of replicas you desire. So all it needs to do is just match your replicas that you specified. So if you have replicas three, you're right. You would try to total three OSDs, as I said. Yes, it does. So let me give you a slightly different example here. For example, RUK operator watches for events. If events match in the criteria it has, it also, like for example, I made in a new node, it matches those criteria, it has the same criteria for the node, criteria for the devices, it will automatically edit them. It removes, it will start automatically rebalancing it. If it sees the same blocks elsewhere, it will start rebalancing it back. There's a lot of self-healing functionality embedded in those controllers, a set of controllers that really, RUK operator. All right, my time is up. Thank you very much, but please go to RUK.io. That's where there's more information about RUK, and it's on GitHub, it's open source. Participate, use it, and have fun.