 Hello. Welcome to the Rook Intro and Seth Deepdive. I am Travis Nielsen, one of the Rook maintainers, and I work for Red Hat. Let's get going. All right. For our agenda today, we're going to start off by talking about what are the storage challenges that you might have with Kubernetes? What is Rook? What is Seth? Just cover some basics. And then we'll get into the key features of what Rook provides, what new features are in our latest 1.9 release that just came out in April. And we'll get into a demo that Blaine will give. If you have questions, we'll shoot up some time at the end to answer some of those. Okay. So what are your storage challenges? Kubernetes really was a platform built to manage distributed application, ideally stateless. And if they need storage, well, storage is kind of an afterthought. There's a way to plug in storage to Kubernetes, but storage is not a native component of Kubernetes. So if you rely on external storage, it's either not portable, maybe it's a burden to deploy it. If you're in a cloud provider, you're locked into that vendor. So how do you get out of that vendor locking? So some of those questions is where we started with Rook. And we wanted to really bring storage to Kubernetes in a native way. So that brings us to the question, what really basically does Rook provide? So Rook brings storage inside your Kubernetes cluster and manages it for you. It makes it so your application can consume storage, just like any other Kubernetes storage with storage classes and persistent volume claims. The way we do this is with Kubernetes operator and CRDs. So you tell Rook how you want to manage SAP and the storage layer, and then Rook will go automate that for you. Rook will deploy, configure, upgrade SAP for you. So you don't have to worry about all the details of managing that storage layer. Rook is open source, Apache 2.0 license, and we just try to have an open community. We wanted to do what's best for the community. As you're getting started with Rook, we want to make sure you know about all the resources that we have on our website, documentation, and join our Slack for questions. At the bottom of the slide, what we want to point out is we're just starting to release some new training videos. I'm getting started with some basic concepts with Rook on kubebyexample.com. But check it out and I hope it's useful for you. Now, what is SAP? Many of you, I'm sure have heard of it. SAP is an open source, fault-tolerant storage service. It provides three types of storage, a block storage, a shared file system where you need to share storage among pods, and or an object storage that's S3 compliant. SAP favored consistency, and it was first released in July 2012. So we're coming up on the 10th anniversary since SAP has been running in production storage clusters. So we're really excited about that anniversary. So what basic architecture layers does Rook have? So Rook, again, really deploys and manages everything you need for storage with SAP. So the SAP CSI driver now, it will provide that plugin layer with Kubernetes so you can provision and mount the SAP storage into your application pods, just like you do with any other CSI driver. And then SAP itself provides the data layer. Really, so when you're reading and writing data, it's going to go straight to that data layer for optimal performance. So here's a view of what management looks like from Rook's perspective and kind of looking at what pods are created. So there are a lot of pods that need to be created, but again, you don't have to worry about this, Rook creates all of them. So you start with a Rook operator, and you tell Rook, I want to deploy SAP, and this is how I want it configured. So then Rook goes and starts the SAP one pod on each of three nodes. Then it will go look for all of the devices. So say you have devices on your nodes that you just want to consume as storage in your data center. Rook will create OSDs on each of those devices. OSD is the fundamental storage component of SAP. So all these red pods basically are SAP demons, and the green pods are really CSI driver pods that help you then provision and attach the storage. So that's in a nutshell kind of idea of what pods you're going to get with Rook and SAP. Now when your application is ready to go consume provision that storage, we've got this picture here of the three different types of storage. So here on the left, we start with if your app needs block storage, you're going to create read write once volume claim. You define a storage class which uses SAP RBD, and then the CSI driver for SAP is going to create one of these SAP RBD volumes and attach it to your application pod. Okay, so that's block storage. Now in the middle here we have an example of file storage where what you're going to do, you've got an application, two applications or two instances of an application that need to share a claim. So you create a claim and read write many mode. It uses the CepFS storage class which allows for the sharing of the pod or the volume and then it mounts it with the CepFS driver. All right, so that's file storage. And then if you have an application that needs to use the object storage rest endpoint with an S3 API, well, we kind of follow the same pattern that you see with PPCs. You can create a bucket claim to say I want a bucket to read and write data to. We define a storage class object and then we've got a bucket provisioner, then that makes that bucket available to your pod. Now this is kind of a precursor to the new Kazi implementation that's coming in Kubernetes soon. So we're looking forward to supporting that implementation as well. Now here's a view of what the data path looks like with CepFS. So this is assuming we've already got the storage provision attached to your pods. Now when your application is ready to write data, you're going to write to your volume inside the pod just like you write to any other local storage. That write or we will go through the Ceph RBD kernel driver which will then talk to the Ceph demons that are running in your cluster. And you don't need to know anything about where they're running, how they're running. Ceph just takes care of that for you. No matter what type of storage, whether it's the block, file, or object storage with the S3 client, those clients know how to connect to the Ceph storage. Okay, let's get in some of the basic features that Rook brings to the table with Ceph. So first of all, installing Ceph in Kubernetes is as simple. We've made it as simple. That's been one of our goals from the start of the project, how to make Ceph deployable for the majority of clusters without having too much management overhead. We've got some sample YAML files, some manifests that you just say, go create all of those, and we'll create and configure Ceph as desired. The last one is cluster.YAML is the one that's pictured here on the right. That's where you tell Rook what settings you want to deploy with, like how many moms and how to consume the storage that Rook finds on those nodes to create that storage cluster. Okay, we also have Helm charts that make this deployment even simpler for those who are using Helm. All right, so the CSI driver brings a number of features, standard CSI features. We've got dynamic provisioning for block and file storage. We've got volume expansion, snapshot and clone, and others. This enables all sorts of applications to do failover, work across data centers, work across clusters, mirroring, and a number of scenarios. So what environments can you deploy Rook in? So first of all, primarily where we started was in bare metal. You've got your own hardware, and on bare metal, you don't even have storage options unless you plug in some external appliance. But if you have your own hardware, you can deploy Rook there to provide that storage platform. Also, if you're running in cloud providers, when cloud providers have limitations in their storage, and you can deploy Rook there to also provide a consistent storage platform. Let's talk a little bit more about that. The Rook in a cloud environment can overcome shortcomings. So some of the shortcomings include the cloud provider might only limit storage to inside a single availability zone. Well, Rook can span AZs. Failover times can be long in cloud providers. So you can failover in seconds instead of minutes. You can have more, basically unlimited number of PBs per node instead of, you know, some product providers might only have a limit of 30 per node. You can also get better performance to cost ratio if you provision large PBs from the cloud provider. And then you put step on top of that, get better performance because of those large underlying PBs from the storage provider. So ultimately, in this environment also, Seth can use PBCs as the underlying storage. You just tell Rook which storage class you want to provision the storage from. Right now, Rook works well for many cluster topologies. You can tell it to work across zones, across racks, or whatever your data center configuration is or in the cloud. You can spread the SEP team in the cross failure domains to make sure you don't fail in a single AZ. So even if one AZ goes down, your other two AZs can keep running the applications. And you can tell Rook how to deploy based on node affinity, tanks and tolerations. It's very flexible. When you're ready to update Rook or your data layer with Seth, Rook can handle everything. Rook can update and patch all the SEP demons to the latest release, even with SEP major upgrades that update is automatic. Our upgrade guide can look a little scary, kind of a long document, but really it's just trying to be extra careful, make sure you feel confident knowing your cluster is healthy before, during, and after you're upgrading. Another key feature of Rook is that the CSI driver allows you to connect to Seth that is running external to your Kubernetes cluster. Maybe you already have a SEP cluster running, or maybe you just really don't want to run your storage on the same hardware with Kubernetes or in the same environment. So you can run SEP independently, deploy usually with SEP ADM, and then connect your cluster to that. Again, mentioned briefly already using buckets with object storage, the OBCs, the object bucket claims allow you to provision buckets easily, and we're looking forward to the COSI, the container object storage interface that will be coming with a Kubernetes enhancement soon. So Rook 1.9 was just released in April. Let's talk about some of those new features that are just out. First of all, Seth Quincy is the latest major release of Seth V17 that was just released in April as well on their annual release cycle. So with Quincy comes the latest and greatest storage layer. So we don't get into what those features all are, but Rook does support that latest major release. The CSI driver has had some good updates in the 3.6 release that Rook deploys by default. So you can, for example, fuse mount recovery. I think we can detect the corruption of Seth Fuse mounts and remounted automatically. There's AWS, KMS encryption, and many other fixes and updates. Another major feature is NFS provisioning. So NFS is still useful in sub scenarios where maybe you're migrating a legacy workload into Kubernetes. So you can create NFS exports via PVCs now with the CSI driver. CSI driver will provision them and then the Kubernetes NFS will mount them for you. And that community NFS driver is available today. And our documentation and Rook explains how to work through that and get that working. In this release, we do have a new CRD for creating Rados namespaces. So Rados namespace is a concept and stuff that really gives you isolation and multi-tenancy so that you don't need to create separate pools. A pool is kind of a large entity inside Seth. And so this just gives you isolation within pools. Some network features, so on what happens on the wire with Seth communication. So you can have encryption on the wire now and compression on the wire as well. They do require a recent kernel by .11. And then there's much more. Of course, lots of fixes with each update to Rook. Emission controller, for example, is enabled by default if we find the cert manager is available. We support multis networking now with our latest release. And then updated Prometheus alerts and much more. And those alerts will really help you make sure your staff on your storage cluster is healthy. All right, now we'll turn the time over to Blaine for a demo. Thanks, Travis. We also want to show you what it's like to run Rook on an everyday basis. So we have a demo prepared for you. The environment that I'm going to be using for the demo today, I'm running on OpenShift, which is running Kubernetes 1.22. This is on Amazon Web Services. I'm going to have three control nodes for Kubernetes and three worker nodes. I've chosen to use M5.8 Xlarge nodes here. This will allow us to run the final size of the stuff cluster we're going to get to, as well as having about 50% of the node left over for user applications, which is just kind of a rough estimate. I'm going to be using a slow but pretty cost effective GP2 volumes. And this is using what is, at the time I'm recording this, the latest version of Rook, which is 1.9.0. And the pre-release version of Seth Quincy, which is version 17 of Seth's release. Before getting into the demo, I want to briefly talk about the two basic types of Rook Seth clusters that we kind of talk about. This is host-based clusters and PVC-based clusters. So for host-based clusters, Rook is just going to look at the node itself and pick up disks in order to use those for Seth OSDs. In a PVC-based cluster, we instead instruct Rook to use PVCs. These might be dynamic from like GP2 today, or these might be local persistent volumes that you've created yourself that represent physical disks on the hardware, but are claimed via Kubernetes native mechanisms. For the host-based cluster, this is suitable for a simple cluster, especially for proof of concept clusters. We can say use all nodes, use all devices, and it becomes pretty easy. This starts to get complicated when we don't use all nodes or all devices when we're using heterogeneous hardware, or if we want to, for whatever reason, customize the device layout on a per-node basis. The PVC-based cluster on the surface seems a little more complicated, but we don't need to describe hardware configuration, and it becomes pretty easy to expand. We can increase the count of disks that we use for Seth OSDs, or we can increase the size of those disks by increasing the resources.size parameter here in the storage class device set. Jumping straight into the demo, I want to break it down onto what we're going to see first. So we're going to see creating the Rook operator from there, what it's like to create the Rook Seth cluster. From there, we have something we've been working on for a little while, and it's a little new for a coupon. We're going to be using a crew plugin that we've created for Rook Seth to see some of those cluster details. And we're going to use that throughout the rest of the demo when we expand the Seth clusters OSD size, as well as expand the Seth clusters OSD count. And for this demo, we're also using recommended configurations for production. So we'll have these files be provided to you as well, so that you can reference what it's like to run a kind of best practice cluster. So first off, we talked about creating the Rook operator. This is really just as simple as creating a deployment, a pod that runs on some Kubernetes node. I've depicted it here on worker one, although this really might be any available and schedulable worker on your Kubernetes node. Let's jump over to my terminal here and we'll see what it's like to install the operator. The first thing we're going to want to do is install some prerequisites. This is going to be CRDs that Rook will use for taking your user configuration about cluster and add-ons. And as well, role-based authentication, which is going to give Rook permissions to create the storage that it needs. Once that's done, we can create the operator itself. Because I'm running an OpenShift, I'm running the OpenShift flavor of this operator, although if you're running a normal Kubernetes, just operator.yaml is the one you want. And we'll see that it gets scheduled and it starts running pretty soon after here on the left side. From here, I'm going to go ahead and start cluster installation. You can see that Rook starts scheduling some resources already. This is going to take about 10 minutes. So while this is running, we're going to jump back to the presentation. These resources are some ancillary resources to start off with, including the CSI driver. But what I want to drill down into is really the core stuff components. So you can see here, spread across our worker nodes, three monitors, three OSDs, and what's called a manager. The monitors are what I like to think of as the brains of the CIF cluster. The manager provides CLI and API services for the CIF cluster. And the OSDs are really what provides the backing storage on the nitty gritty details. The way that this cluster is configured, the monitors should be spread across the nodes. And the OSDs should as well. All right. A little over 10 minutes has passed. Most of this time was spent setting up the Rook monitors. This seems to be a result of the PBC driver for GP2 being a little slow. So from here, let's look at the crew plugin and how that gets used. The first thing being to install the RookSaf crew plugin. So that's crew install RookSaf. I've already got this installed. And from here, let's check out some of the commands that we have going in this version. So we have a basic overall Rook status. This is going to show us the status of this RookSaf cluster. We can see that it was created successfully. And we can also see that it's in health worn state right now. So if I suspect something's wrong, I might want to set the RookLog level to debug so I can get more information out of the Rook operator logs. So we can say CoupeCuttle, RookSaf operator, set RookLog level to debug. And this will do that for us. I don't really want to run through debugging right now. But we can also run Cep commands. We can just say RookSaf status. And this will give us the overall Cep status. And we now see the health of our cluster is okay. It's just taken an extra minute for the cluster to stabilize and become ready. And in addition here, we see that we have three OSDs that are up and ready. And we have 30 gigabytes of raw capacity. Each of our OSDs is 10 gigabytes. So this is really what we expect. I also want to talk about a Cep command that's a little bit advanced called Cep OSD3. And this is going to show us the hierarchy of Cep OSDs as Cep understands it. So we can see that we have OSDs running in one region. This is US West 1. And within this region, we have two zones being West 1B and West 1C. I have two nodes in 1B and one node in 1C. And so the OSDs are spread two in 1B and one in 1C here. 30 gigabytes is not really a lot of capacity. This is great for just testing things out. But at some point, we definitely want to increase this. We can do that by editing our cluster manifest. And here I'm going to increase the storage size to 100 gigabytes per each of these three claims. And I'm going to apply those changes again. And we should see that Rook begins reconciling this change. And in a few minutes, we should have 300 raw gigabytes of storage. So what's happening here visually is just we're increasing the size of these persistent volumes attached to the OSDs. And right now we're increasing them 10 fold, but that could be whatever amount is right for you in your cluster. Jumping back to see what our cluster is doing and skipping a few minutes ahead. We can now see that our OSDs have each re-initialized. And jumping back to our crew plugin, we can get the stuff status. And we should see and indeed do see that we have 300 raw gigabytes of capacity. At some point, we will reach the IOPS limit of these GP2 volumes with scale up. Increasing them in size more won't really get us more IOPS. So we may want to scale out the cluster instead to create more OSDs. And this will effectively allow us to get better performance for the size we're increasing. Again, we're going to want to edit a cluster manifest. And here instead of changing the size, we're going to be changing the count. So I'm going to change this from three to six, which is doubling the number of OSDs and should double the available capacity to 600 gigabytes once I apply this. To look at this visually, we are adding three new OSDs here in blue. And Brooke is going to create these and try to spread them evenly across the nodes. So we should see one extra OSD on each node running. Skipping forward a few minutes until when that's done, we can again go back to our crew plugin and see the status. And we now have six OSDs and 600 gigabytes available. This is also where I'm going to come back to our OSD tree command. And we can see all of these six OSDs. And now there are four OSDs in the zone with two nodes and two OSDs in the zone with one node. I do want to make one small note. With this scale out, I talked about potentially doing this for performance reasons. Another option is to create a new storage class device set rather than expanding an existing one. With a new storage class device set, I might use a different and faster storage like IO2 instead of GP2. And I could provide this as a backing pool for faster storage for some user applications if I have different users that have different storage speed needs. Obviously the faster storage is going to cost me more. And so I probably want a little bit less of it. Thank you so much. I'll pass it back to Travis to wrap things up. All right. Thanks, Blaine, for that demo. Again, I'm going to refer you back to all these resources we have for getting started with Rook, the website, the docs. Please join our Slack. It's a great place for asking questions or go to our GitHub. Join our community meeting if you want to talk to us over the call. And again, check out our training videos to get started. Thanks for joining.