 Hello, welcome to the talk We'll do an intro and a deep dive into what Rook is But basically Rook is a Kubernetes storage platform Now I am Travis Nielsen. I'm from Red Hat my colleague Saturu From Saibusu. We'll also be talking with us But today I want to talk about first of all, what are Kubernetes storage challenges that everyone has? Then let's get into what is Rook what is Seth how they work together and And then we'll get into some features You know what features are have all has Rook always had and what features are in our latest release 1.8 Saturu will then show a demo of what it's like to use Rook and at the end we'll be happy to take your questions So during this presentation, you can type your questions In the chat or you can save them to the end So what are your Kubernetes storage challenges everyone who has Kubernetes really needs to have applications that store things Kubernetes as a as a platform to manage distributed applications Now these applications are traditionally stateless or they don't need to store anything So you're relying on some external storage to serve what's backing your Your applications so these applications if they have storage that's outside Kubernetes. It's difficult to make them portable It's a it's a burden to deploy because the storage may be different everywhere you deploy and Then at the end of the day who is managing the storage Do you rely on a cloud provider that? Manages this Manages that storage and if so, you know, you may run into vendor lock-in challenges where you don't want to be tied to any particular storage platform So here's where Rook came in we we were looking at Kubernetes and we thought hey How do we bring storage to Kubernetes in a way that's natively built in to Kubernetes and works well with Kubernetes? So Rook it creates and provides a data platform inside your same Kubernetes cluster where you're running other applications You will then consume the Rook storage just like any other storage using storage classes and pvcs The way Rook is implemented is with an operator Kubernetes operator and custom resource definitions So using the custom resource definitions, you can tell Rook how you want the storage configured but at the end of the day the power that Rook provides is Automating the storage platform deploying it for you configuring it for you Managing upgrades for you. So you don't have to know all the details of how the storage platform works Now Rook is open source. It's based on Apache 2.0 license So it's very flexible to allow you to deploy in your environments I just want to stop for a moment and say happy fifth birthday to Rook in KubeCon Seattle Five years ago. We first went public and open sourced Rook We were excited for all the community support and the growth that we've had over the past five years to help it be production ready and Deployed in many production environments Let's get into what is Ceph? So Ceph is a storage is the storage platform that Rook deploys Ceph is also open source and it provides three types of storage. It provides block storage shared file system and object which is S3 compliant Ceph favors data consistency. So you never have to worry about if your data is safe and Ceph has been around for almost 10 years now in production environments. It was first released in July of 2012 And you can refer to the website Ceph.io for more information So architecturally Rook really Has three layers to consider so that the top layer Rook owns the management layer So we Rook deploys the operator and manages the storage platform Rook deploys a CSI driver. So the Ceph CSI driver now is the second layer that CSI driver dynamically provisions and then mounts the storage to your application pods and Then the third layer is Ceph. So Ceph is the actual data layer Anytime your pod is reading and writing data to the cluster It will go directly to Ceph for that storage and Rook is not involved in the data layer Rook is really only the management layer So I've got some diagrams now that will help us see these different layers and how they're separated in a few different ways So first the Rook management layer So Rook owns the deployment of all of the pods and the services and the endpoints and all of the resources to make up the Ceph cluster To provide that data platform So this is a diagram of pods that might be running on three different nodes So the pods in blue are the Rook pods So the Rook operator and the Rook discovery daemon So they're provided management The green pods then are the CSI driver, the CSI driver, the CSI provisioner, the Arbadi plug-in Which is for block storage and the Ceph FS plug-in, which is for the shared file system storage and Then each of the red pods is a Ceph daemon So the Ceph pods are the ones that actually provide that data layer And there are a number of different daemons that Ceph provides in that data layer The Mons and OSDs, the manager and so on We won't get into exactly what all of those Ceph daemons are, but so here we see that Rook is really managing the deployment of these pods and Deciding how to configure the Ceph daemons So that's a second layer. So if we look at after Rook and Ceph are deployed Now how do you attach your storage, your application pods to consume that storage? You need to first provision that storage So the CSI driver will take your application and your application has a volume claim or a PVC that That PVC will consume the storage from a storage class, which is a Ceph Arbadi for a block storage or Ceph FS storage class for shared storage The CSI driver will provision the storage either for that block or that file system storage Now on the right hand side with these purple Boxes we see this is with an object storage. So an S3 interface provides access to object storage And we provision that in a similar pattern to PVC's with what we call a bucket claim So an object bucket is of course something that's backing the storage is backing an S3 endpoint and That endpoint comes from a storage class which declares where in Ceph to configure that storage Now at the third layer After the data the storage is already provisioned. How does the application write to the? the data layer So Ceph has a driver that after it's mounted this Arbadi kernel driver for block storage It will write to the Ceph cluster If you have a shared file system the Ceph FS kernel driver will write to the Ceph cluster And it will manage the communication between all of those different Ceph pods Or if you have an S3 client the S3 client will connect to the Ceph RGW demon To read and write that object storage and write to the buckets So you'll see that in this diagram when your applications are reading and writing data Rook is not directly in this path. Rook is only in the management layer when you set up or provision the storage Let's get into some of the key features that Rook has always had or had for a long time from the start. So first of all Ceph Deploying Ceph is simple with Rook. We've tried to make it as easy as we can That was one of the goals of the Rook project from the start. We provide several manifests for installing Ceph in a default configuration and you can change those settings for more complex configurations But at the end of the day And we've made Ceph simple to install Here we show several kubectl commands of what you need to create if you're creating these manifests directly and we also have Two helm charts that will help you install the operator and create the Ceph cluster as well We have a CSI driver the CSI driver as a has already been mentioned it dynamically provisions the file and block storage It allows you to expand volumes. It will It also implements snapshots and cloning So you can have some of the backup functionality All right, so what environments can you run Rook in? The Rook primarily is run In production environments in two types You have bare metal environments where you have your own hardware or your own virtual environments and you get to install Rook this is the end. This is the main scenario where we expected Rook to be used where there is no cloud provider storage that you can back your storage with and back the applications with So the second environment though That is commonly run is in cloud providers where even though cloud providers Have storage that you can build on Rook has several benefits with Ceph That overcomes some limitations that cloud providers have So in that cloud environment what Rook does is it allows you to have storage that spans availability zones You can have faster fail over times where it's seconds instead of minutes for your pods to fail over You can have a greater number of PV's per node as many as 30. You can use storage with better performance to cost ratio and also Rook really gives you a consistent platform no matter where you're installing whether you're deploying an AWS or Azure or Google Cloud or or any environment This is it's a consistent storage platform everywhere When you're running in these cloud environments, Ceph will use PVC's as an underlying storage so you can connect Ceph To the underlying storage platform in a consistent way with your other Kubernetes applications as well No, there's no need for direct access to local devices when you're running in the club or mount them in advance It's all dynamically provisioned another aspect of Rook is you can configure it for any cluster topology whether you have Just a flat level of nodes Ceph will divide the storage across the nodes in an available way to keep the data highly available and durable Or if you have multiple racks or zones or any other complex topology Ceph and Rook will work well in those environments to spread your data across the appropriate data zones So that you can rely on your data being spread across failure domains Right now when it's time to update Rook or Ceph There's a way where Rook handled everything you tell Rook what version you want to run And Rook will upgrade all over the Rook. They have Rook operator and the Rook daemon Rook will also update the Ceph daemons all of the Mons and OSD's In a rolling fashion so that your storage will remain active and available And online during the upgrades as well Another feature Rook has is let's say you already have a Ceph cluster running outside of Kubernetes So using the CSI driver and the Rook operator Rook makes it simple to connect your CSI driver To that external Ceph cluster No need to run Ceph cluster inside Kubernetes if you if you want to run And connect to your existing Ceph cluster Now if you want to have access to object storage with an s3 endpoint You can provision object buckets. So this is done with an object bucket claim This is a very similar pattern to using a pvc where Rook will create a bucket when requested Give that application access to the bucket with a secret And then You can have your provision your buckets dynamically This is very similar to the new container object storage interface or Kazi Which is coming soon and we will be implementing that in Rook as well when Kazi is available So let's talk about some Rook 1.8 features That are in their latest release what Rook 1.8 is was just released as a kubekan We're expecting to have it out this just a few days before kubekan china So some of the features we have there first object bucket notifications notifications can be sent to A configurable endpoint where the notifications can be based on a put or a get or a copy or a delete of different actions to the s3 store You can specify filters for those notifications, whether it's object names, suffixes Or regular expressions, or you can filter on the metadata of those objects And you can send it to an endpoint such as htp Kafka or mqp Another feature disaster recovery for applications. The dr is very important where applications need to Span across clusters across kubernetes clusters. Let's say you have a whole site go down Your application needs to continue running in the other site So the step CSI driver supports marrying the data with step across the clusters And we provide tools and documentation to support your applications in that failure A couple of network enhancements in the latest release. So encrypted connections across the wire Where you know any data that's transferred across the wire will be encrypted with seps suffix Encryption at rest was already has already been available for a number of releases You can also compress the data across the network to reduce that That throughput that's required across across the wire if you have compressible data This does require an experimental version of sep sep Quincy, which will be coming out in early 2022 Multis networking the multis is a network plugin that provides more flexibility for different connections So work will fully support multis in 1.8 The CSI driver primarily needed an update to require that multis support A little tool that we've released is the crew a crew plugin. So perhaps you've seen other crew plugins Really, it's for making kubectl commands extend extendable So for example, you can say kubectl rook sep sep status And you'll see the sep status as if you're running in the rook toolbox But it's it's much simpler and shorter to type out That's just a convenience tool, but we are really excited about it Now in 1.8 a few important notes if you're already a rook user and and are considering upgrading to 1.8 We do require Kubernetes 1.16 or newer older versions are not supported due to old crd versions The sep versions supported our octopus in pacific with quincy is experimental coming out early next year But the support for sep nautilus was removed And finally if you've been using work for a really long time before the csi driver The flex driver support is was removed in 1.8 And we'll we have a tool that will help you convert your volumes to csi And now we turn the time over to sotiro for a demo I have a demo to create a simple look safe cluster I will use these two softwares There are two types of look safe clusters The first one is host based cluster And the second one is pbc based cluster Host based cluster is suitable for simple cluster Especially if you use all nodes and all devices for a look safe cluster But safe cluster cluster resources get complicated if not all nodes and devices are used At worst you should list all nodes and all devices for Clusters used for clusters As for pbc based cluster you are free from describing hardware configurations like this And then you should you should specify only two fields The count field and the volume claim templates field Count field means the number of OSDs and the volume claim templates field Is used for a template or pbc Used for OSDs pbc based cluster is very easy to expand You just need to increase the count field If you increase this field from one to two The number of OSD is also increased to two So let's create a simple pbc based cluster I use one Kubernetes cluster Consist of one node This node has two local empty block devices And corresponding persistent boring This demo consists of three steps And you can get the all script and all manifest from this project So let's create a look operator as a step one So kubectl apply hardware and operator.yaml So the operator is created Operator is created and the looksafe cluster Looksafe name space get port Okay, the looksafe operator port is already running So the second step is create a looksafe cluster The manifest is in the cluster on pbc.yaml Yes, it's a safe cluster cluster custom resource And it means the number of monitor port is one And the number of OSD is also one So let's apply this manifest Right there No, no, no cluster Okay, so let's see the progress of creating this Creating looksafe cluster get port So looksafe operator is already exist And the gsport for safe CSI drivers And now safe monitor port is running And the second and the next step is create a manager port And the third step is creating looksafe operator Looksafe OSD prepare port It's to initialize the data structure on top of Local block device And the last looksafe OSD port is created It's to manage OSD ports So looksafe get pbc Okay, so this persistent volume claim is created by look And it's bound to local OSD to persistent volume It's corresponding to one of the local OSD And this pbc is consumed by this OSD port So let's confirm the status of our safe cluster Looksafe tools Apply another toolbox port Okay On the looksafe exec Looksafe tools So safe-hyphen s command is to to see the status of Safe cluster Okay, the safe cluster is actually created And the monitor number of monitor is one The number of manager is one And there is one OSD Okay, so the third and the last step is expand this cluster So it's very easy as I said before Safe So it's by editing Safe cluster cluster results Okay, count one Okay, it means the number of OSD So let's increase this to two And let's confirm Looksafe get port Okay So the next second looksafe prepare port will be created So wait for while Okay, the OSD prepare port is created And it's running The now creating the second OSD data structure And the now OSD port is created and running Okay, so let's confirm the Let's run the Safe-hyphen s again Okay, the number of OSD Number of OSD is now two Okay, so it means this cluster is expanded correctly There are advanced configurations about pbc-based cluster The first one is create persistent volumes for OSDs on demand In this demo, I prepared the two persistent volumes beforehand But if you use CSI drivers with dynamic volume provisioning You can omit this step on this work And the second configuration is even OSD spreading among all nodes To use this feature, you can use topology-spread constraint feature in Kubernetes If you are interested in these configurations, please refer to this blog post All right, thank you Saturu for that demo And thank you for joining us today to learn about Rook Here are a few links to our website, our documentation Come join our Slack, ask questions We'd like to build the community in an open way We do take contributions, we're excited to have you join the community Now we'd like to take your questions