 Hello and welcome to the Rook intro and Seth deep dive. My name is Blaine. I'll be joined today by my colleague Satoru Also keep your eyes open for presentations by other Rook maintainers Travis and Sebastian. I Want to start with the goals of our talk today? Starting with a background on Kubernetes's storage challenges talking about what is Rook What the background is of Rook with Seth and then also talking about the key features of Rook and Seth We also Released Rook 1.7 recently. I'll be talking about some of the new features there Satoru will give you a demo and then we'll conclude with the Q&A starting with Background what are the challenges that Rook hopes to solve in? In Kubernetes, we have a platform that is used to manage distributed applications. These are ideally stateless but in practice This is pretty rare something requires storage something somewhere if we rely on external storage this is not portable often deployment can be a burden and For day two operations, we need someone to manage it It may therefore make sense to go to a cloud provider managed service But then we may be faced with vendor lock-in Rook aims to Help solve these issues But but what what is Rook like? How does it help solve these issues? Rook makes storage available inside of your Kubernetes cluster This storage you can consume like any other Kubernetes storage You make a storage class for it and then users can create persistent volume claims to get Access to that storage for their applications Rook is an operator It has many operators available But generally there will be one operator running at a time and then custom resource definitions to define that storage and set parameters for it Rook provides automated management including deployment Configuration and upgrades of the storage and it is fully open source and it's available to To anyone who wants it Rook has three live storage providers right now three active storage providers, I should say For stable we have stuff. That's the upcoming deep dive into Rook with stuff we also have NFS and Cassandra in alpha phase and With some recent changes, we're now able to Release all of these storage providers independently So they don't all have to be released at the same time if some features Still need to be baked a little bit for NFS There there's that time and freedom available as an example Also some exciting news. It is Rook's fifth birthday coming up Rook in November of 2016 at Cube, Seattle went public with version 0.1 And in that five years we've had a hundred ten more releases So much of this is due to community support feedback and pull requests and we really just can't can't thank everyone enough who's Come in and and contributed to the project dating into some background of Like now getting into the the sef deep dive I want to cover what what is sef? And What are the architectural layers of Rook? How does it work with sef? well, sef is our cephalopod themed storage service and There are a lot of words on this slide, but the TLDR is that sef keeps your data safe through scale It provides the three most common storage types that you might want and that's block Shared file system as well as S3 compliant object storage With the architectural layers We have Rook sef CSI and sef Starting with the Rook layer that owns the deployment and management of everything else, which is sef and sef CSI Sef CSI itself is the thing that dynamically provisions storage in sef and then mounts that into user application pods and With sef we have our data layer that does all the data protection data movement It is Really the most of the brains of the operation. I don't want to talk about this slide too deeply the real takeaway here is that Rook is really just this operator that you see at the top in the middle in blue all of the stuff in red is sef and There are a lot of a lot of components of sef running at any given time Even sef CSI, which is a lot more simple still has a lot of components running at once and these are what you see in green here Rook also has some helper demons of its own Which may be Rook discovery as presented here and the Rook operator also manages deployment of those and Again Rook manages all of this really the Rook operator is the only thing that the user needs to create With CSI this is the driver that Actually makes the storage inside sef like that it creates it inside sef and then it connects that storage into user application pods I also won't go too deep into everything you see here. I think the important Takeaway sort of expanding beyond the CSI view from before a little bit The takeaway is that neither Rook nor sef CSI are in the data path here once CSI Creates the storage and mounts it into an application pod That application Connects directly to a kernel driver, which then is connected to a sef cluster. I know that was quite a lot Sort of slowly kind of ramp up to talking about some of the key features That I want to highlight in Rook and sef These are features that are noteworthy noteworthy and that are most commonly useful to two people I'll start with installation Which we've tried to make as simple as possible Really there are kind of four steps to setting up a sef cluster and the first three It could as easily be done on one line as they could on three different lines Very rarely do these need any modification at all and then creation of the sef cluster resource that we have in Rook is The only thing that most people really need to to customize to get things started and at its simplest It could be the 13 lines of code that you see The 13 lines of YAML that you see off to the right We also really would have precious little without our sef CSI driver This is the thing that Makes all of our hard work available to to the user when they Want block or file storage The key features of sef CSI that we have are volume expansion if you make a volume Five gigabytes and you need 20 gigabytes later. That's possible. We also have snapshots and clones. Those are still in beta But those are rapidly becoming more and more stable Also for a note we are Very soon deprecating the old old old flex volume driver And as part of this deprecation, we're working on a tool to migrate those flex volumes our persistent volumes created with the flex volume driver to CSI or Alternately for migrating in tree drivers to CSI so that users don't have to migrate things manually Rook also runs in Really any environment you could want these commonly we break down into to to basic umbrellas being bare metal or Inside of a cloud provider with bare metal. This may be you have your own hardware or it may be that you have a shared hardware situation in a data center we commonly get asked Especially by people used to running Kubernetes on bare metal. Why would you want to run Rook in a cloud provider and The the responses here. I really find quite fascinating cloud providers Do actually have several shortcomings for a lot of users Storage like storage types are not always available across availability zones One availability zone may have object storage where another does not for example some Some cloud provider storage also takes a long time to fail over Rook can very often fail over in seconds versus minutes Some cloud providers also limit the number of pvs you can have per node to 30 which is Sort of pathetically small for for a lot of users and It may simply be that users want to save money They can use a cloud provider They can still use cloud provider storage But they can use something that has a better cost to performance ratio and to aggregate that together into something That's usable with Rook The bottom line also especially for multi cloud situations is that consistent a consistent storage platform like the one that Rook can provide is Can be used anywhere Kubernetes is deployed and this is like again the interface is just using persistent volume claims For underlying storage. There's no nothing extra or fancy. It's just make a claim for storage and In a cloud environment, there's no need for direct access to local devices as we're using a cloud provider's pvcs as the underlying storage in many cases Also sort of expanding on environments here. We it's possible to configure Rook for any cluster topology really this is customizable across failure domains or within failure domains and this really is Toward the end of providing a highly available and durable storage And this is achieved by spreading SEF demons and data across failure domains as much as possible Also, a lot of users want to keep their application pods separate from their storage pods and have a subset of nodes for storage and a subset of nodes for applications and that's possible Pretty easily using node affinities and taints and tolerations Also key with Rook is that updates are automated. I mentioned this a little bit before and We can think of this in like in two parts there are upgrades to SEF and then there are upgrades to Rook With SEF updates and even major upgrades are fully and totally automated Rook handles everything I can go from a major release of SEF version 15 octopus for example to version 16 pacific and Rook just does it Patch updates to Rook going from you know version 1.7.4 to 1.7.5 or 6 or 7 Is also fully automated within Rook But if we're talking about Rook minor upgrades upgrading from Rook version 1.6 to Rook version 1.7 does sometimes require manual work And this is just because Sometimes we need to update the permissions that Rook has or Any number of things or we have featured deprecations which require a small amount of manual work if users are using those features Notable featured deprecation. I just talked about is deprecating the flex volume which is Very little used anymore Feel free to read the latest SEF upgrade guide here at the link also Another cool thing that we've had in Rook for a while is the ability to connect up to an external SEF cluster SEF has been around for a long time I mentioned or and a previous slide mentioned that it had been around since 2012 and Many users may already have a SEF cluster up and running and they simply want to use it within Kubernetes Well Rook also allows connecting to that external SEF cluster so that then Kubernetes has access to it and Just like if Rook itself were running the SEF cluster Users can still just request the storage they want and it gets created in Kubernetes and there's no No fiddling necessary. No extra administration necessary We also have been really interested in some of the upstream work to provision object storage buckets and Currently what this looks like is Users or administrators can create a storage class for SEF object storage and then users can create an object bucket claim And this is very similar to a persistent volume claim. It's just for a bucket rather than a volume and Then whenever Rook sees that OBC come in it creates a bucket and then it gives the user access to the bucket via a Kubernetes secret There's also a Kubernetes enhancement proposal called Cozy the container object storage interface which we're We've been following pretty closely and we're Waiting to get into Kubernetes so so we can also provide this option for users and This aims to be CSI But for object storage if I were to put it in a very short number of words Finally, we have the features with the recent Rook version 1.7 release. This was from August of 2021 The three notable updates I can provide here are that stretch cluster now is stable I will I'll talk about stretch cluster in more detail in a minute We also protect user data whenever we're deleting a SEF cluster So we don't allow a SEF cluster to actually be deleted if any other Rook resources exist and this is to prevent and administrator from accidentally deleting user data That may exist that maybe they just missed or a user missed the deadline for backing up their data or whatever it is We also have continued to add support for mirroring file systems from one SEF cluster to another We now have full support for this at least as far as the SEF project is concerned We do have a note that this is still a newer feature in SEF and it is still Considered under testing So I promised I would get back to stretch cluster Many users want to run SEF in multiple failure domains This allows us to actually have the better disaster protection that I mentioned and Most commonly this is really either two zones or three zones. We we really don't see more than three zones terribly often the problem with clusters in two zones is How do we maintain a quorum if a zone fails? because by definition quorum has to be more than 50 percent of The the daemons in quorum for SEF that is the SEF monitors Well, the solution that the SEF project has come up with is to have a third zone acting as a minimal tiebreaker and I would encourage you if you're interested in these scenarios to look at the SEF documentation about this But I will also briefly talk about what this looks like visually We have two primary zones and this is in our example where all of our user applications live This is also where all of our SEF storage lives and this is where most of the SEF monitors are going to live We have the third zone on the right, which is a monitor that exists as a tiebreaker Just for the case of a whole zone going down. We still can have a majority quorum And this is still What it looks like during normal operation, but what this situation really cares about is failure So imagine that one of your primary zones goes offline Maybe you live in Houston, Texas and there's a hurricane and now your data center is under three feet of water True story. This actually happened to me With this tiebreaker zone There are still a majority quorum of three out of five nodes and SEF has protected all of your data all of these OSDs in zone two still have All of the data that exists and it's still available Kubernetes because it's Kubernetes will reschedule any applications from zone one into zone two once it realizes they're failed and now after Some surprisingly short amount of downtime ideally less than 15 minutes all of your applications are still running and Most people on the outside world are not aware They don't have to know that we're a data center in Houston just got flooded Okay, I think I've talked long enough for now I'm gonna pass to my colleague Satoru who's gonna give you a demo of what it is like to install Rook I'll have a demo to create a simple Rooksafe cluster I will use these two softwares There are two types of Rooksafe clusters. The first one is host-based cluster and Second one is pbc-based cluster Host-based cluster is suitable for simple cluster Especially if you use all nodes and all devices for Rooksafe cluster But safe cluster cluster resource get complicated if not all nodes and devices are used At worst you should list all nodes and all devices for clusters used for clusters As for pbc-based cluster, you are free from describing hardware configurations like this And then you should you should specify only two fields the count field and the volume claim template field count field means the number of OSDs and volume claim template field is used for a template or pbc Used for OSDs pbc-based cluster is very easy to expand you just need to increase the count field If you increase this field from one to two The number of OSD is also increased to two So let's create a simple pbc-based cluster. I use One Kubernetes cluster consists of one node This node has two local empty block devices and corresponding persistent boring This demo consists of three steps and you can get the all script and all manifest from this project So let's create a looksafe look operator as a step one so kubicontrol, applyhackenware and operator.yaml So the operator is created operator is created and looksafe cluster, looksafe name space, get port case looksafe operator port is already already running So the second step is create a looksafe cluster The manifest is in the cluster on pbc.yaml Yes, it's a safe cluster cluster custom resource and it means the number of monitor port is one and the number of OSD is also one So let's apply this manifest there No, no, no cluster Okay, so let's see the progress of creating this Creating looksafe cluster, get port So looksafe operator is already exist and these ports these ports for safe CSI drivers and Now safe monitor port is running and the second and the next step is create a manager port And the third step is creating looksafe OSD prepare port It's to initialize initialize data structure on top of local block device and The last looksafe OSD port is created It's to manage OSD ports So get pbc Okay, so this This Passet and volume claim is created by Rook and it's bound to local OSD to to passet and volume it's it's corresponding to One of the local OSD And this pbc is consumed by this OSD port So let's confirm the Status of our safe cluster Tools Toolbox port Exactly So safe-file.ms command is to To see the status of Safe cluster case the safe cluster is actually created and Number of monitor is one the number of manager is one and There is one OSDs Okay So the third and the last step is expand this cluster So it's very easy as said before Safe So it's by editing Safe cluster cluster results case One Okay, so it means the number of OSD. So let's increase this to two And let's confirm Who said get port Okay So the next the second Look safe prepare port will be created soon. So wait for a while Okay, the OSD prepare port is created And it's running The now creating the second OSD data structure And the now OSD port is created and running Okay, so let's confirm the Let's run the Safe hyphen s again. Okay, the number of OSD Number of OSD is Now too. Okay, so it means This cluster is expanded correctly There are advanced configurations about pbc-based cluster The first one is create persistent volumes for OSDs on demand In this demo, I prepared the to to P assistant volume beforehand But if you use CSI drivers with dynamic volume for provisioning You can omit this step down this work And the second second configuration is even OSD spreading among all nodes To use this feature you can use topology topology spread constraint feature in Kubernetes If you are interested in These configurations, please refer to this blog post. Thank you Satoru for the demo Thank you all for for watching and for your interest in Rook I'm going to leave this last slide here with some links to our website and documentation and Anything else that you you might have interested in about the Rook project and I'm going to attempt to leave these Open while we then Go to the Q&A portion of our our presentation