 He goes by Ashik. Ashik, Ashik. Ashik, who's that Ashik? Oh, my internet is not working? So this is an interview. Yeah. To some free... For free ratio. They're still missing something. Yes. Okay. For free ratio, which means... 1,400 to... 1,400. No, that's for this. I'm trying to remember how to change resolution on this thing. It's been a while. Yeah, it's only for the first one, so... Maybe forget about it. He's already on the Wi-Fi, so he can use... What's that? Mine. It's the desktop management thing. I'm really out of it at the moment. Yeah, we're here. So what's a good 4x3 resolution? I can't remember on hand. They're fully... Yeah, for free... It happens. Yeah. Let's do 1280 by 960. Yeah. Yes. Yeah. Yes. I'll be trying 14 to 150. 150. Okay, so it's only for that one. Got it. Yeah. All right. So after some technical difficulties, we have this presentation ready to go. What is the hyper-converse persistent storage for container-with-dustry? Please welcome our speakers, Huse and Mohammed Ashik. And... And? Huse is a software engineer at Red Hat, a storage... Red Hat storage team. So please welcome for the talk. All right. For the record, I will have it all know I had a very bad morning and I only got here about five minutes before this presentation, so bear with me. All right. As mentioned, my name is Huse. This is Ashik over here. And we're presenting a relatively new storage solution that will allow you to do hyper-converged storage natively on your cluster nodes with Gluster FS. All right. Next. All right. Ashik, say a little... Let's see. Say a little bit about yourself here. So hi, everyone. I work on Gluster containerization and hyper-convergence of Gluster on OpenShift and Kubernetes. So whatever component which is required for hyper-convergence, I contribute more stuff. Yep. And as mentioned, I'm Huse. I'm a member of the Samba team in the Kubernetes Storage SIG. And I focus mostly on the integration code that managed to bring all this into an easily deployable scenario. So we're going to go over sort of why we're doing this and what we're using for it. We're going to go over a rundown of just what work it took to get it all to this point. So we're going to go over, in particular, the containerization of Gluster FS up there. And then we're, hopefully, since I just got here, show you a demo of it working in action. Okay, so why are we doing this? Well, the basic setup is that containers are ephemeral and a lot of applications are not. A lot of applications want to have persistent storage that lasts longer than the life cycle of a typical container. If the application goes down for some reason, it wants to be able to come back up and have its data that it's stored be still there when it comes back. But given the nature of the work we're doing, we were looking to provide something that would require a minimal hardware investment so that you wouldn't have to buy any sort of external hardware appliance to do your storage for your containers. Was as simple and transparent as possible for both administrators and users who wanted to make use of the storage as much as storage can be anyway. And something that is free and open source, which is why we're here. So our target platforms are Kubernetes and OpenShift, because red hat. And in particular, we see them as leading adoption in containers in general, both in open source and corporate environments. And our component projects are GlusterFS and Hecchetti. Lovely little logos there. GlusterFS is a distributed, software-defined file system. Storage devices, or storage... well, storage devices called BRICS are selected on one or more nodes in a Gluster cluster to form logical storage volumes. One of the biggest things about Gluster is that since it's software-defined, it can run on commodity hardware, even a Raspberry Pi, linked to PDF, and has a scale-out design. So it is easy to increase your storage simply by adding more BRICS or more nodes to your storage. Gluster can bring all those together into coherent storage volumes and expand them as needed. And it also provides good distributed features like cross-node and cross-side replication, usage balancing, and, as of recently, an iSCSI storage access method. Hecchetti is the restful volume management interface for GlusterFS, meaning, of course, that you can use any rest client, even down as low as Curl on the command line, to do most of your base... to do most of the necessary volume management operations for a Gluster volume. In particular, this allows it to be done programmatically by applications. It can manage multiple Gluster clusters from a single instance, and was designed from the get-go to be lightweight, reliable, and simple. Simple being the keyword here. It is that easy to use. And then finally, we have a little project on GitHub called Gluster Kubernetes that brings all these various projects and pieces together. We document on how to put it all together yourself. We provide a quick start guide so that you can bring it up yourself very quickly, either in your own environment or in a virtual environment that we provide. And it has an easy-to-use deployment tool, GKDeploy. That's where most of my work went. All right, creating hyperconvergence. Take it a sheet. I just wanted to tell you. So I work on Gluster containers and the hyperconvergence of Gluster in OpenShift and Kubernetes. The first task when we took about this solution was containerizing Gluster. So we needed to make the user space network file system inside a container. So that was our first task. Now it is running really good inside a container. And this is the complete command of Docker, which is used to run this container. So we have two base images maintained in upstream. We have CentOS Gluster image and we have Fedora Gluster image. And we have a lot of pull requests and issues there, which we are really working on. And it is really good to see the community coming up. So what this command actually does. So first, for a configuration, we needed all the Gluster configuration mounted from the container, from the host to the container. So ETC GlusterFace, ValetGlusterD and Valog GlusterFace are all the configuration for Gluster required to run. So these things are given from the host and we give the complete network stack of the host to the container and we use system decontainer. And it runs in a privileged mode. So I'm going to split this up. So this is how it looks like now. So why we came to this solution is that I'm going to talk briefly on. So system D, it is a system decontainer with privileged mode which bimounts the Gluster configurations from the host and has the network stack as host and we bimount the dev and we have a startup script I'll tell about the startup build. So system D, why we wanted system D inside the container. So Gluster had a lot of process running along with it and we wanted someone to clean up the system D process after the process got completed. So we started with system D containers with the help of Dan Walsh and I think most of you know him. So with that we needed privileged. So previously to run a system D container we need privileges. So without privilege system D was not able to run inside container properly but now it can run without privilege in container with the OCEI system D hook on the host. So we need privilege still now because we are storage providers and we need slash dev of the host inside the container to create the logical volumes. So these are the three directories which is required for Gluster to maintain its configurations. So we didn't want Gluster configuration to be changed when container respawn happens. So we bimounted these containers directories from the host to the container. So why we wanted to bimount it because anyways we applied to that host as the devices were only located in that host. So bimount was a good choice for us to go on because anyways we can't move a Gluster configuration from one node to another because devices attach to that. So we also took host networking for Gluster point the IP should not be changed. So we could have done it with giving a constant host name but we thought it will be better performing if it was in host networking. So we gave the host networking stack to the container. So slash dev as we all know all the storage devices get inside this. So we need the slash dev inside the container so that all the volumes which are required for Gluster to create is given from the slash dev of the host. We have it running perfectly fine now so we had few issues with Udev we have solved it in our containers and the startup script. The startup script is really important when you are doing container stuff and which is really big system kind of an application to containerize. So the thing about startup script is it manages to start the Gluster as it starts in the RPM installation the configurations which you needed everything is given copied back to the bimounts of the host and then we needed a way to upgrade Gluster in case of upgradation of Gluster where it is taken care by hook scripts from the RPM installation so we needed some place in the container to do that stuff. So we have a startup script inside the container which also mounts the logical volumes when a container respawn happens. So all the logical volume which was mounted will get remounted on restarts. So that's all about Gluster containerizing. This is on Hackety containerizing. So Hackety as he said it's a restful Gluster volume management tool. So what Hackety actually does it it gets the whole control of Gluster nodes. Let's say you have 3 nodes with 2 devices on each node. So Hackety actually runs to manage those 3 Gluster nodes and those devices. So when you create, when you give a command in Hackety to create a volume Hackety goes back in and creates the logical volume pvvg logical volume for that device, for that brick Brick is the minimal unit for Gluster so it will create everything required for the brick and Gluster volume command gets all those entries of bricks and creates a Gluster volume from the network stack and gives it to consume from the natives mount. So that's basically what it gives those commands. It gives the whole abstract rather than any admin to go and create all these bricks behind the screens and do the volume create command and stuff. So Hackety is managing all the Gluster volumes when you run Gluster inside Kubernetes or OpenShift Hyperconverse. So it was easy to build as it was in small application which was really not as complicated as Gluster. So it was really easy to build the thing which we came up with using the Hackety was to get access to all the Gluster nodes. So it has to have the access to all the Gluster nodes. So initially before Hyperconvergence came into picture, Hackety uses SSH Exec to get inside the Gluster nodes and create the logical volumes for it and Gluster volumes from it. So now we can't do SSH because we are in a Cube Exec with Kubernetes world environment where SSH can be dangerous. So we used Cube Exec So for using Cube Exec we needed privileges or kind of a username authentication way. So we need a service account in OpenShift of Kubernetes to access all the Gluster in the namespace. So Hackety uses that service account and calls all the Gluster nodes to create LVs and create the Gluster volumes from it and BoldDB. So Hackety uses a BoldDB which is to maintain all the operations which was performed by Hackety to maintain all the devices which was attached to the nodes which is managed by Hackety. So in case of persistence in the containers, we don't actually have any. So on Hackety respawn, this DB is lost so we need to build it from the scratch so Hackety how it does in the hyperconverse ways it initially takes the Gluster cluster and creates a volume out of it and it uses its BoldDB for that persistent volume. So it creates a persistent volume of Gluster and BoldDB stays in the Gluster volume and it is available to Hackety to maintain the DB. So now let's talk about provisioning in Kubernetes world. So Kubernetes has its own persistent volume framework which can be used by any port. So backend storage can be anything. It gives the abstraction to the users and the administrator to use the volumes. So they usually admin has to go to the backend and create a network storage volume and create a persistent volume file. He has to create a Gluster volume backend and give it as a persistent volume file to users to consume it. So the persistent volume claim is where the user comes into picture. If I am a user who wants to run application and want a persistent storage of 5GB, I create a persistent volume claim of 5GB and this 5GB volume claim is bound to the persistent volume which was created by the admin. I will talk about the storage class in some time. So this is how the persistent storage in Kubernetes world or OpenShell world looks like. So admin has to go and create the network storage volume which is given as persistent volume and then the user will request for a persistent storage in persistent volume claim. So the request is then bound to the persistent volume and you can see the status of the bound in the persistent volume claim and the PVC is then used by the pod. So when a pod goes down and comes back up, all the data which is stored in this storage is not lost. It will again mount it back and use it from the state which it was killed or died. So that is how it works. So this PVC is mounted using just volume plugins in Kubernetes. So Kubernetes supports lots of volume plugins. So basically the persistent volume file which is bound to the persistent volume will have a mechanism to mount that kind of volume like in case of Gluster. So Gluster supports native views claim. So in case of Gluster, the volume plugin actually does is it mounts the Gluster volume on the host where the pod is going to use Gluster and then it bi-mounts the mount path to the container so that everything written to that bi-mount path will be persistent across the respins of the application pods. That's how volume plugin in Gluster works. So PVC is released only when the PVC is deleted not when the pod goes down. For Gluster to be in a persistent storage in environment where Kubernetes or OpenShift, these are the things which are required. So we need a service account. Service account is used to maintain the endpoint file and endpoint file is actually where you mention the IPs of the Gluster nodes. If you have a 3 node Gluster you will give all the 3 nodes IP which is used by Gluster in the endpoint file so that when you mount these volumes it will use, it will check the endpoint file and take the IP and mount it on your local application pods. This, the persistent volume file, there is a persistent volume file with respect to Gluster because it has a persistent volume plugin inside Kubernetes so the name which you mention makes use of this plugin. So persistent volume claim of Gluster volume will then map to the persistent volume of Gluster and application pods or the users can use Gluster volume from it. So what Kubernetes does to reduce all the steps where the administrator is required in case of a volume request is dynamic provisioning. We have a dynamic provisioning plugin inside Kubernetes now. So you don't have to worry about the endpoints anymore. You don't have to worry about the service anymore. You don't have to actually worry about the pvs which has to be created by admin on request. So the persistent volume claim is required now to get a Gluster volume out of Kubernetes if you are doing hyperconverged Gluster or even without hyperconverged Gluster this solution is possible. So this is the parameters of storage class. Storage class gives more user specific volume creation in case of Kubernetes or OpenShift. So I will tell you in Gluster world. So this is the Hackety data. So REST URL is where the Hackety lies. So you talk to Hackety through storage class and this is the cluster ID. So Hackety can manage more than one cluster of Gluster. So multiple cluster clusters Hackety can manage. So if you say there is one cluster of Gluster which is of three nodes and as back end HDD device and one more cluster of Gluster which has let's say some more faster devices. So you have two clusters in Hackety. So if you want to provision faster storage for a user who is privileged or something you can mention the cluster ID in the storage class. So this will create the volume only from the faster storage and not from the slower storage in the Gluster cluster. So REST user is the user name for the Hackety user and the secret name is used for Hackety authentication purpose and GID when GID max is the security option which hyperconverse Gluster provides. So you set the GID on the volume so that only with that GID you will be able to access the volume other than this GID you will not be able to access the volume. So this is the security feature which we support in Gluster hyperconverse. So as mentioned we have now achieved full hyperconvergence with Gluster and Kubernetes just to go over some of the quick high points here. This hyperconvergence scenario requires still some administrative changes especially if you have an existing Kubernetes cluster you need to open up a couple ports and such that Gluster requires but otherwise this greatly reduces the hardware cost because if you don't have any existing storage all you have to do is attach a couple SSDs or something to your Kubernetes nodes and you're good to go. Applications now have native access to Gluster fastback storage via Hackety and it's also fairly flexible in that the Gluster containers don't have to run on every node of your Kubernetes cluster so if you only have a certain subset of nodes that you can add devices to or that you want to add devices to just spend the containers up on those nodes and you can go from there and as with keeping the promise of Gluster this is relatively easy to scale out. Now let's see if this works. Don't worry. Can you change your font? Easily. Step one. Make it big. So let's see here. Let's start by... Alright so I have a four node effectively Kubernetes cluster running let's see and I have a couple pods running there already although the screen is smaller than I anticipated Thank you. So we have a couple pods running here all in the default namespace let me edit something real quick because this will be important later on. Okay much better. So we have some endpoints, a service account and some secrets. Nothing else running just yet. So then using our handy deploy tool oh wait first of all so we're going to use our deploy tool there but as part of setting up Hiketti we need something called a topology file. Let's do less here. This topology file lays out the topology of your storage devices and where they lie in various Gluster nodes. So you can see here we have the node hostname address the list of devices that are on that node and then we also provide a field for something called the zone. This is basically just a number you can give it any number you want but each number logically represents a different management zone so that Hiketti knows if you wanted to balance storage between different zones this is useful for something like geographic replication from one site to another. This is a Hiketti file specifically and this is formatted in JSON my condolences. We don't want to do that because Hiketti assumes full control of all the storage devices that you give it. This means that it will format it and put LVs and NPVs on there by itself so if you have it scanned for storage devices it might overwrite your root file system. Thankfully it does not actually do that. But you can specify the storage device that has your main file system on it. Yes, of course. Of course, yes. Sorry, like I said I just got here 20 minutes ago, my brain is not all where it should be. Let's see if I can do this right. So we run we specify our deployment tool we specify our topology file and because we don't have cluster running just yet in this case we will specify dash G to tell the deployment tool to deploy cluster. And then for the purposes of this demo I'm going to run it in verbose mode just so that we don't sit there waiting too much on empty dots. All right. So brings you over some considerations and requirements for doing all this deployment in particular that you need to have of course administrative access to your Kubernetes cluster and you need to have installed the command line Hiketti client Hiketti CLI. Your notes will also need these list of ports open on there. And you can see the last ports there. How much how many breaks you're going to have in your cluster cluster but in general we recommend about a hundred or so. That should be fine for most deployments. So we've deployed the cluster pods we're waiting for them to create I've already downloaded the Docker images ahead of time so we don't have to wait for them to download and it should take about 12 seconds I think. Nope. Should take less than a minute. Let's see here. They're running. They're running. So what we're doing right now is just waiting for all the resources of the container to attach to said container. Why this is taking so long I'm not sure. Unfortunately. Let's see here. You can do it. Nope. We are going to run over a minute here. No. I already pulled the Docker images. That's the thing. Woo-hoo! I can't take care of that. That is slightly over a minute, yes. Kubernetes stops reporting at the interval of seconds and then just reports every couple of minutes or so. Here's where some of the magic happens. Hackety actually deploys twice in this scenario. First it deploys a deploy hackety pod that will take the topology file and add all the storage devices and all the nodes. Then it creates a job to copy its new database or to take its database, create a new GlusterVolume and copy its database into that GlusterVolume. Then you destroy that pod and create the end hackety pod that will now come up and run pointed to the GlusterVolume as its main database storage. And we're running. Let's see here. So we have our hackety service and then let's see, if I just do pr- b-d-e-slash Hello? Hello from hackety. Now let's do something interesting with this. All right, so here I have a storage class. Actually, I need to edit this briefly because we have our endpoint as a hackety storage endpoint and as Ashik mentioned that is a pointer to all the Gluster nodes that are currently in the cluster. And then we need the REST URL for actually getting to hackety which I need to put that in there. And for the purposes of this demo we're running with very minimal security so we don't need to really enter a real REST user or REST user key. Created. Then we have a Gluster persistent volume claim. Now this is something that you would submit as a user. So the work of the administrator is now done. Now I'm technically transitioning over to the role of Kubernetes user. So I want to request a 5 gigabyte volume that is of type Gluster hackety. I just created the storage class Gluster hackety as the administrator and now as the user I'm going to do qcto-f you see volume 1 created and as we can see here the pvc has a bit or you see let's see if I can you see up there where it says pvc Gluster 1 that's the pvc I just submitted you see the storage class and here you see the persistent volume that was dynamically generated from the pvc and then we will spin up an nginx pod this is a fairly basic pod about the most basic we could get for the purposes of this demo and it just sets up an nginx instance port 80 and in particular puts its main html storage in a Gluster volume we just created and for this one I also prefetched the container image this one should hopefully run less than a minute we'll see you can see the nginx pod there is currently in a state of container creating and it's running it also created an nginx service here which I'm going to need so we're getting data now I'm going to do some slight command line magic here I'm going to do a cube CTL xx so I'm going to go into the nginx pod and echo some data to index.html this will effectively write the data onto the Gluster storage volume and then if we do a curl again we see hello world from Gluster FS now just to show you as much as I can show you that this actually did something I'm going to go into the container itself or into one of the Gluster containers oh, let's see here I changed hex on me ok and we're going to look for mounts which are managed by Hiketti alright so the let's see the first one here I just know is the volume mount for the Hiketti database so we're going to go to the second one and then cd that ls we go into brick ls I did pick the HikettiDB one let's see here mount grab Hiketti oh where is it now let's do another container then there we go there's more like it cd brick plus cat and there that is the string I just wrote on Gluster Storage and that's it so let's see we go here the five demos and thank you very much do I have time for questions alright any other questions right there the question is what is the point of having all this in Kubernetes and in containers at all the point is not yeah if you can't move them the point itself isn't to sort of bring the ability of containers to Gluster is to bring Gluster to other containers so other containers can move around inside Kubernetes and they can use Gluster volumes natively within Kubernetes as Kubernetes storage and they can do that from anywhere in the cluster convenience also in case of atomic host kind of environment where Kubernetes is running so in that place you can't install actually any RPM on the atomic host right yes this is so yeah so yeah it's shared infrastructure it's shared infrastructure and the fact that you might be running on something like atomic in which case you can't just install Gluster on the host itself you would need you can instead install it as containers on the nodes yes you're talking about snapshot kind of a thing for a volume for example also backup the data so if I'm thinking about this interaction environment I need to have a way how can I get alright the question is what about backups how do we get the data out and as for what? so that is snapshots you can take snapshots of the Gluster volume can you speak more to that? yeah so you can take snapshots for the volumes and also there is replica 3 volume so in case one part goes down Gluster part goes down so it's okay you can still serve from other 2 nodes so all the volumes are created as replica 3 so one Gluster going down will not harm your mount point or applications running with the volume from Gluster self storage self storage yeah alright I don't know I mean as far as getting the data out any backup solution that can access a file system can easily extract the data so I would think about the Gluster FS as my backend storage like I don't know so it really depends where I want to take the data because you see that the mount points are not easily guessable so you need to have at least a way how you can get the data connected to something so I see that a little bit longer to make a backup solution I mean yeah in that case if you want to go down to the raw storage and extract your data that way then yeah I think that gets a little complicated but if you wanted to you could hand face mount that or to use mounts using the Gluster made library from some kind of system and just copy the data out you would still need the Gluster layer on top of that for that one so you don't have to go and guess all the underlying storage like you said so Gluster has a feature inside saying you can create if you want Snapshot it's just a command in Gluster to create Snapshot but for now we don't support it in Hackety because Hackety is a layer where you can access the Gluster volumes so we are we have it in like really near future we'll have that because it's really hard for us to do it like you can you have to manually go to the Gluster container and do a Gluster Snapshot command which will take the Snapshot of the volume then you can use the same volume to restore your data so yes there are ways of doing it just not through Hackety yeah it's alright let's jump over here we don't have sub directory feature in Gluster yet so Kaushal knows so yeah the question was we haven't implemented that yet in Gluster there have been some technical challenges but we're still looking into it I believe it's the short answer alright do we got time for one more alright here we go yeah you you have the feature inside storage class so when you create a volume you can specify this storage class will create replica 3 replica 2 or distributed replica those are options yeah yes by default it's replica 3 yeah by default it's replica 3 you can change it I think the patch is there yes no at that point you're doing with Hackety Rust commands or using the Hackety command line tool you can add or remove devices and nodes that way so yep it's just initialization and the question was do we have to mess with the topology file every time we want to change the topology of the cluster the answer is no anything else I saw one right there are you talking about rebalancing the storage yeah these volumes are mostly replicates so heal happens without any command in cluster even in container or non-container case replica if you have 3 replica and cluster node goes down and something is written to the volume when the node comes up and the heal happens automatically on this so as far as the storage is concerned all the healing and rebalancing happens automatically on the cluster back end not rebalance not rebalance but heal happens we support we prefer replica volumes in this scenario so the rebalance doesn't happen automatically can you go into the back end and force a rebalance so this release we are doing the expansion part even more cleaner and they will take care of the rebalance yeah next one alrighty yes can you come up here apparently don't forget to send a feedback please since you have that level thank you one of my containers which is part of the cluster and then how I introduce another container and join it we are working on some solutions for that so you can replace the node or replace the device which was already being used in case something was wrong so the container itself will take care of all the breaks created and it will do rebalance from other you know you have separate commands you have separate commands that is progress not yet we have many I don't know vision of what can be I need to more replace the node no I mean because for a system like this for performance reasons I am not because even if I it's not as important as something else I would assume that I understand it is you my name is you but then you might lose I am not a captain but I have a hat I have a hat I have a tube how are you doing not bad I wanted to give you a little heads up unrelated to this store hog just got into fedora wait what and I have got like 2 weeks or maybe 3 weeks to make 3-10 work with it so I am probably going to be calling you a lot I don't know I don't expect to have many problems are you going to be hanging out in bernoud this week all here all week yeah me too I am going to fuzz them I am going to fuzz them you are too alrighty I don't think so hey how you doing awake barely no I got him from minneapolis yesterday but I literally asked you do you want to I didn't get on site I don't have hdmi I only have so we are sure we are sure okay I want to get some coffee what and then go about my daddy oh thank you dense information on that cluster yeah that is mostly a chic I can't take much of it out okay so does this same as the screen question do we have to submit slides anywhere if you can do it on slideshow and then send Chris link to the slides okay sure I am giving a talk here later and I realize my vga adapter goes the wrong way do I need to panic no just came after this session and we will try to solve it somehow with reactions okay so for this session it is not in the schedule it is in schedule but it was changed can I know the speaker that I introduced yes of course this here here Clint Byron that's me okay okay nice flat okay please in a QA session repeat each question because of the recording this is only one mic we have let's try to yeah I'll stay right here 4 to 12 we have 40 minutes 4 to 8 that's 10 minutes if you would like to perfect that's good I hope it's all so finally the breaks great what? not starting yet what did you pronounce it's Clint Byron Byron yes thank you