 You go to Paris or France, they're just there all the time, like take your euro and go to Paris. Okay, leave. You've got to leave. I know, I know, just a couple more entries. Oh, I'm sorry, yeah, send me this. Hello. Welcome back to the sessions. Make a reminder about the feedback, I'm looking to get the feedback. Be nice. I think the talk, there'll be the lightning house if you want to participate. I don't know if you can still propose topics, but go there and find out. In the meanwhile, let me introduce Luis, who will tell us how to better manage our cluster clusters. How you doing? My name is Luis Pabon and I work for Rehab Storage. I'm a crystal engineer there. And this is actually a picture that my sister sent me. I guess somebody saw me and they created me in the Big Hero 6 movie. So I'm there in the Big Hero 6 movie. And I come friends, just like this. Except I'm a little younger there. See, I don't know, it's much white. So let's take a little history understanding. What is the need for a bikete and why it was created? When we first started with Glossar Fest, we wanted to integrate it with OpenStack Manila. And if you know what Manila is, it's trying to provide file shares as a service. So you have Cinder, for example, in OpenStack. It provides block. You have Swift. It provides objects. So the new project of Manila, I wanted to provide file. And to do that, we wanted to be able to use Glossar Fest as a backing source for Manila. So that came with some new areas and new models that Glossar Fest needed to adhere to. And that's where Project Hackety was created. So before we go ahead and start talking about Hackety and what it does and what it's trying to solve, let's first try to understand what Glossar Fest is and how it's created. How do you go ahead and create a model? First, what is Glossar Fest? Does anybody who doesn't know what Glossar Fest is? So, anybody else? So as you probably know, Glossar Fest is a distributed, reliable file system that you can access from multiple protocols. So you can access it from NFS, from Samba. You can access it from Object, from Swift. You can access it over a few channels or from API format. So that is, unit of usage is really a volume. So when you want to go ahead and access Glossar Fest, the very first thing you have to do about it is the volume. You have to create this entity out of your storage and that's how you end up using Glossar Fest. But then, how do you create a volume? So, the very first thing you have to do when you want to start using Glossar Fest is decide what systems you want to use for. Once you decide the systems, you decide what storage in that system you would like to use. Once you decide that, then you start dividing it. You start creating, taking one system and joining it with an order of what's called peer-proving and then extending your trusted pool. So now two nodes become a trusted pool and as you add more and more nodes with storage, it becomes as a larger trusted pool. So then, you have not only the storage, you have to decide what to use, but then you have to also set up a file system on that storage. And that makes my mind. What? So, before I go to the next one, Glossar Fest has three models. So, it has a volume and a volume is a collection of bricks. And each brick, when I first started at Glossar Fest at Red Hat Storage and I asked actually this question, what is Glossar Fest and how do you set it up? And they said, well, you know, it's a volume and they set up bricks. And I said, well, hold on a second. What is a brick? And I said, well, it's a directory. But they just called it that. Anyway, it's kind of confusing, but a brick is really just a directory and many bricks, you can join them together to become a volume. Now, depending on how many bricks you need, it all is dependent on the durability model that you choose. For example, you could say, I want replica 2, I want EC. And those kinds of requirements on your volume determine the number of bricks that you need to actually create it. So for example, here we have three nodes. And those three nodes, we created each one has three directories. And those three directors may be on a rate, they could be on a disk. It depends on how you want to decide how to actually create that storage. You then, to create the volume, you have to almost like unify these, but you create a volume out of these many bricks that you have actually created across so many systems. Now I want to point out here that once you create the volume, here we have set up the durability model of replica 2. And then for, in this model, we have six bricks. Now, it's really interesting to see here also that we have chosen that server 1, this brick, each replica is the one after that. And second, it's the second brick, and these replicas are the one after that. So you have to be very careful how you organize these in the command line. Lastly, you then start the volume and then you can go ahead and mount it. And when you mount that volume over Glossier offense, then you get the nice unified storage of the volume. But there's an issue here. So anybody can tell me what the issue is with this model in the cloud? Yes, exactly. Somebody type this, and that is not real. This only works at the model of the request, the number of requests per second. So for example, if you get one request per week to create a volume, this is great. And Glossier 1st works great at the NAS model where you can create a massive amount of storage. You know, and the volume becomes available for a long time. But in the cloud, you got volumes that are small. You got volumes that have so much lifetime. They come in, they come out. Each one of us has different durability models. Some of us have EC models, some of us replica 3s and 2s. And we want those volumes now, as soon as I press the button for the request. So we need a new way to be able to achieve this. So this is the face when you look at when somebody has to type that for every time. So we need a new way to go ahead and achieve this. And that is what Hakedi is trying to bring in. So Hakedi itself is a REST service that has an intelligent algorithm inside, depending on the topology of your data center. So it takes that information that you provide for your data center and then it goes ahead and creates many volumes according to the requests that are coming in. And again, Hakedi is not really supposed to be used by administrators and such. It's supposed to be a service for something that provides multi-tenancy. Something like Kubernetes, something like OpenStack Manila. Or any other system that maybe a hospital or something else that is local only for a private cloud. This is the architecture of Hakedi today. It is an HTTP framework on the top. It has a middleware where we can add more services in the future. For today it only has the education. It can go ahead and has a database to be able to track all your storage across the entire data center. And then it has these things at the bottom that really apply to the models. So today, Hakedi is very young in his software, so we just have enough information there to be able to have the allocator. So here is the allocator to decide where to put the bricks, in what system, in what node, in what zone. And we'll go into zones in a second. But it's one algorithm. We can probably come up with more efficient ones in the future so that we just put some in there. And here we have an SSH executor, which means that Hakedi wants actually to go into a system and create setup commands in that system. Today we just SSH it's in there, but there's nothing in the future that says we can have a REST interface or something else. Or we could have Ansible there or something else. So let's take a look at the workflow. Let's say for example you have a data center and you want to start using Hakedi for it. So the very first thing you did before with GlusterFS is that you rolled in your racks of servers and you have your storage and you started deciding what to use and you started deciding how to set up your storage, maybe you did RAID or not. And then you have to set up your file systems and such. But let's go now in the Hakedi model. You again roll in your storage, your racks. And that's it. You have to install Linux of course on it and then set up Gluster, but that's pretty much it. The next thing you say is I need to tell Hakedi about my new storage systems. First what I'm going to do is I'm going to create a cluster. A cluster is just a collection of nodes. And on that cluster that I created, this entity, I'm going to add my first node to it. So I add a node. But now we have more information that we're giving it than we did before. We're not just adding a node, we're just saying this node is on this zone, this failure domain. So as we start adding more nodes, we start dividing the failure domains. So before in Gluster, first you had to think about that information. You had to understand where to place the break, make sure it's replica was not in the same failure domain as your other replicas. But in this model, Hakedi will do this for you. You just have to explain these, sharing the same switch or the same power supply. And this other right is sharing the same switch. So we have two different zones here. And as we add more and more nodes, Hakedi itself will start adding, creating the trusted storage pool. My goodness. Next thing we need to do with Hakedi is we added the nodes. We defined the zones. Next we start adding the storage. So on that node, for example, we add each one of these disks or devices. They could be RAID devices. So in other words, we're telling Hakedi, this is our topology of our system. Not only can Hakedi do one cluster, so when people think of Gluster or FES, they always normally think of one cluster, one trusted pool. I'm going to stop saying that. But Hakedi can handle any number of clusters. So that means now, before you thought, you know, you maybe had some constraints on the amount of cluster FES, how large you could go. But instead of thinking just one massive volume, you could think of many, many endless number of volumes. Okay? So now let's take just a normal request to create a volume. What's the very first thing that Hakedi does? It goes ahead, and it has an algorithm to determine where am I first, how many bricks am I going to use to satisfy this request? It calculates all of that. Then it says, all right, let me go ahead and see which one of the devices can satisfy this brick that I need to place on there. So it goes ahead looking in the topology, looking for devices that it can use. So as you can see here in this replica two example, we have a device chosen on zone one and a different device chosen on zone two, and that's this replica, and it did that automatically. And then if this is a distributed volume, then it will go ahead and create the other two and one in one zone and one in the other. The really cool thing about Hakedi two is you can say when you create the volume, the request to create it, you don't have to specify the cluster and it will just find a volume somewhere for you. Or you can specify the cluster specifically. I say I want a cluster on that, I'm sorry, I want a volume on that cluster. Maybe the request is coming from somebody who's paying more, for example SSD, or paying less, just SATA. Or you can say I want it from this subset of clusters. Maybe you'll have many clusters of SSD or many clusters of SATA. So what other features does it support? It supports, you can say to Hakedi, create a replica three. Create EC, so it's very easy for Hakedi to do this all for you, and it's all available from the API. And also it supports volume expansion, which is kind of neat because as you create a volume you may need to increase the amount of storage that you require, and it will go ahead looking again for new bricks to satisfy that request. So, yeah. On the request, the requester asks for that. I want a volume of replica three. The requester asks. In the API request, in the JSON, you can specify durability type. And there is none, which means there's nothing. There's just distributed. There's replica style, or there's EC. So you can find that in the request. How does it know the storage devices? Going back here, the administrator, when he first rolls in those machines, the very first thing he does, or her, is to tell Hakedi all the topology information. That is, failure domain, nodes, devices. That's all you need to tell. And then Hakedi takes care of the rest. That's a certain way of thinking. There are more than 100 devices in a node. Yeah. Telling Hakedi all those devices. No, that's not difficult because normally these machines come, like, already set up, and all you have to do is Hakedi send. And it's just, remember, it's not from the command line. This is a service. So some software will, and so tell Hakedi of all these devices. That's how you do it. You're going to send the Hakedi to Hakedi. Yeah. And it's not really concurrent, I've said, though, in power. So you can tell Hakedi of the 100 requests at a time. Yeah. Remember the storage devices. 60 devices means you have to enter 60 device names. How do you care for them? Yes. So I guess you're ready for the demo. All right. Okay, so let's go to the demo here. And this demo is available on the Hakedi website. So you can go ahead and try it out yourself. It is based on vagrant and Ansible. So I already ran the Ansible on vagrant, so I haven't done anything else. So from this point on, it's just for following the instructions that are on the website. And I'm sure we'll go in and answer a question. You can ask me. We can type all the things and things like that. But here we go. So the very first thing we're going to do is enable the Hakedi service. So Hakedi, actually I'm going to talk about this in a little bit, but Hakedi is available in Fedora. It's available on E-Pel. It's going to be available on CentOS and StorageSig. And it's available also as a container. And it's available in RHS as a tech preview. It's available in RHS. It's an E-Pel. Yes, yes. So the very first thing we're going to do here is look at the demo. I don't find it. Here we go. So the very first thing we're going to do is copy a configuration file. So you're here. And Hakedi uses, ah, of course. Hakedi uses a configuration file in JSON and describes what port to listen to and such and what authentication and what user to use. And then we have at the bottom some information that it's needed to be able to SSH. Just like if you use Ansible in any other system, you need to provide your private key for all your systems. And the requirement right now is to have the same public key and private key for all your systems that you want to manage. So here we have. We do SSH copy Hakedi or JSON to SSH ATC Hakedi. All right. Next step, we're going to start our service. SSH do service. So there it is. It is now running on this system. Okay. So now we can have some fun and get some information from it. Here's a REST client available on Firefox. You can try it out. But what we'll do is we'll say HTTP. We'll get all the clusters on that. Looks like there's an issue with the networking. All right. I'll do it from the side. So let's go back here. So next thing we're going to do is use the command line client. Now Hakedi provides a command line client as an example. So when you use the command line client, you say, oh man, this is terrible. You know, it's so specific to the API. But that's what it's supposed to do. That's, let's go. It's a demo of how the API can be used. And hopefully services will be able to combine those APIs into one call that is more humanly usable. So here we have Hakedi CLI. And we can, it's available. You can talk about the clusters. You can talk about nodes. So the very first thing we're going to do is we're going to say, give me the cluster information. Look at me, cluster list. And there's no clusters. There's nothing. We haven't done anything yet. There's no topology yet. And so Ramesh, I'm going to ask you a question now. And this is where services really help. For example, I did add one service call to Hakedi client because it could take some time for me to type every device in the system. And I called it to load the topology. And what it is, it's a JSON file that has the entire topology of my cluster available. So we have many nodes. Each node has an IP address and has some devices that describes what zone that is on. So we'll just run this. We'll say Hakedi CLI load H. We'll say JSON, topology, flipboard. And there it goes. So Hakedi, now this is slowed down. I don't have to do one at a time. I could have just sent all the devices to it and it would have done all in parallel. But just for, I think there's a problem here with my vagrant and standby of the machine. But one second. So this is slowed down. So what it's going right now is creating the nodes. It went to the devices and it went ahead and created the V, set up the storage for usage. And I probably can't ping 1, 2, 1, 6, 8. 10, 0, 1, 1. Yep, that's what I thought. You mentioned how much like LV is the volume down top of it. Yeah. Because LVM and each ring will be an LV. And a thin pool. And then there will be a small LV on top of that. Yep. So let me go back out of this because it seems to, when you go in standby, it seems like... Back to my question. Yeah. Hold on a second. What? Oh, just halt it. Halt it. Let's see if that works. All right. So what's your question? The reason it's just basically the topology rate on this stage. The topology has to be integrated through the state PA or something. The what? The topology of the system. Yeah, the topology of the system. That's what I'm saying. The service takes care of that. The service who calls the KT can either have it done by a human person. You can discover... What I'm saying is that the service can discover that information. However it gets it, it then supplies it to the KT. Why not take it to the KT? Because... Yes, come and just move it over. I'll explain why. Every data center in every place is different. And the services understand those data centers. So the services themselves understand what's going on. The KT doesn't know. So if the services can provide that information, however they access it, then that's probably the right model. All right. So let me try this again. Say shut down. Let's try this again. There it goes. You know, live demos always are terrible. Let me go back with a recording I have. Can I access it? No, that's all good. I'll take care of it from the web. You know, when things happen, you know... All right, here we go. It's right here. Hopefully I have a Wi-Fi now. Come on. There it goes. Oh, I don't have an Ethernet port. So I have Wi-Fi, or I should. Oh, yeah, you're right. Thank you. All right, so we'll do the same thing. We set up the system. Here it is running. We'll go into storage zero. We'll go and load the information. We'll store it. The topology. And then on the backside, while showing the topology is coming up, I'm showing the logs of it going actually into the systems and setting them up. So once the topology is set, I'm trying to get rid of that bar at the bottom, sorry. Then it goes... We can go ahead and say cluster info. We should be able to get information. Now we can get information on a specific system, and everything in Hecate is done with UUID. So every device, every node, every cluster has a different UUID. So everything can be referenced uniquely. So here we have the node information, and we can see all the devices that are available in that system. We also can get an IP address and what cluster it belongs to. We're going to start looking at the volume information, and then we can start looking at the create information in a second, and we can look at the health file inside of it. So in create, as you can see from the example here, we have a lot of options of what you can actually request. You can request... I think somebody pointed out the durability type, but you can have every type. You can use EC. You can also specify how much storage do you want for snapshots in this volume. For example, if you don't want snapshots at all, or you want a lot of room for your snapshots. So let's see here. We're trying to create a one terabyte size volume. It goes ahead with looking at the logs on the backside, and Hecate has decided already what systems and what device is going to use, and it went ahead and started setting them up with LVs, thin pools, makes, FS, all that stuff. It joined them all into one volume for you. And now we can go to the client, and we can see that there's nothing mounted. We go ahead and mount that node at UID. And... Yes. When you create the volume. It happens on demand. Exactly right. Yes. So you create an LV for every brick, and then if... Actually, you first create a thin pool on top of MVG, depending on the thin pool size, of the amount of LV that's needed for the brick, and the rest needed for snapshots. So then, once it does that, then it does make a FES instead of the entire brick. The entire blaster volume. This is not in real time. Some parts were accelerated. You have to ask that. That looked cool. Yeah, yeah, no, no. Some parts were accelerated. And actually, yeah, I tried to keep it at least in five minutes. So here, then, once we have volume, then we have the ability to expand the volume. So we can say, we give it the UID of the volume, and we then describe how much more space we want. This is really cool. Man, I'm surprised that sometimes QMU has issues with networking when the laptop goes in standby. But I wanted to show you that you can create a volume, have it mounted, go back to the KDE, expand the volume, and watch it on the mounted system. Actually, the size increase of it. So it's pretty nice. So here we have a volume expansion. And we go back to the client. In a little bit, you can see from the volume info that it's expanded. We go back to the client and mount the system. And there it is at 1.2 terabytes. We added 200 gigs to it. So now we go mounted. The last thing we're going to do is destroy it. So again, as a service, you have to be able to create, expand, and destroy. Those are the three things. So I can't get rid of this bar at the bottom, but I just typed the KDE volume destroyer in the UID and it goes ahead. First it stops the volume. It goes ahead and deletes all the LVs and then puts everything back so that it deallocates all that storage. It's available to somebody else. We're not watching it again. So let's go back here. And like I mentioned before, this is available right now. You can try it out. You can try it as a container really easily. So when I showed you before, the architecture has what's called different executors to be able to send information to a system. But it also has something called a mock executor. So when you download the container, actually when you download any of these, it's first set up with a mock executor, meaning you can create clusters, you can create nodes. It doesn't send information anywhere. But you can check out and try out the API. Lastly, what's next for the project? So it's still kind of in a baby. We started in about July, June, July. And the next thing we have to do is conflict resolution, which is kind of a simple thing. But if you just haven't had time, it's when you add a node and you add the same node again. It should know that it already has it. Or the same disk. But it wasn't as much a big deal because this is not a human interface. It's for services. But we still need to make sure we have that feature. The really cool features later are what to do when something fails. So for example, let's say you have your many, many volumes over. And then one of the nodes die. And Katie gets informed, node whatever ID died. It can go ahead, allocate new bricks automatically, update all the volumes and everybody will be fine. So even if you wanted to say, I want to take this, I want to upgrade the amount of RAM on this system. So you can just tell Katie this failed. So it pulls everything away from it. Grab the system, put in a new RAM, put it in and then tell it again. It can use it again. And lastly, I know there's a project called Gluster FS 4.0. And in that project there will be a new REST interface through Gluster D. Right now we have to SSH into the system to communicate with Gluster. But it would be great to have this REST interface with Katie. Okay. That's all I have. And this is it. All these slides are available up there on that link. Thank you. Any questions? Being a topology? Yeah. Yeah, again, if you create your own program for your own discovery tool, maybe you can use UDP for example in your system to find out the nodes. Maybe you have your own software that when you plug in automatically set the UDP packet or registers with some registration system, you can use that information to authenticate it. But again, I'm trying to keep it very simple because every different data center it has a different requirement on discovery. And usually often we have the same, every case is different but in the same organization often you try to authenticate the same configuration of the open and open. Yeah. If you can say for example all the side that are in this range they are in this zone. Yeah. And they are able to say but they do not have this IP. So I'm in this zone. So I really know what this is doing. I guess one of the things that we need to work is to understand how that can be done. Do you know of any institutions such services that already try to find out and to discover the network like this? Do I know any such services to discover? It depends on the size of the data center. So there's if you use I think it's SNP for MIP for discovery that's one way but I know a very easy one to do for small data centers is UDP, broadcast is very simple. I'm talking integrated with this tool. Yes. It's something that just goes on and discovers the JSON file and tests it over. You know what we really need is to be able to use some discovery service with a cloud model. So if I can talk to the cloud and find out the nodes or find out information. I was just wondering if you know one that exists already or everybody else. I don't know. So we can talk to that guy back there and find out a little. Right there. Hold on. You're putting me on the spot. I know. That's a good question. So the question is do I think that GlassRFS is robust enough to be able to deploy in the cloud in this model creating many, many, many volumes that are small and creating many volumes at a time concurrently. I'm changing the sizes and such. I do not know how to answer that. Let me put it this way. This is a new storage service. This is a new model for GlassRFS and it would need to be investigated to see how well it can handle it. I don't know how good it is. Well it can handle it. I don't know how bad it can handle it. But I think it's an unknown that we need to answer. If you lose So that's a great question actually. Let me go back. Let me go back here. Today Hiketti uses an internal Go Hiketti is written in Go and uses a database called BoltDB which is a really good database for simple applications. It is a key value transactional database CopiumWriteB3 kind of thing which works really well when you're trying to do things and you're trying to back out of them. So you could do something and you could realize that you fail halfway and you fail that transaction your database still stays clean. So you could do many things at a time on the entire transaction. Now it is a single point of failure if the database fails you will lose all that data. Today we have to put that BoltDB on something that is safe but we can definitely change that in the future. Yeah so do you think there are endless environments in the database? No we cannot import and not for sure we cannot import existing cluster environments in the system because existing cluster environments have an infinite number of combinations and permutations of how they were created. So I'm talking about the hardware things that Hecady does for example is to make sure that everything's aligned when it creates the file system so there's a lot of things that it would need to do before it actually can take over an existing system. So I'm not saying it's impossible but it would be really hard so I'm sorry Yeah conflict detection Yes No no no no it's actually the next thing I'm going to do Yeah absolutely that needs to be done Yes That again is an idea Okay it's not something that we have said in style and but it's definitely we can have talk about and lastly it's everything that's available on here on github if you have any questions you can go to Hecady on github and there's a channel there for Gitter and you can ask all the questions you want on there that people can see and I'll be able to answer all right thank you very much they're not in in pdf format Yeah they're in in Go and they're only available online Okay so you're kind of exported Yeah they're only available there let me help you with that in a second Okay This is yours All right I'm talking about I'm sorry All right Dude one small comment you're using the flag library what? the first C oh yeah we got to rip that out that's the Go flag Garbage complete garbage I do what you want me to let go dash dash something like that you have to use a proper library like close gangsta I know it's crap no one uses it oh really? yeah it's crap right no I'm talking about the single dash thing yeah that's what it's what it's garbage yeah I don't want to say something about it no it's Go language it's a Go language if I it is native for their like simple tool yeah they have to have one but no one uses it everyone uses like Go gangsta okay is one that I'm using but there's another one that's also just as good I'm sure I'll use whatever I'll change it I'm not like no man I'll change it it's easy I'm just like telling you a small thing someone has to tell you it's easy I don't know you could I'll I'll I'll