 Welcome everyone to our second Gluster talk kind of today. So please welcome with me, Havi. Good morning people. So I'm going to talk about the evolution of geo-replication in Gluster FS, and to start with it, I'll give a brief idea about what Gluster is so that you can understand the rest of the talk. And yeah, Gluster is a distributed file system with no single point of failure. We don't have a metadata server and that's how we achieve this no single point of failure. And it provides easy scaling out and it can run on any commodity hardware. You don't have to buy boxes to run Gluster. And it provides other basic file system features like replication, erasure coding and stuff. And the terminologies that you have to know with Gluster are bricks. Bricks are basically your disks which you are going to have, going to make use to store your data in the back end. And servers are going to be collection of your bricks and a volume. Volume is a logical entity which is going to have a collection of bricks across various servers. And then your client, that's where you write your data and that data is going to be transferred on to your bricks. And the Trusted Storage Pool. It's a collection of servers within which you can perform certain operations and manage it. So the overview would be something like this. So you have bricks in your server and these bricks together form a logical entity called a volume. And that volume is mounted on a client where you are going to consume it. So yeah, how about geo-replication? The basic requirement for geo-replication is disaster recovery. With replication your data is going to be on your site, the same data center and that's not that much reliable when it comes to natural calamity. So we go for geo-replication where you can replicate your data across different data centers. And yeah, the requirements for geo-replication are how do we get the copy of data from one cluster volume to the other place. So to do that we came up and also the other requirements were checkpoints to know until when your data is actually replicated from the master site to the slave site. And the last process is to make this copying as efficient as possible to reduce the hit on the IOs that are happening on the master volume. And how we did it is basically we have two volumes. The master volume where you are actually writing your data from the master volume, the data is copied on to your slave. So we basically created two volumes and made use of them. And the other thing is you have to record the changes that are going in on your master site. And then these recorded changes will be transferred from the master site to the slave site. And to record the changes is where we came up with something called change log. So change log is basically a translator. A translator in Gluster means something which gets a certain input and provides a certain output. So change log is going to get the data that are written on a particular volume as a recorded change. And this change is later on going to be sent for the georeplication to replicated to the other side. And it's on the brick side. I'll explain how this flow goes. And yeah, once you have the change logs, the change logs have to be synced from the master site to the slave site. And the other benefits of change log would be to create checkpoints. Like if you have 10 writes which are made and each write is into a change log, then you can say that if you have completed 5 change logs, then 5 of your writes are done. This is how we came up with the checkpoints. And also if a particular write is failing for some reason, being a change log, you can just retry that particular write alone and then succeed. So this reduces the overhead of doing the whole operation again. So as you can see, you have the master volume, which is one particular cluster volume in some trusted storage pool. And the content from the master volume will be synced across to your slave volume. This is how basically georeplication works as of today. So we have three types of crawl. So the reason for crawl is change logs get recorded. So if you see this, you will understand. So that's one brick which I have explained with the change log. So your IOS will be coming from the top and it goes until the bottom where you have your disk there which is going to store your data. So when it goes through the brick, it's going to pass through the change log. And change log is going to know that, OK, now I'm getting a write operation. So such a operation is later on then written to a flat file. And it just keeps writing the contents to a flat file based on the incoming content. So yeah, the rest I'll come back in a few minutes. So these content will be available only if the georeplication was enabled when you create the volume. But there can be cases where you already have a volume and you want to sync the contents across to another volume. So here you won't have your change logs. So that's the reason we have this hybrid crawl. So what this hybrid crawl does is once you enable georep on existing master volume, this crawl is going to go through your file system, create pseudo change logs. And these change logs will be later on used to sync to the slave volume. And change log crawl is incoming traffic. It records the change log and it syncs up to the slave volume. History is when you have a number of change logs recorded but georep was stopped for some reason. And now you are again started because the slave volume is back up. So all your recorded history will be synced there. And how we do this is where this diagram comes into the picture. You have the change log translator in the brick which records the changes to the flat file. And then the agent process is going to read the contents from the flat file and give it to this worker. And worker is going to read the contents from the master and write it onto the slave. And the worker and the agent are taken care by the monitor. So if one dies, the monitor spawns it again and so on. So the disadvantages with the current approaches. So we have something called a GFID to identify a particular file in Gluster. It's going to be unique and so far we have... So this GFID is used to construct the whole file system. So we are replicating the GFIDs from the master to the slave as well to construct this whole tree. And this one dependency has been giving a number of issues like you can't replicate it to a non-Gluster volume. You can have GFID conflicts between master and slave and all those problems can be there. And you can't manually copy a file from the master to the slave. Right now it has to be done to a Gluster volume through this Europe. And other than that, if you have a create-delete and a create, you have to repeat all these processes one after the other to have it in the right state. So to basically avoid these issues, we came up with something called as path-based geo-replication. And we are working on it. So what this does is it stores your parent GFID in each file's exciter. This way we have a separate tool which can read this particular exciter and then understand how the path is. Give the path to the slave volume so that the slave volume can work on the paths instead of working on the GFID. This way we have removed the GFID dependency. So the advantage is that you don't need a Gluster volume for the slave volume. You can write it to any of your other type of storage where you want to write it. And you can use additional tools to do the initial sync. You can copy files manually from the master to the slave. And you don't have to do a create-delete and create. We use R-Sync which just does it with a single create. These are the advantages of path-based geo-replication. And I have a demo just to show how to have geo-replication between two volumes in Gluster. So basically I am doing a peer crop to create the Truster storage pool I am talking about. So if you see I am having two servers and I am creating a volume between two servers called as the master volume and I am starting it. So I am starting the master volume. Now this is additional step for geo-replication which is used by geo-replication internally. This is another volume. So similar to this I would have created a slave volume on another mission so that I can do the geo-rep between these two. So the master volume is here on this Truster storage pool and the slave volume is on a different Truster storage pool which you will have to repeat the same process. And I am starting this particular volume which geo-rep uses. So after that I am mounting the master volume and this additional volume the geo-rep is going to make use of. So this is another command which you can make use instead of creating this particular volume. This command takes care of that work but as I have done this created this particular volume manually this command is failing but this is something that you can know to reduce this work. And then you can see that so this command is going to so you will need a passwordless SSH between your master and slave Gluster volume one of the nodes. So I have done that. This is going to get the keys from SSH keys from various nodes in the master volume and then send it to the slave volume so that the transfer can actually happen through SSH. So I am doing that and other configurations for geo-replication so you see that I am doing certain configurations for geo-replication and then I am starting a geo-replication session over here and then I am checking the status it initializes and at one point it actually starts to work. So now I will SSH to the slave node and show you the amount in the slave node to see what is the data available there. We haven't wrote any data so it should be empty there so you can see one minute. So on the master when I do an LS you see nothing because I haven't created any file here. Now I am SSH'ing into the slave and you can see the slave volume here which is a different volume from the master volume and I am doing an LS on the slave it is again empty. I am coming back to the master cluster and I am creating certain files here and now you can see the files in the master volume. I am going back to the slave volume and again I am doing an LS on the slave so which was empty now it has these 10 files replicated here. So setting up geo-replication should be fairly easy in Gluster. This is how you do it if you have any questions. Questions. So you said that you implemented path-based geo-replication that doesn't use the GFIDs so is it possible to use both? So it is not a done it's a work in progress thing. Right now you have the normal geo-replication which is between a Gluster volume to another Gluster volume. So path-based geo-replication is something we are still working on it's not it completed. So the whole arcing thing is new? So once we completed we will be mostly relaying on path-based geo-replication rather than supporting the old one. So currently geo-replication always has to be done to a target that's also a Gluster volume. Yes. Okay thank you. The change log translator does it have a performance impact if it's enabled and could it be used for other purposes because for example we kind of want to track all file operations kind of like an audit log something like that so can the change log translator be used for that purpose and does it have a performance impact? So the question is does the change log translator have a performance hit and can it be used for auditing as well? And yes change log will have a minimal impact and we do give you options to set the change log like you can say that I'll record a file for every 15 seconds for every 15 minutes. So you have the ability to configure the time based on your volume so that is something with which you can tune your volume and we are using change log for it another tool called Glusterfine which is going to give you the files which have been changed from certain checkpoint to another checkpoint. So again their change logs are handy so you can make use of change logs for auditing as well. So are these change logs like complete? Like can you trust that every file operation will appear in the change log? Yes change log has been so can you trust change log is the question and change log has been there in Gluster for a while now and it's pretty stable. So yeah change logs are trustworthy. What can you advise for active application? Sorry I Master Slave right? Okay Is it possible to create like two sites master? Sorry is it possible to create? I mean like for availability now it's for disaster recovery. Okay. So you are asking if it is possible to get it back the other way. Exactly. So the question is is it only from master to slave or is slave to master also possible? Yeah right now once you stop your geo application session from master to slave you can enable it the other way as well. This way you can copy your data back but we don't have active-active mechanism right now. Yeah. Thank you. Thank you Hari. Thank you. So Pathway geo application has worked by Aravinda and yeah the geo application team in Red Hat comprises of us if you have any questions I have the links provided in the slides you can make use of it. Thank you.