 You just joined us, let's give a warm welcome to Kaushal and what's new in Gloucester. Thanks everyone, good morning. So I'm Kaushal, I'm a maintainer of the Gloucester, in the Gloucester office project where I maintain, sorry, okay, I'm Kaushal, I'm a maintainer in the Gloucester office project. Sorry, I switch it on, yeah. I'm a maintainer in the Gloucester office project where I maintain the distributed management name for Gloucester office called Gloucester D. So today I'm talking about, I'll be talking about our new upcoming release Gloucester 4.0. This is one of the biggest releases we'll be, one of, this is a major release that we are doing coming up soon. So in today's topic, in today's agenda, I'll give a quick intro to Gloucester to people who don't know what Gloucester is. And I'll follow that up with describing what Gloucester 4.0 brings new. Okay, so what is Gloucester? Gloucester is a software-defined network storage system. It provides users with a distributed, replicated or erasure-coded storage. So it's purely software driven. And one of the silent features of Gloucester is that we don't have a centralized metadata server. We are purely distributed and work without any metadata, central metadata servers. We work on commodity hardware. That means we don't need specialized storage appliances to run Gloucester. We run on normal off-the-shelf x86 servers. And we are scale out as in to grow a Gloucester storage, you add more servers when you need. Gloucester provides users with a POSIX compatible file system. So Gloucester provides a native fuse client, which is POSIX compatible. And so applications can be pointed to the Gloucester FS mount. And you can reasonably expect them to work without any problems. And in addition to the native Gloucester FS mount, Gloucester FS protocol mounts, we provide access via standardized protocols like NFS and SMB, via integration with other projects like NFS, Ganesha and Samba. So some terms to help understand things later in the presentation. So in Gloucester FS terminology, we have peers or nodes or servers. I use them interchangeably. So a Gloucester FS server is a computer which contains Gloucester FS server packages installed. So lots of Linux distributions have Gloucester FS packages installed on them in their default repositories. And you can install those packages onto your server and that becomes a Gloucester FS server for you. We pool these Gloucester FS servers together to form a trusted storage pool. This is what we call the Gloucester FS cluster. And we have clients. These are the people who access the Gloucester FS storage using the native protocol. So this picture sort of gives an overview of the trusted storage pool. So we have servers with Gloucester FS installed on it. They form a trusted storage pool and we have clients outside the trusted storage pool who talk to Gloucester FS servers. So we have the native fuse client which talks directly. And we have the other external applications which use our Gloucester FS API so that we can export Gloucester FS volumes using projects like Samba and Ganesha. And we also integrate with QEMU to provide direct access to Gloucester FS storage for QEMU and for QEMU. So there are some more Gloucester FS terms. So we keep talking about BRICS in Gloucester FS. So a BRIC in Gloucester FS terms is the smallest unit of storage in Gloucester FS. So basically when we talk about BRICS there are two things. There is the BRIC directory and the BRIC process. It's a combination of these two. A BRIC directory is an empty directory on your server, Gloucester FS server. In majority of the deployments or in general deployment, this empty directory is the mount point of one of your disks on the server. And a BRIC process is something that exports the directory out to clients. And then we have a volume. A volume is a logical collection of these BRICS to provide you with the distributed, replicated redundant storage. So we have servers with disks on them or mount points on them. So we can combine different BRICS to get volumes that you want with different characteristics that you want. And Gloucester FS makes it really easy to do this and some more. And Gloucester FS is built on, built using, what do you say, the concept of translators. So in Gloucester FS, all the file system features that you hear of are built using modular bits of code called translators. So each of these translators performs just one action. So it provides one feature of the Gloucester FS. So there is a replication translator which just performs replication. There is a distributed translator which has the distributed logic. There are performance translators which do things like read ahead, right behind. There are translators that do caching and stuff like that. So there are translators. And we arrange these translators in a volume graph to give you a volume with the features that we asked. So all of these translators will be arranged. So each of the blue boxes here is a translator. So we have a graph for the client side. We have a graph for the server side for the BRIC process. And each of them is a translator implementing a separate feature. So we have the fuse translator which talks with the kernel fuse module. We have other translators here. We have a distributed translator here. We have a translator that talks over the network that talks the Gloucester FS network protocol. Similarly, we have translators on the other side. Finally, we have the POSIX translator which actually writes the underlying file system. And this is a volume graph that we have. So this is a very simple graph without any branches without any distribution. But in actuality, you would have branches starting out from the distributed translator so that you have multiple breaks, you have replication and all available. So how do you use Gloucester? So using Gloucester is really simple. Once you have installed Gloucester FS server packages on your servers, you form the Trusted Storage Pool using the peer probe command. So this is where you let each of the Gloucester FS server know of each other. So you form a Trusted Storage Pool. Then you create a volume using the cluster volume create command. So to create a volume, you give a name, you specify the redundancy that you want, or if you want the ratio coded, you can specify the ratio coded parameters. Then you give a list of bricks that you want to use for this volume. And then you start the volume. And now once the volume is started from the clients, you can go ahead and mount with just a simple command, simple mount command. So you have just basically four steps to create a usable Gloucester volume. So now before going on to Gloucester 4.0, a little bit of history. And I'll also talk about where we are right now. So Gloucester FS started as an attempt to create a Linux distribution to perform, to create a supercomputer out of commodity hardware. So the founder of Gloucester FS, Gloucester worked in US National Laboratory where they worked on supercomputers. They thought they could do the same using software. So Gloucester started as such a project. And Gloucester FS was part of this larger project to provide the storage for the supercomputer. So Gloucester FS v1 was a part of the bigger project. By the time v2 came out sometime in mid-2000s, Gloucester FS storage, Gloucester FS had taken more importance in the Gloucester project. And by v3, Gloucester FS, which was released late 2000s, Gloucester FS had become the primary project for the Gloucester community. So let's talk a little bit of Gloucester 3.x. This is a really important release for us. Gloucester 3.0 was released in December 2009, so nine years back. This was the release in which our protocol was defined, the network protocol that we speak till now was defined. This is important for us for Fortotto. And Gloucester 3.1 was released in October 2010. This added Gloucester D, the distributed management framework, which makes Gloucester really simple to use, which allows those simple four commands that we start to work. So this also brought along things like a native NFS server and other stuff. But yeah, the big thing with Gloucester 3.1 was Gloucester D. And we had a lot more releases after that in the 3.x series. We bought in lots of features, a ratio coding. We bought in five features that provide you with like a NAS appliance, features like snapshots. Then we bought in things like GFAPI, which allows us to integrate with external projects better. And that allows us to integrate with Ganesha and Samba to provide better NFS and better SMB access. We recently started integrating much more with the container world, with Kubernetes and OpenShift in particular. So we have a project called Ekiti, which provides a layer in between Gloucester 3.1. It provides a REST API for managing Gloucester 3.1. So it provides users with a single call to create a Gloucester 3.1 volume. So you don't need to specify the bricks and stuff there. For Ekiti, you just say, I want a volume of a certain size. Ekiti will give you a Gloucester 3.1 volume of a certain size. So these are the major features that Gloucester 3.1 gained over the releases. And right now we are at Gloucester 3.1.3. This was released in December 2017. It's the latest release. It's a short-term maintenance release. It's supposed to reach end-of-life next month. And now let's go on to the future to Gloucester 4.0. So Gloucester 4.0 is going to be a major release, like one of the biggest releases since 3.0 for us. It's scheduled right now for late, February later this month. We have branched out like two weeks back. And yeah, the release should happen soon. This is again a short-term release because this is the initial release of this branch. We know there will be some bugs. And there are some few things that I'm missing that will make it not recommended to use for really long-term production deployments right away. And it is going to drop support for older distributions by default. So Centa6 and EL6 won't be supported. And the relative Ubuntu or Debian distributions are also going to be dropped from the community side. So what does Gloucester 4.0 bring? So Gloucester 4.0 brings protocol changes. This is like the first time that we are making proper version protocol changes since 3.0. And it's bringing a new management layer for Gloucester. So these two are the big changes for us. In addition to that, there are a few smaller features like we have instrumented the code a lot more so that we can get metrics to do performance analysis and other stuff better in the file system path. We have done some changes internally with the algorithms that we use internally for hashing for certain stuff to make sure that Gloucester can run on FIPS compliant systems. Gloucester won't be FIPS validated but Gloucester can run on a FIPS system. And there are a lot of performance enhancements, I guess, particularly with respect to like directory operations. And there are a lot more performance enhancements in the queue. So if people attended, we had a talk yesterday in the performance track in the main tracks where we spoke a lot more about what we're doing with performance for Gloucester. So those were just the 4.0 features. But we have a lot more features coming on. We have been talking about a lot of other features for Gloucester 4.0 release but these are not landing in 4.0. They'll come somewhere down the 4.x release line. So we have this thing called Drio. This is a new distribution algorithm that should help with distributing large directories or directories with large entries much better. There is journal-based replication which tries to push replication to the server side and makes improvements there, performance improvements as well. There is this thing called GF Proxy. Some of these features are already available but not fully baked right away. But there is this thing called GF Proxy which tries to provide users with a lightweight client. So right now in Gloucester, clients have all the intelligence and they're pretty heavy. They take care of doing the distribution. Clients have the distribution intelligence. Clients have the replication intelligence, etc. So GF Proxy tries to provide with a lightweight client and move some of the intelligence out to the server itself. There is this thing called Hero replication which should allow replication, active-active replication across data centers. We already have Georeplication which does replication across data centers but it's asynchronous and it's master in a master-slave mode where you can write only to one. We can write only on the master. We have a feature called Arbiter which tries to solve some split-brain issues by replicating just the metadata like three times instead of the full metadata and data three times to make better usage of storage. There is a feature, an improvement to that coming called thin arbiter which tries to make arbiter even better. There is this thing called plus one scaling which should allow users who are using replicated or erasure coded volumes to scale by adding one extra server. So right now with erasure coded or replicated volumes, if you want to grow your volume you need to add the replica number of servers or the EC parameters number of servers to grow your volume. So it is not always helpful and along with that we also want to bring more automated management features to help administrators perform like the day two operations much better, day one and day two. So again like we talked about hackety rights so we want to pull in some of hackety features into the cluster as itself to provide say automated volume creation where you don't need to specify breaks. And then we also want to do things like provide day two operation support for doing things like replacing failed breaks, replacing failed servers automatically and stuff like that doing migration, doing healing all in a better much more automated way. Okay so now let's talk about Gluster 4.0 in particular. So as I mentioned protocol changes is one of the most important biggest things in Gluster 4.0 that's happening in Gluster 4.0 and one of the things that's driving us to move up a version number. So what this means is that we have a new on wire RPC version. So we had the RPC set at Gluster 3.0. We did changes to the RPC but it wasn't really good done in a really nice way but still technically a Gluster 3.0 client should be able to work with the Gluster 3.1 client server. But now we are bringing in a new on wire RPC that breaks this. What this means is that we have improved upon our on wire RPC structures. So Gluster uses some RPC on wire with XDR data encoding. So with the protocol changes we have button we have improved upon the structures we use request response structures we use to add in more members for explicit members to the structure. So and we also made sure we improved upon the on wire representation of our dictionary data structure. So all Gluster 4.0 on wire operations RPCs contain a member called Xdata. It's a dictionary data structure. It can contain arbitrary data so arbitrary key value pairs so where the value can be of any data. So what had happened since the 3.0 release during the 3.0 release was that we started putting more and more things into the Xdata structure, Xdata member as we started implementing more features. So that was one problem because it was in a dictionary and we always had to encode and decode from the dictionary again to double encode decodes when you send information across the wire. So once for the Xdata structure itself one for the dictionary and the other problem was the dictionary itself wasn't really well built. So what happened was all the values that you added into the dictionary to send them over the wire they were encoded to strings and again when decoding they were read back from strings and try to be converted to their original format data type. So that wasn't really good. So there is a new dictionary data structure for on wire new dictionary representation for on wire which avoids this. So that makes dictionaries also more performant. So but even though we are doing this protocol changes we still have the older RPC version available. This should be there for quite some time to allow older clients to work with Gluster 4.0. It may be deprecated sometime in the future. I'm not sure of exactly when we'll do this. These are the protocol changes and we also have Gluster D2 which changes a lot internally with how Gluster is managed. But to the end users it should remain mostly transparent. It should be similar to what you are already experienced. So Gluster D2 is the new management system for Gluster 4.0. Gluster D2 is not backwards compatible with GD1 or classic Gluster D. This means that we can't have a hybrid 3.x and a 4.x cluster. So but the thing is Gluster D1 is backwards compatible with the clients. So clients can still talk to clients as in the file system clients can still talk to Gluster D2 and perform their mounts. So that's still there. Gluster D2 is from scratch rewrite and it's written in Golang. It aims to scale much better than Gluster D1. It aims to improve our integration stories. Integration with external projects and integration of cluster features. It's themselves to the Gluster Management framework. And it also aims to reduce the maintenance effort required going forward to maintain the management framework. So what does Gluster D2 bring? So Gluster D2 brings a more what do you say universal API for management. So in Gluster D1 we just had a CLI. There was no other API available to you to manage Gluster. So Gluster D2 gets a REST slash Http API. REST like it's not pure REST but it's an Http API. This allows Gluster FES integration with other projects easily as almost all languages have really good integration with Http and JSON. So there is a new CLI tool which talks to this REST API. The new CLI tool though it's new will still retain most of the syntax of the existing tool. So there is not a lot to relearn. And internally in Gluster D2 we have done a lot more changes to make Gluster D2 itself more flexible and more pluggable for future enhancements. So we have improvements to our internal transaction engine. So the transaction engine is the thing that we use to perform operations across the storage pool. We have a better implementation of Valgen. Valgen is the piece of code inside Gluster D2 that builds the translator graphs, the volume graphs that we saw based on whatever the users requested. So we have the REST interface itself is pluggable. So technically users can add their own plugins to provide their own REST endpoints. We have a flexible events framework internally to allow a synchronous communication of events internally so that plugins can be notified of what's happening in other parts of Gluster without explicit connections between the two modules. And one of the biggest changes is that instead of trying to store and synchronize our cluster information manually we are now depending on HCD. So HCD to provide a centralized store for us. So in Gluster D1 we used to store information as text files on each and every server in the cluster. And the way we did our transactions, the way we did our operations was all specifically tuned to keep all of these stores in all the nodes in the cluster in sync. But that worked well when your clusters were small but as they grew to larger sizes say 50s to 100s or beyond that was really bad and it didn't scale really well because it sort of worked like a mesh system where all your operations reached all the nodes. All nodes had to persist something. It was hard and if something did fail if we had a split brain it was sort of hard to recover from that. So in Gluster D2 we are moving to use HCD to perform the replication of the data to keep the consistent cluster information. But we use HCD in such a way that Gluster users don't have to manage HCD. So Gluster has an internal HCD algorithm to automatically set up an HCD cluster and manage it. So by default it creates a three node HCD cluster automatically whenever you do a peer probes. Gluster users don't have to do that and Gluster uses the cluster. And Gluster automatically will take care of promoting some or adding new peers to the new servers to the HCD cluster when needed. Say if one of the original HCD servers did go down Gluster and automatically bring up another server to the cluster. So you always have the requisite number of HCD servers running here. And so this is what GD2 is like right now. So this is like the basics of GD2. In 4.0 Gluster D2 will remain a technology preview version. We are not calling it. We are not going to force everyone to move to it right away. So in 4.0 we will have most of the Gluster D1 commands implemented or the original Gluster CLA commands implemented. There will be some commands missing particularly with related to say tiering related to the internal Gluster NFS servers and some integration pieces that we had with Samba and Ganesha and others. So they will be missing in 4.0. There will be a preliminary support for automatic volume creation. So we are pulling in some of the HCD APIs to Gluster FS. So you'll be able to hopefully create a volume by just requesting for a volume name and a size. So you'll get a volume back from Gluster with that size that you asked for. And in 4.0 GD2 is not going to support migration from 3RTX. So yeah that's that. So users may be able to manually do the migrations by themselves if they want to but we won't be having any sort of scripts or anything to help users migrate away in 4.0 itself. Beyond 4.0 with 4.1 and beyond. So in 4.1 Gluster 4.1 is planned to be a long term maintenance release. So by that time we'll be stabilizing more. We'll have full documentation on all the commands that we support on all the APIs and the different workflows, management workflows that you can use them for. We'll bring in support for migration from the older release. So at that point in time we'll provide with scripts that will help users migrate data from the old GlusterD store into the HCD store. We'll also be doing things like centralizing, logging. We'll have tracing frameworks implemented so that users can visualize the Gluster commands flowing, the Gluster management commands at least, how they're happening over the cluster. We'll also be improving upon our automatic volume management and we'll be bringing in things like plus one scaling. Plus one scaling is going to be a Mealier management feature and maybe do things like automatic cluster formation where you don't even need to do probes for cluster to form. So Gluster server should be able to discover each other. And in addition to that, we'll be again bringing in more native APIs that help integrate Gluster with other projects and to provide a single API for different sorts of workflows. So you don't need to run multiple cluster commands to do a node replacement now. So it would just be a single API. So since I mentioned that GD1 is not compatible with GD2, there is going to be a specific upgrade path. So the upgrade path is going to be there. Okay, sorry. So we have a specific upgrade path. With the upgrade path, what we're trying to do is we want to avoid any file system downtime. That means that we want the file system there to be always available. So any clients that have access to volumes that are running will retain their access continually throughout the upgrade process and they can continue writing, reading from the volumes as the upgrade progresses. One caveat is that this only applies to volumes that are redundant as in purely distributed volumes, distributed storage won't get this capabilities, but volumes that are replicated or erasure coded can do this. So to do this, what we will be doing is we'll be shipping both the new GlusterD and the old GlusterD in Gluster 4.0 and Gluster 4.1. So what this means is that users will be able to upgrade from Gluster 3.13 right now to Gluster 4.0 or Gluster 4.1 using the old GlusterD. So you'll be able to do a live rolling upgrade. So this is a documented procedure already. And then we'll provide you with ways to do a migration from the old GlusterD to new GlusterD. So I'll describe the steps in the next slide. So right now we plan to ship the old GlusterD in Gluster 4.0 and Gluster 4.1. We currently plan to remove GlusterD1, the old GlusterD from Gluster 4.2. During this time, GlusterD1 will not receive any feature updates. So it's basically essentially what we have right now in 3.13. So yeah, these are the expected steps for upgrades as I mentioned. Initially users will need to perform a rolling upgrade from Gluster 3.x to 4.0 using GlusterD1. So the steps are as described here. So you install the new packages on each server one after the other. So install it on one server, restart the GlusterFest process on that server. Do a heal of the volume so that any data that was returned to volumes during the upgrade of the node get replicated properly. And once the heal is completed, you move on to the next node. So by doing this, you always have access. Clients always have access to the data and the volumes during the upgrade process. So you move from 4.0 to 3.x release to the 4.0 release without any downtime for clients. After this upgrade, we will do the move to GlusterD2. So we then expect we would be killing GlusterD1. So this just kills the management layer. Your file system still is alive. Clients will still be able to access them. But at this point in time when GlusterD1 is killed, there are no management demons running anywhere. New clients won't be able to access it or access GlusterFest volumes. But anything already connected, they will still have access. So once GlusterD2 is killed, everywhere in the cluster storage pool, we'd go ahead and start the new GlusterD2. And once GlusterD2 starts, we'll do the migration into GlusterD2. So during the migration, we would form the new cluster using GlusterD2. Then we'll have scripts to import data from the old GlusterStore into HCD. And once the data is imported, GlusterD2 should automatically just pick up the running file system demons and begin to manage them. And migration is complete or upgrade is complete without downtime for clients. And yeah, that's it. So any questions? You can ask me now. We also have... Yeah. Yeah, so the question is, do we embed HCD in HCD2? Yes, we do. So we have our own HCD embedded, but if users want to, they can actually connect to a running HCD service. So that's intended for use cases where Gluster is run in containers, in Kubernetes or OpenShift where they already have HCD running and managed. But the internal HCD is managed by GlusterD2. People may be able to access the GlusterD2 instance of HCD, but we're still not sure if we want to provide that access directly. Yeah. Yeah. Yeah. Forward1 is planned for three months from 4.0. So GlusterFS follows a three-month release cycle. So yeah, end of February, March. Okay. So the question was, do we have a timeline for 4.0? 4.1. So yeah. Gluster follows a three-month release cycle. So 4.1 is planned for end of February. So March, April, May, end of May should be... End of May, early June should be 4.1. Yeah. Sorry. As a consumer. Yeah. So... Okay. The question is, how far is the Kubernetes integration with GlusterD2? Not much right now, so we are just trying to implement the HCD APIs directly. But we have additional projects in the GlusterFS community that provide you with Kubernetes providers, I think they're called, stuff like that, to... Right now, those providers speak the HCD API. But since we are basically moving that API as it is to GlusterD2, the similar case, they'll be able to work with GlusterD2 directly once it becomes stable. Any more questions? So the old clients, okay, the question is, what's the impact of running old clients against the new 4.1 cluster, 4.0 cluster? Yeah, right? So the old clients, since the old protocol still remains, the old clients should have the lower performance technically. The clients running the new protocol because of the improvements we did to the on-wire encoding, they should have slightly better performance. But they should still run. Yeah? This... Okay, the question is, is there a performance impact for the GF proxy implementation? Right? So yes, there'll be because there'll be two hops, but depending on where the proxy daemon is running, it sort of makes it simpler. So if the proxy daemon is within the trusted storage pool itself, so clients just have one... To do one request to the proxy daemon instead of like sending out multiple requests to multiple servers in the pool for application and all. So for some use cases, there'll be a bigger impact for some, not so much, I guess. So I don't know much about that GF proxy, how it's going to work, but we have, maybe we have some of the developers around. We have a stand in the K building, right? You can visit us there. I guess we have a developer around who can answer that question. I don't know particularly. Does Nils know? Yeah. Okay. Any more questions? You say a few words about your experience of moving to Golang. So you want a few words from me on my experience of moving to Golang? Yes. Yes. One of the things was that Golang, going to Golang allowed us to concentrate more on the challenges that we had like designing the frameworks and concentrating on designing the frameworks, designing the pluggability, keeping it modular, instead of worrying more about all the C-specific problems that we had like memory allocations, doing all of that probably. So we could concentrate more on the logic and go itself has built-in support for lots of things like styling and all the tools for CI and all, it's better than C. So it helped us move to GD2 much faster than what it would be if we had done it in C again. So that's all I can say. Yeah. Any more questions? Nope. So thanks everyone.