 Welcome to another edition of RCE. This is Brock Palin. You can find us online, find all our entire back catalog on RCE-cast.com. You'll also find the links to Jeff's wonderful high-performance blog, as well as, yes, as well as all of our Twitter handles and everything else like that. I also have, again, Jeff Squires, one of the authors of OpenMPI, and works for us, our Cisco system. So, Jeff, thanks again for your time. Yes, apparently I have a wonderful blog that everybody should go read. Actually, there's been some fun stuff recently about process affinity and things like that. I love bringing up that stuff, because most people don't appreciate the complexity of the internals of your servers and how it actually affects the performance of your HPC jobs, and you don't even know it. So, go read them. They're good stuff. So, let's go ahead and roll right into our guest today. This one's been requested a couple of times and is taking a little bit to get this lined up, but we finally have it. We have with us Jeff Darcy of Red Hat, and he's gonna be talking to us about Gluster, a distributed parallel file system, but I'll let him get into the details of what's unique about it, versus the other offerings and things like that. So, Jeff, take a moment to introduce yourself. Hi, good to be here. I've been involved in distributed storage and HPC or semi-HPC for about 20 years since I was working on NFS Version 2 back at Encore. My most recent gigs have been working on Luster at Cycortex, sadly missed, and now I am at Red Hat, where I started a project called CloudFS, which was based on GlusterFS, and since we acquired the company behind that, I've been one of the architects and to a large degree, the public face for that project. Okay, so can you get into a little bit about the details of Gluster? Well, the idea behind Gluster is that it's scale-out storage, so it is based on the idea of getting a lot of commodity servers. It's largely a cost play. It's a convenience play. It's a little different from some other systems in that it has no single metadata server, and in that it is extremely modular. I sometimes say that it's not so much a file system as a way of building file systems. We just happen to build one, but you can actually use the interfaces that we provide to build others. So that's actually one of the real key things about it is particularly in something like an HPC context where you may have applications that can benefit from a very specific optimization, you can actually build that optimization in because the file system as it is now may not perform that well for you, but with just a little tweak, it can really make a huge difference. Now, let me zero in on that. What exactly do you mean by an optimization? So this is coming from a guy, I don't know anything about how file systems work. So could you give me an example of, let's say I'm gonna be running a specific application and it has a specific file pattern or file access pattern, how does that translate to an optimization that I can build into the file system? Well, there's a couple of examples that I've come up with. One is the case where this is a little bit far afield, but you have a script that's pulling in dozens of other scripts that are spread across dozens of library directories or something. And the way that things work is to ensure that we're getting current data, we actually have to go look in each of those locations just in case the file showed up there. And so we're actually doing a lot of lookups that fail. And we're not remembering the fact that we did a lookup that failed. So those kinds of operations and those kinds of workloads can perform very poorly. So I actually implemented a translator that remembers the failures. And if we failed within the last minute, then we'll fail again immediately, quickly. That can make a huge difference for those types of applications. Another one, which is probably more relevant is something kind of like the parallel logging file system that was developed at one of the national labs. And what it does is as you write out files, it actually writes them locally and then asynchronously we'll spool them back to the main store later. So if you have a workload that's actually dribbling out little bits of data across a whole bunch of files, if you don't do anything special, that's gonna perform really terribly because every one of those tiny writes is going over the network. But with something like this, they all get batched up into huge blobs that get transferred very efficiently. And with PLFS, the real thing, they've seen really significant speedups on a lot of real applications. And so you can do the same sort of thing in Gluster as a translator. So that catching everything locally and moving it back, one, don't you have a global file system consistency problem if one of those clients goes away and two, is this completely transparent? Is this just like a standard feature or is this an add-on? This would be an add-on. In fact, it's one that's still in development, but it's an example of the kind of thing you can do. And yes, you're absolutely right. It's a consistency problem. And what we found with Gluster is that we have defaults that are very strongly biased towards consistency, but a lot of applications don't care about that level of consistency, that they're actually prepared and able to make a small sacrifice in consistency to get a huge performance improvement. So what we've done is while we set the defaults very conservatively, we've provided a mechanism by which you can relax those defaults in ways that make sense for a particular application. So just into that just a little bit more. To do this kind of translator thing, do I have to as an application writer, and I'm writing my physics application, do I have to use like a Gluster IO library or can when using these translators, can I stick to some sort of generic POSIX open, close, MPI file open, MPI file close, things like that? There's actually a lot of flexibility there. Translators are part of our core infrastructure, which is independent of what access method you're using. So they can run on the client, they can run on the server, they can be arbitrarily reordered relative to one another. So there's a lot of different things you can do there. You can use that core stack through several methods, you can use it through our own native protocol, which uses Fuse on the clients, you can use it through NFS, you can use it through SIFS via Samba, and we do actually have a library API as well. So that's currently being used for our QEMU block storage. So it's bypassing all of the file system stuff basically. That could be used in theory from something like an MPI library, although as far as I know, nobody's gone down that path yet. So a translator is our name for our plugin interface, which allows you to add functionality or change functionality within the file system. It's actually taken from the GNU herd project, which is where one of the project founders came from. And the name comes from the fact that it translates an IO request from the user into one or several IO requests going down towards storage, but both of those are expressed in the same terms. They're the same API above and below. So that's how we can use those to stack them on top of each other and all sorts of arbitrary and perverse orders and get exactly the functionality that we need for a particular deployment. All right, so you told us what Gluster is. What is the scope of the project? Where are the aims of the project? And who's involved? Well, the project actually came out of a different background than a lot of other distributed file system projects. It wasn't a PhD thesis. It wasn't an academic or a labs project. It actually came out of a very immediate commercial need. It was a Venezuelan oil company. They needed cheap distributed storage that was better than tape, was actually the performance bar. So it's evolved in a very practical set of directions, very sort of user focused. And by user in this case, I mean very generic user, not necessarily a specialist user. So historically it's been big in areas where they have very friendly, well-behaved kinds of workloads, things like media streaming, content delivery networks, things like that. Now since Red Hat has picked it up, there's been a lot more interest in using Gluster FS as a substrate for virtualized storage for virtual machines. And really that's going to be a huge direction. There's also from outside, there's a huge push towards accessing Gluster data through the object storage interface, which is compatible with Swift, similar to S3. I'm actually a bit surprised by how much that is driving a lot of customer and user engagements. So currently, those are the big directions we're going in. It is not entirely Red Hat that's behind this. There actually is an upstream community that is separate from the Red Hat product based on it. So there's Gluster FS, which is upstream, and then there's Red Hat Storage, which is the fully supported product based on it. And they are separate. I'm actually on the advisory board for the upstream project. And there are sometimes divergences that the upstream project actually wants to go in a different direction that is more dictated by what the community wants. So really it's very driven by what people ask for. So S3, you mentioned that in Swift. Those are the object stores. S3 is the Amazon object storage and Swift is the OpenStack object store interface, right? Correct. Okay, so you guys could stand up Gluster, install locally and point your applications that were written either for Amazon S3 or written for OpenStack and they could at least access storage off of this. Correct. What we've actually done is we have submitted some patches which we're currently carrying ourselves, but which hopefully will be incorporated as part of OpenStack eventually, which actually use the real live OpenStack Swift code and sort of fool it a little bit into thinking that all of the objects are local. Now, in fact, they're not, but they're accessed through the local file system interface and then we take care of the distribution and replication and so forth. So by implementing the Swift API that way, we get S3 compatibility to the extent that Swift themselves do that. We leverage all of the rest of that ecosystem as well. And the real key thing that we do that's a little bit different than what some of the other people who are doing unified object and file storage is, in our case, it's act the same bits that you can actually take something, write it as a file and retrieve it as an object or vice versa. Some of the others, it's a common storage pool, but they're separate objects within that pool. And our feeling is generally that we want to have the same data, but accessible through as many methods as possible. NFS, SIFs all fit into that as well. So it's really a very valuable thing. Hadoop fits into that also, that you can write stuff through any application that can use a POSIX file system API and then use it directly in Hadoop without having to import it. And vice versa on the backside if you want to get data back out. So most of our listeners are HPC focused or I assume that because that's my focus. And a lot of these systems traditionally use a POSIX like interface, even for distributed parallel file systems. Can you touch just a moment for those of us who are ignorant of object stores exactly what an object store is and how it differs from like a POSIX interface? Well, they're basically similar to a file system interface, but fairly deliberately dumbed down, you might say, or simplified if you were feeling more charitable. I'm a bit of a file system bigot because I've been doing it so long. The idea is that there are certain things about POSIX that really inhibit scalability. And we saw some of this years ago from the PVFS people and others who said, having to conform to some of these direct atomicity and ordering and consistency rules really makes it hard to create something that scales. And the object store guys took that even further and they said, well, okay, we're gonna have a single directory hierarchy. We have buckets or containers depending on which actual project you're using. And then you have objects within those. You're not gonna have arbitrarily nested directories. You're not gonna have the richness of attributes. You're gonna have a different permission model. You're going to have a different data access model where it's pretty much limited to whole object get input. There's no single byte overlapping rights within the middle of an object. And all of this is done with the idea of making it simpler and making it easier to scale it. So in some cases, you can use it the same as you would a file and in other cases, you can't. And even in cases where you could, you do have to change code to use a different interface. So there's pluses and negatives. A lot of people like it. That's why we do it by everyone's good. But it's not quite as flexible as a regular file system interface. So you mentioned PVFS in there and we've had them on the show. And so there's a parallel file system which has multiple machines with multiple lines all presenting a single file system interface. And then there's the ability to do parallel IO, which is having multiple clients right to the same file simultaneously. Can you do that with Gluster? And if so, can you do it using the object interface or do you have to use the POSIX interface? Well, since the object interface is largely whole object, then essentially no, you can't do it that way. You can do it through the file system interface. We are nowhere near as sophisticated in handling that as the PVFS folks are. That's a focus for them. So I actually have worked with those guys. I was out at Argonne as part of my psychortex duties at one point, almost got snowed in in fact. It was pretty ugly. So I've had talks with them about this and we're broadly in agreement that that's kind of a special thing. What we do since we are very, very strict about maintaining the POSIX ordering and consistency guarantees, we're doing a lot of locking and back and forth. One could imagine that we could evolve in a direction where we skip that locking for applications that have their own forms of concurrency control and get better results. There's probably other issues that we'd have to resolve before we were really good at that. Right now the workloads that we work best for are the ones where those sorts of things don't happen. They work, they just don't work well for us right now. All right, so if we take this concept of a translator and some of the ideas that you mentioned previously like playing with locking and knowing what the application is gonna do and integration with MPI and so on. Obviously I have a large MPI bias since I'm an MPI guy. What do you imagine could be done just as a thought experiment here? Form an MPI perspective where the goal is to have many processes simultaneously reading and or writing to the same file to get some kind of speed up, get some kind of optimization. What do you think MPI should do to speed this kind of stuff up? What kind of hints can it give to Gluster? Well, there's a couple of different directions that I think the communication can go there. On the one hand, if MPI could tell Gluster that it knows something about what it's doing when it's accessing a file concurrently, then we could use that as a signal to relax or elide some of the locking and synchronization that we do. So we pick up a big performance gain there. There's things just in the configuration that can happen. In my experience, a lot of the scientific users are pretty comfortable running their applications on top of scratch space and then moving their results and their data back and forth. So you can skip the Gluster replication which actually has fairly serious performance implications. We're much faster without replication than with it because in our case, the replication is synchronous. And then there's the whole issue of data placement that we already have hooks, these are used by Hadoop to find out where copies of a file actually went. So that's across replication, across striping. We'll give you all the information about where the bits of the file are. So that can actually be used by a job control system or anything like that to place jobs where the data is. On the flip side, we're looking at adding features where you could actually have the application or a library tell us where we should put the data. So if you have a job that's already running somewhere, then you can tell us where to put the data that comes out of it. The degenerate case, it's just to put it where we are. We actually have that already. But another case would be telling us to put it on some other node where we know the next stage of a computation is gonna run or at least put it within the same rack, put it somewhere attached to the same switch, things like that. So there's all sorts of opportunities there for the application layer and the job manager and Gluster to communicate about data placement. I think that's actually a really interesting research area that hasn't been adequately explored in a practical context. There's some academic work, but not a lot out in the real world. Okay, so it sounds like you wanna implement a lot of these functionalities by using translators and other things. These translators, are there something that I can as a user inject from user space or is this something that has to be set by the admin globally? Generally, they are set by the admin globally. You can actually, as a user, add them on the client side. So as a user, you can affect the server side, but you can certainly affect the client side and you can cook up what we call a vol file, which is the volume definition that includes all of the translators, all of the options for the translators, how they're all hooked together, going from very dumb, very simple, what we call bricks on the servers that provide the raw storage, and then combining those in various forms, distributing across them, replicating between them, bing, et cetera, et cetera, et cetera. And you can actually put your own translators into that vol file and then mount using that vol file. So you do actually have some control over that. Now that works for translators that have already been written and built. As a user, you're gonna have a little bit of a learning curve to actually write a translator. It's a lot of domain specific knowledge, but there's certainly the possibility there to do that. When I've developed things like the lookup, optimizing translator that I've done, I've done exactly that. I've hand edited my vol files and then mounted using those. Okay, so these vol files, do they just define which translators to turn on and then settings for them for translators that are already installed, or can I actually inject my own translator that the admin hasn't installed globally but not enabled in that file? You can inject your own. The vol file contains the names of translators which are then actually found as dynamic libraries in our library directory. So the translators do have to go there, but it's all totally dynamic. So we can readily load translators that we cluster FS developers never heard of before that the user happens to have as long as they can put it in the right place and name it in the vol file. In fact, when I started cloud FS which later became HECA FS and I was at Red Hat and Gluster was a separate company. This was really the key to why Gluster was selected as the technology base was that I could do pretty much everything I needed, things that the Gluster developers had never thought of as translators. I could write them on my own, they could be packaged, they could be supported, they could be licensed separately from Gluster itself. And some of our users actually come back to us and say the same thing, that they actually want to write translators and they have their own concerns, say around licensing. And the fact that it is a fully abstracted public API and that they can build something that's physically, logically, legally separate actually is what enables them to do what they want to do. So stepping back a little bit, you mentioned replication, and you also mentioned distributed metadata namespace. Can you touch on some of the enterprise like resiliency features that Gluster has? Well, there's two main things that we do to facilitate data protection. We have synchronous replication. So this is a style of replication that is designed for local use within a local fast network. So that provides protection against a single failure within that data center. Then we also have remote replication. We call it geosync, georep. I don't know what the current favored term is. Now, it's asynchronous cross-site replication. It's actually, think of it like rsync, but it's actually using some highly optimized methods of finding which files need to be transferred. So it's actually like rsync plus, you could say. And that's more of a disaster recovery. Your whole data center blew up or you lost connection between two data centers. That's sort of a need. So you can actually use both of those. We are looking at adding other things that play into that same data protection space. For example, bit rot detection. We're looking at adding checksums or something like them, possibly even erasure codes as ways to further protect data against loss. And that's really what we see as the biggest thing that a lot of the enterprise crowd are looking for. It's often why they come to us instead of somebody else is that we have built-in replication and that's really important to them. Just a side note, there is actually an erasure code translator which provides protection against multiple disk or node failures with very good storage utilization. It's being developed not by us, but by a company in Spain using that translator interface. So I actually found out about this after they were kind of well into it, which was a major milestone. Here's somebody actually developing stuff and they didn't even have to come to me for the information first. So because of what you were just talking about with the replication built-in, do people typically run with RAID or is it just easier to run with boxes of JBODs? That I just have a billion individual disks and Gluster takes care of the replication for me so there's really no need for any kind of RAID. You can do both. The most popular type of deployment is actually a mixture. What people will typically do is they'll run RAID 5 or RAID 6 on each brick and then they will combine those bricks using our replication. So simple disk failures are then handled at the RAID layer and they're very transparent and the recovery is very performant to the extent that RAID recovery is. At least it can be done largely in hardware if you have a hardware RAID controller. And then node failures are handled at the higher level by our replication. So you're getting both kinds of protection. You're dealing with the most common failure mode in a very efficient way. You're dealing with the less common failure mode in a less efficient but still functional way. And then on top of that, you can have the replication to a whole other data center. So the even less common failure is handled in yet another way suitable for that need. So the metadata though, a lot of distributed parallel file systems, metadata either is mixed some way with like the data store or they're a separate host or they can be on the same host but on a different one. How do you guys handle that and how do you handle your distributed metadata? Well, this is actually what sets us apart from both say Luster which has a centralized metadata approach and Ceph which has distributed metadata but does it in a different way. So in the Ceph case, what they do is they actually have two different layers. They have an object layer and then they have a file system metadata layer that scales separately that use different protocols among themselves, different algorithms. And what we've done is actually pretty different. We found that the scaling separately is not actually all that useful. We actually have every server is both a data server and a metadata server. And both types of information are actually looked up using the same consistent hashing or sometimes we call it elastic hashing method which is based on placing both servers and objects around a conceptual ring using hash values. And then you find the server for a particular object by looking around that ring from the files hash to the next servers hash. So that makes it's a little bit similar to the way Ceph does things with Crush. It's a different algorithm but it's the same basic concept that the client can figure out which server it should be talking to based on a small amount of data plus an algorithm instead of having a big lookup table. We don't have to maintain some similar directory that tells clients which server to talk to for which file. It's apparent from the server topology and from the hash of the file, which one that should be. So if I'm running out of I-Nodes or something like that I can just stand up another couple of bricks and now I've got more I-Nodes in the file system sort of set up. Is that basically what this translates into? Yes, we actually do keep track of both free space and free I-Nodes. You can get information about that across your bricks and you can see when you're likely to be running out. The process of adding new bricks, bricks generally will correspond to servers although you can actually have multiple bricks that are served from a single server. You can just add bricks. The process is a little less transparent than I personally would like. You do actually have a series of steps to go through to incorporate it into the topology, change some of that information that we use to determine which files are supposed to be on which bricks. And if you want to, to actually migrate data from the old bricks to the new ones. I would actually like to see a lot of that be completely autonomous that at worst all you have to do is say it's there and we'll handle the rest. But that's not the way it is. And part of the reason is, again, this is an enterprise-y thing. A lot of enterprise users actually do not want the system going off and doing these very heavy weight high IO repair, rebalance, maintenance types of operations on our zone. They actually want to schedule that. They actually want to have control over when this operation takes place. So they might actually add the new brick and do the part of integrating it into the system where we update all of our information about where files should go. But we don't actually start migrating data until they tell us to. And then they do that during their scheduled downtime or during what they know is going to be a slow period. So we would like to have the option for it to be fully autonomous, but there will probably always also be an option to have it be under administrator control. So just a side note here, I have to ask, where did you guys come up with the term brick? Because at least in my world, a brick means a machine that is dead. It is dead as a brick, it doesn't respond and whatnot. So I've been inwardly laughing every time you and Brock have said the word brick throughout this conversation. Well, yeah, it's not our best terminology decision, it's not our worst. I think our worst is actually the name of the project because people see the G on the front and they think it has something to do with GNU, which it doesn't. They hear the whole name and they think it has something to do with Luster, which it doesn't. So I constantly get asked both of those. I really, obviously it's too late to do anything now, but I kind of would have wished for a different name. We've actually had people come into our IRC channel looking for Luster and I'm sure it's happened the other way too. Well, I mean, you look at Luster and Gluster in the ER and the RE, I mean, isn't it obvious? Terribly. Going back to brick, there was actually a bit of a tradition of using this. We use it because a brick is something that's very stupid, dumb as a brick. And that's actually one of our design principles that the server should actually be providing a very base level functionality and that a lot of the intelligence is actually pushed to the client that might be on the server in most other systems. For example, replication, in most systems, that's a server responsibility. In our case, the replication is actually done directly from the client. Distribution and hashing similarly are done directly from the client. So that's kind of the idea there. There are other projects that have used the same terminology like FAB, Federated Array of Bricks. That was a project that I think preceded us. And I think it's shown up in other contexts as well. It is unfortunate that it has overlap with other uses of the term though. So that's an interesting design decision you say though that a lot of the intelligence is down at the client and it kind of sheds light on something you said earlier about how replication has a severe implications on performance and I would assume that that's because it's happening at the client and not at the server. What are some of the other trade-offs that you have experienced because of that by putting things at the client instead of the server? You just mentioned a bunch of the positive effects of that. What are some of the negative effects? Well, there's definitely a negative effect in terms of performance for replication. There's also a negative effect that I've seen a couple of times as I try to do new things that because things are at the client, if you want to do anything that involves a read modify write to update a fixed size block of data, then you're talking about a read modify write between the client and the server, which is much more expensive than a read modify write just on the server. So we've hit this when we were doing encryption using block encryption methods. We've hit this with erasure codes. We're going to hit it again with deduplication. So these are all things where it would in many ways actually be better for us to put that functionality on the server. Now through the translator interface, we can do that. We can implement the protocol or the algorithm in a translator or a pair of cooperating translators, which may run one on the client, one on the server, but may also both run on the server. And there's different trade-offs. The running it on the client means you're pushing functionality and work out to the most numerous components, which is often good for scalability. It also means that you have an availability advantage that the client already knows where the alternates are if it needs to do anything about a failure and so forth. But then there's that performance negative. So if you're comfortable with dealing with availability some other way and you want the better performance, then you can actually move some of that stuff to the server side. Now that's not necessarily supported currently by our CLI and configuration tools. The low-level infrastructure certainly can do it. I actually do that myself on a regular basis just for fun. But I think in the future, we're gonna start opening that up a bit more so that moving functionality between the client and the server, which has historically been on the client, will actually be a fully supported and transparent kind of operation. Okay, so let's talk about performance a little bit. I guess there'd be two different ways to ask this question. One would be is how is the performance scale for a normal POSIX interface as you add bricks? And then what's the performance of a single client in the, you could say, standard config of out of the box with the basic replication? Well, as I'm sure you know, talking about performance is a really good way to get yourself in trouble. I mean, we are generally not trying to be the performance king. So that's kind of a decision that has often been painful, but it's one that we made pretty early on. Scalability is excellent, at least in the IO path. So you can continue adding servers and see a very nearly linear performance increase for what we would call well-behaved workloads, large block sequential streaming type of workloads. Some of our biggest users are media services, things like that, that have exactly that kind of workload. And we have configurations with, I think the biggest, I can't say who it is, but I think it's a few hundred servers. Now, when you start getting into workloads that are more small IO, small file, random synchronous, single threaded, then we don't do so well. I'm actually a little bit surprised. I've done some comparisons myself. I think I've probably run more parallel and distributed file systems than, well, maybe than anybody. I've run, I think, 10 of them at this point, often on the same hardware. And I'm actually a little bit surprised that we generally come out pretty well even for workloads that I wouldn't expect to. But it's not really our primary focus right now. The one thing that is really kind of awful for us, maybe I shouldn't even be saying this, but what the heck, I think people deserve to be warned, is the many small files types of workloads. And I know these are actually very common in HPC. I've been burned by these many times. Or you have tens of thousands, even up to millions of files in a single directory. That can be really painful because directory operations are really kind of the only area where are using Fuse to do file system stuff in user space. And the native client really actually does hurt us a lot. A lot of people complain about Fuse. Most cases, it doesn't matter so much, but that one case, that's why I do things like the lookup optimization and things like that, because that is definitely not our strong point. We really like the streaming large file stuff a lot better. Yeah, I don't think you're alone in that space for any of these distributed file system things where you've got multiple RPC type requests going back and forth over the network all the time. Even regular NFSv3 is like, yeah, really don't give me a lot of metadata things to do. Absolutely, I think we still suffer, but yeah, it is to a certain extent a characteristic of being distributed and at some point, the fact that you are physically separated across a network is going to show. So kind of moving on from the performance thing, what's the strangest use of cluster you've ever seen? Well, I actually spent a fair amount of time considering this one. So I'm tempted to say that NASA's use of cluster is pretty far out, but that's just kind of obnoxious. I know there's a few porn sites using our stuff. I should hasten to add that I know that because I've seen references to the technical matter. Obviously, yes. The strangest use that I can actually think of that I really know something about is I actually have used cluster myself and used our replication component in particular to do a file system migration just on one machine that I set up a translator arrangement where one half of the replication pair was the old file system and one was the new one. And that way I could do the migration from the old one to the new one while the file system was live. Cool. Here's a question I like to ask developers and projects and it's sheer curiosity, just because I love to hear what people say about this. What version control system do you guys use for your source code and why? Well, we use Git. I'm not entirely sure why it's Git other than say Mercurial or Bizarre. I think mainly it's familiarity and being able to integrate with some of the other tools we use. So for example, we use Garrett, we use Jenkins, we put code up on GitHub, we have robust internal support at Red Hat for Git repositories and for all those other things that are all part of a Git workflow. So it's just kind of a default position that we've never found a compelling reason to stray from. Personally, I like Mercurial better but they're so similar that I actually don't have much problem switching between the two. Okay, so future features, you've mentioned a couple of things. I wanna know what's the next exciting thing you're gonna be working on for Gluster? Well, it depends a lot on what kind of user you're talking about. I think there's currently a very large set of directions that we're going in as far as virtual machine storage object storage and big data storage. So we have a whole team of people working on ways to make the Hadoop plus Gluster combination really something more compelling. I think other directions that we're going in, I think a lot of people in the enterprise space are really gonna like BitRot detection. A lot of people who are looking at this more as a cost play are gonna like the improved storage utilization of things like erasure coding instead of full out replication and deduplication compression. Those are all things that are kind of in the works. Personally, the thing that I am most excited about and I have to say I've been a little bit frustrated because I wanted to work on this for years. So it's kind of why I came to Red Hat in fact. And it's always kind of fallen down the priority list even though customer and user after customer has really gotten a light in their eyes when I mentioned it is more robust wide area, truly distributed replication. So we're talking about many sites. We're talking about multi-master. So simultaneously being able to write at multiple sites and have the conflicts resolved. We have some users who really, really, really want this because they're currently using AFS at 50K nodes scale. And they really would like to get away from it. But sadly, nothing has come along in how many years is it, 25 years that can really displace it. And I would really like to have this be the thing that can finally bring that same functionality that existed then into the modern era. That's funny because we're trying to kill AFS over here and we're running to so many places where it's like, oh, we don't have this permission thing and we don't have this read only volume thing. And we don't, it's like, yeah, you're really right. Nothing really touches AFS in terms of a lot of the functionality it has, but man, AFS has some, yeah, okay, I won't go there. Well, and it has some awesome things but it also kind of shows the era in which it was written. It uses some algorithms and some RPC methods and other techniques that clearly have been improved since then. If you could take the core algorithms, the core ideas and implement those on a modern infrastructure, I think you'd have something really killer. Well, and I guess I can say that I do have one more thing to add here, which is anybody who sort of started to perk up and get interested when we started talking about that, please let us know. Make your voice heard either on the community at cluster.org or at Red Hat. Let the people who actually plan the releases see that there is actually a very significant interest in this particular thing. So I am being incredibly selfish here by asking people to help me push it, but what the heck? Why not? Okay, Jeff, thanks a little lot for your time. And thank you. Jeff, this was great. Appreciate it, thanks. Talk to you guys later.