 Hello, everyone. Welcome. Yeah, I'm Carlos. I'm a software engineer at Red Hat. I work at the storage team at Red Hat more specifically in the Manila team and today I'm joining my fellow Red Header, Gauton Paterravi. We both work in the same team and we maintain the OpenStack Manila, as I said. And today we wanted to talk about a natively scalable CFFS gateways with OpenStack Manila and a couple of things we have been implementing over the course of the past months and releases in OpenStack. So these are our agenda for this talk. So first, about OpenStack Manila. You, at this point, might have known we have had a couple of formations to Manila over the course of the day and it can offer you self-service, post-ex, compliant shared file systems, which are elastic and secure. Elastic because you can grow and shrink them at any time and it should be instantaneous and secure because you can control the access to whomever you can control who can write data. You can control who can only read and do all of sorts of like ACL rules. In addition, Manila also provides like levels of multi-tenancy and servicing guarantees. So like cloud administrators are able to control quotas, share types and user control. It supports a variety of protocols as well like NFS, CIFS, CFFS and yeah, besides all of these functionalities, we already know about the NAS shared file systems. Manila also lets the users have lots of self-service features and lots of things you can do to interact with your shared file systems. So yeah, pre-complete and we have a lot of use cases and very large deployments using Manila. So here kind of an overview of results of surveys we conducted with both upstream for OpenStack and CIF communities. So there are some large deployments using Manila with CIFS for OpenStack. The last conducted survey was 2022 and we got about 84 responses that recorded the usage of OpenStack Manila. Almost like 42% of those were already using Manila for production and there was kind of 22% of those people of those deployments, still in the testing phase that they were trying to phase out to production. Yeah, and CIFS, being able to consume it directly via or through NFS, it was the preferred solution in Manila, as you can see. So that blue thing, like in the right-hand side graph, it's the usage of CIFS. That second survey was like 68 responses and it's basically CIFS was pretty much dominant in the usage of Manila. So let's go around on how we do native CIFS and CIFS through NFS in Manila. So for native CIFS, it's pretty much like CIFS volumes served as Manila shares. So the way we usually do the deployment or the way people usually would do that is we would have a storage provider network that extends the self-public network to the VM. So the VMs would be connected to the storage network and also to the public network. And with that, the VMs would be able to mount the shares and the clients would be able to consume the data. So it could be the VMs, it could be containers, you name it. The CIFS driver, it works well in an environment with trusted end users on a private cloud. I think there are a couple more things. So for CIFS via NFS, we can also, CIFS can also be served behind an NFS gateway, in this case, use NFS Ganesha, a user-based NFS server. So it's basically like the same concept we are serving, letting the users create sub-volumes in CIFS and we are letting them like mount it and consume it. And there is how do we do, like, how do we control the high availability or everything like that for the NFS servers? Well, that's basically how we do that. So we would have base maker controlling like they are having access and managing the instances we have for NFS Ganesha. So in this case, for example, we have an example of like three Ganesha instances and base maker would have like a virtual IP. So and there would be, it would work in an active passive mode. So there would be one instance active and whenever, like, if it went down, then base maker would elect the next one and it would like try to make that the active one and if it would go on and on and on. And then, yeah, the clients would be consuming that. It should be something that the base maker is already doing. So now, over to Gautam to talk about some changes with him. Thank you. As you can see, there are quite a lot of limitations, as you can tell already. So if you're familiar with using CIFS, you expect a certain degree of performance from CIFS. But layering an NFS gateway in front of CIFS, you know, you're going to actually suffer a loss of performance. And besides the loss of performance is also about the scale, right? So you would be able to serve a lot more clients if you were, if you were accessing CIFS directly, as opposed to putting them all behind maybe one or a few of these NFS gateways. So the performance hit is something that is a harder problem to solve. But the scale hit was something that was an easier achievable problem that we were able to work with the CIF community in a whole, really, the CIFS folks, folks who work on CIFADM and also the NFS Ganesha communities. So there's work that happened in the last couple of cycles that I'd like to show you what's changing. In this model, and this is OpenStack hidden in this picture behind, but those are OpenStack users that are consuming NFS shares connected through Pacemaker, which is controlling the uptime and the availability of the NFS server. And a bunch of these users are all going through the same, you know, the NFS server. That's how you would picture this. And to orchestrate the NFS exports, all of this is done automatically with the help of CIFS driver in Manila. So the driver is capable of creating the access rules that are required. So if you are an end user, you would tell Manila what is the client that's going to end up mounting. Your CIFS volume, what's the IP address, et cetera. And we, as in the Manila driver would would go create an export record for you in the NFS parlance. And there is stuff happening in the background such that the NFS server can recover from any sort of outages by preserving these exports, not FM really are using any other storage. I mean, there is a storage system behind this. So why not just use CIFS objects to persist those exports. So all of that is happening in the background and you don't need to worry about it. But that's how the Ganesha recovery is happening every time Pacemaker detects that there's a failure in one of these nodes. It's able to bring up another NFS server elsewhere and consume those exports because those exports are configured on rados also. And all of this is being done by the Manila driver hidden away from what the users are known to end users have a familiar interface saying, I have my share, I have my IP address, give me access. That's it. As I told you, one of the things is the whole aspect that this doesn't scale very well. The other thing is the aspect that Manila and Ganesha need to be in sync. And in order to send a export configuration to Ganesha, we would use Ganesha's D bus API. And that kind of started putting us into a, you know, a locked architecture of sorts. So if you were to run this in the best way possible, you would want the NFS server and the Manila share manager process to coexist so that they could share a D bus socket. Because if you're trying to do anything else, like D bus over SSH or something, things would start getting weird. Pretty soon and not be as reliable. And I'm an engineer. I'm used to telling you that, you know, this is a bad solution, but this works very well in like a lot of production scenarios, by the way. It is being used at scale as well. But I would like for the scale to be better as other users would. And that's why we were working on the next approach, which is clustering Ganesha. So NFS Ganesha is pretty old at this point. And it's been widely used. And a lot of the storage solutions that we're interfacing with NFS Ganesha have been using it in a clustered manner. Ceph wasn't, or rather there were users in the wild, probably you noticed them on the Ceph users list that were trying to do this, but not a lot of first party things going on. Not a lot of, you know, community supported architectures. And that's what's changing. So with, I will talk to Ceph Adam, because that's what we've been using mostly in open stack deployments that Red Hat supports, but a very equivalent thing is happening with ODF as well. And you may be aware of it. So what's happening here is Ceph Adam today is able to create a NFS service for you. And it is able to scale the NFS service. And you can have active, active instances directly living as Ceph Damon's would on your Ceph cluster. And it's able to put these active, active instances of the NFS server behind a well known ingress sort of service. It is called the Ceph ingress service. What the Ceph ingress service is, is a combination of HA proxy and keep alive D. So if you're scaling your NFS service, like this diagram shows you into three servers that are active, active, active. So all of these three servers are presumably on different Ceph nodes. And there is an HA proxy and a keep alive D that's running in front of it. And the HA proxy is managing the load balancing aspect so that you can add more scale to these NFS requests and keep alive D is moving the VIP as it detects any failures. So one other aspect of this is if you're used to Manila, you're used to client restrictions through IP addresses. So as I told you, a client would come into Manila and say, I mean, here's my share, give me access to this particular VM and this particular IP address. Now off the bat, this wouldn't work because we are terminating all the connections at this ingress service. And what Ganesha is seeing is traffic coming from the ingress. So the NFS Ganesha folks have now implemented support for a proxy protocol, which is kind of confusing to say HA proxy proxy protocol. But what it's essentially doing is it's able to pass the source IP address in a header that Ganesha can parse and understand that traffic is coming through this ingress. But it is meant for a client which has a different IP address. And so your client restrictions saying this is the one I need to have access would actually work because you're looking at the source IP address in the header rather than what's coming as part of the ingress traffic. So that's a huge number of small things that had to change to get that happening. And with the scale, like with any other self-tamen, you're able to scale it up and scale it down. And as you're scaling it up and down, we are using primitives within NFS Ganesha, within the NFS protocol, as well as mechanisms of persisting the recovery pieces of this onto RGW. So that onto self objects that we're able to recover the state, the locks, the NFS locks and so on. So I can go into way more detail. Maybe that's not too interesting as to how these Ganesha servers recover from failures and stuff like that. But maybe that's for another talk. For now, though, I mean all of this innovation happened in like the Ceph Reef release that is yet to be released. And it's not complete yet. We're still testing this stuff and we know there are some corner cases and some bugs. But we are targeting for this to be available in the Ceph Reef release. And as far as NFS Ganesha goes, it's available with NFS Ganesha version 5, which shipped like a couple of months ago. And it's meant to be used with the Ceph Reef release to be able to use client restrictions with this HHA proxy protocol enabled and so on. So if you're looking at this today and you get your hands on Reef, you should be able to do a Ceph NFS cluster create like this. And specify where you want your NFS cluster to be created. Provide a virtual IP address that you would want to use as the front end. And specify the ingress mode to be HHA proxy protocol. You can look at the help for this command. Ceph Adam is capable of, you know, building the moon for you. So there are a lot more other options that are there. This is how we would have our users use this and set up their NFS solution. And what this does is it creates this sort of a spec. And you can see the spec has some things about placement. Where to put things, how many of these services to run? What is the front end? What is the back end and all of that stuff? Usually you don't need to worry about it unless you're debugging or troubleshooting something. So time for a demo if I can get the... And so here we're just logging into the Ceph cluster and printing out the version. As I was telling you, this is the latest, greatest thing. So we built this demo yesterday. And this is using the RC that Josh was telling you shipped like last week. I think. Oh, let me check. Probably not unless I can do that. Nope. It's adjusting to the... Okay. I'm sorry. I can't. I guess it does think it's at 300. If I can quickly do that. It's an HTML. Sorry about that. Demo fails. Even if we record. And now I can't get out of this. Oh, yeah. Okay. It's hidden. You got a screen. I'm sorry. But I will try to make this available, like give you the slides so you can actually click through this and probably even try this. So that's what my Ceph Arc LS gives me. I can read it out for you. It has an ingress.nfs.cffs which has a front end IP and a back end, you know, section as well. So it's front-ending with that IP address and port 2049. So it's pretending to be the NFS server, but it's actually the HAProxy service. And there's an NFS service that is running on that host with the thing. It's serving NFS on 1249. Doesn't matter. Because all we're pretending is the client doesn't need to use a special port. They're familiar with it. Their clients are configured to use 2049 so they can continue to use NFS this standard way. And I'm also going to show you some part of the thing because in an earlier presentation I said triple O is going away or something like that and that usually causes panic. So the Red Hat installer is changing and it's going to start deploying the OpenStack control plan on OpenShift and Kubernetes. So that's what this looks like. So I'm running the Manila service with that new installer. And I've configured two back ends, one that serves CFFS over NFS and another that serves CFFS over CFFS, but it's talking to the same CF cluster. A very common use case with Manila users. And that's what the spec file looks like. As I'm looking at the back end configuration, that's just the container image of the service that I'm using. The share back end information looks very familiar for folks that have probably configured CFFS with Manila. One useful thing over there is that it's now looking at a CFFS NFS cluster ID. I'm sorry I confusingly called it CFFS. Everything we just named it CFFS. So you can name it what you want, but it's basically a way that the back end driver is identifying the NFS cluster within the CF cluster. So it knows which NFS cluster you're talking about. And on my other back end, I'm consuming native CFFS. Simpler configuration, I'm just serving the native CFFS protocol. So there are two Manila share manager services that will eventually come up is what the idea is. And this is the demo environment a little bit. So we're distributing the CF secrets to Manila so that it's able to access the CF cluster. And we're also going to be using those two bare metal nodes as clients at the end. Guess I didn't speed up this part. So we're just showing you that that's usually what you would have seen is configured within Manila share to access the CF cluster. You need a key ring. You need the CF.conf. That's the Manila services that have come up. And that's the pools information that Manila knows about. And I'm going to create a share type to be able to use CF. I also threw in a net app over there if people use it together. But otherwise, I'm just creating a share, a regular process. So what that's going to give you is it's going to create a CFFS sub volume and export that over NFS. And you'll be able to grab the export location and the IP address over there should be familiar because that's the front end IP of the HAProxy service that we have deployed, the Cefengress service. So we give access to the VM that we know. And we go ahead and try to mount it and write some data, regular stuff. And let's just go ahead and list the files. Everything looks great. So this is just a regular workflow of the client restrictions working as expected with even if you're talking to the HAProxy service, you're actually getting NFS out of it. And if you're trying to mount this on another VM which has no access, and that's the user experience you're going to get, which is that you don't have access to it. We don't know what that file or directory is. So let's go ahead and give access. And we're going to be able to mount it as well. So this is the part that would not work without the HAProxy, proxy protocol implementation that the NFS, Kineshah folks have to work on. So we believe for sure that these primitives that are implemented are way more extensible and are usable even outside of the OpenStackManla context. So this is a big change for Ceph and for NFS, Kineshah. And we expect, as Federico was telling you, IBM is looking to productize NFS piece. So I think this is one of the foundational pieces for that and things. So I'm glad we were able to use and reuse these things. So again, all credit to the Ceph team and the NFS, Kineshah team for having figured all this out though. All right. So that's it for my demo. Let me get back to my slides if I can. So you'll see more of this and some documentation written around this pretty soon. Everything is bleeding edge. So what do we plan to do to migrate people? Because we're also in the business of supporting people that have been using a solution for quite a while and pretty successfully. So one of the problems we have and would be an amazing ask for the Ceph community would be there is no easy way to say Ceph, Adam, adopt my NFS server. It doesn't exist. And so we'd have to invent different strategies of how we'd have to move our NFS clients over. So one thing that we are thinking of with the Red Hat OpenStack product is allow for an extended decommissioning period for the existing NFS server. And what this would do is allow a cloud service provider perhaps or an operator to tell their clients that eventually this NFS server will be shut down a week, 10 days, whatever, like your own logic to that even six months or whatever. So by that time, please also start scaling down your workloads, changing your stuff, consume the same data that you have. It's not getting deleted. It's just coming from a different place to start consuming it from the new place. A standard deprecation that is probably used in... A standard deprecation approach is probably used in most data centers for stuff like this. So how would we do that? Well, Manila will handle that part. So the Manila driver, when it starts up and you tell it, hey, there's a new CFNFS cluster, you can say this is where my old cluster was. And Manila will be able to represent both export locations to its users and tell people which one's the preferred path. So if somebody is listing their exports through Manila, they'll know what to mount and what not to, what's going away basically. So it's sort of like helping that UX alongside the operator's communication. Another thing that we are working on and we're still actively testing this and it may be possible is to be able to live migrate them or even migrate the NFS consumers with the minimal disruption, if anything. And what that would involve is for us to cluster these NFS services, the NFS service that originated with Seth Adam and the NFS service that if you're using OpenStack with... that had OpenStack, it is the pacemaker-managed NFS service on your OpenStack control of nodes. So we'd be able to cluster them and then start tearing down the old NFS servers and do a WIP migration. And this is stuff that's still in flux. And one of the reasons we think this may not work in all situations is that you may not have the network flexibility for doing this sort of a WIP migration. So what we're essentially doing is moving the NFS service from OpenStack to Seth. So we are not sure if your NFS network, your pre-existing NFS network, is that easily accessible to the Seth cluster because you had architected your network in a different way and so on. So you will have access to the previous solution, but this is something that we are interested to attempt because we have customers that are never going to unmount their NFS shares and remount it because you're trying to move them off of their old NFS server and so on. So we'll see if this works. Maybe we'll update you in a different opportunity. But if you're looking at what that solution would look like, that's what this slide's providing. So one other thing that I should have mentioned is that when you're moving off this NFS servers, this is also stuff that we did in the Manila driver. So you don't need to do anything of moving exports and all of that stuff. Just tell Manila where the new NFS cluster is and this FFS driver will take care of replaying all the exports to this new place and mirroring all of that and things like that. So that's a UX we expect to work and we'll be testing that thoroughly. Awesome. That's actually all I had and I think we have a couple of minutes for questions. No questions. Amazing. So, oh, sorry, yeah. Yeah. So, I mean... Oh, I see. Okay. I think the CEF community certainly has done more per-scale tests that might involve you. Sorry. Then we have, because placing Manila in the picture is not interesting, right? Because Manila is mostly doing all the management operations. So it's mostly CEF that's, you know, depending on where you're running it, how you're running it, how you're tuning it, that might impact the performance. What we've done a little bit is comparing this usual canned Red Hat OSP deployment to try to stress test the control plane, try to break the performance a little bit that way, or even try to compare and see whether we're seeing the same patterns that probably the CEF folks are, which is how does native CFFS compare with NFS, compare that with bare metal versus container workloads versus VM workloads and so on, and try to find out if there are, you know, improvements that we can suggest and things in the network path and things like that. But, no, we've not compared CEF with vendor storage or something like that in that regard. Yeah, no problem. Is fencing in use? Is fencing in use? Yeah, yeah, sure. So is fencing in use in our current solution is one thing I could talk to, which is that we're using PaceMaker and we do test fencing. So if you wanted to try to do a planned failover, decommission a node, et cetera, you could even actually test it with initiating a fencing operation. That's how we are queuing the product itself. But yes, that's another PaceMaker primitive that we are using, that in case a node failure is detected, it is actually fenced off. It's not allowed to rejoin the cluster automatically and stuff like that. All right. Thanks, everyone. Thank you.