 Hi there. Thanks for coming. This is Manila and Sahara crossing the desert to the Big Data Oasis. I'm Ethan Gafford from Red Hat. I'm a member of the Sahara core team, the other presenters. Hi, I'm Jeff Appawide. I'm a technical marketing engineer at NetApp, and I'm here to talk about Manila. Hi, I'm Malini Bandaru from Intel, and I'm covering for with Tin Chen, and we hope you enjoy the session. Alright, great. So to go over what we're going to be talking about today very briefly, first we're going to talk about what Sahara and Manila respectively are for those of us who don't know. Then we're going to be going into the state of Sahara data processing prior to Liberty, and the state of it now. There's a lot more Manila. We're then going to talk about three different approaches to Manila integration, two of which are available to you today, and one of which is still a work in progress, but it's pretty exciting. Then we'll conclude and do a brief Q&A. Okay, so OpenStack Sahara is the OpenStack data processing service. The fundamental problem that we're trying to address is that big data clusters, whether it's Hadoop or Spark or Storm, it's not quite the Hadoop show anymore by any means. It was a few years ago, are difficult to configure. They promise that you'll be able to use commodity hardware very inexpensively, but then when those nodes fail they require expert maintenance to repair, which is reasonably expensive. The demand for data processing as a resource will naturally increase over time. If you lock yourself into a fixed size cluster, you're kind of going to be doomed eventually. And finally, if you do have one bare metal cluster, if it does go down you can lose a lot of time, money, data, etc. So it's a lot harder to provision these clusters than it needs to be, and it hurts organizations. So our solution, of course, is to put all this in a cloud. Allow you to use interfaces to create clusters, to scale your clusters, and to run elastic data processing jobs, whether you like to use PIG, Hive, Java, MapReduce, StormSpark, whatever you like. We give you easy configuration options that just work, as well as very sophisticated, in-depth configuration management for those who know how to use it and want to. And that's fundamentally what we do in OpenStack Sahara. We have two fundamental APIs, the cluster management API up top, and the elastic data processing API down at the bottom. And the Manila integration that we're going to be talking about actually touches both of these at different points. We'll be talking about that in a bit. And our architecture is fundamentally very similar to most OpenStack services, or as a service services. We have our service itself, has a job manager, which, first we have a cluster configuration manager, which reaches through our vendor plugins, whether those are native upstream, Vanilla, Spark, or Hadoop, Hortonworks, Cloudera, etc., reaches out to heat, to provision resources to build your cluster. Then we use those vendor plugins to configure the cluster together, make it work. We also, in our elastic data processing, have data sources. And right now, we had Swift as a data source before. We, of course, have native HDFS within your cluster. And now we have Manila, which is really what we're here to talk about today. I'm going to turn it over to Jeff now for Manila and much prettier pictures that were made by marketers instead of developers. Thanks, Eden. So yeah, so Manila, for those of you who are not familiar with it, Manila is the file sharing service within OpenStack. It actually was initiated by NetApp, but we've invited a lot of our even competitors to come and contribute code to it. It now has a pretty broad user base and also contributors, many drivers that plug into it. So anything that's a file system can be exposed through Manila. So if you think about, let's just take an example, like a VDI deployment. We have virtual desktops, and those virtual desktops are virtual instances need to share file systems. So with Manila, you can actually do that. So in this example here, you can see the R&D department has a share and the virtual machines number one and seven are given access to that through Manila. And then the marketing department also has shares and virtual machines six and eight are given access through Manila. So it's a way to provision file systems in a cloud and then share those out to instances. And typically the access is based on IP address and I'll show you in a little bit the APIs that are used to control that. As I said, there are a number of people who are contributing. There's code then there now for HDFS drivers. NFS and SIFS are covered by the NetApp driver. GlusterFS is another option. So if you go out and look at the documentation for Manila, there's a lot of drivers out there. Another thing I did want to correct, the recent user survey showed about 8% uptake in the Manila project, 5% in production. So this is what we now consider to be a mature, stable, sort of enterprise 1.0 ready service. This is just sort of a high-level overview of the way you interact. It's very simple. For those of you who know Cender, Manila is a very similar layout as far as the interaction. So the end users are interacting with the Manila API. Requests come in, they go in the message queue. If there's persistent data, for instance, about shares, a metadata that needs to go into the database that's stored there, obviously, and then the Manila scheduler at the high level is responsible for knowing about all the back-end shares that are available to it and know what those share capacities are and then making a provisioning successful down to the back-ends. And so this is where your back-end could be NFS. It could be HDFS. It could be a number of different file systems, GlusterFS, GPFS, as well. And I just want to highlight a couple of the APIs here. These are that are most relevant to sort of a provisioning scenario within Manila and Sahara. The first one would be the Manila Create operation. So, for example, I'll talk about the NetApp driver. So when you do a Manila Create, basically what's happening is calls are made to create a FlexVol on our NetApp controller, and then when you do an access allow, then we actually modify the it's essentially kind of like an export FS command for those of you that know Unix to get access to certain IP addresses. So those are two sort of key APIs. Obviously, you can delete a share, edit a share, list and show a share, and then deny access as well. And then, so how does this tie together? This is actually sort of a high-level diagram of what you would see. In the top right corner, you see Sahara as the controller. Its responsibility is to actually spin up the Spartan nodes in the big data cluster, so you can do your processing. And then once a share is created, you can create templates. And I actually have a technical report that I'll give you a link to at the end of the presentation here where all this is in great detail. But basically, at a high level, you're creating a share. You're making the Sahara controller aware of it through a base template. And then when that template is spun up and these nodes get created that you see here, the worker nodes and the master nodes, they will come up fully enabled with a share that's automatically mounted so that all those nodes can share the NFS data that's accessed by all of them. So it's a great use case for where you want to pull files in or write files out. You can also do HDFS puts into the HDFS from those shared files. And this should make life easier for big data operators. Okay. So what's the goal? We basically want Sahara and Manila to work together. So we want them to work together. We want our life to get easier. And let's see how we do this. Basically, we wanted to integrate with multiple different storage backends and protocols. And Sahara will make this possible with Manila. What do we have pre-liberty? What do we have in Kilo? Basically, there are three patterns that we support as of Kilo. In the first one, you have both your compute, your processing of the data and the HDFS file system in the same VM. That's the first pattern we indicate over there. In the second one, the data processing can happen in one VM and the data itself, your Hadoop file system, could be in another VM or in a bare metal, whatever, an external Hadoop file system. In the third pattern, you basically access the data through Swift. So there is some, you know, driver access direct through your Hadoop file system to Swift. You can process it, you know, in a continuous fashion. What do we have in Liberty? Pattern four and five. Six. We'll get to in a little bit with Jeff. What does pattern four do? It uses the Manila file service system. It has a Hadoop driver, an HDFS driver and through your VM, you can access the Hadoop file system through Manila. In pattern five, there's a little difference. You have a local volume and that local volume can be backed by any data behind it. It could be an NFS driver, so, you know, access to NFS data in the back end. It could be cluster and that's a very important contribution in Liberty. So when you use a local storage system, there are many benefits and Ethan will cover that and Jeff will go further on this. So I'm going to talk a little more about pattern four here. Five. Go down. Okay, so Manila HDFS driver. How does it look? You know, think about a cluster in your cloud. Multiple compute nodes. Each of them may be hosting multiple virtual machines and then they're all going to talk to your Manila file share system which has the HDFS driver in it. That goes to a name node. It goes to different data nodes. All of that is accessible to your computes out there. So essentially you have an HDFS system that's external and you are accessing it and the Hadoop file system. You know, all that's good. It's out there. It's connected to your VMs out there and Manila basically helps you share that data. The advantages you've access is external system and you've centralized management to your HDFS file system. There's some limitations today. One of them is the way we access. Normally in OpenStack you have Keystone and each user, each tenant has access to only their resources. So that part is not entirely translated into the HDFS file system through this driver. So it's a single access point, single user. So that's the limitation today and that can be something that we have to address in the coming releases. How do you set this all up? You basically have to change the Manila file. You have to ensure that your user login name, access password, etc. all set correctly and your service needs to use the HDFS user login system. You do your usual mount, you do your usual share, all that kind of stuff. Go right through your Manila system because all the driver capability exists. So once you've configured it, you need to restart your Manila service and you're good to go. Okay, so what did you get out of all this? You basically have the system set up. You have access to your data and now it's an external HDFS data and as I mentioned you have a little bit of limitation from the user access point. I mean there's more access than you really want but it's a step in the right direction. And with that I hand off the Jeff. Okay, so now we're going to talk about what was pattern five from Malini's bit of the presentation. Mounting NFS shares to your clusters within Sahara and this is new work within the Sahara project primarily by the Red Hat team, myself, Trevor and Trevor McKay, Chad Roberts. And we support a few use cases here. The feature is that we can now mount Manila NFS shares. It is NFS right now. The mechanism we created is very extensible to new file systems as we go, cluster, et cetera. Right now anything that expresses an NFS interface we can mount and we'll be hopefully adding additional driver support in the future. But you can mount those shares to every node in a cluster or you can specify node group types. So if you only want shares mounted to your name nodes, your data nodes, your masters, your slaves, et cetera, you can do that as well and have that level of specificity. The API you can see to the right, very simple. When you're creating a cluster or a node group template, you create shares, you create a list that contains an ID of the share as registered in Manila. You can optionally provide a path at which to mount that share and provide the access level at which you want your cluster to be able to access it either read write or read only for Manila's API. This API actually is only required for users who aren't intending to use Sahara's EDP features. It should be noted. If you are using Sahara's EDP features as we're about to go over when you create a job binary or a data source in Sahara, Sahara will automatically mount the appropriate share for you and you don't have to worry about it. So the first use case that we cover is binary data storage. This is the case in which you've created your Spark job or your Hadoop job, etc. You have a jar. You want to register it with Sahara so it can be run repeatedly within your cluster. So these have a comparatively small size. These aren't really big data. So the initial location of these files is irrelevant to performance. Before Liberty in Sahara, you could store these in Swift. You could store these in the Sahara internal DB. That last solution was never really a great idea. It was pretty much there for very early on before users started adopting Swift in OpenStack very heavily. But now we have Manila NFS storage as well, which can really be a very great solution because you can do your version control directly on your native FS. It provides very reliable long-term storage if you're using transient clusters. Clusters on different networks can all route to that share without exposing them to each other. And also read only access is really very appropriate for clusters in this case. There's no reason why a cluster should have right access to your binary repository. It should just consume that. So this is sort of what we're fundamentally doing. You have a Manila share expressing an NFS driver. In this case, that NFS driver is actually backed by a cluster file system. Your Sahara cluster at the top is routing down through Manila to your NFS file system. We're mounting that and it's pulling the data back into your HDFS. Now you may say, okay, I'm losing data locality. That's true. Effectively, you're doing the equivalent of a Hadoop FS put from your NFS share in Manila back into your Sahara. Nonetheless, having standard FS access to data is really nice in the job binary case. In the data source case, you can lose a little bit of performance, but it's still convenient in many cases, especially small outputs. So fundamentally, this is the workflow for NFS binary storage and input data. You create a Manila NFS share and, you know, just as Jeff was talking about, you then place your binary file or your data on the share at, you know, absolute path to binary.jar wherever you like it. You can then create a Sahara job binary object with that path reference. And that's just, you know, prior we had internal DB or Swift colon back slash slash. Now we have Manila share UID and the path. You utilize that job binary in your job template just as you have in the past. You can then create a Sahara data source using the precise same URL structure, pull it in, and it runs a job from the template using that data source and that binary. As I've noted, if you are using Sahara's EDP, you don't actually have to do anything new with the API at all to mount these shares except for using that new URL form. Sahara will automatically check to make sure that that share is mounted to your cluster. If it's not, it'll mount it for you and you're good to go. No additional work needed. In that case, it does use some defaults for permissions. It assumes you're going to need read-write, which you may not want, and it defaults to the path mount share UUID. This slide is here primarily for reference for people reading the presentation afterwards, but we'll go into it a little bit. I'm not going to deal with all of it by any means. But fundamentally, this is sort of how we make the sausage. When we check to make sure that the shares are mounted, we check to make sure first that you have the proper utilities to actually mount a share. If they're not there, we download them. However, we want to make sure that you can use your Sahara cluster in a non-network capacity. So if they are already there, you don't need network newly to reach out to download those utilities. We then call out to Manila to access allow for each IP. This is Jeff was talking about, and we mount. At that point, we just translate your paths from the Manila UUID back to a file system path and you're good to go. There are slightly different implementations for Uzi, Spark, and Storm. You can look at those if you like. I just want to shout out to the Manila team. As I'm saying this, doing that integration work was really very clean and painless. It's a nice API. Take a look at it if you haven't already. Seriously, it's kind of great. Okay, so some screenshots. This is creating a share. Share name, share one, protocol, HDFS, et cetera, et cetera. For these screenshots, we're going to be looking at the external HDFS case that Malini talked about. And then you've got your share. A bunch of details on it. At this point, you create data source that drop down with data source type, HDFS, Swift, NFS. Sadly, the UI patches for this particular feature are not in the official Liberty release. They're coming very swiftly. The horizon patches in quite land. So a lot of this is API only at this point. Not so great to screenshot, as one might hope. Nonetheless, very usable and tested. And now I'm going to turn it back over to Jeff to talk about the NetApp NFS connector and where we're going from here. Yeah, I'll switch to that in just a minute. One other use case I would highlight with Manila as an administrator. Not only could you create a new share, you could actually manage an existing share. One of our use cases was sort of an enterprise scenario where data might be sitting on an NFS server. You want to bring that in, make it available for processing, and then perhaps write data out to an NFS share where then you could share that to your enterprise users. So all that feature, as Ethan said, is now there in Liberty. I'll give you a link to how you can actually do that with some command line and some GUI as Ethan said. In Mataka, the GUI patches will land for all of the integrations that I'm mentioning. So the command line support is there now today from Manila. So we've covered a number of use cases. So this is number six. This is the last use case where instead of actually creating an NFS share and making that available through the cluster, now we're going to access data directly through what we call the Hadoop NFS connector. This is a proposal actually that was made by Y Team Chin from Intel, so I also appreciate Intel's support for the HDFS driver as well as the future work that they're committed to for this to integrate our connector. So basically at a high level, the way this connector works is it's a Java JAR file that you tell your Hadoop cluster about and with that you can then reference NFS paths in your Hadoop job. So you can run a TerraSort as I'll get into in a minute using NFS file paths. So rather than where in the previous use case, in use case five, there was a share available on all of your Hadoop nodes that you could then copy into HDFS. In this case, NFS is actually hiding behind HDFS if that makes sense. So the connector actually talks to files that are sitting on HDFS or sitting on NFS rather and then exposes those up as HDFS. So the end user and the applications never know that there's actually NFS on the back end of this. So it's a great use case for development. You can't have big data jobs without developers. The goal here is to make it easy for developers to access their data and get started. So if you've got data sets that are sitting on NFS, the connector would be a great way to get started and expose those. So as I said, it's an NFS client. It's written in Java. It's completely open source. It's out there. I'll give you a link in a little bit where you can get to this. It implements the Hadoop file system API, as I said, and doesn't require any changes to the Hadoop framework or to user programs. And so there in this use case, we would actually be eliminating copying data from NFS into HDFS. It would just live on NFS and actually be exposed as HDFS. And the plugin, a lot of the work was actually done to optimize performance for NFS. So there's a lot of pre-fetching for reads and a lot of optimizing of how writes get flushed out because obviously HDFS is not a file system that was written with NFS in mind. So the connector does a lot of work to optimize the performance on a read and on the right side is different modes. So if you want to optimize it for reads, you can do that. If you want to optimize it for writes, you can also do that. And so this is the use case here. So at a high level, you're going to have virtual machines. They're going to be pre-configured. We're working on actually building a virtual machine base image that has the Hadoop NFS, the NetApp NFS connector built into it. Similarly, as you would have an Amazon, you have an AMI that has this, the big data tool sort of integrated into it. And so then those nodes would be able to come up and speak NFS directly. They would already know about that protocol at the HDFS level or within Hadoop. And then the Manila comes into this where you could provision a share or, as I said, you could manage an existing share where data lives. And then the NFS connector would talk down to that share. And all this could be automated through APIs. And this is the proposal that YTN is working on. There's a reference down there to the blueprint for that, you can see. It's a little bit small, but... So really that kind of completes the picture as far as enabling NFS as a particular back-end for big data workloads. There are a couple of choices with the NFS connector. You can deploy it in two different modes. So in the first mode, you're completely speaking NFS v3. The end user doesn't need to know anything about it and it's just presented up as an HDFS file system. That's simpler. It's all NFS on the back-end. There's also a mode where it's HDFS and NFS. So you could have a case where, let's say within HDFS, you do an HDFS ls slash and then there's a slash NFS that resides within your HDFS file system that's back-ended by the connector, if that makes sense. So you can have it mixed where your HDFS data is both on your data nodes as local data and also can supplement with that with paths within HDFS. It's a bit confusing. So one of the things to keep in mind, as I said, this is open source. So you can go out there on GitHub and get this connector to play around with it. There's good directions on how to configure that up and this is a really a benefit that I'd like to highlight. I mean, I'm a marketing engineer so I get paid to talk about our value and so with NetApp, at least, you have the benefit of, you know, all the things you come to know about with NetApp, which is snapshots, your ability to do flex clones. So you can make zero cost copies of your data. So in a development environment with big data where you have a large data set, but you want to make a small change to it, you can deflect clone of that, make a small change and not really incur any additional space penalty. Also with Manila, snap mirror is an option. You can filter on volumes that are mirrored to other locations for, you know, data recovery. And also in Mataka recovery groups, you know, you can actually, you know, actually do full-blown disaster recovery in the Mataka release, but not in Liberty. And this is an example I talked about where you can actually run a Hadoop command and in this example we're pointing it to an NFS path rather than a HDFS path and that's what the plug-in enables. It enables the cluster to understand the NFS semantics here, give it an NFS colon slash slash, you know, the host name or the IP address of the server where the NFS is, the port which is 2049 and then the path to the file that you want to serve as your input and then you can output it to slash tera out, which in this case would be, this is a mixed mode example where slash tera out is actually residing with an HDFS. And similarly you can do, the second command is just an example where both paths, the input and the output are residing on NFS. And this is the references there for the connector at the bottom there, alfarnetup.com site and also github.com where the, excuse me, where the connector lives. So I want to summarize a bit. So thinking back to, you know, really this all started, there was sort of a groundswell of, you know, the demand really within the Sahara community and also a recognition that Manila could solve some of the problems that big data administrators were experiencing, you know, easier, needed easier ways to get files in and out of their clusters and Swift worked great in some cases, but in some cases it may not be the best solution. So, you know, we enabled that with the, what Ethan was describing. Intel enabled the Manila HDFS driver so that you can use Manila and the APIs to provision and manage HDFS. And this is the link I told you about, actually if you scan that QR code it'll take you straight to the technical report that goes into the details of use case five, right? Yeah. So the actual details of how you could either manage or create an NFS share, bring it in, automatically mounted on your Sahara cluster and then access that data natively. That's there the QR code and the the TR4464. And then the last case is the one that I just discussed with the NFS connector for Hadoop. That'll give you the link to find that on github.com. And I'll open it up to other closing comments from you all. How are we doing on time here? I think we're just about set for, you know, good 10 minutes of questions or so, which is good. Yeah. I just wanted, as one closing comment, you know, there are three companies up here as one of the contributors to that joint effort. You know, this, this, the three companies up here each really deserve to be up here in this effort. And seeing this sort of collaboration happen, you know, across really one feature set during one cycle by three fairly major companies has been great to see and is a real testament to the spirit of OpenStack that we're all trying to embody here. So that's all. Yep. And I'll also just point everybody to to our documentation at github.io. This is our sort of one-stop shop at NetApp for any information around OpenStack. So if you want to find out about our, you know, Manila, the documentation, the deployment and operations guide is a great reference there. If you're new to deploying Manila and you're looking at that as a proof of concept, you can find that information there. Also, the, there's blogs as well. There's, there's one about this, the new technical report that I mentioned. And Do stop by our booth and get your passport, you know, clipped and you could get a compute stick. So please stop by our booth. I want one. You can have one. You can have one. So questions. Does anybody have anything? Yes. I think we have Ben Swartz learning them back to you if you have questions about that, that go beyond my knowledge of Manila. So like we have good support for or the driver for is JFS, right? I think we have support for RDD in this park. Right. So any drivers for Manila or maybe in Sahara for the Spark API for RDD? Sure. So Spark itself, Spark itself is, is effectively a plugin in Sahara, right? So we have the Hadoop plugins and then we have our Spark plugin. And that, that binary is a little bit orthogonal with the data source. Okay. You can use, you can use Spark with, with any number of data sources and you can use Hadoop with any number of data sources. So Spark, we, we have routed so you can use the NFS or external HDFS data sources just as we've done with Hadoop. So how, how take the data, real-time data, right? Like in this park, we have taken the real-time data for processing. So our system will take the real-time data because nowadays we want the real data to be processed, right? So, so in, in this case with the, with the NFS data, you'd, you know, you'd give, you'd give Spark a file system reference. Okay. And if Spark is, you know, if, if your application is constantly capable of reaching out to that Spark file system reference and pulling data in from it, then you could perfectly well use, use this NFS driver as it is. But suppose RDD. I don't know that we've, I don't know that we've done a proper reference architecture and testing on RDD. Okay. So I don't want to definitively say that you can and that it's perfect. Okay. But it should be theoretically possible based on my understanding, yes. And one more thing, like we have NFS support, right? Does we have NFS version 4.1 support also? Like for SCLNX and CalvaryOS for the security point of view? I'm gonna get that one. So the, the NFS, well, so with the connector that I spoke about in, in scenario six, it's NFS v3 only. But with Manila, a share on all nodes, that would be any protocol that we want to support that the back end would support, basically. Yeah. So yes, 4.1 or PNFS. It's a board, right? Yes. Okay. As a share, but the connector is limited to v3. So in the case where you're, you're accessing, HDFS is accessing files sitting on an NFS share through our connector, that's limited to v3. But Manila itself can provision through 3.4 or 4.1. Like you, like we also seen the one dashboard, right? We have support from, from the dashboard, we can pull the resources. But if in my Hadoop cluster, if I host HDFS through Kerberos support, does the API will support? Does we have some plugin to first go through the Kerberos and then? I'm not an expert on the, on the HDFS plugin, but my understanding is based on basically fairly, fairly basic HDFS access through the HDFS user. And, you know, there's some limitations around security control with that driver. You want to add any color to that, Ethan? Yeah. I mean, you're, you're touching on the point that reference architecture for, which Malini talked about, you know, does have effectively, you know, HDFS file system and sort of Linux level permissions on it, that, you know, you go into your share and create the appropriate folders and such. As such, and because some of those permissions need to be set in Manila.conf, you know, I don't know that that solution is really certainly public cloud and multi-tenant ready. And that, you know, I'd only say that it's appropriate for security in a private cloud context. The NFS share mounting through Manila while you lose some data locality, you know, you can also use it in a multi-tenant public cloud context fine. You have, you know, Keystone permissions on both sides of the equation. One last small question. Like, we have our, in the last, last diagram, we have for data distributed processing, we have Jan support, right, for repository manager, sorry, for resource manager, right? Does we have support for Mesos also? For resource manager in, in your Manila API. In the last, last diagram, in architect diagram, we have, you have written like Jan for resource manager, right? That support in Sahara or Manila, right? So if I change Jan to Mesos, resource manager, does he support Mesos also? I think our connector has support for that. Is that what you're asking about the connector? Yeah, I'm asking about the connector, yes. Yeah, but that would be, that would be next cycle, development work at least. Okay. So let's, let's take other questions. Who, who is next? Him in the back? Okay, then we'll get. Are there plans to support locality, let's say, for example, for Gloucester FS? So the, the NFS connector is, is great for data locality. At this point with the share mounting mechanism, you know, if you mount, if you mount a Gloucester FS share and then, you know, create sort of a, put, put a Gloucester FS driver into your Hadoop cluster so that it knows that it can talk to Gloucester FS and use data locality that way. It's certainly, it's certainly feasible in the, in the future. It's not currently implemented because we haven't specifically created a GFS share mounter. But there's, there's nothing, there's nothing that I can see that forbids it and thus it should be done. Yeah, the locality aspect is the relevant part, I guess, the IP mapping of push machines to which hosts they run on. Yeah, absolutely. And the work that Jeff's talking about for next cycle will also allow that, you know, in native NFS and anything that it can express in NFS interface. Yeah, and one thing I did, I didn't mention is that the, the connector for Hadoop actually allows you to have multiple mounts. So you can actually have a large cluster of NFS servers, and you can have connections to multiple mounts on the back end. So you're not limited to a single mount by any stretch or any single piece of hardware. So if you want to scale up, have multiple nodes available to serve that NFS data, that is an option for the connector. Thank you. My question is, which component take care of the replication? I mean, Hadoop itself or the manual or the native appliance, which I'm sorry, replication. Oh, replication. Okay. So replication is something that's built in as a, as an as a feature within Manila, you can create shares and enable replication and basically in one operation. And also, as I said, disaster recovery. Ben, keep me honest here. Was that in Liberty or feature? Okay. Future. Okay. All right. So disaster recovery DR, Manila DR is, I guess, future release, perhaps Metaka. If you have any external HDFS that's provisioned by Manila, then that Manila machine set will take care of replication. If you provision any HDFS within Sahara, then that VM set provisioned by Sahara will take care of it. So wherever, wherever your name node is, you know, from there, your replication will flow. Yep. Is that another question? Yeah, I have a question. So you just mentioned that you have, we have three options. One is HDFS. The other is a general NFS share. The other is one is a NetApp connector file system. So I'm not sure any plan to, we'll support the general NFS share or we still need to do more things in the Sahara to support this one. A general share? So yeah, the NFS connector will work with any NFS v3 compliant NFS server. It doesn't have to be a NetApp. We obviously like it if it's a NetApp, but it could be anything. So, and does that answer your question? But then as far as the shares, you know, it's down to the Manila drivers that plug in the Manila. And as I said, there's many drivers out there. So it's really down to the vendor implementation. I can speak, you know, specifically about ours, but it really does come down to the driver. Any other last questions? I think we're beginning to run out of time, but a little bit. Just along the lines of replication, have you made any sort of account for sort of the amplification situation? If you've got HDFS replicating and then the back end file system replicating, does the Manila stuff try to address that in any way? Sure. So if you're using scenario five, right, NFS is going to be replicating its data and then you're effectively sort of pulling it into your local HDFS on your actual, on your Hadoop cluster, this provisioned by Sahara. And that will perform replication for the length of the job on any intermediate data sets and then export it. At which point you can certainly go ahead and delete it if you're done with it. On the NFS side. Yes. I mean, so as I said, it would be a driver specific implementation. With the NetApp driver, if you enable replication, you're going to get, you know, we're going to use our snap mirror technology to create a remote flex vol that's going to mirror. So with NetApp, at least it's a flex vol that gets created and mirrored to a target. So any changes can be dynamically updated to that target based on, you know, how you set it up. There's no current configuration that would have sort of a multiplicative replication, right, which is good. Yeah. HDFS replication would be, that'd be outside of Manila completely. But even though the flex wall is being replicated, what about the mount points? Like, let's say I lose it. That's a good question. Yes. No, it would not. Yeah, you would not. Well, you know, in a case where you create a Manila share that is replicated and you need that read only, potentially you could mount that the target path as well. I have to think about that. It would be a manual. Yeah, you could, you could mount the mirrored target as well as a read only that it cannot be changed, obviously, because it's a mirrored copy of the base. But perhaps for scaling up, that might be an option. I'm sorry. I didn't quite get there. Is there, is there a question? Another one? Well, thank you for all for coming and it's been great and have a great conference. Thank you.