 You want to do I don't have any notes in there though. Oh, you don't okay. That's why my name is Joe Arnold My name is John Dickinson. I work at Swift stack and I'm the project technical lead for open stack Swift and I'm I'm one of the co-founders of Swift stack and we're here to talk about a new project that's in open stack Swift or we're gonna be introducing it open stack Swift called proxy FS and what proxy FS is is an integrated file system access natively built into open stack Swift and the design goals are to support both object and file for the same data for the same workflow and this is of course gonna be part of the OpenStack ecosystem. It's gets open source and we're gonna be diving into the details of what it looks like and talk about some of the use cases that we're hearing about and we're looking to support So the great neat thing about Swift and the reason that this is really great is we're building on top of Swift itself itself, which has a thriving and growing ecosystem and community So over the last several years well since before open stack started Swift has been deployed in production at massive scale all over the world we've seen dozens of companies join in we've seen hundreds of contributors offer code and review into the project itself and All of the graphs are going up into the right and so over the last few years some of the things that we've been able to be a part of and see that is Building out things like global clusters and storage policies and erasure codes We're currently working on encryption within the project and we see that this is something else happening in the ecosystem That's really just at least if not bigger a bigger deal than like and then recently like in the ecosystem Then there's S3 API support which has been greatly enhanced Over the the past few months and this is just another extension of things that we can integrate directly into the open source project to meet more workloads And so just just to come a perspective of work where we're coming from Swift stack We are a commercial Object storage provider and we provide We're not we're the leading contributor to open the stack Swift John's project technical lead and then what we ship with our product is is The core Swift project, so it's unmodified and then what we add then is we do things like underneath Swift to do Hardware monitoring and managing the underlying hardware infrastructure above that set will do Monitoring how much data is being used so you can do charge back integrate into authentication environments And then what we've done is we've created a control plane which we call the Swiss that controller So that it makes it easier for people to deploy operate a multiple regions multiple zones of of their infrastructure and it's really more about providing operations at scale and completing what people need to to run Swift and but we've had a number of conversations with customers and one of the things that they need in order to Do more with Swift they want the the architecture that everything John talked about about With object storage they want the architecture they want the scale, but they still have Existing applications they still have existing tools, so they're doing data generation and they're shoot like with the video We're moving to 4k you're moving to longer episodic episodes, and you need a ton of data online medical imaging Genome sequencing these are petabytes upon petabytes of data workflow And then you have end users and then end users have we're putting lots of data into the system but the if the only way for them to get access to it is via an object API that isn't necessarily fit with the workflow with the software with the tools that they're using and File is a way people are are accessing it and an existing applications and workflows are generating it But you have the volume increasing So traditional file systems are starting to meet some scale limitations that that that they're experiencing And then you have a group of developers who want to be building applications using an object API You talk to any developer nowadays that's coming up They're they're very much more willing to consume an s3 or a Swift API because it's it's a rest API It's much easier to do is they're much more developer features around that and so you have applications being built And so there's this conundrum that that needs to be solved is how do we use existing? Things that are generating data yet supporting the new applications that people are wanting to build okay So if applications are going to be using and developers are going to be writing stuff in object API Why are we spending time working on what the the old and busted old build file system out of access? Okay, yeah, John's trying is teasing like an analogy. I've been using out of me, right? Which is like okay, let's say the future is all going to be driving cars or self-driving cars, right? And so then do we not bother to get a driver's license? No, that'd be like that's that's silly, right? So this is a way for us to support both of these needs at the same time So the first time I heard about proxy FS and the first time that you Want the first thing you want to do when you say I'm going to provide file system access onto something That's really great. Use a gateway. There's a bunch of them out there. Just pick one and use them But the thing is about gateways is That they are built with the fundamental Design that the storage that you're making this gateway to is remote it's someplace else and so even if it's not particularly deployed that way in every instance the way the actual software and the functionality that a gateway provides is built is built with that assumption and So when you do that when you build something with that assumption You have to make certain choices and that's like I need to spool some data locally So I need to have a little bit of storage locally in my gateway and I need to I can't trust this This network link because it's less trustworthy than something that might be in the same data center because these things are high latency Low throughput links because it could be even going to a different continent or at least Something that's a couple of times on its way But if we I mean we're starting down the design process here The guys that were like we really went to the to the drawing board again They're like well wait a minute like we're doing private deployments, right? Like yeah, we're doing private deployments Well, why do we are we are we making this assumption that the storage is somewhere remote? Why don't we take advantage of the storage that we have the object storage itself and Build on top of that as an architecture rather than assuming that we only can access You know being to the public API as we have to come up with some caching mechanisms. Let's let's not do that at all Let's use the object storage as the persistence layer for the op that we have to architect this And so the summary is basically that yes, you can use a gateway and gateways have their place And we're not saying that you shouldn't use a gateway. What we're saying is that in these kind of Private storage deployments, which is where we see open stack deployed more and more and more You can actually take advantage of that because you have a certain special knowledge that you have to assume You don't have with the gateway solution So that's why something that's tightly integrated is a better solution Overall for the end user than something like just using an off-the-shelf gateway And then you have the the other kind of flipping this on its head So why not just have a file system underneath that you're deployed and then put file system access on tap Well, then you're it's like the opposite right then you have basically an object gateway That's talking to a file system and then you're not really gaining anything at all because you're kind of it's like the the Venn diagram of the worst of both worlds you have the restrictions of the object API you have the scale limitations of the file system and You have to manage and Operate that file system, which is the whole reason why people are looking to object storage in the architecture to begin with Is that they want the art that object on the on the bottom for the operational scale So we have to build on top of Swift itself to be able to integrate with Swift itself And there are three primary characteristics about Swift that we can rely upon the first one is that it's durable storage Which means that if you put data into it, it's there and it's there for good and more so than that When you get a successful response back, it means that you're going to be able to to read that we We make sure that we are handling a lot of different Hardware failures are working around all of those things transparently we're making sure that In the background we're we're just taking care of those normal things that happen in a storage cluster hardware failures file system corruption Unexpected power cycles all of those sort of things so we have a very durable system and alongside of that We also make sure that it's highly available, which means that yes You can pull the power out of an entire rack and Swift keeps on working and you can lose Servers all the time and Swift keeps on working and works around that transparently The really great thing is you can put these things together and you get stuff like being able to do rolling upgrades It's actually rebooting a server to upgrade it ends up doing Being treated in the system just like a server happened to go down momentarily and that's completely okay and normal operational concerns So it means that you can expand your cluster. It's great. It means that you can do rolling upgrades It means you don't have to Wake up at 2 a.m. On a Saturday morning because you had a pager-duty ticket that said that your hard drive failed so the third thing that we need is This feature of Swift that is read your creates So when you create a new object in Swift not an overwrite of an existing object But when you create a new object of us inside of Swift and you get back a successful response That means that you can read it and it means that anybody else is going to be able to read to you can read that Create right away. So I put something in Joe read tries to read it. Everything's good This is the building block for The the fundamental design of what we're doing. We're going to go into a lot of detail about how that actually Ends up working about the individual components right because this capability allows us to build a log structured data format inside of Swift and This will be this is a picture of what we're going to be walking through in the next few minutes here So there's a new component called proxy FS which sits aside of the proxy services and The persistence of that file access is done as objects into the Swift object here So that means there's no gateways and then we're going to be using a log structured Data format to persist to the data into the system So log structure files in Swift. So what do they look like? So it's a segment a segment of data that comes in and this is a new way of storing data What what you do with that segment is then you need to have a manifest that comes and it runs in the tail of That segment to contain which what information what's the map of that of that particular file and When you add additional segments into the into the system those are connected together via those manifests Into a whole object So this is actually if you've been around Swift for a while and you know what's going on This is like The well, we're not talking about that yet. Are we we'll get through that in a second. Okay, so the segments have bite ranges associated with them and those bite ranges are exist inside of those manifests and Those manifests then can point to different ranges or they can point to other manifests and those manifests form a tree so that you can navigate so when a file system comes in and does a seek to a certain location we can Efficiently navigate to the segment and the bite range offset that we need in order to get the file that we that we need to get Access to right so then So then when a new data comes in so let's say you're Modifying any really anywhere in in the system adding data at the end or adding data in the beginning or modifying some data What you're doing is you're adding an additional segment in and then garbage collection comes along and will Delete data that is no longer needed. It's no longer part of that manifest map to reclaim the necessary space Okay, so let me try to sum up that piece right there. We've got File system access you do seeks you do tiles you do writes and then when you seek to bite position 100 And you try to write a few hundred bytes of data This is not something that we are seeking to the middle of some object in in Swift, right? Right, so we are creating a new object inside of Swift Remember we have those read your create properties then we can create updates a manifest Which is really just here's some offsets of where you need to go look so we created an object We created a file with the file system access. We created this one Segment inside of Swift now we need to update that so we create this new segment inside of Swift We're gonna the manifest says you're gonna go back to that first one read this series of bytes Then we're gonna go read this new segment over here this other series of bytes Then we're gonna go back to the first one and read the rest of the series of bytes to be able to get your data out Or at least that's the basic logical idea of how that's working. Yeah So that read plan you would go to that manifest you have a location of where that manifest is so you can build that read plan So you can read that data for a file system access and the the the other benefit of being able to put this in multiple segments Is you get parallel read performance when you're servicing that file request because it's multiple segments again Those segments are just objects in Swift They can be parallelized out and you're hitting different parts of the system To read the read that file as you're being as you're serving it The other benefit of the strategy and we're seeing this particularly in media workflows where you're writing You're starting to stream data in as you flush you create additional segments into the object storage system And that allows a read to begin before the file has been finished writing Completed so you can imagine broadcasting or security footage streams. That's very useful This also allows for object Access to it as well. So object access is a log structured access to it's a manifest Managed by Swift. You're going just going through the proxy request and again Because it's the manifest you can use the surface area of the whole cluster to surface the read request So when somebody tries to access read or write specifically in this case read I mean write through the object API and this is an object that has been enabled for bimodal access at that point the ProxyFS software will be able to intercept that and read and write it down using this log structured format and I think to me one of the most exciting things about that is That now the file system clients and the object storage clients can coordinate So in a file system world you may get a lease on a file and you're doing things And you want to be alerted when you need to refresh your caches and things like that And then I come in and do a put into the system of a file that you say have a lease on and the next time I try to go read the data. I have been notified already because those access methods are coordinated with one another So this is yes, this is what I wanted to talk about because every time we start talking about this This is what? This is what everybody thinks about it's like we kind of have something like this already in Swift This is the way we do large objects inside of Swift We basically have this manifest which is one little file by itself one object by itself Which you could think of as a read plan like we've been saying and you've got a whole bunch of segments now In this in a in this case We read the static large objects and it says you're gonna go read object a and you're gonna go read bite ranges You know one megabyte through one gigabyte from that one Then you're gonna go to object Q and then you're gonna go to object B and then you're gonna go to whatever So you you you kind of do those sort of things It is very similar, but there's a couple of important differences one the object the manifest file in a large object inside of Swift is kind of this Table of contents up front. It says that here it comes We're gonna go read this manifest and now we're going to be able to know exactly where we go and just like a table of contents in a book it kind of comes up front and It gives you this high-level bullet point of you know on page 33 is where chapter 2 starts The log structured files is kind of the inverse of that. It's more like an index in a book It comes at the end is generally going to be larger It's gonna have reference that well We're gonna go look for these particular ranges back in all of those other things over there It's like there's a structure to it. It's alphabetical, right? And we're similarly there's gonna be a there's a structure to the way the manifest is representative, right? And the advantage of this also is that generally that index is gonna be much bigger than the table of contents and by putting this into new objects and Keeping that manifest updated every time you're adding these new log structured segments It means that you can continue to grow that to be quite a number of quite large So then okay, what's a file without a file system, right? We need to connect all of these files together in the file system tree So we have to support a few things right so we have to support both file trees and object URLs and There's multiple layers in the hierarchy as as you can navigate down that file system tree You need to be able to move items right and and it typically it's with a swift API. You don't there's no move There's no rename command implemented but if you go into a file system and you rename some root directory You're not expecting data to be moved around just updated eight updating pointers to that data we need to support hard links and sim links and typically because of the type of use cases why people would be bringing swift into the organization that tends to be a large large number of files so large number of direct files per directory and That needs to be efficient as well. So the solution is Exactly the same as as the file approach. So instead of a log structured file It's a large structure log structured directory and that directory segment contains data So it's a special segment and we understand that in there is The elements that are part of that directory So the name of that it's referring to and then the I know that it represents and so those children can be files they can be directories and Those the they're just a log structured file just like anything else So likewise when you're adding new items to a directory and that you mean like I'm updating the listing You're updating a list. I do LS and I see 10 things I do Ellis again, and I see 15 things correct because you've added new things in there So you're adding new things in there and an additional segment will go in it and be created as a new object inside of Swift and And then be updated that become the manifest becomes the new route for the the tree that represents the The manifest and then a read will happen to get the get the directory listing So the advantage of this is that we can do parallel reads in the Swift object storage system To get the contents of the directory so that can be fan out And then the other advantage is when we do do an update to a large directory listing We don't need to do the roll up all the way to a root node So it's more efficient and then the log structure directories, right? So now it's kind of piecing these two things together each each file or sorry each directory in the system has its own directory segment and And there's one for each directory in the hierarchy and Those are all representatives unique object. So that allows us to translate a path and work through that hierarchy to navigate down a file system tree to Represent the file system in the export I know this is the thing. So these are What we've described so far we've kind of gone real deep into as if you if you're keeping up right now what we've got is a whole lot of Individual objects inside of Swift we've got to kind of tie these together to some way We've got those manifests which tie together Individual subsets of these so you're updating a file via the file system access or the object API access for something that's enabled by modally and it will End up creating a bunch of different segments That are then tied together with this manifest and each one of those segments, of course We'll say it again unique ID unique ID unique objects inside of Swift, right? But now let's say for example We'll have one that's made up of three segments We need to kind of tie those together logically and we're calling that an I know because that's generally what they're called in the files file system world and I knowed is an Indirection layer in a traditional file system you buy a hard drive put a file system on it the I knowed are this mapping that is between The logical file system path name to the actual physical geometry of the drive In in general or the block device what the block devices It's the it's the block address you've got some more things inside the hard drive Of course, but the point is that you need this indirection layer to say I want to chant translates in my documents to You know go to this LBA and be able to have that mapping So the I nodes are the things that we need to keep up to date And then when something moves around we can just update the I knowed and it's we don't have to move the actual data itself And the I knowed is also a convenient place where we can put other types of metadata So the link count modification time access time creation date all the traditional things you have So what's also interesting is we can put user defined metadata in there as well And so for those who are putting data in the object API and putting user defined metadata that can also go into the user Supply metadata on the file system side as well for the file system clients that are able to consume that And so that's what gets down to the other major part the actual what's the magic behind this So we've got now millions and millions of Objects inside of Swift that are this log structured thing and we've tied them together with the manifest and we have the I knowed abstraction layer So now we need to know how do we go find the right manifest to find the right? Segments to be able to read the data that you asked for and that's where you have to come in with Something else that we've got an I knowed mapper and the basic idea here is it's key value store It's gonna have to be strongly consistent and this is where you look do your look ups So yes, this is a lookup table Essentially that is when an object comes in we can now translate that to the current end of The where the manifest is located the last one of that chain of segments is Located with this unique object ID go look at it so just like in a local file system where you'll say that I need to go look for my documents and It says go look at this I knowed which tells you the block address go look over there in here We're taking the requested object name or file name and look using the I know mapper to to go look at the Swift object ID the unique distinct object there So that's the mapping and then that allows atomic updates so that allows us to create a series of transactions that we that the file system wants to represent to the system and then it Interfaces with the I know mapper to go execute that transaction And so the thing that we've done here is that we've isolated what needs to be updated What needs to be updated transactionally only to the I knowed root object ID mapper? So it constrains what needs to be part of this transactional update and then So similar John touched on this a minute ago, which is you also feed that a file path or an object You are I and to translate that between a path So it to a root object ID and then that you root object ID You do look at that up to get to get that into the system So there isn't so if you're you for for the object API access you don't necessarily need to re traverse the The file system tree in order to go get at that individual individual file just like with the object API access So the target here is is POSIX file system compliance. So to have a higher curl directory structure sim link information being able to Do atomic path updates flush stat f stat truncate but also kind of it I think Fairly pretty importantly is is on the permission side. So the goal here is to be able to map the the permissions at the share level with what is provided in Swift And by connecting those two things and integrating that within it into SMB that simplifies the How much management needs to be done before for the operators? And speaking of which so our chart so the so the the the first implementation is to be exported via SMB and so we're doing a couple of things here first is clustered Samba and Doing using above that using DFS for load balancing across that clustered Samba then on the back end is using Samba VFS and Samba VFS is an And an API that you can conform to instead of routing through Fuse down into user land and all the way back up again. So it's a more direct path into proxy FS. So everything Stays in the same same process And then the second thing so SMB compliance, so it'll be two two point one until I'm for authentication Supporting leases, but for bite-range locks because of the workload that we're going for which is this ingest workload Will will the initial notation doesn't have doesn't have bright range locking so the second benefit So, okay, did we say again that all these segments are Objects in Swift that okay if we're do if we're doing that that means we can take advantage of storage policies So from a deployment perspective That means we can store those segments in different storage policies Now most of the deployments that we involve ourselves You know we're wrapped up in typically we use a bit of SSD for account and container data. Well, we'll also then use Solid-state media for storing directory segments. So that will speed up the latency between for people who are trying to navigate a file system and doing directory listings And also means we can do things with the file segments and store them instead of a replicated policy maybe for Someone who's using this for an archival use case store it in an erasure coded storage policy really so language so we're the proxy FS code is being written in in go and We're the initial open-source release will be over the summer and Contact either John or myself For for preview or to get involved with the project So overview all right, so this is file access Integrated into opens deck Swift the goal here is bimodal access for data within the same workflow and The it's really a bridge for these data intensive workloads So they can continue to be using file system access for the applications for the tools that they already have but start to use to build applications using object APIs And it's still designed for the same workloads that Swift is already good at which is the high concurrency So lots of users lots of incoming connections coming in high throughput rates It still has the same durability properties and availability properties and so that's it Thank you very much and Happy to answer any questions you have is The I know mapping database stored as a Swift object and replicated as a standard Swift object No, it is an external key value store. Okay. How is that? So if you have proxy FS and like Samba vfs running on multiple proxies, how do you how is that like replicated between them? so in that sense what that is going to have to be is a distributed key value store that has its own consistency processes and Coordination engine inside of it. Okay, and rather than distributing that out through as an object into the whole system We can create a pool of proxy nodes that are just servicing that so just like we might have a pool of proxy nodes today servicing across like for it's very typical for us to do a Proxy pool that exists in a region and then we there's another region which has its own proxy pool So each you're only spreading that That data across that that smaller set of servers good question though Hopefully I didn't miss this in the very beginning But is there any interaction between the key value store that stores all the mapping for the inodes and the ring itself? No, not directly Two quick questions the first one What about authentication or does it integrate with keystone or stuff like that? So the point is right now keystone is not a file system authentication thing Swift can perfectly support keystone and we'll continue to do so. So one of the advantages again of the kind of bimodal access and being able to expose some of these file system attributes with the Swift primitives that we already have accounts containers objects means that we can actually start translating the Akkles that we have in Swift with the SMB permissions that you would have there now that being said We haven't completely written that code yet. So how should it work? I mean, these are things we'll explore Yeah, and so we and we so we do have there's an active directory module which will make it will make that tie-in easier For for folks that are that are that are working with us but we'll need to figure out the mapping between what a keystone user is and an SMB user so that being said I think the last point here is that For the time being our decision right now is the resolution of the bimodal access is at the Swift account layer So in that sense you've got the accounts and account ACLs with container ACLs inside of that That is then going to be wholly exposed as an export And so that's what the SMB would be able to consume is that account container metadata. We actually have some of the SMB stuff written Yes, okay last question How does it integrate in an existing cluster where there is no lock structured stuff yet? Good question Yeah, good question So this would be a new think of it more like along the lines of a storage policy won't be implemented with exactly the same mechanism But yeah new data coming into the cluster would have to come in through that path and be stored as a log structure For for for the access over over the bimodal access Yeah, correct and and that being said it in a lot of ways you can look at this as a It's an application data structure Swift is storing blobs of bytes And that's what Swift is concerned with but we've got the proxy FS piece which can't intercept that and say that This is going to be it's in this sort of structure. So when I start seeing this data stream I can interpret it in this way and so it's not something that It's it's going to be possible to add it to existing clusters But it's not going to be that for example in obh's cases my existing 75 petabytes are all now going to be 100% available over file systems It means that we can enable it for this particular subset And that's the key that actually makes this thing kind of click for me and work is that we can look at the workload that people have look at the actual overall data flow of I've got to do stage 1 through 15 of my pipeline and This set of the data right here needs to be able to access for bimodal access I mean yes, there's a lot of complexity here, and that's not a problem per se But it is something that we have to go into saying that Yeah, we've got to look things up in key value store We've got to be able to traverse these log structure files and all those kind of things. Let's not pretend that this is going to be this Stunningly performant file system used for high frequency trading This is something that we want to have that the work flows that require that both access or at least We've got a workflow We put stuff in with object and then we need to get the same data out with file systems or vice versa Then we can do that and we can enable it for that. Yeah, so you can enable it again I think it does bear repeating It's the same it's the same thing that you would target for object storage API So it's it really is the high throughput high concurrency workloads. Yeah, great. Thanks for the question. Thank you How do you expect this those index structures to scale to potentially billions of files billions of objects? Yeah So how do so the quite yeah, so the two things there is two questions. There's two things one is that the index files those that manifest part is only keeping track of the the segments for that one particular file and There's a couple of strategies that we're doing there that basically we're we're not looking at, you know First hundred bytes second hundred bytes, whatever we're looking at the actual file system write operations which can be pooled at whatever size is appropriate for your tuning and Then that's going to be written down as an object and then we can update that The actual structure of that is a B plus tree. It's just serialized down there So we have nice scaling properties about a very wide and shallow tree And then the second part of that is yes there will be billions and billions and billions of objects and Swift Swift is really good at that and what we're gonna have is the the That distributed key value I know mapper is the one that's going to have to keep track of where's the head of a particular thing that you're adding in right now And that's going to be kind of that separate System over here that you're going to separately scale and and then build yeah And then that the manifest is just going to be the new manifest is going to be the root of that B plus tree Yeah, thank you. Thank you So it sounds like you have an I know mapper that needs to point at the actually correct manifests at all times so You're limited to like you're sort of right throughput by how many operations you can push through this new consistent store Is that accurate? Yes, and is that they're like some expectation about how high that scales? yeah, so the the the expectation so that that's a good that's a really good question because The the way we can do that then is we can I we don't necessarily need to inherit that same Constrain across all the object pools we can create like the previous question that was asked we can create a pool of That is for a particular export that is managing that I don't mapper and so Sure, it's going to be constrained by the distribution and the updates on that pool But you can create other ones in the system for additional exports as the as needed. That's a good question Thanks So in ancient several times the bimodal access with this solution So in general as far as I understand with your kind of layout of the file for two multiple segments You cannot simply just get single object. You'll have to reconstruct if you want to access the file as an object You'll have to reconstruct this content In some using some kind of external code is correct Well from a client point of view it will it'll be transparent from from the object API So when the read request comes in to the proxy node it will ask to create ask for that read plan and then that creates the read plan in a sequential order for all of the all of the segments and Then depending on how much buffering we're going to do To how parallel we want to do the retrieval it will it will ask the object storage Tier then for all of those segments and straight and stream it up. So from the client point of view It's going to be transparent, but you're correct. Yeah, there's there's there's there's a component in there that will Will understand that it's speaking it's it's quarrying for a lock structure object and and do the resummeling there All right, thanks Instead of instead of doing the request down to an object server does the seek for the single file and then and streams it up So thank you and the last thing I would like to say here is one thing I love to say all the time and if you've at all seen me speak before you've heard me say this Which is my vision for Swift is that? Everyone will use it every day whether they realize it or not and we're seeing that happen because we're seeing thousands of deployments and we're seeing that grow and we're seeing millions of users around the world you Swift every day today and That's really great The thing is that we do to take that to the next level is we continue to remove barriers to entry for people using Swift This is proxy FS is something that's in the overall open-stock Swift ecosystem as a way to remove barriers to entry is The way we continue to make sure that everybody uses Swift every day whether they even realize it or not Thank you very much. Thank you