 Thank you for coming my name is John Dickinson, and I'm the project technical lead for Swift and This morning. I am going to give you an update on the state of Swift what the project is doing what the community is doing What we have been working on and what we will be working on and how you can help So first off we're talking about We're talking about Swift. So What Swift are we talking about? Well, I have a modest proposal for you I want to tailor this presentation to meet your expectations drive home the right messages so you can bank on the benefits and Something something on your phone Obviously there are a lot of swifts out there and Most of these swifts came after our Swift some of them didn't I suppose but what we're talking about of course is open stack Swift and It's but who but who are we? So Swift is one of the two founding projects of open stack We follow the cycle with Intermediate release pattern inside of open stack, which means that we do a release with every open stack cycle So Newton, Okada, Mataka, Pike, Queens, we do a release at that time But we also do releases between those and we normally end up releasing not on a set time schedule, but it generally Works out to about every two months Just two weeks ago. We released version 2.14.0 and on April 27th, and you should go download and use that It's it's the best Swift we've ever released And in a couple of months you should do the new one because that'll be the best with we ever released Swift has been running in production at scale at large scale longer than any other open stack project and I count the when it started being launched into production as Swift's birthday in fact Swift's birthday is next Wednesday Swift will be turning seven on May 17th So it's really great. I love that and that's kind of where we've been over the past seven years Recently the as you know the open stack foundation has given us all new mascots that have this great things beautiful new logo that we've got And it looks great. So where do we use Swift? We use Swift anywhere that you need to store unstructured data that can go grow without bound It's great for static web content. It's great for user-generated data like videos and movies It's great for like the big budget movies and special effects and rendering and video games with online distribution and saved games and in-game assets and doing backups and storing your machine images for you and your container images and I know of where Swift has been used in all of these places today in production. So It's gives you all of this with a simple web API And overall it is if you're storing data on the internet and you're not using Swift. You're doing it wrong So where have we done where if we returning seven next week What is the what kind of things have we done? The big thing about Swift and my thing about Swift is that I couldn't be more proud of the community that Works on Swift and then I get the privilege of working with over the years These are some of the major big visible changes that we have done over the last few years. I Stuff like encryption and erasure codes and storage policies global clusters But also there are innumerable other big and small changes that are have been done in the community and That have made Swift better. So Looking at who the community is and what's going on This is this is a look at our active contributors since 2015 until today or I guess maybe last week In the last Open-stack cycle in Okada we had 120 unique contributors to Swift and we had a total of 45 brand-new code contributors in the past year The on this on this chart you'll see that the blue line is the one that's on the top that you can very easily see is the monthly active contributors and the red line which is slightly lower there is the weekly active contributors and These are how many unique people have contributed to the code either By writing a patch and having that land or by doing code reviews Those are equal equally important Participants in our contributor community and so over the past two years you can in fact see that there's a rise and a decline over the Past two years, but let's look at this over the a longer time scale. This is the exact same data It's just extended back since the very beginning of time for Swift Actually going all the way back to 2010. So currently we're at about 16 unique contributors every week and about 33 unique contributors every month and So I'm not blind. I can see that's the the slope is trending downwards recently and this has some There there are many ways we can try to explain this But I think the biggest one that has been obvious in open stack in the news and the press and things like that is there are Have been several companies that have shifted their focus away from either Swift specifically or from open stack in general And that has hit the Swift community contributor base Hard we've had people who have lost their jobs or have been reassigned to other tasks or doing stuff like that So what are we going to do about that? It is something that is you know one of those things that keeps me up at night so to speak and I think that it's obvious But it's not going to be simple one of the things some of the things that we need to do are In general encouraging new contributors be more direct about asking for contributor support Working within our community itself to make to remove barriers to entry so we make it easier to download to install to We have honestly just better storytelling About production features because those are the things that get people extremely excited about what's going on And so just be more active and more straightforward about community recruiting so This is kind of some of the stuff we're doing, but if you look at the absolute numbers We're still relatively speaking. I mean, we're we're an active and large open store open source project even on its own We have I said that we had 45 contributors in the last year. This is who they are new contributors are the lifeblood of every project and Yeah, Swift is an absolutely great storage system But it's not because I get up on a stage someplace and say that the reason that Swift is an amazing storage system is because of the hard work of these 45 people and About 600 or more than 600 of their closest friends who also have contributed to Swift over the years So to those who have contributed several of whom I see here. Thank you very much It is because of your work in the community on the code base that Swift is in fantastic storage system And to everyone else come join us So you come join us and what are you going to work on? Let's talk about what the community is actually working on so We've got big stuff and I'm always asked what are you what are you currently working on and when is it going to be done? And I know what we're working on. I have no idea when it's going to be done one of the privileges and curses of the ptl is that I Get to know what's going on But I don't get to actually tell people to do things and so it's not like a business where we can Rigidly schedule out our our milestones to get to a particular goal We can track that as best we can but what I can do and what I try to faithfully do For the community to the to the entirety of the rest of the world is to say here's what we're working on Here's how it's going. Here's how you can come help so during the current release cycle and Well during every release we we have some kind of big things we're looking on But there's a few constants that remain same the same that I will commit to you that no matter what features land and no matter What improvements land or what things that we defer to work on in the future? There are a few baseline things that we will always commit to as a project team one we're going to continue to do intermediate releases and do our Allow you to have access to the new features and a stable production ready release as soon as possible Number two we're 100% committed to a smooth upgrade you should be able to download a very very early version of Swift from five six seven years ago and Upgrade to current master with one jump And not have anything break with your existing data that stored in that cluster if anything happens That is a bug. We want to hear about it and we need to fix that immediately so we're going to The goal of these things is to make sure that you can continually get what We're working on and you can have something that's better solving your source storage problems and number two that it makes it easy for you the Deployer the operator to to get that and not have to freak out about stuff It it's a lot harder to upgrade every two years than it is to upgrade every two months or four months or something like that so I want to say that as as a as a team we are very strongly committed to these sort of things so You Swift now continue to track the upstream releases and you will get the latest and greatest as as they are released So what is the latest and greatest going to be shortly in the future? What are we working on? there's a few big things I want to talk about and a Couple of big things and a slightly larger set of smaller things the two things I want to talk about first what I consider two of the The priority things the biggest things that are going to make the biggest long-term impact to Swift overall The first one is something we call container sharding container sharding is a way to Efficiently distribute some of the metadata inside of Swift There's a without going into a lot of details of the architecture of Swift the basic ideas that we have a set of back-end storage nodes that are able to track Groupings of Objects so you store your objects inside of a container and the container is then persisted in the cluster as a set of Databases on discs that are replicated and distributed durably and and available But the problem is the more and more stuff you store in one container on your Swift cluster the larger and larger that database gets and This causes operator problems because what happens if you have literally a billion or five billion You want to have five billion objects in one container? Well your container database is going to get really big and Frankly you're going to run out of hard drive space on that particular hard drive where that is stored So it would be a great idea to split that up into smaller chunks and distribute those smaller chunks throughout not only will this ease the The day-to-day life of operators who are running clusters like this, but it will also allow anybody who's using Swift as the applications the end users to Understand it and be able to not have to worry about some of these back-end things So right now when we talk to people about Swift we say if you want great performance You need to spread your data across the entirety of the namespace so use a lot of different containers And you'll be able to do a huge amount of concurrent access to Swift That's exactly what Swift is optimized for but When somebody is migrating to Swift when they're used to another object storage system say something like Amazon S3 Then they're used to just dumping everything into a single S3 bucket And they would like to be able to not change their applications not change their mode of thinking when it comes come to using Swift And this work will allow us to Embrace those people who are migrating to Swift much more easily So it's going to be great for the end users and it's great for the operators. Just removing headaches from both I'm really excited about this work Matt Oliver has been working on this most recently in the community and It's going to be great. So we could also use a lot of help on pushing this across the line making sure we have it all validated and Correct, so come help us on this the other big thing that I wanted to highlight is one that is Extremely hard to categorize in a small set of words Despite that I have tried and I call it Optimizing rebalances So the basic idea here really is not just Looking at capacity adjustments inside of the cluster, but it's an overall effort starting with capacity adjustments But extending to quite a few areas in the system that are all about efficient back-end Operations on the data. So we start this with optimizing for rebalances and this is Don't think about this when we when we first launched Swift in 2010 into production We were using massive two or three terabyte hard drives on and just literally You know dozens. I mean like two dozens 24 drives on a chassis But today it's completely common to go by an eight or ten terabyte hard drive and putting 60 or 80 of those in one server and that sort of growth in the density without Commiserate growth in network connectivity interconnectivity or numbers of cores on the on the individual object servers means that The we're having to do a whole lot more work with the same amount of network and CPU resources Which means we need to get more efficient in order to keep performance up and to overall improve things for everyone So there's a lot of work here there's there's work involved in appropriately scheduling the work and Determining what needs to move where when you add in new capacity into the cluster when you roll in a new rack or a new server and plug It in and you need to move that proportion of the data on to the new to ingest that new capacity that's available We need to more efficiently schedule how that is done once we have more efficiently scheduled that We need to make sure that when we're talking about it server to server that inner process communication there is as efficient as possible And to make sure that Swift is in the data path, so we can have appropriate Back pressure on what's going on we could slow down when we need to we can speed up when we need to and we can take advantage of All of the resources that we have available for us then we You know going further than that. There's also a large set of work in the future to Efficiently figure out that great now we have efficiently scheduled the work We've efficiently communicated to another server that here's some new data You need to push there so how then do we efficiently take the data off the network and put it onto the disk and taking advantage of all of The hardware resources we have on that server, so this includes Some of the Golang work that you may have heard about us discussing over the past year inside of the Swift community so overall I while I categorize this as Optimizing rebalance really that's just a short two-word phrase to say that We have a massive amount of work that we're there were in progress and will be continuing to work on That is all about making the storage nodes the back-end parts of Swift extremely efficient the goal the plan on this is that In the future so a year from now when I you know get up and do another one of these talks a year from now That we will be able to say that yes Look at the performance that you've been able to the performance improvements the lower latency that Customers are seeing the more consistent latency To request that customers are seeing the faster in just times the slower consistency engine cycle times That will help everybody out so more and better. That's that's what we're working on now Obviously, this is a massive amount of work and we need lots and lots of people to help out with this Because there are there are big things that need to be done as many of these can be done in parallel with one another And it's been really exciting this week to hear about some of the other work that people are doing along these along these lines, so That's a mouthful. That's that's a lot of stuff and I don't want to be Intimidating by saying that look at this huge amount of stuff. We need to do remember that despite these changes that are Happening that we are still committed to being able to Completely seamlessly upgrade your clusters. We're not going to lose data. We're not going to We're not going to break your cluster and force you to lift and you know for cliff upgrade your your clusters so be encouraged about what's happening and if you have some thoughts on that please get involved with the community and We can we can keep going so Those are the two big things that in my minds are kind of the high priority If you need something to do those are the places to do it, but there are a lot of other things that are also very important, but The people are they're working on or talking about that that are going on and these are kind of exciting especially when you think about them from Swift being used in more enterprise work cases use cases and adopting some of the More traditional store enterprise storage features and stuff like that So the first one or a few of them just going quickly through here is a policy tiering This is the idea that an object may that is stored in one policy say it may be Replicated, but over time we want to move that to an erasure coded storage scheme There's the idea of policy migration which says that If a container is stored at one storage policy, how do we just change that storage policy itself so we can You know distribute it out or just maybe you put a bunch of objects into a single container for a single project The project is done and now you need to move that into a different place subtly different than the tiering But it both involve changing the way things are stored. We're looking at Implementing Erasure codes on a globally distributed scale And we're also looking at improving the existing at rest encryption functionality We have adding some more knobs and integration points So we can get some really nice new features integrating with other external key managers Per account keys for you know, maybe even including stuff like key rotation So those are kind of some of the things we we need to go We're other also being worked on now one of these specifically the global clusters There's been a lot of work at the global erasure code clusters There's been a lot of work on that recently and I am hopeful that The the majority of that work will be done soon So That's some of the big stuff that's working on. Let's talk about where Swift is being used in production Swift is completely deployable on its own in a standalone mode apart from any other open stack services And this is in fact a very common deployment pattern That I see over and over again and in fact I was I was honestly I was blown away by the keynote on Tuesday morning when Mark Collier got up and praised the work of the Swift team overall on being able to maintain this and say that Swift is deployable in a Independent way apart from other open stack services so that being said Swift works extremely well with other open stack services as as well and both as a target for storage on things like sender and glance and Sahara, I've seen several talks this week on using Sahara with Swift as a way to run distributed Hadoop jobs from data that's stored in Swift and then also using features of projects like Keystone and Barbican as we're looking at encryption stuff and So it's it's usable both on its own with open stack projects and It's also been interesting Over the past year or so is looking at how the industry is changing and as new things are being developed in the overall broader ecosystem seeing where Swift can't fit into that and I really do believe that Swift is more important and more needed now than ever the it's a it's a great Way to do storage, but it's also somewhat of a new way to do storage Different than traditional file systems or sands or something like that that people have been using for quite some time but now we're talking more and more about Yes, there's the cloud and cloud native applications, but now you get into stuff like containers and What does that mean? So obviously there's been a huge talk about Kubernetes and open stack working with a community Kubernetes community Dockers obviously not left out in in those conversations, but people are writing applications that are being deployed in containers and more so than just thinking of containers as a convenient way to package up something and just and like is a distribution method but if you have applications that are running in containers one of the benefits is that you're supposed to be able to create your application and Deploy it on a fleet of servers and then things can come and go and you get this really great nice distributed Execution environments that your application can scale out But one of the hard parts about containerization and the container ecosystems as as it exists today is storage is hard Because if your application is coming and going and being scheduled on different servers and things like that Where do you store your data? It's you don't have a hard drive Plugged in to I mean like if it's it's plugged into this server and you get rescheduled over here Then how does the data move between them and data moving is much much more difficult than the application moving? So there's been lots of things inside of the container ecosystems on how they're trying to solve these sort of things, but Shocking I know, but I think Swift would be really great for this Meaning that if you've got a distributed Compute environments you need a distributed storage environment that is also just as scalable So when your application is loading data when your application is creating and storing new data The thing that you should be doing is putting that inside of Swift so that you have this kind of your compute environment and your and your storage environment and the storage environment is designed to take that data scalable easy access and Growing along with your compute environment. So yeah Swift is I think the best way to persist data for these apps and just kind of a new way as the industry continues to move on that Swift is being used and and can be used more and more also looking at the change in the landscape as it were of Since OpenStack started back in 2010 to today over the last seven years You know we originally started that OpenStack is going to be this great thing and it's going to supplant all of these other proprietary cloud vendors, but You know what there's massive cloud people out there who are not running OpenStack And there's a lot of people who are running OpenStack clouds, but the reality now is how do we work together in that? How do we make sure that Swift can work in a multi cloud world? Whether that's OpenStack or Amazon or Microsoft or Google or or even but just anything else So hybrid cloud for storage has always been one of those Phrases that hasn't really worked well in my mind because storage is not one of those things the data doesn't move very easily You put it someplace and you if you put a petabyte of data in your own local data center You don't just copy that data to the public cloud to compute on it and then copy it back That doesn't make sense because it it's extremely hard to copy a lot of data back and forth But recently I have I've been hearing about some really cool use cases where Swift has been able to Work and you have multiple Swift clusters or you have Swift working with other public cloud providers That is really kind of cool. And I'm going to show you a story Of a place where I've seen this It's just I think really exciting. It's a movie company that is You know they may be doing special effects that maybe do like animation rendering or something like this and the reality is that the Compute requirements on that are enormous So you have some amount of storage that you need so you measure that and you know a few hundred terabytes or petabytes or something like that But the the compute I mean you need tens of thousands hundred thousand cores just running constantly to do this and Building that infrastructure out yourself is expensive and it turns out with you know, for example things like AWS EC2 spot instances are really cheap Especially if you can just get those overnight But the problem is they can come and go and you don't know exactly when You're going to be able to execute or when it's going to be killed when somebody else chooses to pay more money for that instance But the thing is you've got your data and you would love to be able to You know turn on it constantly in your own thing in your own data center But let's how do you then take advantage of that cloud that cloud compute? That's that's available someplace else and so the the system that I've heard about is You've got your Swift cluster. You're running in your own data center for for cost and privacy reasons and then if you have a resource that you need to compute on then you can actually You can you can run a Swift node inside of that public cloud Just on one of their own compute things that it participates as a part of a global cluster of inside of Swift and then When that spot instance comes up all they can all they have to do is talk to that local Inside of that public cloud that that Swift endpoints and that Swift endpoint there could possibly be doing something like let's load the data from from the primary location that the Swift cluster that may be the on-premises sort of deployment but let's cache that say in s3 and What that gives you is as these spot instances come and go and need to compute on something They don't have to always fetch something from the primary data center They can you get this efficiency and honestly a lot of cost savings over those network transits because those are the hard problems with With the public cloud the costs on the network transport So you get a lot of efficiency you get a lot of cost savings and you get you're taking advantage of Swift's own features of globally distributed clusters you start being able to make headway on this whole idea of distributed hybrid cloud storage and Seeing these things come together and people start using these things in production is really exciting to me And I think that it shows a very bright future of what's happening inside of Swift in the future so overall the main story about the containers about hybrid clouds and increasing use cases for Swift is How do we access Swift and other ecosystem things that have been going on is obviously we've got the Swift API and we are the Implementation of the Swift API, but there are ecosystem projects that are Allow you to have an s3 API And also people who are working on file system access in different ways to to allow more traditional applications to talk to a Swift cluster So the point here is Swift is the place where data is stored We've got these continued development in the ecosystem to Remove barriers so that people can more easily put their data inside of Swift get their data outside of Swift for the compute Whatever they need the features that we're working on in the community overall to help streamline that make things more efficient It's all really exciting I'm really excited about what's going on and if you've seen any of the other talks that I've given about Swift these project updates I'm about to say something I've said many many times before and that is that my vision for Swift is that everyone will use Swift every day Whether they realize it or not We've seen that happen with things like great you go look at Wikipedia and you look at pictures there and comes from their own Swift cluster you go watch movies guess guess how those movies were The special effects on those movies were done you go you go to your bank you look at your documents You take a backup of your servers those kind of things are happening and as we Facilitate better access to Swift as we work in a multi-cloud world as we embrace Swift being used in conjunction with container ecosystems and as we make Swift itself more efficient And improving and adding new features This is happening and this is this is absolutely fantastic. We see Swift being used by millions of users every single day And I think that's only going to keep growing So that being said Help us get involved. Come on. We've got a great community It's the best people I've ever had the privilege of working with the community or what make people make make Swift great I would love to have you come in introduce you to everybody help you get involved Even if that's just a hey, I've got an idea great. Let us know that would be fantastic If you want to come in and say I don't have any ideas, but I'd really love to help Help make some of these other ideas come in great We've we've got a lot of work that you can help us out with so thank you very much for that We have plenty of time for some questions. Please come up to use the microphone here And I'll do my best to answer those and repeat those for the video not everybody all at once Regarding the API So what I the reason I mentioned If we've got the Swift 3 API the Swift 3 project as another project inside of the open stack umbrella That provides that s3 to Swift's translation that's going very well and Continuing continually improving the file system API that I mentioned But I would consider the Swift 3 stuff also part of it's not in the actual code base for Swift So, you know, it's an ecosystem sort of thing the file system API stuff that I work with or that I'm talking about is a Few projects that I have seen also in the ecosystem not being directly developed inside of Swift, but for example at a year ago I was on stage with with my employer Swift stack and we're talking about proxy FS as a project that we're working on to add a file system access to that you can see other Fuse-based here's a way to access this there's there's strengths and weaknesses of a lot of these honestly, I'm kind of excited about the way that proxy FS is going but You know, it's just ecosystem things that people are developing for Swift on top of Swift so that we can help facilitate access to Swift if you do have any other further questions feel free to reach out on Twitter and IRC I am not my name and you can find me pretty easily on online Thank you for your time. Thank you for coming today and let's all go use and build Swift