 All right, sounds like my mic is on right now. I'm just going to start up a little early, give you a little bit of introduction. Most people should probably by now have a handout like this. If you don't, you should have it in a few minutes. This handout includes the second exercise we will be doing today. If you want to follow along, you're going to need a terminal. So if you're on a Windows machine, use Putty. If you're on a Mac or a Linux box, your standard terminal should do just fine. If you look at the very first page, you will see two stickers. The sticker on the left-hand side is the one we're going to be using in the first hands-on lab. And the one on the right-hand side is the one we're going to be using for the second lab. We only have 80 minutes, so we're going to be cranking through this pretty fast. And if you are not done with either one of the hands-on labs when we need to go to the next one, don't worry about it. Stay around afterwards. We'll hang around here and we'll help you get it sorted out if you want to. But just try to move along as fast as possible. We'll wait for a few more minutes to make sure that everyone has received the hand-outs. If you do not have a handout, raise your hand in the air. Yeah, yeah, do that. That's great. Yeah, I brought my own pointing device here. It's great. It doesn't have laser yet, but it's coming in the next revision, which is Swift Power. Louder, please. Do we have a soft copy of that? We do, probably. I can put it up after this workshop, and we can put it up on the screen where we're going to put it up. Okay, I think we should be getting going here. It's 410, which is when we were supposed to start. All right, just quick introduction. My name is Martin Lanner. I'm an engagement manager with Swiftstack. We focus our business purely on Swift and deploying Swift. I have my colleague, Hugo Kuhl. He's a systems engineer, and he works out of Taiwan Taipei. He's also going to be doing all the hands-on kind of work for this stuff, and I will be guiding you along. The agenda is as follows. We are going to do a quick introduction to object storage in Swift. We're going to look at how Swift works for those of you who don't know already. Then we're going to launch into the two labs we're going to do. We're going to build one Swift node, more or less from scratch. We have actually installed Swift on the node already, but we're going to do the configuration, build the rings, and all that stuff together. When we're done with that in about 20 minutes, we are launching into a Swiftstack installation, doing the same thing, but through different ways, and using Igui to do it. Then we will take a look a little bit at the how you operate, monitor, and manage Swift, and talk about what you need to think about, things like that. At the end of it, we will take a look at failure handling, and explain what Swift does, how it does it, and why it does it, and explain how that benefits you as an operator. With that, I should also just add that the machines that you are, or the VMs that you are running, are all OpenStack-based machines. Those are VMs running on Rackspace Cloud, so just a little plug to Rackspace there. So why object storage? I read a study recently which basically says that data growth at about 50% per year, and that's 50 to 75% of that data is unstructured and tends to be of archival nature. There's also a large explosion of web applications that take advantage of unstructured data in object stores like Swift. Swift and object stores in general are using RESTful APIs, so the way you address it is just through an HTTP request. The nice thing is that it's a distributed system, high availability, where you don't need to worry about things like raid and loss of nodes in your system. That allows you to run very agile data centers, make the operations much slimmer, and you don't have to rush out and change disks and stuff. Every time you have a disk gone bad in a raid or something like that. Swift does not use raid, so it saves you from all those things. Now, why Swift as opposed to something else? Well, you guys are probably all here because you like open source. Swift is completely 100% open source, and it's been proven at scale. We know Rackspace runs very large clusters. They have gone out and publicly talked about that. There are lots of other clusters out there that have not been talked about publicly, but they're also very, very large. So it's hard to beat Swift today in terms of being proven technology, and since it's all open source, you can peer into what's going on underneath. The Swift ecosystem is today managed by about 15 core devs, and there's another 150 or so developers that are contributing to and from to the project. The developers in the core system comes from Swiftstack, Rackspace, Redhack, Vance in France, and IBM, I probably forgot a few, but there's quite a few there, Intel. The benefit of running Swift is also you can deploy a large object store in your own data center. So although, you know, S3 at Amazon is very, very popular, and it's great for a lot of use cases, it's not always great if you're under some kind of restrictions in terms of making your data running on public clouds. And also the price of it can be pretty high when you start adding a lot of data into it. Now, Swift has some really unique features. Multi-region clusters is basically a cluster that can span multiple data centers across states or even, you know, America, Europe, Asia, South America, whatever you have, which means that you truly can run a resilient infrastructure and have very high durability and availability of your data. Storage policies is something that is just around the corner in Swift. It's a great new feature to be able to take advantage of different kinds of tiers of hardware or numbers of replicas, depending on what you need to do. Later on this year, there will be a racial coding added to Swift and a lot of companies are looking forward to that and it's a great way of reducing your footprint of your data. So, how does Swift work? The design goals are really just doing, making it reliable, highly scalable and hardware-proof. I think reliable speaks for itself. It's very, like I said, it's very, very simple to deploy. It has extremely high durability. A study by Seagate, I think, a few months ago claimed that Swift would have 12 nines of durability, which is pretty amazing. Obviously, highly scalable. You only need to add racks or machines to the cluster and grow it out as opposed to traditional storage, which tends to be more of a forklift upgrade if you need to move from one sand or NAS to another. Hardware-proof, the nice thing about Swift is that you can run it on standard servers. Just select whichever hardware you like to run it on. You happen to have a favorite vendor buy the gear from them and deploy Swift on your standard Linux distribution and off you go. All right, so how do we actually address this? A typical URL looks like this. So you have HTTPS, SwiftExample.com, and you have version one in the API version. Could be v2, but then you have your account container and object. That's how you address every single object in the store. Very, very simple. And if you're a dev, you probably appreciate this because it's incredibly easy to do over HTTP. So what do you do if you want to put an object in? You write it with a put command. You read it, you do get. Very easy. Okay, so now then the Swift components. What do they do? We have a proxy server, we have an account server, a container and object. Just like you saw in the URL before. Underneath we also have the ranks, which is sort of the killer mechanism that makes it all work. The rings are built up using regions, zones, devices and partitions. And that's kind of how Swift knows where to place its data. So when you're using Swift, the way that it works is that you get a user, sends a request in, it comes into the proxy. The proxy then forwards that request to the account container and object servers. That writes the data or reads the data out of it. Underneath there, of course, you have the disks and in this example here we have three replicas as you can see. Alright, so here's the rights request. You have the user, you have the proxy and you get three rights going down to three different nodes in the system and three different disks. Once you get two acts from the system that they have been persisted on two disks, getting a quorum out of three, so two out of three, right? That will be sent back to the client and that will be considered a successful right. In most cases, those three rights happen almost at the same time and you'll get an act back anyway. But really all you need to do is have those two. If you only get two and for whatever reason that third one doesn't write, the system has a built-in auditor and replicator which I'll talk about later that will automatically create that third copy for you. Read requests, they kind of work the same way except for you don't need to read three times, right? So you just read from one of the nodes or the disks that you have a copy on. So durability and replicas, Swift stores by default, we recommend using three replicas. It's a good kind of middle-of-the-road thing. It gives you really high durability and it's also not a bad cost for doing so. You can change that as I said and it's determined by how you set up the ring. So how do we know that the data is actually okay and it's not costing some bit rod or something to it? Swift uses MD5 sums to check every object to make sure that it's what it's supposed to be and because it has three different objects to look at, it can compare those to make sure that the time stamps and everything match up. Now, some of you may be thinking, well, MD5 is sort of like a security thing, it's been breached and whatever. We actually don't care because we don't have a security issue. All we do is compare the MD5 sums to make sure that the file is the same with the object. So the auditors and replicators are using the MD5 sum as being compared. If the auditor discovers that one of your objects does not match two of the other ones, it will take that object and it will toss it out, put it in a quarantine, and it will create the third new object and replace that bad object with it. That way we keep the durability. All right, well, I kind of skipped ahead of that. So that's sort of the simple ways that is working in the background. Now, you can use any kind of disk that you want. You can even run SWIFT with multiple disks of different kinds of sizes. You can have one terabyte, two terabyte, three and four. And they're all given a weight, which is relative to all the other disks in the system. So in the case of SWIFTAC, we have just simply taken two terabytes and we put a weight of 2,000 for that. If it's a three terabyte disk, we put 3,000 for it. Now, the nice thing about this is that if we take those disks and SWIFT actually takes and creates partitions, which is SWIFT partitions on there, and it lays out those partitions per disk. And those are not partitions the way you know them, sort of like Linux disk partitions. They're SWIFT partitions. These are really just directories on disk. So if you go in like I've done here and just do an LS on one of the disks under the objects, you will see all these different partitions. They're just simple directories on disk. So what we do is we basically take those partitions and they get mapped out over the cluster. And when you have, let's say, the 2,000, a weight of 2,000 of a disk, those get evenly spread across the disks in your system. And if you then, so it kind of looks like this, if you then take and spread those evenly across the system, and now you have, in this case I just made something up with eight disks in a node and I have 16 partitions per disk, that would make 128 partitions in the cluster. That's an extremely low number. You would never see that in reality, but that would have been way too many boxes on this slide to actually illustrate that. So you get the point. Now if I add a second node using equally weighted disks like a 2 terabyte disk, I would get something like this where I get 8 partitions per disk per node. Still 128 partitions. So if I now have the 2 terabyte disks in a rack, for example, if you down the road find that the 4 terabyte disks are actually cheaper, what you can do is you can just start changing disks out in the system. Replacing the 2 terabyte disks with 4 terabyte disks, the partitions will be reassigned based on the weight of the drive. So a 4,000 weighted drive would have more partitions than a 2,000 weighted drive, which means that as the data comes into the cluster, it will get evenly spread across the disks. So it's sort of like a smoothing out effect that way. Obviously with advances in technology and drive sizes going up, costs going down, you can now go from 2 terabytes, 2 terabyte disks in a rack and swap them out with 4 terabyte disks in the rack, same hardware, and double your storage footprint. So these partitions I talked about, what do they do and how does it work? Well, the ring that I mentioned before is kind of the magical thing that lets Swift find everything. So the partitions are mapped kind of like an encyclopedia. So if you're going to look up Swift in the encyclopedia, what do you do? Well, you go A, B, C, D, and so on until you find S. You go to S and find Swift. Now, Swift uses hashes, not regular letters or anything like that. So if you actually were to look up a hash for a particular object in the ring, it starts kind of looking like this and it gets sort of complicated, but it's very, very elegant in the way it's been designed. So if you take a look at the object here, first kind of going back here, that's a UNIX timestamp for that particular object. And if you look at the partition underneath it, it says 53-180. You will see that there's a 53-180 up here after objects. Then you see the last three characters of the object hash, which is being prepended right after the partition, which makes it really quick to look that up. Then you get the full hash and inside of that directory you will find the object timestamp, which relates to the object. So Swift just iterates over that and you'll find it through the ring that way. It makes it very simple. So really what it does, it takes that entire directory, all the path of the directories, and then it looks up the partition and using the hash finds the object. So we got to build this ring. And the ring consists of mainly three different things. It's the number of replicas. Like I said before, it's usually three. That can be changed again. You have the number of partitions, which is not a value that you can change. It's static. So when you start setting out to build a cluster, you kind of want to know how big that cluster is going to be. Of course, that's kind of impossible to tell sometimes, so you're better off going a little above what you think it's going to be. You don't want to have, say, oh, I'm going to build a 10 petabyte cluster realizing about two years later that this cluster is really going to go to 300 petabytes. So choose carefully when you do that. The last part is the min part hours. The default setting in most clusters or at least the ones we deploy is 24 hours. The min part hours, what it does is we have three different objects sitting in different partitions. What those partitions, if we make a change to the ring, what's going to happen is that the objects are going to move at some level in the cluster. They're going to be moving off to a different disk or something. And if that happens, we can't move all the objects at the same time and just kind of shuffle them around because while we do that, if a request comes in for that object, the proxy won't know where to find it. So what min part hours does, it kind of locks two of the partitions and the objects down so that they can't move while another one is in flight. Once that finds its destination, another partition or object can move. Another really cool thing about Swift is that it automatically puts the data in an as unique location as possible. What that means is that if you only have a single node and you have three disks, the data will be placed on those three disks. If you grow your cluster to three nodes with multiple disks in each node, you will be guaranteed to have one object on every node. Stepping even further, going into zones, maybe you have three racks. You assign every rack as a zone. Guess what? You're going to have one object in each zone. So if you lose power to one of the racks or the top of racks which goes down, you can still access the objects in the other two zones. Likewise, going to the multi-region cluster does the same thing. If you have two different data centers, you're going to end up with a guaranteed having one copy in each data center and then a second copy in one of them. This is all just to make sure that the availability is really high as well as making sure that the durability is there to protect the data in the case of, let's say, our headquarters are in California. We have a lot of earthquakes. If a data center in California floats out of the ocean, all the data would be lost. If we had had one here in Atlanta, we would be guaranteed to have at least one copy here in Atlanta of our data. And that is just one cluster. It's not two clusters, so that's an important notion. All right, so hopefully you have kind of a sense for how SWIFT works at this point. Question. So the question is, can I add multiple proxies behind a load balancer? Absolutely. As a matter of fact, you should. You should never have less than two proxies. There are lots of different ways of deploying a SWIFT cluster and you can tear it into proxies, separate proxy machines, separate account container machines, and separate object boxes. Or you can collapse them into running all those services on one, on every single box and spread. And if you have 10 boxes, what we call the Paco, proxy account container object, then you would have 10 proxies on top of them. So go. Yeah, so I'm going to answer that really quickly and then I'm going to kind of get back to that at the end of the presentation. But the question was, what's the best practice if, let's say, you have 100 petabyte cluster? Would you deploy it all due in Paco? Probably not. Because from a price point perspective, that would probably get kind of expensive. And you can do that. You can kind of tear out your cluster in a better way and make that fit your use case really well. So at that point, it's probably more of a financial, economical issue than anything else. The weight? So if I heard you correctly, can you tell us about the weight factor and how does that affect if you have different kinds of speeds, RPMs of your disks? Yeah. You know, most clusters, to be honest, actually run the same speed disks. But really, if you do have a faster disk and objects end up on a 5,400 RPM disk and a 7,200 RPM disk, for example, you're probably going to be getting data back faster from the 7,200 RPM disk, right? And so because of how the proxy works, you're going to kind of favor the 7,200 RPM disk. That's kind of how it would work. All right. Great. So I do have a slide later on to kind of talk about afterwards. We can talk about tiering and breaking things down. And we'll also stay around here to take questions afterwards. But in the interest of time, let's go ahead and do the lab, do the manual Swift command line client. So take a look at your cheat. On the left-hand side, you have a demo and password. You have an IP address and the SSH command to actually get into that machine. So as most people already gone into the machine, question. Password doesn't work for you? Demo, password? Give him a new sheet. Anyone else having problems logging in? Give him a new sheet, too. All right. So Hugo is going to launch into this. This is the second lab stuff. So when Hugo will type this in here, what he will do is type it slowly so that you can actually follow along. So what you will find here, there's a lot of typing. You need to type it exactly like Hugo does. If you don't, if you have typos, you're going to kind of mess up. It'll be, you know, a few of you will probably end up with something that's not quite functional. If you really want to make it functional, we can deal with that afterwards. Is that large enough for everyone to see? All right, good. A little bit bigger? Yeah. So the question is, is there any difference between the Workshop 1 VM and the Workshop 2 VM? The answer is yes. Swift is already installed. The basics of Swift are already installed in Workshop 1. And it will, if you try to install Workshop 2 on Workshop 1, you can have a problem. You can get a new one if you want. All right, so let's start this up. So first of all, let's check the available disk inside your virtual machine. So please use black ID to list out all your disks. All right, so you can see down there, you have a couple of mapper disks. Those are virtual drives that we've added to this machine in addition to the OS disk. Say that again? Yeah, so those are actually not mounted. And that's because we will make Swift do that later. So the first thing you're going to be doing here is you're going to do a couple of, make some directories. And those are going to be mounted under SRV node. And then we're going to make directories that are going to match the disks. So disk one through five. And, hang on. So the question is, are we supposed to follow this? You don't have to, but if you want to, do it. You will build the whole machine and you will understand how everything is working underneath the covers if you follow along with this. This is the one, the left one you want to use. Yes. No, this is very tedious. It's typing. It is building a node by hand and creating all the things that you need to do to get a Swift node up and running with several disks, in this case five of them, by hand. So I think the question down there was how is this different from the instructions on the sheet? So the first workshop, like I said in the beginning, is basically to create a Swift node by hand and creating all the directories that you need, mapping every directory, making it a mount point, mounting the devices, the disks onto there, and then starting Swift manually and building the ring and doing all the things that we will later do in the sheet here, which is sort of the Swift stack way of doing stuff. So the reason we're doing this is to kind of get you a flavor for what's actually involved in creating a Swift node and building all that out and then we'll do it in a different way showing how you can scale out a cluster very much easier. Does that make sense to everyone? And so, like you can tell here, there's a fair amount of stuff that goes in to building a Swift node. So let's see, so where are you now? Hugo, I just tried to format all the disks we had here. So I want to make sure everyone is catch it up. So, yeah. So what happened here is Hugo created directories, SRV node disks 1 through 5. He then added, formatted the disks as XFS with a size of 512, put a label on each disk with D1 to match the directories that he created previously. And so now he's gone through what, two of them? Yes. See, it's very easy to mess this up. The instructions on the paper is not part of this. This is not printed out. We can make these instructions available from the presentation. Unfortunately, we were hoping to have two screens here. Actually, we can show both of them, but we don't. So D1, D3, isn't that one? Next DDF. The last one, D3. Next DDF. Alright, so now that Hugo showed how easy this is to actually get confused by, he's now going back and fixing his mistake. We have 5 disks and label from D1 to D5. Okay, so these disks now, if you look at the mount point in the devices, the labels of these devices are now matching up to the mount points that we created earlier with the makedir command. Go back here. Can you go up to the previous command? See it? Yeah, should be. This is DDD, supposed to match to D1 and E match to D2. Yeah, the second is wrong. I already fixed it in the newer command. Alright, so let's keep on going. So, can I move it forward? Yeah. Okay, thanks. If you guys don't all complete it, it's okay. We can work through this afterwards. I just want to make sure that you get a flavor for what's involved in actually building this and how complicated and how much orchestration you actually need to make this happen. So, don't feel bad if you can't follow it all. Try to keep up, but if we get to the end of this, no big deal, alright? We're here, we're not going anywhere afterwards. We're happy to help. So for now, we need to mount all the device to a proper directory. Okay, so what Hugo's doing here now is he's doing a mount of the XFL file system, the label D1, D2, D3 onto each node with the, or each disk with the mount point that we created earlier. Question? Okay, so this is a great question. Is it required to use XFS as the file system? Swift does require that you have extended attributes for the file system. You could use something like EXT4 if you wanted to. XFS is a very proven, reliable file system. It works well. It's very fast. It's sort of the go-to file system for Swift today. That's why we're using it. We consider it to be best practice. What are the alternate file systems? EXT4 is probably the other file system that you'd be considering. But really, this is going to give you the best option. We can talk a little bit later about something called disk file, which is an abstraction layer in Swift. But for the time being, we can talk about that afterwards if we have time. Question? Yeah, so we use them as raw walk devices, right? So, does that answer your question? I mean, that's what we're doing. And we're putting XFS on top of it. Okay, so question about LVM. It's kind of a long story. The reason we're doing it here is because if we had to add 100 gig drives to every of these, we would be in this room adding tons of virtual drives to it. Okay, so the last thing you do after this is that you need to make sure that those SRV node mount points are owned by Swift, so that Swift can write to them. So what Hugo did here was he changed the ownership recursively on all those directories. Now that we have just the basics of the disks laid down on the system, we actually need to invoke the Swift ring builder. The Swift ring builder will take the partition powers that we talked about earlier. Hugo is using a partition power of 14 here. He's using a replica count of three. And for this exercise, we're actually making the min part hours just one instead of 24 because it's a small node, and we may want to just play around with things. So it'll be easier to do. Now you have to build a ring for each of the account, container, and object servers. The account server and the account in Swift, what it will do is it will keep track of all the containers underneath the account. The container will keep track of all the objects in the container. So you need a ring for each of them. Are you in a Swift ring? Swift dash ring dash builder. Take a look at it afterwards. Okay, so now that we have put in the values there to create the rings, we're also going to have to include the region zone, the node, and the disks in those rings. So what Hugo is doing right now is basically writing a small for loop here in a few lines to make all those devices be included in the ring. So once that is completed, that will create the ring and include all the different devices in there, and that's the part that then gets generated that needs to be put on every single node in the cluster. It's incredibly important that the cluster, that every node has the same ring. If you don't have the same rings, you will end up having problems at some point. Defaulting the region one. Because we specified the region so we just used the default one. So in this command, we did not include the region, and because we didn't include the region, it will just default to region one in this case. So that's why you're getting that warning. You don't have to add a region, it's okay. Yeah. Unfortunately, I can't switch back and forth because I only have that. But if you guys want to take a look at this, we'll make it available. The whole PowerPoint will be available right after the weekend uploaded and send it out and let you know exactly. The question is how long will the virtual machines be available? Let's keep them up until Monday. You can play around with them, mess around with things, reset them. Alright, so let's go ahead and do the actual ring builder and create it. And so we're going to have to build now the ring based on all this input that we've done. We're building the ring for every one of the account, container, and objects. And so what you see here now is like the device is included for each of these, the IP addresses in this case, the local host, it will use a port and it will have the disk name, the weight, just the one you wanted to see. So everything is... So in this case, we're using these quasi disks, right? And we're just defaulting to 100 here for the sake of it. If these were real disks in which we would do in our software, we actually do that differently based on the size of the disk that we find. So I guess the question is you'd have to determine the weight of the disk yourself. If you do this by hand, you have to have some kind of way of systematically laying it down the proper way. And in that case, yes, you would have to determine that weight. You can do what we do, which is basically look at the disk that I started by drive that would be a 4,000. So the question is, do I need to... If I want to replace drives, can I do two or three at a time or do I have to do one a day? You don't have to wait. You can swap them out. Swift has... We'll go through this in failure handling later. Swift has ways of making sure that if you yank a disk, it gets moved to another place. Moving on. Which one have we done? We're doing the object. We created a ring and started to rebalance it. Yeah. So now we're going to go ahead and rebalance these in the cluster. Left it up. Right. So now you see that the balance is very close to zero, and that's exactly what we want. So if you're having an unbalanced cluster for whatever reason, if you push out the rings, new rings, it will over time balance itself. What's that? Basically means that the weights in your cluster across the whole cluster are assigned in a way such that the cluster will spread the data evenly throughout, based on the weights assigned to disks and so on. So we're getting pretty close here to being done. So what Hugo's doing here is just listing out the rings that he just created. And so now you can see that he has the account container and object rings, all gzipped on that, on this node. And if we had more than one node now, so if we had done this for, let's say, 10 different nodes, we obviously wouldn't do that this exercise on one of the nodes. We would do it on a separate node or a separate kind of a management machine or something, and then we would distribute these rings out to the 10 different nodes. So that's the way we would ensure that every single node has the same rings. And the last thing here going on is that Hugo is starting the Swift services. And we're just going to take a look here at the Swift log to see what's going on, and it's outputting data. So now we have Swift actually running here on the nodes. So if you go into if you've actually been able to follow along so far, if you go to just to CD Home demo, you should be able to see in that directory that you have a Cloudcat image, Cloudcat.jpeg. And you can now take that image and you can actually upload it into the cluster so that you have that image inside the cluster itself. And doing so, you can use the Swift command line client that's on this node already. And the command is simply to say Swift space the dash capital U for the user. And in this case, the user account that's been created is admin, colon, admin. And then you have dash K, which is the key as the user's kind of password, so to speak. That key is also admin. And then dash capital A, which is the address. That's the URL that we started out with earlier today. And in here that will be address will be HTTP colon 127 0.0.1 slash v1 no, sorry, off slash v1.0 And then you're going to upload it to a say upload, which is basically just a put request. And you can container not found. Should have been. So if you do an upload and there is no cat's container that should actually create that automatically. Take a look at the log here, see what's going on. Maybe I build a ring roll. So in the interest of time here, I think what we've been able to show here is that building this by hand is actually quite complicated. So maybe everyone like a better way to do that. Yeah, maybe you guys will prefer to build this in a way that actually works. So we now to provide a better way. So I can't hear you, sorry. In this case, it should be set up with off v1. So someone got it working. Someone didn't know how to type compared to us. Thank you. Question, why did we change the permissions on? They were owned by where are they owned by? The are they owned by right? Anyhow, we got one lucky person being able to actually do it. So let's move on and we'll show you now how we've actually implemented this in Swiftstack. So go back to your handout now and we'll do this in Swiftstack way and make this happen a lot easier. The thing we'll do now is take a look at the second sticker on the right hand side and what you want to do is you want to go in a web browser to try.swiftstack.com and WS and in there you should have a WS something value and you put in a password that's the same one that's on that sticker and you log in to the node or to the controller I should say. On the controller you have a curl command displayed. It should be curl HTTPS. or call it try.swiftstack.com and once you have that command you can go in and log in via SSH again to a new node that we have prepared and you do that by just going to SSH at the IP address that's given on your sheet and once you're in there now what you should do is you should take a copy on the controller side in your web browser copy that curl command and you can paste it right into your command line so do curl HTTPS.try.swiftstack.com slash install pipe it to bash and what that will do it will grab an install script from the controller and it will execute that script on the machine and it will install Swift from ground up on that machine immediately. That should take about a minute or so and while that's happening what you will get is you will get Swift, you will get a few Swift stack specific demons to kind of help keep track of what's going on on the node and talking back to the controller and as you see there there's a VPN installed what will happen there is the node will try to connect to the controller over a VPN and so all this grabbing stuff that we did before and that we managed to completely royally mess up all that stuff is baked into the controller is doing all that for you it will take an inventory of the disks on the machine and it will create the proper directories on your machine and do that so I don't know if you saw the screen if you can just flip back really quickly take that claim URL like that take that claim URL and just copy it go back to the controller in the web browser and copy that into the address bar at the top when you do that you should get a screen like that and you say claim this node once you do that it's going to try to establish contact the controller is going to establish contact over that VPN with the node 30 seconds to a minute this case maybe it took 15 when this green box comes up that means you have actually successfully connected the node that you just installed directly up to the controller so all you need to do is claim node at that point so we don't have a cluster configured yet we only have a node so the first thing we need to do is create a new cluster you can call it whatever you want we're calling it workshop but it really doesn't matter once you've done that what you want to do is go down to the network configuration screen because we only have one node here we don't need a load balancer because we're not going to load balance across any machines so you can select the no load balancer and you can take the again the IP address that you just logged into over SSH take that IP address and type it into the cluster API IP the API the cluster API host name you don't need a set because we don't have any DNS to actually map that IP address to so below that you have NTP NTP is really important to have set up for Swift so that it can track the time and then you will want to expand the open stack Swift advanced options to just take a look at that number of replicas you can change here if you want I would suggest just keep it to 3 that's the best way of doing it at the bottom you have a table of the partition power and sort of an estimation of the number of disks that a certain partition power would allow you to grow to and the size that that would use if you had 3 terabyte disks because we only have one machine I would just set the partition power to 10 if I were you because it's actually going to be faster to crank on the controller side once you've done that hit submit sorry oh yeah so you set the partition power in both places to 10 I think it's we default to 16 normally but 7 to 10 on both and then just hit submit and it will take you to a screen that says enable node and you can go and ingest that node and what that does is basically taking that long ass host name and it's going to now take that node put it in underneath the cluster that we just created did you have ingest? was the question? say enable node did you install the node like we did before? okay you're not logged in so you have to log in with the user for access if you can log out and log back if it helps so where are you Hugo? just like the network interface okay so once you have ingested your node you're going to be seeing that you have three different interfaces here there's outward facing interface that's where the proxy requests are coming in you have a cluster facing interface which is where the proxies are talking to the object store nodes and then you have a data replication interface where all the replication between object nodes are happening if you only have one network you can just leave these all the same it doesn't matter you can split this out into separate VLANs and make that be able to control traffic across those VLANs in this case we can just leave it the way it is as one and hit the reassign networks next thing you want to do is you want to add these drives so now what we've done is the stuff that we did before manually make dear blah blah blah all that stuff we've done that manually or automatically in the background here there's one disk at the very bottom here that's one that you can click select the button ignore and hit change because we don't want to format that one that's a fragment of how this demo just works and it doesn't detect that drive properly but now once you've done ignored it and hit change then you can go ahead and format all the disks and you will format them for account and object all of them at the same time and once you hit that and hit format that should take a few seconds to get that done now Hugo is just going back to the node here to show using an SDT probe it's a command that we have that basically lists all the nodes or all the disks for you so you can see this maps to what we did earlier today when we did all the mistyping problems that's how that works so Hugo has successfully been able to add those disks and format them now what we're going to do we're going to add them into the ranks the same steps that we did by hand and we selected add immediately add immediately will basically take the full weight of the disk and it will deploy that into the ranks if we had let's say this was a new node that we're adding to an existing cluster we may not want to actually add the entire full weight of those drives immediately we may want to fold them in slowly so what we do here is basically if instead you had added gradually which I don't suggest you do now because it will take too long but if you had done that what we had done is basically we're adding 25 gigs of weight to those disks every single hour so that basically means that we're filling up those disks slowly so we're not overloading the system and moving lots and lots of data at the back end so the question is the right side is for the ring operation actually so the two columns you have is account container ring operation and then you have object ring operation and the reason for having two is that if you have large clusters with tons and tons of activity on them with lots of containers and lots of objects in those containers it's really beneficial to run the account container and you can do a lot of different stuff on SSDs because account container account containers rely on SQLite databases that everything gets rolled up into and so when a request comes in it will go into those and look things up and if you do that on spinning disks and you have thousands of users those lookups can be slow and so if you put them on SSDs they're fast and speedy and in this case there's no need because we're a small thing but that's why they're split out like that alright so we should be good and we should be able to enable that node now the last thing so it says click here to deploy if we go there the last thing we actually need to do is create a user because we need to have a user to be able to do anything in the cluster go in here, create a user you can create whatever you want again user one it says use one now use one instead of user so user one and then set some password on it and that's created the new user and when we done that you can go back to deploy changes and you can hit deploy to cluster now that ring builder commands and all that stuff that we did earlier those are now being generated all the rings are being generated is being pushed out to the cluster in this case just the one single node right but it doesn't matter if we had had a hundred nodes that we did this to we stood up the cluster the first time the rings will now be deployed to all those hundred nodes so basically that's what we've done as a company to add sort of like making it simple to deploy a Swift cluster and at least I appreciate that a lot because clearly doing it by hand was not so successful so I think the question here is how do you how do you now use this for if you want to use Glance or Nova or other open stack components so that's a great question you know if you want to have let's say you have Nova running you're relying on Keystone for example and you have a Glance repository you can point Glance to Swift to store your images and then you can have use Keystone to connect to to use that as authentication here that's actually sort of the next slide that I have to talk about here is that we have built as a company because Swift doesn't actually it comes with something called Tempoth which as you can imagine it means temporary auth it's not really meant to be a production and auth system Keystone works great for certain things but it's not great for everything so a lot of our customers want to use LDAP directory or they want something simpler so we actually built something called Swift stack auth which is basically functioning exactly the same way as Tempoth except for it doesn't have all the drawbacks of Tempoth Tempoth for example needs to restart the proxy every time you add a new user with Swift stack auth you don't have that problem Tempoth has clear text passwords that's a bad thing so we don't do that we also do a bunch of other things to make sure that it's a distributed auth system across the cluster which is really helpful when you're trying to run HA kind of operations question so the question is am I saying that Swift does not come with an auth system that sort of production ready correct? that is correct it's a piece that any deployer needs to be concerned with you basically would need to reinstall Swift on it but yeah you could take any node and add it into the cluster so if we just have another node you can add it you can go home and literally do another node in whatever virtual box and add it but you can use the other one to do that so this job finished it's all there it's working you wanna go home and show the console so this console is something that we built to have a web interface on top of Swift so you can see what it's actually doing Hugo uploaded the cloud cat image to the cluster already and you can see how that now shows up in the interface under photos the photos container so that's just one simple way to you can drag and drop stuff into the cluster using that and there's actually the image that he uploaded so question what is the best way to back up data so you don't really need to back up data in Swift because you already have three copies of it so you have a distributed system you have three copies you have a lot of consistency checks in the system at all times going on and so if you happen to you know like have a bad one bad object it will be recreated as a new object anyway so it's not like you need a backup system for a Swift cluster no so the question is can we deploy Swift on ZFS there's no reason to do so Swift does not as you noted here we are not using any raid whatsoever and you actually don't want to use raid because the problem with that is that Swift looks at objects on disk and if you abstract the disk through a raid system so that Swift cannot see what's underneath if that raid system goes bad Swift doesn't know that it's gone bad so if that second disk in raid 6 or whatever a ZFS raid 2 or something right if that goes away what will happen is Swift had no knowledge about those disks going bad so when the last disk goes and your parity is gone so is your object so you actually don't want to use raid at all in a Swift system but ZFS is just a file system right hang on so the question is if you would want to roll back a change to something sort of like a snapshot so Swift has a versioning component to it so you can set versioning on each container and so if you set up versioning on the container every time you upload a new cloud cat image that's different right it will create a new version of it in a sort of a shadow container that you set up so the question over there in the back so now that we have this you can see in the interface here that we have lots of different stats showing the cluster proxy throughput timing account request by status code and you can start drilling down and making full screen graphs of that so you can see what's going on can you see the number of objects you can see under Swift usage here we have a utilization by accounts and the top accounts we have as a company we've created a utilization API where we actually track usage by account see how many objects the size of them you know how much network in and out users have been or accounts have been using that you can query the controller about that so we have all that so the question is do you have to use the cloud service or can you put it on prem you can do whatever you want we support both we help you deploy the controller on sign how does the Swift stack what the licensing is for the controller and I don't know if I actually said this but Swift that we deploy on each node it's the same bits that you get directly from github if you download it so the way we license it is if you have 3 petabyte of raw that would if you use 3 replicas that means you have one petabyte usable so what we do is we charge for the the one petabyte usable under management you can surely talk to one of our sales guys here I'm not in sales so for an account you can do it by the container so do you want to do a list of the container so you mean you want to list objects from CLI or UI yeah you can do it from both the console that we showed before it was just one object in one container but if we put in 100 there will be 100 and you can see all of them the web console we've added but the Swift command line is just you download it from github or from actually a python Swift client API is standard open source and the Swift python client is just using the Swift API to get the data out of the cluster yes it's basically Swift this cluster here's my username and password stacked or list if you want to list something in a container you had a question no we basically the question is what do you use to deploy it so what we've done is it's not our sync it's not puppeter chef it's too heavy handed for what we need to do so we have kind of built our own very lightweight sort of deploy mechanism on that and what it does is every node connects up to that as a management channel it creates a VPN that is is secure back to the controller and then inside that VPN the controller talks to the each node yes it's always on and it's checking in and all the because you can see all the nodes are actually sending back every 30 seconds are sending back their vitals back to the controller and a lot of these or all of these graphs and more data actually get back to the controller so that you can see what's going on in the system on a normal deploy system we get about 5000 metrics per 30 seconds for every node so a good question so the question is if people have done this sort of the open source bits by themselves and kind of hacked it up right can we take that and just import it into a Swiftstack cluster theoretically the answer is yes it's kind of complicated to do that so it's a lot easier to build it from scratch we don't have a simple button yet for import this cluster kind of deal it would we've actually never had anyone get into the position of having to do that so we haven't done it but purely from a theoretical perspective it's definitely possible to do but it's not a piece of functionality that we have in the core product today yep so the question is if something gets moved to the quarantine do we manually have to change or manage that quarantine and delete stuff out of it it doesn't get deleted out of the quarantine for we don't just flush it and Swift doesn't go back to that quarantine and normally it's not like the quarantine is just going to overflow with stuff because then you have a bigger problem on your node or on that disk anyway but surely you could go in there and delete everything out of the quarantine but it should be a pretty small amount of data in that quarantine alright what else do we have we want to cover see we're kind of time is it now so we're kind of running out of time a lot but we'll stay around here if you have questions feel free to ask, come up, talk we can go over if you want to troubleshoot anything that didn't work on your machine we're happy to help where can you get the slides we will put the slides up they will be on the main site but it will probably take a while Hugo just type it in type it in a notepad or something we'll upload them right after here we're going to upload them to files.swiftstack.com just files.swiftstack.com slash Swift dash Atlanta dash 2014.pdf just put it up there and people can write it down but we will put the presentation online as well but it usually takes a week or something for it to get up there do we have any plans to put out a mobile device client we are not working on that ourselves but there are plenty of different clients already out there that can talk to Swift so it's not like you're at the loss of stuff out there so a question is do the disks need to be the same size no they can be all different sizes we don't care so is there a rebalancing if you push out a new ring disks on it with new capacity there is a rebalancing that will happen in the cluster as you do that it will spread the data smoothly over the cluster based on the weight of the disks automatically three types in terms of no so the account and container databases they are also replicated three times out to using the exact same algorithm and it deploys that and this is used the same way obviously generally speaking container databases are much smaller than objects right so if you sort of follow best practices of using account container on SSDs then what you will end up having is 100GB SSD per per account container is usually plenty of whatever it is like 200GB is maybe the least you can buy today that is leastable right did you have a question yeah right so the question for those who didn't hear is like ZFS is actually just a file system that most of the time people ask us about can I just use this the way I described earlier so yeah you are 100% right and there are things in ZFS that are really truly valuable things to take advantage of their ZFS Swift on ZFS has not been tested very much I know there are companies that have been wanting to do so um there is really no one in the community or the core devs that have spent or think that it is valuable enough in the system to spend all the time that it would need to actually do that and test it and versus XFS for example so so if you are interested in doing that I would definitely recommend getting you know first of all going into the Swift IRC channel and talk to the core devs because they are on it all the time and of course your agenda um so the reality is that if it is truly useful for a lot of people and you can convince the community at large about that then things will typically happen um but that is kind of a and then the question becomes where is the next feature or benefit of Swift and where can we take it that is typically where people or where the community is typically moving maybe I mean people have said that butterfes is going to be available and production ready I'm not sure that it's gotten there yet every time a new version of an OS comes out they say it's going to be the default and I think that there's one OS that has this default that I know that I'm aware of right now maybe with time who knows yep yes well thank you yep yes so bringing up a second region when you're 75% full on a cluster and you have one in the US and you're putting one in the UK or something is that possible to do? absolutely the reality is if you have a really large cluster right it's seven petabytes that you need to move seven petabytes across the Atlantic ocean the decable is going to be slow the the bandwidth of putting that those seven petabytes replicating in your existing data center shutting down the machines and putting them on a boat to the UK and turning them back on over there and having the replication and just doing the diff of what happened is probably faster depending on how much bandwidth you have but moving data across long distances on not a lot of bandwidth is going to be kind of flaky I mean not flaky but it's going to take a long long time yeah but yes it's definitely doing and we've done it but not with large amounts of data when people have done regions like that they really stood up the two halves of the cluster next to each other shut it down and move it on track in the US because it's faster