 Building applications with Swift. How many of you know how Swift works? Awesome. Great. Okay, so I can go really fast over some of this, which is good. Okay, so for those of you who don't I always have I always want to start with setting the setting expectations so that nobody is nobody surprised. Swift is a distributed object storage system and the basic idea, the basic concept that I think is the thing that makes it fundamentally new which is not particularly invented by Swift but it's characteristic of these kind of storage systems is that you separate the data that you're storing from the actual media that you're storing it on. So therefore they're completely independent. So if you need to swap the media out, maybe it broke or maybe you just have a new kind of technology that you're next generation sort of thing. You can do that completely independently of having to worry about the data itself. So the data could be pictures, documents, videos, images, backups, that sort of thing that unstructured data that can grow without bound. That's what Swift is really, really great at. So the point is, I can now store my backups and my videos and my user-generated content generated from mobile phones and all that kind of stuff. And that's all that the application developer has to worry about. Now over on the Ops side, all they have to worry about is making sure that this cluster of servers and drives plugged into it are working. And if you need to swap it out for different sorts of drives or different technologies over time or if something fails, you need to do upgrades or things like that, you can do those independently. So that's the difference. Swift is the system in the middle that disaggregates your data from your media. You no longer have to say, here is my data and I dropped it in the lake so I can no longer read my data anymore. So that's why we're here. That's the cool thing about Swift. From the application perspective, you just treat it like a utility, just like electricity. So utility computing, grid computing, cloud computing, this whole thing, the ability to say that Swift is going to offload the hard problems of storage from the application developers. So what are the hard problems of storage that you want to offload? Some of those things would be, I don't want to have to deal with failures. I don't want to have to deal with where my data is. I don't want to have to deal with data placement. I don't want to have to deal with locking for concurrency. I don't want to have to deal with really, really chatty protocols that aren't good over a long distance. Things like SIFS and NFS, you don't want to do that across a continent sort of thing. You don't want to have to deal with iSCSI talking to things. So what's the story? Where did we start and why are we here? So the story is that you wrote an application and you need to store something. So you've got a hard drive and you put it on the hard drive. And then your hard drive filled up and you thought, what am I going to do now? Let's buy a new hard drive and now you've got two hard drives. And you've got to remember, is it on the first hard drive or is it on the second hard drive? So you write it down someplace and then when you're going to read your data you go look up where you wrote it down and then you figure out what hard drive it's on. And then you run out of space again and you buy another hard drive and you continue the process. And eventually one of these hard drives fails and then you're in a world of hurt and you think, I know, I'll buy a raid card. And then you basically haven't solved any of your problems because your raid volume fills up. And then when something fails it's slow to rebuild and you just really, you haven't solved your problem, you've just made it bigger. And so the difference is that when you have a storage system that is able to separate out those pieces, now you can offload all of those to the ops guys that are sitting somewhere someplace else. You writing the application, your mobile phone app, your web app, your internal IT department app that's doing document management or whatever the case may be, is now has to think about it as here are some bytes, please deal with it. And then later on, I'd like some bytes back please and you get the bytes back. And that's it, it's storage as a utility. You don't have to think about what's going on in the background, how are things being durably stored, what is about capacity management, what about some sort of bookkeeping problem of writing down where all the data is stored and things like that, that's what Swift is able to offload away from the application developer. Which basically means that the application developer can focus on the things that are actually important to the application, which is namely making an awesome application. You don't go here about mobile phone apps that are like, you know what, we sold for $19 billion to Facebook. You know what our key technology is? We're really good at figuring out how to do storage on phones. That's not what they do. What they say is, I'm going to completely not deal with that. I'm going to offload it to some object storage system and now I can focus on making the application awesome. Because the application, from the perspective of whatever company we're working in, that's what's driving the value. So my job, writing Swift, is to make sure that you as application developers are able to make your applications awesome. So storage is the utility, offload those hard problems from you. So let's do a live demo. These are always fun. So here's what we're going to do. I have locally running a Swift all-in-one in a VM on my laptop. And I have a set of scripts and I'll put the link up in a little bit when I'm done here. In my Swift examples folder. So let's look at what we have here. I've got 10 little scripts I want to go through that basically show you some examples of API applications. We're going to walk through, let's just talk to the Swift cluster and then we're going to do something that I think is kind of cool with it. Putting some stuff together. So I want to show you what this looks like to start with and that's going to have to get bigger. Excuse me, just a moment. Again, I didn't have time to... We're in France, let's do Eiffel. And then I'm going to make that a lot bigger. And then I'm also going to... That key combination, not that one. That key combination, not that one. Let's do you... Fullscreen of control command F, of course. Okay, so here's my very first one. There. So this is... The Swift API is HTTP. It's a REST-based API. So it's standard HTTP verbs, standard HTTP response codes. So we can talk to it with Curl. So that's what I'm going to do. Here's an example of let's get some info about the cluster. What kind of capacity, what kind of capabilities does this cluster have? So I'm making a get request to the slash info endpoint. So we can cut out a one and now you can see that's exactly what's going to happen. And I'm going to run 01. Okay, so I went to Curl. Oh, haha. This is what I needed to do. Let's turn on Swift. Seriously, there we go. Good, good, good, good. Okay, again, I apologize because I was running over from the other talk. Okay, so I reset my Swift. I'm going to start main... Start main. Okay, now I have Swift running. Great. Okay, let's go back here. Let's try that again. Tada, there we go. So now, if we look at this, we get a JSON object back that tells us here's the kind of features that we have available. What this means is that you as an application developer can query any Swift cluster and say, oh, does this feature supported? And if so, are there particular constraints on it? So I can tailor my application, write the application once and actually talk to multiple Swift clusters, which is very useful. So a couple of things that may be interesting is here's some rate limiting that is turned on for this particular cluster. We support the static large objects feature. And here's the... Where is it? The max file size. Here's the maximum object size I'm going to have on this particular cluster. It apparently looks like I set this one to two gigs. So you can query what's going on on the Swift cluster. And you can parse this regularly and there's documentation on how this works. So now you have the information. What do you need to do? So in this case, we need to do an auth request because we need to get a token from Swift and make a request. So let's do that. So in this case, we did the curl request. I gave it to username and password. This is using the old V1 API. Of course there's V2 and now V3. Swift definitely supports those, integrates with Keystone. In this case, on my Swift all-in-one, very, very lightweight, we have a V1 temporary auth system in there. Now, there's a few things I want to point out here. Whoops, there we go. Notice you get back your standard 200 okay response and I give back a couple of headers. Auth token header, this is what I'm going to pass in on my subsequent request. And I'm going to give it to this particular Swift account. So now, next up. To that account, I'm going to make a request using that auth token, that auth, that stored URL. And the minus capital I says do a head request and don't wait for a body on the response. So what you get back is a two or four no content. Notice there's nothing inside of my container at this point and I get a couple of interesting things back from Swift. One of them I'd like to point out is the XTrans ID. This is a transaction ID that's returned by any Swift response. A little hint on that means that if you've got a bunch of layers between you and your Swift cluster, if you don't get back an XTrans ID, it didn't come from Swift. So that eliminates one area to look for in your problem, like your caching layers. So now, I've talked to my cluster. So let's create some things. Inside of Swift, I can create containers. Containers, I'm giving an account. You saw that. I got an account based, I authed and I got back an account endpoint. Now, inside of an account, I can create a container. A container contains two things. One is a listing of all the objects inside of it. And two is some degree of metadata. Some aggregated stats about that particular container. And then you can also set some user-defined metadata there. So you can set this container was created by Bob on Tuesday or something like that. So in this case, I created a container. You can see I did the minus X put, which gives the HTTP verb put. And I gave it the token and I created a container called C because it's short. I got back a 201 created and another transaction ID. Fantastic. Now, it's not too interesting until I actually put stuff into my container. So let's put some objects into the container. So let's see what we did here. So I'm going to create 16 objects in this container. And in this case, my object names have a slash in them. And I didn't print that out. So let me just show you the file. That was 05, I believe. Here we go. This is exactly what I did. I put the token, used the token. And then I created, you can see here, container C. And then inside of that, I've got object names. I've got 01 through 08 prefixed with an A slash. So this object name is A slash 08. That is the slash is in the object name. Now, I also created eight more that are prefixed with B. So I'm going to play with that distinction in just a little bit. So in this point, I have put in 16 objects. And inside of each one of these objects, I've put one byte. So very, very tiny objects, but it allows us enough to play with. And in a way, we can have a little bit of structure across that. So now, let's do some container listings. I'm going to cat that one out first. Okay, container listings. Here's what we're going to do next. I'm going to show you a few different things. First, we're going to do a plain, simple container listings that basically asks the container what's going on. It's going to be fine. So we do a get request to the container C. That's it. And now we can figure out everything that's in my container. Now, if you would have looked back at that info request at the very, very beginning, there was one thing that said container listing limits. Basically, we can return pages back, and you can know that the page size is going to be that. So in this case, it's going to be 10,000. So our 16 comfortably fits inside of that, so I'm only going to have to do one request. Next up, I'm going to do a marker request. A marker says, given my list of objects inside of my container, start here and give me everything else. So in this case, the marker is going to be B slash. Remember, I didn't have an object called B slash. That was just some piece. So now it should just give us all the B01 through 08. I think that's what it was. I'm going to do the same thing. I'm going to do a prefix with A, which is anything that just starts with A slash. And we'll see what that does. Then we'll have an end marker, which is the converse of markers. Just start at the top and then end when you get to this one. So that should return just the As. I'm going to give it a limit, only give it 5. Give me a page size of 5 in this particular request. And then lastly, I'm going to ask for one particular object and combine some of these. I'm going to ask for one particular object, limit of 1, and I'll ask for it in a JSON format and you'll see the little difference there. So that being said, let's do it. Absolutely. They're on GitHub. So let's scroll up and you'll see what the world happened there. Oh, fun little command line things. Sorry about that. So you can see that I did a plain and simple test listing right here. I get my headers and then I got back all of my 16 objects. Everything good? Now I'm going to give it a marker request. Remember the marker B right there? Now I'm going to only get back B01 through 08, only 8. It started with B slash, which obviously doesn't exist and then anything that's sorted after that, that's what it returned. The opposite here, the prefix A slash, only get back 8. Again, the only one, it stopped when the prefix did not match that anymore. So this way you can easily use these things to page through your listings to figure out what's in your cluster. So in marker again, I'm stopping with a B slash. So again, I only got the A. And in this case, I gave it a limit of 5. I don't only get back 5, even though I would have normally gotten back the full 16. And then finally, JSON formats, I get one object back, limit of 1, but I give it the JSON. JSON gives you a little bit more information than just the object name. So obviously, you get the object name, but you also get the hash. This is the MD5 sum of the content of the object. You get back the content type. You get back how big it is, one byte. And you also get back when it was last updated. So you can use that to get back a list of, show me all of the things that start with A and then now you can provide interesting dashboard sort of information from that or do other information, other things with it. So, that was 6. Let's see what 7 does. We have 10 of these. Okay, dynamic large objects. Remember, we still have 16 things. Okay, dynamic large objects. Dynamic large objects are based on container listings. What they do is they are a 0 byte object inside of Swift that have a special tag on them. In this case, the tag is called X object manifest. And you can see, it's wrapping a little bit here, this header name, minus H, X object manifest starts with the container name, A slash. In this case, I'm going to create that object, that 0 byte object. Now, anytime I read that object, it's going to internally do a container listing and return me the concatenated contents of anything that would have matched that container listing, that prefix there. So, in this case, remember, those are 1 byte objects and I remember the As are 1 through 8. So, we should get back an 8 byte object when we request this. This is the fun part about live demos. Okay, so, 07. There we go, it worked. So, we created the object, created a new container. I'm sorry, I created a new container. Inside of that container, C2, I created an object called dynamic large and I added this header to it, the X object manifest header. You can see that was created just fine. Now, I'm going to go read it, go to C2 dynamic large and I get back a content length of 8. And here's the contents right there, 1, 2, 3, 4, 5, 6, 7, 8. So, everything worked. That's the way you can tie together things. So now, if you have a dynamically changing set of data that needs to be referred to by one particular thing, you can now refer to it that way and not have to update this particular manifest object every time you do it. I've got a really cool use case of that and I'm going to come up to it in just a minute and see what my time's like. So, now, 8, static large object. Static large object is instead of dynamically creating what is returned by this, explicitly refer to very specific objects and say, this object is going to be made up of this, the contents of this object, followed by the contents of this, followed by, followed by, followed, et cetera, et cetera. In this particular case, you can see that I'm referring to 1, 3, 5, 7, 1, 3, 5, 7 in A and B prefix names. So, it just shows that I'm skipping over things and it's going to be aggregated together. The static manifest, the contents of that are a JSON description of the particular objects, specifically referred to by their ETag, the hash of that content, so that you know that it's exactly this one. Even if somebody else overwrites it, it will not return that anymore. So, that is 8 static large object. And in this case, we created it. I just recreated that container and I created my static large object and I got back 1, 3, 5, 7, A, C, E, G. So, it worked exactly like we expected it to and now I can specifically tie together things. In this case, this is a really great thing. If you say, I've got a 20 gigabyte video, let's chop it into 1 gigabyte chunks or 100 megabyte chunks and I know it's going to be, this is exactly what it's going to look like. Let's create this static manifest, tie them together. Now, if I wanted to later on, I could currently download all of those 21 gigabyte chunks and do transcoding and things like that and do aggregate throughput and that's really good. Or, I can just now have this one URL that says hilarious cat video and it's referred to this one 20 gigabyte object. Must be really funny cats. And download it to the client, to the end user, they never will know that it's actually chunked up in different pieces. So, that's actually a really great way to tie together these large objects. Now, two more things I want to show you. What is it, two more? Temporals are cool. Temporal URL allows me to create a signed URL for a particular object which means that Clay writes an awesome Swift application, as he is accustomed to do. And I don't have credentials on his Swift cluster. He can actually give me a URL that's only good for a certain amount of time and for a certain set of verbs. So, you can say this one, you have this particular URL, you can read this object for the next 30 seconds. And that's what you do. So, you create this, I'm using this Swift temp URL command that is included with the Swift source code. It's just a little helper thing. It's all based on an 8... Oh, yeah, it's a Swift client now, too. And it is based on, it's an HMAC SHA-1 which means that it's locally computable. You don't have to have network requests to generate these sort of things, which means they're super, super fast. So, we can do this, Temporal, and we're adding a shared secret to the account metadata. I'm posting it into the account Temporal key, and it's secret. And then I create a... I didn't actually print that, you could see up here. I was creating the... saying this get request is going to be good for 30 seconds on this particular path. Use this shared secret. I think it might actually be more than 30 seconds, but... So, this should fail. Escape my ampersand. Unauthorized already, but... I'm going to do it again, and then I'm going to do it fast. Go, go, go, go, go, go, go, go. See, it worked. 200 OK. So, you can see it's a time-limited thing that works pretty well. Now, here's my last one, and it's my favorite one. So, 10 bulk uploads and manifests. This is where you start putting stuff together. It's really cool. So, one of the neat things about Swift is we support bulk operations. You can delete a lot of stuff inside the cluster at once. Great. I don't want you to delete your Swift stuff. I want you to add stuff to Swift. So, you can put multiple files into Swift at one time, and this is really cool. So, you can have a tar file, or even TGZ file, and upload that to Swift, and it'll explode it into all of its constituent objects and store those independently. That's kind of cool, because you can say, here's my server, type it through tar, and then send that to Swift, and that's pretty cool. So, what I did is I created a directory called sample logs inside of that, it's 2012, and I have one through 12 months. Inside of each month, I have five log files, and the log file has one line in each of them. It just says, that's number one in 2012.08, number two in 2012.08, number two in 2012.5. That's a good pattern, so we should be able to see how that works. So, I can do this. Let's just cat it. So, I'm going to do this. Okay. Scroll up and show you what happens. So, first I create another container, two containers, I create logs and logs roll up. In, before I upload any data inside of my logs roll up, I'm going to do dynamic large object manifests. In this case, I'm going to say that this object is anything that starts with in the logs container that starts with 2012 slash. We're going to call it 2012 logs. Now I'm also going to create one for every month. So, January, February, all the way through December. December 2012 logs starts with 2012 slash 12 slash. Now, I'm going to extract this tar file, I'm going to show you what's in the tar file, the TGZ file itself, and tell it to extract this archive right here. Now, download a sample month roll up, I'm going to ask for August logs, and then we'll see what happens. So, here we go. I created those containers, make the dynamic manifest objects, created all of those, got success 201 on all of those, and then here's inside of the tar file, this is exactly what I showed you earlier, there's the whole contents. So, through the bulk upload, it was successfully done, created 60 files, awesome. Now, sample month roll up. Here we go, I'm accessing 2012 logs, and I get the body of my request is 1 through 5 in 2004, 2008, because I was looking at August. And if I similarly want to just say Matt, give me a month. June. Boom, there it is. So, that's how it works, you can kind of aggregate those things together. So, that being said, here we are. That's my little live demo of kind of cool things you can do with the Swift API, so I'm going to breathe through the rest of this. This is where you can, every single piece of code I just did, it is available right there, my account on GitHub is not my name, and Swift underscore examples. You can access those there. So, rules for using Swift better. It's not so much rules, it's more like general guidelines. Build for concurrency. This is one that's very, very important. Sometimes, if you want to, you should all write these down if you're writing Swift apps. Build for concurrency. Swift is optimized for massive concurrency across the entire data set. We don't optimize for 10,000 requests to one particular object. We optimize for access to 10,000 objects at the same time. That's the kind of thing we work with. We do this for a wide, which means that you can you, you can splay your data across lots of containers, you can splay your requests across lots of different objects at a time. Which basically means it's really, really great for web and mobile apps and backing up your entire server farm, putting those in different places inside of the Swift cluster, different pieces of the namespace. Use response code classes. Don't assume that the only thing that means good is 200. If you look at the HTTP specs, the client should respond to response code classes. That doesn't mean we're going to be changing response codes on you or anything like that. But it does mean that you are basically future-proofing yourself against certain problems. So what happens if they come out with a new response code? And we implement something for that response code. It's some new way to say a client error, a new way to say success. Then at that point, we are both in compliance with the spec, and you also did not break. So that's a very, very good thing to do. 200 series means everything was good. 400 series means that you messed up. 500 series means that we messed up. And in that sense, what do you do when those things happen? Well, if you messed up, fix it. If we messed up, retry. So if you send us a request to source some data and you get a 503 back or something like that, then we don't know what happens. Something really bad happens. So send it again. And basically 503 means that SWIFT cannot guarantee the durability of your data on a write. So if you're trying to write data and we're telling you, I don't know what happened, I can't guarantee that durability, send it again. SWIFT almost certainly choose different places to put the data and to work around current failures in the cluster. And durably write your data and you're going to be completely okay. Again, you don't have to think about those kind of problems. All you have to think about is oh, I got a 500 series response. Okay, let me try that operation again. The next thing you need to do is sometimes you're going to get 401. 400 series, which one does that mean? You messed up, right? So 401 is unauthorized. Re-use your tokens. Don't try to go get a new auth token every single time. That's just going to add pain to your life. It's going to be slow. You're going to have to make that network request or some sort of math to compute that right then. And then you're going to have to do SWIFT and then you're going to do it all over. But the thing is your auth token doesn't only last for one request. Used to the default keystone was 24 hours. Now I think it's one hour. But even so, I hope you're doing more than one request per hour. So get an auth token. Use that in a loop. And if you get 401, get a new auth token. If you don't get 401, use the same auth token. That's all it means. So just keep retrying. And that's the one thing that's really going to remove a lot of potential overhead. I've seen some clients that have written to SWIFT and they start getting a new token every time and it just kills your performance on that. And mostly it's just the network transit between the different things. It also means that we can't cache that token. Because when you get the token and we validate it, we're like, oh yeah, that's going to be good for another hour. Let's just put that in the cache. New token comes in. I've never seen this. Let's go check and see how long it is. You're doing more work to get the tokens. We're doing more work to authenticate the tokens and authenticate them. So use retries and assume that the data, don't assume that the data is going to fail, but assume that it's going to be robust to changing network conditions and response code classes and things like that. So there are tools to help. Do not go alone. One of the tools that you can use is the SWIFT command line client, the CLI interface. You can do interesting things. So the first example I did was the curl request a slash info. Basically, a SWIFT stat. Actually, not info. The SWIFT stat is like an account head. Sorry, that's a stat as a head. So instead of doing that account head request, well, this way we can get it out nicely formatted and you can see some information about what's going on. You can use environment variables to set your auth credentials and you can only type it in once. Basically, it saves your fingers a lot of typing instead of typing in curl requests. So stat is one of the things you can do. Upload request looks like this. You give it the upload command, tell it what container you want to put in and your local object and it responds, we wrote this one object. I had a one megabyte object to put in there. So now that you've done that, you can send a stat to your particular container. So this is another thing that can save your fingers and things like that. You get out something that looks very, very similar, but in this case it's the container and has an object in it and this is how big it is. It's one megabyte and so on and so forth. Not only do we have the fan line client, but there's also the SDK piece. This is for Python. There are some things in the ecosystem that are not Python inside of OpenStack. That's what we have. And there's lots of various work going on here. But the basic idea here is that you create a connection, give it your credentials and then you can start putting objects and in this case you can see I'm just opening up the object locally and passing that to the library. It's taking care of it. Now the really great thing about the SDK is remember all those rules I told you about retrying and using response code classes and things like that. SDK already does those. So it puts things in a retry loop. It manages your tokens for you so it doesn't get new tokens all the time and stuff like that. So this is something that can really help you out in getting started very quickly. And get the book. Unfortunately they scheduled this right at the end like after the expo hall is closed so I can't say go to the booth and get books. But if you go to swiftack.com slash book there's an O'Reilly book that is put out that it's just recently released. It has all the latest information about writing applications for Swift, a lot of info on that. It has a lot of info on building Swift clusters, how to put things together, guidance on all kinds of stuff. And I would highly recommend you peruse that. It's available in Dead Tree and Electronic One because it's O'Reilly. You can go try it out. So one of the things that is really great is somebody can say you should go do this and this is how you do it and you're like, well great, I can't really learn it until I'm doing it. Because the reality is that me standing up here for 30 or 40 minutes is not going to teach you how to write code for Swift. You really have to sit behind a keyboard and figure it out. So you want to get started on your own. So there's two ways that I would suggest you can do that very, very, very quickly. One, if you really don't want to, if you want to go nuts and bolts, there's a repo inside of the Swiftstack GitHub accounts called Vagrant Swift all in one. Basically it uses Vagrant and VirtualBox to deploy to set up very, very quickly a Swift cluster and it has a little config file that says you can choose this many drives that will provision in this many replicas you want to use. For just a testing purpose, setting it to one actually works really well. It's very quick. It's just kind of a one-time thing. Clay wrote it. It's awesome. And number two, if you want something that's a little more on Rails and a little more kind of behind a polished box, you can go get a free trial for Dev and test work from Swiftstack. You can download that for free and it's more of the management controller that Swiftstack has. You can use that and then from there you have a box running either virtualized or bare metal that you can just kind of do a one or two click. Here's your install and you're ready to go. And then now you have a real Swift cluster to play with. So those are the ways to go try it out. Now, how are we on time? What time is it? 10. What time are we done? 410? Well, I have 409, so I might have time for one question. IRC channel. Open Stack Swift Open Stack Swift on IRC FreeNode. I'm on the mic, so that'll be on the video. Any questions? Yes. My GitHub repo? That one? There you go. You're welcome. GitHub.com That's true. I didn't. I had about 15 seconds left. So there are other things you can do. Yes, question here. I think it supports BZIP if I remember correctly. I can guarantee you we don't support your custom file format right now. It's so the really cool thing about it is that that functionality, I have a third talk and after this gap they put all three of my talks on the same afternoon. So I'm talking about extending Swift. Yeah, it's back in the other room. So it is implemented as middleware in Swift which kind of gets away from the dev side of things, but it is absolutely possible to modify that or actually put your own that says now I know how to do the, I can understand this thing coming in and now I know how to split things up and stuff like that. I worked on that and I think they presented on that in Atlanta using Swift as storage for Git repose and they were using all the binary blobs inside of Swift. You should definitely look those up online. They are recorded. Talk to some of the Inovance guys who are around as Christian and Shmuel did that. I guess now Red Hat guys. Anyway, that's that. Thank you very much. Several of us are here. I can answer a few questions for you if you'd like and we're doing the Design Summit tomorrow and Friday. Thank you for your time.