 Let's welcome Kristian and another SWIFT talk. All right. Good afternoon, everybody. Yeah, and welcome to my talk on using or developing applications using OpenStack SWIFT as a storage backend. My name is Kristian Schwede. I'm a software engineer working at Red Hat, also working on OpenStack SWIFT. I just had a look before this talk. It's nearly five years ago that I started contributing to SWIFT. So for quite some time already. And yeah, I want to give you a short introduction how to use SWIFT for application development. So before we start or before I start talking about the REST API that actually is used to interact with OpenStack SWIFT, I want to give you a very short brief summary about what is OpenStack SWIFT, not as much as detailed as the earlier talks from today from Tiago and Alistair. But if you're interested in that one, videos should be available probably soon. We are in room age 20213. So please have a look at these talks if you want to have a few more details about the how SWIFT works internally. So SWIFT itself, it's an object storage system. That actually means I don't mount a file system. I don't mount a block device. I just have, well, basically an URL that actually gives or returns, for example, my object that I start early on. It uses a very flat namespace. I have a personal account within my SWIFT cluster. And this account contains containers. And containers are just a collection of objects actually. And because of this flat namespace, it's mostly suited actually for unstructured data. So you should never actually try to do it, for example, mounting and SWIFT object, even if it's possible, and install your database server files on it or whatever. It's really more suited for, for example, videos, images, other large binary data sets. And especially videos and images and other multimedia files. It's nice that you have actually an URL where you can access these objects because you can then just use it within your browser. SWIFT itself is a very scalable system, very durable system. The default, well, most people are actually using it with three replicas, meaning that you have in the back-end three copies of every object located on your storage servers. And SWIFT is taking care of the replicate that each object is actually replicated. And even after years of time, that there's, for example, no bit rot. It was too quick. Second? All right. Sorry for that. It has been in production for more than eight years now. Actually, it's one of the founding projects within OpenStack itself. Has been invented by other original developers at Rackspace. And that's the same amount of time that actually sums with clusters already running. And we have, well, there are known SWIFT clusters with more than, like, 75 petabytes of storage actually within a single cluster. So as an application developer, using an object storage is actually really nice because it really separates the application logic from the data pass logic. So when you create an application or write an application and you want to store some big data files, for example, video files, you don't really need to care about the data pass that you actually upload data through your application server. But you can upload the data to the SWIFT cluster and you only store metadata within your application, for example, using a database. Or you're just storing references, for example, with an elastic search. And let SWIFT itself handle the large data sets. When you're accessing a SWIFT cluster, you're using a simple REST API. These are basically HTTP requests based on get, put, had, delete should be there, too, and post requests. And as the end user or application developer, you're always talking to the SWIFT proxy, basically. And the proxy then in the background talks to your storage servers. And for example, in this case, make sure that every object is replicated three times, or well, it sends three times to the storage nodes and stored in a durable way. So when I talk about a SWIFT cluster in the following slides, I'm basically talking about interacting with a SWIFT proxy because that's actually the endpoint for the application. So a simple REST API request, actually, even a simple get request actually is a simple REST API request. So when you have an endpoint that looks something like this and you have these parts there, all tests, for example, that's your account name. And public in the first example would be your container name. And by doing so, you get actually a list of objects back. At least if this is a public readable container, that is. If this container is public readable, then you can also download the objects by just appending the object name at the end of the URL. And of course, you can also upload objects. But for uploading objects, you need actually some kind of authorization. In this case, in the lower case, we have XR's token that is sent along the request. And this token actually authenticates yourself to this SWIFT cluster, making sure that you're actually allowed to write data there. There's a little bit more information in the reference link down below. Before I continue with the remaining part, and just a few information in general how to use SWIFT, when you send heaters, there's a differentiation between customer metadata and system metadata. So for example, I have sent two different heaters along with a request. The first one is actually a system metadata XdeleteAfter, which we'll come to in a few minutes again. And I have a custom metadata, and custom metadata always at the object level starts with X object meta, and then some key and a value. So you can store custom metadata alongside with your objects. For example, that might be some reference to another object or some information about the video file, some captions, some authors, titles, whatever. There are different ways how to easily start interacting with this SWIFT cluster. I think probably one of the simplest ones is to actually use the Python SWIFT client, this one. And there's a very useful option, dash dash debug, which actually includes examples how to do the same request using curl. So when you do a SWIFT debug list container, somewhere in the output, you will see a very similar curl command, actually, that you can use directly. And it gives you some idea how to start interacting with the REST API directly and not using the SWIFT command line interface. There are two different ways how to send metadata with a SWIFT command line interface. The first one is the dash h, or uppercase h, which actually natively just sends a header. And there's a lowercase dash m, where you can just set some object data directly using the SWIFT command line interface. All right, so that's that. I just mentioned earlier on that we need some authentication, in many cases, or in most cases, actually, depending on your application, for storing new data, sometimes for reading data back. And so let's talk a little bit about this. When you start developing applications with SWIFT, you might actually want to run SWIFT only. So you don't need the full open stack environment with Keystone, and Database, and whatnot, to actually start and testing SWIFT. And we have a built-in or included middleware called Tempers. It's actually, we're actually using it both for development purposes as well as demoing and showcasing stuff. So it's really meant to be as a showcasing stuff or testing stuff. Don't use it in production, because the credentials that you're using here are actually stored in plain text in the proxy server configuration file. So it shouldn't be done in production, of course. However, using this one, it makes it, or it shows very easily how to use it, actually. Just sending two headers, in this case, XORsUser and XORsKey, which is, in this case, it's an account name and a username within that tenant or account. And your key, and SWIFT returns a storage URL and authentication token. And you use this storage URL later on with the authentication token. And if you're the owner of that account, then you can upload as much data as, until your cluster is full. All right, so when you want to go into a production or more production-like environment, then in many cases you're using Keystone, which is the open stack identity project. It goes very similar, but instead of sending headers, you're just sending a JSON blob. But you can see there's a password in it, there's a user in it. And you send it to Keystone server itself. Keystone will return a token for you. It will also return an URL for you. And you need to do a second query then to get the actual endpoint for SWIFT. Because with Keystone, you typically have multiple endpoints for multiple services. For example, you run OpenStack Nova, OpenStack SWIFT, Cinder, Glance, whatnot. And each of these services has its own actually endpoint that you need to query from Keystone. All right, so you have a token and you have a header that you can, well, you have a token that you're sending as a header to some URL. This is nice. If you do so using, for example, a command line interface or curl, but if you want to upload data, let's say, with your browser, your browser typically doesn't send, well, custom headers along with a request. So we need somehow a way to send authenticated data, for example, using a browser. And there are two middlewares that work very similar to each other. The first one that I want to introduce you to is tempURL. It actually uses pre-computed signed URLs. And these signed URLs are only valid for a very specific action and a short amount of time. So to use this, you need a key that is later on used to sign this request. And you store this key as a metadata within SWIFT itself, either on the account name. In that case, it's valid for all the containers within that account or on the container name. And the example in the first line there is a way how to set it. So when you have this key, you can actually compute these URLs. And it's pretty easy. You just need to define the method that you want to use, for example, for a get request or a put request. You need to define how long it should be used. And then the full pass, for example, to the object that should be valid for this request. And then you do some HMIC stuff within Python. And you get back some signature. And you would append the signature to the full URL that you can see below. You might wonder, actually, why we're using SHA-1 check sums here. And there are some good news. Just this week, we merged a patch that actually allows you to use SHA-256 and 512 check sums as well. So you don't need to use SHA-1 check sums anymore when you run the latest Swift versions. All right. So these are 10 URL URLs. We have a very similar working middleware called FormPost. And as the name a little bit implies, it's actually for HTML forms. So an HTML form might have some hidden input fields. And in this case, we make use of that. One of the hidden input files actually gets a signature. And there are a few more fields that you can use here. There's a field for redirecting the request. So when your browser finished the upload, Swift actually returns a redirect. And your browser will hopefully follow this redirect. You can also limit the maximum file size and the number of input fields for this HTML form. And only if all of these parameters are met in only that account, this is a valid request. Oh, I should mention something here. So when you do an upload using an HTML form, you somehow need the file name that the browser sent actually. What you do is typically you upload to a container and then use custom prefix, for example, a random UU ID. And the browser will append the file name to the end of the URL that you just used for assigning the request as a pass name. So your application needs to take care a little bit of that. So after you uploaded the request, it's in many cases useful to have, well, some kind of action at the redirect URL that actually updates, for example, some internal location for your Swift object in your application itself. I mentioned earlier on that it's also possible to have public readable containers. Again, these are simply metadata settings on in container, for example. In this case, we're making a container public readable, which is given by this asterisk, and the error listings is responsible for actually enabling listings for public readable containers. When you have an account with an OpenStack Keystone, for example, then you have a tenant, and within the tenant, different users. You can also differentiate between these users on a container base. So for example, you could have one container per user, and each user is only allowed to write into his own container within that account. And if you want to have a look at the current ACLs that are set, you can, for example, use the SwiftStat command using the container name as a suffix, and then you get back the read and write ACLs. All right. I should mention that some of these actions are only applied to the object level. That's especially true for temp URL and form post requests. Both of them only apply to the object level, really to only, well, upload and download data. And numerous requests are valid as long as you have a public readable container, and authentication tokens are in most cases, because you're the owner of the account, valid on the account, container, and object level. And you should take this into account when you write applications that actually use Swift as a storage backend. Let's assume you have some references within a database for Swift objects, and you give out an authentication token to the application that is running on the client side, for example. If the client actually has an authentication token, you might be able or you might actually modify Swift data inside Swift without updating your application. It's the entries in your application. So when you have references to Swift objects, in many cases, it makes a lot of sense to only use temp URL or form post requests because the client can't do any harm to any other objects then. All right. Let's have a look at a few API features. We have some modifiers for listings that most of them both apply to the container listing as well as the account listing. I'm focusing here on the container listings. So when you simply apply a query string, for example, using the limit equals two parameter, you can actually, well, limit the amount of returned entries. So in this case, it would give you only the first two entries, two object names that are returned. These features, well, these modifiers are especially useful if you paginate over a container with a lot of objects. Let's assume you have a container with 100,000 objects in it. You probably don't want to show that to the user on a single page. You want to iterate over that using multiple pages. And you do so by using markers and markers and limits by just saying, OK, I'm starting here at entry number 1,000 and continuing on the next page with entry number 2,000, for example. I can also use some modifier to only list a specific subset of objects, for example, using this prefix. Let's assume you store something else and high resolution pictures in the same container. But I only want to get a list of something else within my request. Then I could use a prefix to actually filter these objects for objects that are just starting with, in this case, sub, for example. And as a developer, I need some possible data. So what I can do is, or what typically is done, the Swift itself will return the object listing just one entry per line. But you can also say, well, I need some JSON object or some XML object. And you can do so too. Expiring objects is another useful feature that actually, so when you upload an object, you can specify a time after the object becomes unavailable. And that might be either seconds from now or a unix upon timestamp. And what goes on in the background inside Swift, Swift will immediately stop returning the object after the time expired. And a little bit delayed there's some process running in the background that really deletes the Swift object there on the cluster. What we didn't mention so far today, I think, is that we actually limit Swift objects in size. And by default, these are five gigabytes. So there's a reason for that. Let's assume you have a very large high resolution video. And sometimes users are doing that with terabytes for a single file. When you look at the underlying level, you have, of course, a couple of disks. And by actually limiting and splitting objects into multiple segments or chunks, you spread the load and spread the data across multiple disks. That's one of the reasons why we're splitting these objects. Or you have to split them, actually. And we limit that. But we have a concept of static large objects. And we have also dynamic large objects. But I'm focusing on static large objects here. We're using a manifest later on. And this manifest actually defines where are my chunks located. There is another popular public object storage, which actually uses a little bit different concept. When you upload chunks there, you need to send a manifest later on, too. But in that case, it combines all these chunks into a single object. That is not done on Swift. Swift really keeps the chunks. And you can later on reuse this concept. For example, if only a few chunks within your file changed. Let's assume you have a large video file. And you cut your video file, for example. Or update some metadata within your video file. You don't need to upload the whole object again, which might be terabytes of data. You only upload the chunks that really change later on. Range requests. Again, video is one of my favorite topics within this talk. It sounds simple, but especially you want this for videos. It's just basic or general HTTP stuff, where you define the range where you want to start and end, and use that in your requests. The nice thing with videos is most of all browsers that I know of, or that I use, which is Chrome, Firefox, Safari, supports this out of the box. So when you have a very, very simple HTML file, you have an HTML file with this video entry. And you define, for example, a Swift object, BigBugBuddy in this case. And when you do so, your browser actually generates a preview for you. If you look at the second line there, it's a GET request. And actually, the original video file is more than 600 megabytes in size. But your browser will only retrieve the first few megabytes and create a preview for you. And the same is true when you do a seek in the video using your browser. It won't load all the data. It just jumps in the file. And this is supported out of the box by Swift, which makes it really easy usable for video and other kind of data. Versioning is another very helpful feature of Swift. You can actually, when you have a container, you can specify a location, another container, actually, to use. So whenever you overwrite data, for example, an existing object, the older version will be still stored in your archive container. Same is true for delete requests. If you send a delete request to a container where versioning is actually enabled, it will store the last version of your object as well in the archive container and an additional delete marker. So one of these two objects here that have both timestamps is actually an empty object, a 0-byte-sized object. And it has a content type of a delete object. When you're writing applications that are running in the browser, for example, using AngularJS and you're serving the Angular side from different domains in your Swift cluster runs, then you need to enable a feature called COS, which is a cross-origin resource sharing. So let's assume Swift runs on one domain or subdomain. Your static web file or your static file with Angular stuff runs on a different one. By default, it's not possible for this application running in your browser to retrieve or to use the data that is used, for example, from a container listing. And to enable this one, you, again, set a special metadata flag on a container that actually makes this possible. All right, a little bit of a rough voice. Let's have a look at a few examples. So I mentioned Angular early on. What we're doing here or what we're having here is a base URL, which, in this case, is a public readable container. And I use, in this case, a prefix called image and doing an HTTP request, which actually retrieves a list of objects in that container. And once this request has been done, I store the list of images somewhere in my application and call a function called showImage. And this showImage function actually does an additional head request because I want to retrieve some metadata for this image, in this case. You can have a look at the full example and the given URL below later on. I'll share that with you. But actually, it's a very simple Angular application that browsers or is a basic image gallery built on top of Swift. So what actually is done here is it shows an image that is stored in Swift. And the head request that I just mentioned is happening in the background. So there is a metadata field called xObjectMeterCaption. And the caption value, in this case, is then shown below the image, in this case, for example. Very simple, but quite powerful if you want to use or build on top of that. So how do you get the data into Swift? Well, it turns out I use a software called Adobe Lightroom, which is unfortunately not open source. But it's easily extendable. You can write your own plugins for this. It's not written in Python or something similar. It's written in Lua. But the concept of tempURL makes it really easy to include that within this application as well or other applications. So I just need to compute or pre-compute this signed URL and reuse that. And what I have done or running at home is a small plug-in that is also available on my GitHub account, where you just use a storage URL with a tempURL key. And you can directly export from Lightroom your pictures to OpenStack Swift. If you want to use Python, probably the simplest way is to actually use a Python Swift client itself, which is the Swift command interface itself. But it also includes reusable parts or libraries within your application. And again, what you're doing there is you get an authentication token in the storage URL using your username, password. And from that on, you have actions like get account, put container, put object, list container, stuff like that. And it's really easy to use within your application. When your application wants to give out tempURL URLs, you need this tempURL meta key. And it's a good approach to actually first try to check if there's a key already existing on the container or account. And if not, create one randomly or create a randomly one and set that on the account, for example. And if you want to have a closer look at how to do this with Python, there's an application called Django Swift Browser. As the name implies, it's written in Django and Python. And that actually uses all these concepts, like tempURL's form post for uploading data directly to Swift, the listings with prefixes, public URLs, whatnot. And it's a really easy build, or I think at least, it's really easy to read a Python code. So please have a look at that one. All right. So how do I get started? There are different ways. And for years, we used a concept of called Swift All-in-one installations that we are still using as Swift developers. But it's a little bit overkill, probably, if you just want to start out, because it's a very long document. You need to do a lot of stuff. Based on that, we have some vagrant environments that you can also use. But you can make your life a little bit simpler if you just want to try out some things, for example, using the API. Docker Swift is something that we worked on half a year ago, just as a POC stuff. But it turned out it runs pretty well for showcasing stuff. It's a very simple environment where you run everything in a single Docker container. And you can just start interacting with the REST API. And with the REST API itself, I would encourage you to use Python Swift Client. And if you use these credentials on the slide, together with this Docker Swift environment, then you can easily start using and playing with Swift and the REST Swift API. All right, so that's it. It was a little bit faster than I thought, but we have more time for questions, which is also great. So any questions? Everything unclear? Everything clear? Yeah. So you showed the files greater than 5G have to get split? Yes. But then you also said the browsers are going to make this? Yes. It now needs to know a different object ID, because it's not sharing the same object ID? No. So actually, so the question is, what happens when you split up, for example, your 50 gigabyte video file, and you want to do range requests if you need a different object ID? No, you don't need to. So let's assume you name your chunks like this one, like chunk and then increasing number. So you have all these chunk objects there. And finally, uploading a manifest object using this object name there. If you just use this object name, Swift itself and the Swift proxy server itself will take care of accessing the other chunks of that file. So if you just stream it from the proxy server, the proxy server will make sure that you just need the single URL and everything is taken care of for you. So you don't need to play with us. Yeah. Is there a tour? Once again, you first. Yes. And we need to put it in a cache. Yes. To give access, signature, and so on. But how to put this logic of tempurale in cache? Because cache stands first and proxy server is next. OK, so the question is, if you have a temporary signed URL and you have a cache in front of the Swift proxy server, is there a best practice to do to use this? Well, the Swift proxy itself will send heaters along with a response to not cache this object itself. Your question is more like, OK, how do I ensure that this object, which might be accessed very frequently and after some time runs into the expiration time, probably, that it's still accessible? Well, that's a good question. There are different ways to do so. So if this object is always, should be always public readable, then I would just put it in a public readable container. If not, sorry? Yes. But those files at the same time are very popular. OK. OK, so to repeat this part, the client gives out a special URL or wants to give out a special URL for a private object that might be accessed very frequently. Well, in that case, I wouldn't give out the actually temporary URL to share with, but an URL pointing to your application and your application then creates temporary URLs per request or, for example, for 60 seconds. And the reason for this is if you, for example, give out an application or a temporary URL that is valid, let's say, for a month, and your client decides later on, well, that wasn't a good idea that I actually shared this temporary URL. You can't revoke it. Well, you could invalidate and overwrite your metadata setting temporary URL key, of course, to revoke this. But it might be actually much easier to handle this in an application that then simply returns a redirect to the actually signed temporary request. Make sense? So actually, what you had in mind is part of the Swift browser. I'm using a very similar approach here. I'm generating a random UID. And it is stored inside the Swift browser. And then if this is accessed, it creates a temporary URL for you and redirects the browser directly. So from a browser point of view, I'm just accessing this URL and the browser follows the redirect and downloads the object in that case. But that would really require a little bit of logic, at least, on your client application. More questions? All right. Then? Oh, just one short notice. The slides are available at the FOSM website on the talk details. So if you want to have a look at the links, please feel free to do so. Thank you.