 OK, it's two o'clock. Welcome to my session. And this session, we'll talk about application developments with OpenStack Swift. It's about programming on OpenStack to make OpenStack work for our applications for our scenarios. I'm from O-Storage. O-Storage is a startup company which is providing object storage solutions for enterprises. I'm a software engineer from O-Storage. And this company is located in China, Shenzhen, in Shenzhen city, China. So the core of O-Storage product is OpenStack Swift. So why we choose Swift? Swift is very mature. It is born at the very beginning of OpenStack. Now OpenStack have tens and hundreds of projects. But at the very beginning, there are only two projects. One is Swift, and the other is Nova. And Swift can run at very large scale about this morning, 75 petabytes. It is a very large scale. And in direct space, actually, it runs on hundreds of petabytes. So it can run at a very large scale and provide very high durability, very high scalability, and availability. Yesterday, there was a session which is delivered by Swift Stack. We know that Swift Stack is the main contributor of the OpenStack Swift source code upstream. So they explained why Swift can perform this better, this good. Swift is cool. Why? The architecture and how to install Swift. So I will not repeat this part. If you are interested in it, the end hood of OpenStack Swift, you can refer to the presentation yesterday by Swift Stack. So today, we focus on how to build application over OpenStack Swift. There are many applications in the real world that work on Swift. Swift is not so much related to virtual machines, different from other parts of OpenStack. So there are many applications in real world. We give some examples. The first one is eBay. We know that all these pictures are stored in OpenStack Swift. And the other one is iQiyi. It is online video site in China, similar to YouTube. So these pictures and videos, the source of these pictures and videos are stored in OpenStack Swift clusters. So these are internet applications. And Wikipedia are also a user of OpenStack Swift. And the right one is interesting, because in China, debit card need to be applied on site. You need to go to bank. You cannot apply debit card remotely. So you need to go to bank. Go to bank, the process was very long. It will cost you about an hour to get a card. So this bank, this bank is also a very large bank in China, China commercial bank. So that developed a system. So this machine, it is not an ATM. It is not an ATM. This machine located in banks. And you go to bank. You go to banks, you want to apply for a debit card. You do not need to spend so long time. So sometimes you just stand in front of this machine and a staff will see you remotely on that screen, see you remotely. And she will talk to you and ask you for some information and capture these videos and recording your sounds. Recording your sounds and store them in an object storage system, object storage system. So including your fingerprint, maybe, and your signature, the image of your signature. So you can get your debit card, debit card, sorry. You can get a debit card in less than 10 minutes. So it is the time decrease from one hour, about one hour to 10 minutes, thanks to the cloud computing and cloud storage. So this is very interesting. And I intended to show you a mobile app on iPhone. But unfortunately, I cannot access to the Git server of my company. I don't know why. I cannot log in and access my code. So I cannot demonstrate the iPhone application. But I wrote a very simple but something like it. It is based on command line. But it will help us to illustrate, to understand how to develop an application based on OpenStack Swift. So talking about application development, OpenStack Swift is object storage. What is the difference between developer application based on object storage and traditional storage, such as file system? The APIs are different. APIs are different. Object storage is just for HTTP APIs. And traditional storage like file system use the POSIX APIs. They're very different. For example, we cannot just override or change only one part of a file. In file system, we can change a part of a file. But if we use object storage, we can upload a file. Upload a file to the object storage. So we call it an object. So we can upload an object, download an object, override it. But we cannot change only some part of this whole storage. So this is the difference. And there is something else. So in one word, we do not have random writes. We cannot perform random writes on object storage. And something other, we need to access our storage over HTTP. This is very different from traditional storage systems. So object storage is very suitable for internet applications because it talks on HTTP. It talks on HTTP. So this is the first thing that it differs. So the next I want to talk about is the namespace. For object storage, it is flat namespace. We only have buckets. We only have buckets. Now some object storage provide presenting directories. But that's not a true hierarchical namespace. So this is different. We cannot build directories and directories and hierarchical and put a file in one of them. We only have buckets. Only have buckets. Why we do this? Because when we have very large scale storage system, the files are too many. There are tens of millions of files. This would be very slow. This would be very slow. But that would be fast. And the performance will grow with the scale. This is very interesting and very attractive to nowadays. We call it big data times. So this is the second thing that object storage differs from traditional storage. So let's see some look closer to the APIs. The first part is the endpoint or the UI of the cloud storage service. It is the IP and with HTTP slash slash and add 0, add 0 port. And the second part is the count name in Swift. Usually an account is related to a tenant. Account related to a tenant. So if you are a user or one app is a user of an account, if a user of a tenant, he locked in, he can only see containers and cannot see an account. Because account is related to a tenant. Or you can think it equals to a tenant. But actually, for a storage system, there is not what we think about account username and password, not that account, but a storage space. That equals to a tenant. And the third one is container name. We said that if we have an account, we can create a containers in this account. And we put objects into that account. So this is the overview of OpenStack Swift APIs. And in the following, we will see some details about how to use these APIs. And first of all, we need to install OpenStack Swift environment. There are several ways to install OpenStack Swift environment. For example, follows the OpenStack installation guide manually to install OpenStack and Swift. But it is very complex and manual to install OpenStack. We know it is always go wrong, right? Always go wrong. So we usually use some automatic deployment tools. Our company provides automatic development tools for OpenStack Swift for enterprise users and also does a service stack. Also, our company is very similar to a service stack. But it is in China for the China OpenStack users. And the third way is to install SAIO environment. SAIO is Swift all-in-one. Swift all-in-one is for development. It is very important. It's not for production. It's not only for development. And it will install Swift from source code. And talking about to install from source code, we know that OpenStack have a project DevStack. So we can also install Swift with DevStack. So I will show that. We can see this is the virtual machine based on VMware Fusion. It is Ubuntu. If you are interested in it, if you have an environment based on Ubuntu, you can follow me to have a try. It is a normal DevStack environment. Yeah, I will show you the difference in the configuration file. It is a normal DevStack environment. And just download the code from GitHub DevStack. And the difference is that you need to edit local.conf file for Swift. So if you want to install an environment only for Swift, you may disable all other services, but only enable something like sproxy, sobject. So these services begin with s is the Swift services. So actually, you do not need to. It is not a must to install Kistone and MySQL. So we can also delete them. And you may need to do something else, such as we know that Swift will store data with replicas, usually three replicas. But in a testing environment or development environment, we do not need to. Sometimes we do not need so many replicas. So here we just set it to one. And in DevStack, it was installed Swift to use a loopback device, not an actual hard disk or SSD. It's just a loopback device that emulates disk. So the default size of this loopback device, this file, the default size of this file is 1 gigabyte. 1 gigabyte is too small sometimes, so we can modify it. For example, here we use 20 gigabytes. So we can perform stack.sh. I'll just mention that here we use the branch liberty. We use the liberty branch. Sorry. OK, this is a new environment. So we do not need to. Yeah, yes, I would like to repeat the question. The question is, if you change the configuration file, do we need to stack it again? The answer is, this is a new environment. So we do not need to do that. However, if there are some services running, so we need to stack it first. Now we can see it is finished. And different from what we usually see after DevStack running, we do not see the username and password and other things. Just the IP address. Because we do not enable Keystone. We do not enable Keystone. And Swift. Swift do not rely on Keystone to perform authentication. This we will explain later. Sometimes we need to develop authentication service that according to our customer's needs. So to verify this installation, we can perform some operations. First of all, I would like to minimize this. This would be better. So here, this is another virtual machine. So we will use it as a client. And the one that runs DevStack as the server. So this is in the client. And the virtual machine we have seen before is server. So most of our operations will perform on the client. So we talk about developer applications, not operations and administration. So we just use the client. That's OK, the client environment. I will type in some command and show the results. And later, I will explain why the result. So first, SwiftStat. SwiftStat will give us the best information about Swift cluster. Actually, it is about an account in a Swift cluster. This account is, now this account is Tembol's test. Now, account name is Tembol's test. And SwiftList, this command will list the containers under this account. We can see, first, we list, we run SwiftList and we get nothing. And then, we run SwiftPost. SwiftPost will create a container and that account. So then, SwiftList, again, we get A. This is the container. And we can upload object to this container. So we also run SwiftList. But this time, we put the name of that container in the command line. And we can see that the file, we just uploaded, its name is testIC in that container. So why it is actually like this? Because, first of all, source testIC, let's have a look at testIC what's in it. So there are Swift users, ST user, ST key, and ST auth. This auth UIL is different from Keystone. This auth UIL is different from Keystone that we know that if a Keystone, if it has a Keystone service, it should be on port 5,000 and with version 2.0. Tokens and so on. But this is different. So in our environment, we do not install Keystone. Our authentication is performed by another authentication service named Temples. Temples is, we can think it as a demo for authentication service that we can develop other authentication. We can integrate other authentication service into Swift, follow the demo of Temples. This we'll talk about later. So now, let's see how these command lines call the APIs we mentioned before. So these are command lines. These are command lines. Now we have looked at it how it called the APIs. These are the APIs. So what are the relation between them? OK, the question is how we configured the client to connect to the server, right? OK, just source this file. And we can export the environment variables. The three environment variables. So we have Swift authentication URL, Swift users, the name of Swift users, the password of the Swift user STK is the password. So we use command line. The command line will look for these environment variables and use them to log in, to connect to Swift. We add a debug here. And we can see that the API calls here. We can see the API calls here. So compared to the slides before, we can see that this is the URL of the storage service. And this is the account name. This is the account name. And this is the container name. And the action is get. So we get what we get, a list of the objects in that container, in the body of the response. In the body of the response. So this is the API call under the command line. So we just copy this and pass it directly. And we can see clearly that there is one object with the name testIC. We just upload it here. Do we have a laser pen? Nope. Here. So I'm in your site here. So I cannot see that clearly. This is the object name. This is the object name. It is the same as we just uploaded. And we can see these are the headers of the response. You can think it off for some matter information of the container A. So the key point is that a user will know his user name, the tenant name, and the location of the storage cluster. The location of the storage cluster here. This is the location of the storage cluster. And the user also knows the tenant name and the user name. But in this API call, this is very important. In this API call, we do not see the tenant name directly. And we do not see the user name. The account is related to a tenant. But the tenant name, let's see that. The tenant name is the test. The tenant name is test. And the user name is tester. So this is the format of the temple authentication service. The Swift user means the first one is the tenant name. And the second one is the user name. And actually, a user or a developer would know these information, including the password. But he did not know what is the actual name of the account. And in this API call, there is no user name, user name and password, no user name and password. So the question is, how did a developer to get the account name? How do he get this account name? And how to get authentication? So get verified is get verified. So these are the open stack services, use tokens. So the question is, how a developer can get the token? So let's see this command again. Swift is that. We said that with this command, one can get basic information about his account and the basic information about his account. So we add the vparameters, the vparameter here. So we can get some more information instead here, instead here. So actually, it is perform another API call, which is here. Let me see. It called this command line, called this HTTP API first, and get two variables. The first one is storage URL, and the second one is auth token. So subsequently, other command line will use these other operations, will use these two variables. Let me see. So how does SwiftStats get other information here? So it called this API and get the storage URL and storage URL and the auth token, and then use the storage URL here and the auth token to perform next steps. So in our development, we also follow this routine. We first get authenticated, get the token, get the storage URL, and then use these variables to perform next steps. We can see another command. So we create a container named Austin. So we can see that it also did the same things. First, to call this API to get the storage URL and auth token, and then use them in next steps. In next steps. OK. OK, first, in this command call, we can see that there is an error. There is an error. Because it first checks that whether this container already exists, if it already existed, it will not create a container with the same name. It will fail. So it checked it before. And I find that it returned not font, which means the container Austin does not exist, so it did not exist. So he creates the container Austin in this step. So we can have some try that. So we should remember that at the very beginning, we only know the authentication URL, the auth URL, and the tenant name and the user name. So we just use these three information to get others. OK. This is a short program. This is a short program. We can see the source code. And I will demo the function here. In this window, I need to do some modification that this program is with Keystone. We can see that if we need to create a container, first we need to get a token. And here is how we get a token and the storage URL with 10-fold authentication mechanics. So here is the auth URL. And it will put the user name and the tenant name in the request headers. And send the request to the server and get the response. In the response, there are storage URL and auth token. So this is what happens when we do not use Keystone. And how about we use Keystone? We use Keystone. So Tempest is just a demo authentication service that help us to develop our own authentication mechanics. But Keystone can be used in production. So let's see what will happen if we use Keystone. This is the machine. This is the virtual machine we run DevStack before. We modify the configuration to add Keystone. So now we need to un-stack it and the stack again. Very soon it will come up a swift environment with Keystone. So Keystone installation will spend more time. So when we use Tempest, when we use Tempest, it is a very, very simple authentication malware that is very simple. So we add the user name and tenant name in the headers, in the headers of our requests. And the token, if it is issued, it will not change. So it cannot be used in production, OK, finished. So back to the client machine, OK. And it seems that I forgot something, OK, that's OK. And I also need to enter the tenant name and user names. This is what the program writes. So the first one, get a token, just to show that this works, this works. And we can see in the other functions how the token is used. So we created a container, and we list all the containers. And we find that the test container is in it, is in our account. So we can see something different to that. If we use Keystone, the storage URL and the token are a little complicated. We still use this command, and we can see that the storage URL changed. Also, version one, but the storage URL is more complicated, like some random number. Actually, this is the tenant ID. This is a tenant ID provided by Keystone. Let's see how it works. So where's my point? OK, here. For example, if we need to list all the containers or we need to create a container, if we need to create a container, we need to first get the token and the storage URL. And we call the function in the front here. So this is a very simple program, but it is a little bit complicated that you need to call the host URL to get the token and the storage URL. Then use them to actually perform operations to the storage cluster here. Get the storage URL and the host token first and then call the storage URL with the token. And here is just a restful API calls. You can use whatever framework you like. I use requests here, requests here, requests. Request is a framework that help us to make HTTP calls in Python. It is very simple. If you use something before like your lib URL, urllib, urllib, urllib, urllib2, you will know that. This is more simple than before. So just perform HTTP calls. HTTP calls. It is the routine. First to get the storage URL, remember that a developer, as a developer, he did not know the storage URL, but he needed to get it from the authentication service and then call the actual APIs of the storage service. Why it is designed like this? Because this makes a storage cluster, a Swift cluster, can use different, more than one, more than one authentication mechanics. For example, it can use templates. At the same time, use Keystone. So the prefix of the storage URL will be different. The prefix will be different. For example, if we use Keystone, the prefix is auth. But if we use templates, the prefix will be templates. So if we have a third authentication mechanism, we can use another prefix. So if we create a container, if we need to create a container, first we get the storage URL and ask the user to input the name of the container and then we create the container. Before we create the container, we need to check if this container already existed. So this is the routine. But there are something we should take care of. For example, if we do not successfully connect to the server due to network interruptions, do we need to try again? How many times do we need to try again? And if we get some status code indicates something goes wrong, how do we handle them? How do we handle them? So these are the questions we need to take care of. So when we develop application with direct API calls, it will be more complicated than we see here. More complicated. We need to make our application robust. So we need to do more work. So sometimes it is more and more complicated and still have some bugs. So in production code, we usually use SDKs or libraries, but not use a direct API calls. Direct API calls, we can write some program to make direct API calls to help us to understand how this works. But in production, we use libraries and SDKs. For Swift, of course, there are Python bindings, Python libraries, and also Java libraries and Donate libraries and JavaScript, maybe. I do not remember clearly. Ruby, yes. Ruby SDKs. So in production, we will use SDKs and libraries and bindings. And here, we will show you that if we use Python library, yes, it is the official library that the OpenStack provides. The official Swift library, the Python library, let me see. So here, we can see that in the manual, I designed two, I implemented two items with the same name, create a container. But the underlying code are different. We can see that a moment ago, we see that with direct API calls, the codes will be a little complicated. But with libraries, only these lines, only these lines from here to here, did all the things to create a container. We do not need to call the authentication API. We do not do to store the storage URL and authentication token in a variable or something else. And we do not to take care of if we failed to connect to the server, if we failed to send HTTP requests to the server, do we need to try again? We do not to think these things, because the libraries take care of them. So it is very simple. It's very simple. Actually, only three lines. Only three lines. First, make a connection to the Swift cluster, then put container. Just use this function, put container. That's OK. That's OK. We do not need to write that many codes. And after that, we close this connection. So up this container is created. We can help us to see this clearly. So we compare them. We compare them. To do the same thing, to do the same thing with direct API calls by so many calls. And so many calls, maybe something we do not take care. And with the libraries, only three lines to do all of the things. So this is the difference between the direct API calls and the programming with SDKs. So in this slide, we see that more clearly. If we use client libraries, we only need to write three lines to do all the operations to create a container. And if you want to see more information about the Python client libraries for Swift, visit that URL. So we can have a look to this page. This document is generated from code. So it is not available, but read it carefully and you can get it. So this is the best function that we can use in the development. We create a container, read the data, write data, and put, get, and delete. Delete the data, delete the containers. These are the best operations. But there are something other than that. For example, this photo. This photo is I took yesterday at the keynote session. And we can see on the right, there are some informations with which device I took this photo. It is an iPhone. And the size of this photo and the location where we took this photo, where I took this photo. So this, we call it metadata. In file system, metadata usually indicates that the directories, the tree of the file system. But in object storage, metadata usually refers to these informations. These information, they usually be very useful because only a photo is not enough. We need some more information to describe this photo. So these are also in the, if we need to implement something like this when we're programming with Swift, how we can do? Swift provides an API or something like an API called metadata headers. We put a header, we put a header begin with x. And meta, in the middle. And we can add some metadata to containers to account to objects. Let's see how it works with command lines. Oh, we always see this warning. This is because Indipity Swift client used the API of the Keystone client. And it is not recommended to use that. Keystone did not recommend to use that API. So it always triggered this warning. This is not because our code or our command line is because Swift did not use the best API of Keystone, or Swift client. Swift client did not use the best API of the Keystone client. So in this container test, there is an object demo. And we also can use SwiftStat. We can use SwiftStat. We can use SwiftStat to account. We also can use SwiftStat, this command. It is very, very useful when we debug some programs. We also use this command line to a container and to an object. So we can see these are the informations about this container, about this object. We can add some metadata to it. The format of the metadata is here. It's shown here. The format of the metadata. We add a location. Now we can see that in the information here, there should be a location. Location appears here. Yep, if you want to highlight the location on the map, you need to do more programming with LBS, LBS service, or map service. So just add some metadata on it. We can also do something more. Yep, yep. You add metadata to the photos, to the documents, to anything you uploaded to Swift. And you can use other services. Related? Yep, yep, yep. You can use, for example, you can use this location, input this location to a map service. And it will highlight the point, highlight also in that map. So here we can see that there is a metadata called topic. Yep, yep, so we can use different metadata to describe this, different metadata to describe our objects, describe our objects. So this is about the metadata. And similarly, this is in command line. So how we do it when we're programming, we can see that with debug here. Just add a hat. Just add a hat in front of a body. And just add a hat in the request. And the hat with the x and object means this is a metadata for an object and a matter. And this is the key of the metadata. This is the key of the metadata. So you can add OpenStack as the topic of this object. And this operation is performed by post. So we use put to upload object into Swift. We use get to read object out of Swift. We use delete to delete object. And we use post to update the metadata. We should be careful that with post, we can only update metadata, but we cannot overwrite the object. If we need to overwrite object or we need to modify our object, we need to use put. You use put, the put request, but not the post request. The post request only modify metadata. And actually, metadata can do something more. For example, quotas for container. If we add this metadata into a container, it will limit the size of this container, how many bytes, how many files, how many objects you can put into a container. This will be useful when in some scenarios, users may put millions of files into one container. That will make this container very slow. So sometimes, we need to use quotas to limit these things. And sometimes, we also need to expand the function of OpenStack Swift because the APIs the Swift provided to us does not satisfy our need. Satisfy our need, so we need to expand the functions of Swift. It is not very hard because Swift, the API of Swift, is implemented based on WSGI. So in WSGI, we can think of it as a pipeline. Think of it as a pipeline and a request that come into the pipeline and modified by one and another filter and output the last request and to the final proxy server and proxy server return the response. And the response goes through all these filters to the end of the pipeline and to the user or to the client. So why we can use Keystone? Because we add this. This is Keystone. We can make authentication with Keystone and temples. We can make authentication with temples. And we can also add our own authentication mechanism into this pipeline. And in the Swift cluster, we can use more than one authentication method, more than one authentication method. We can use many. So in some scenarios, some containers need to be accessed by, for example, fingerprint or areas, authentication and some just code. Just code. That's OK. So we can use different authentication method to add our own middleware here, middleware or filter. It depends on where you use this in the WSGI context. So here are some other things that I want to mention that Swift nowadays do not only support small files, but also support very large files very well. The throughput can be achieved to about one gig bytes per second. One gig bytes per second or more depends on the network. Depends on the network. So how to upload big files to this high throughput? We split a file into segments. And the segment size can be limited, can be indicated in command line with a parameter segment size. And how we use it when we program, when we do programming. So we upload this file with 50 megabytes and split it into segments. So we can upload it to different parts, apparently. We also make it output the debug information. We can see that Swift uploads every segment as a single object. And at last, it upload a single object with zero bytes. So we call it manifest. So it describes how actually this object is consists of different segments. Let me see. So we can see there is another container, test segments. And actually, the data is stored in this container, but not the container test. So these are the data. There are six of them, and everyone is about 10 megabytes. And in the test container, we can see that. Actually, the file, the object 50m.file is zero byte. The size of it is zero byte, because it is just a manifest. It's just a manifest, it indicates that the Swift, that the data are actually stored in another container. So this is how we process, how we store and upload big object. And actually, Swift, the data in Swift can not only be accessed via proxy server, but also from storage server directly. So the API are very similar. OK, so we can directly read this object. We can directly read this object from the storage node. So this is very flexible, because sometimes we want to make our data processing program run directly on the server where the data are stored, but not via the API from proxy server. So in another word, in Swift, you can do data locality with this API that read data from the storage node, storage node. So there are many other features that we can use in our application development. And if you are interested in it, you can read the official documents on OpenStack site for Swift. So that's all. Thank you very much. And thank you. If you have any further questions, you can email me or use WeChat. You can contact me with WeChat.