 Thank you everyone. Thank you for coming back from Coffee Break. Today I want to talk about maybe this title is a little different from or much different from what you see in the schedule because I will explain later. I want to share something new, something we are doing these days, these months, these months. Some even in progress work. I want to share the ideas and have some discussion. So first of all, I would like to thank Seagate. Is that from Seagate? From Seagate. I would like to thank Seagate because this is the gap between vendors and users, right? Because Seagate produce what? Produce disks. Who use disks? The developers, architectures. The vendors deliver OpenStack to the end users. But Seagate is not here. This summit from some point is a very good place for Seagate to promote their products but they are not here. So this is the gap between vendors and users. This gap is everywhere. We are facing this gap every day. But back to this slide, I want to thank Seagate. The Seagate guy cannot be here but this work is we did it together. I want to thank Seagate. So Swift and Ceph. Swift and Ceph are both very popular in the community of OpenStack. Swift was born in the dawn of OpenStack. OpenStack was born in 2010 and Swift actually was developed, the beginning of the Swift was in 2009. Ceph draws lots of attention from the OpenStack community. Even it is not part of OpenStack but really many Stackers care about Ceph because it can provide unified storage. We can use Ceph as the glance back end, Nova back end and senior back end. It is very cool. But today I want to talk about object store. First of all, what is object store? Ceph is based on Rattles, reliable automatic distributed object store. But that object store is not what we are talking today. The object store we are talking today is object store service. It is S3 like storage. Data are stored in buckets or containers, not directories, not file systems, but buckets or containers. In Swift it is containers, in Ceph it is buckets, and in S3 it is also buckets. There are also buckets. And provide restful HTTP APIs. We can talk to this storage system using HTTP, the language of internet. Let's see how they can do this. For Swift we have proxy servers, and account servers, container servers, and object servers. So we can put object containers and containers in accounts associated with tenants if we use it in cloud environment, especially integrated with Keystone. Then in Ceph, in Ceph we use Rattles Gateway. Rattles Gateway translates the restful object APIs into the native object API, the APIs of Rattles, reliable automatic distributed object store, data object store. The communication between Rattles Gateway and Rattles cluster use sockets, but Rattles Gateway provide restful APIs to the applications. I want to say something more. In this picture, the storage nodes provide a container server, and object server. For some cases, we may want to put container server in some other servers, some other nodes, or put the container DB on SSDs, because container usually becomes the bottom neck of Swift. When I was preparing this talk, I found a very interesting thing. I want to ask anybody in this room who loves Ceph, who loves Swift. I find that the ones who like Ceph almost do not like Swift, and vice versa. The ones who like Swift do not like Ceph. This is very strange, and I think this is not very clever. Because clever ones will not argue which technology is good and which is not. They will choose technologies according to their use cases, their needs, their requirements. If we search Google, we can see many results. I do not want to repeat this. Some of them are really good. I recommend this one. This one is from Marenis, and it is a talk available on YouTube, accessible on YouTube. It is a talk on the summit in Vancouver. This was a very good talk. He shared his experience, Marenis experience in a real case with multi-region deployment. Multi-region deployment to use both Ceph and Swift, because we can deploy a multi-region storage system with Swift easily. But if we want to use block storage, we want to use block storage, block storage for Nova, block storage for Cinder, Swift cannot do that. We need to use Ceph. In this case, he used Ceph in every site. At the same time, you Swift to share some data between data centers. This is very interesting. It is a very good case to show how to choose technologies according to requirements, but not compare one with another. These are some questions. I think it is very, it is some, can show the metrics, the advantages of each other. If you want to deploy multi-region storage system, Swift is easier. Swift can also do that, but the rights can only put to one of the regions. It is not a multi-region deployment, but Swift do not provide immediate consistency. If you use Swift, you may face the problem we call eventually consistency. Some rights happened, but the risk do not get the latest variant. Risk may get results from older variants. But after a period, other risks can get the last variant. We call this eventually consistency. Especially in multi-region deployment, Swift may, if you use Swift with multi-region deployment, you may face that problem. If you want immediate consistency or strong consistency, you may use Self. Unified storage, you use Self for block storage and for Cinder for Nova for glance. That is the core thing of Self. But we can see others use the defined functions for object store service. We want to do on-flight compression, on-flight deduplication, on-flight virus scanning to scan the user uploaded data, to scan the user uploaded data. This can be implemented with Swift easily because Swift proxy or Swift is based on WCGI framework. We can put middleware into that framework. It is plugable. It is very easy. Also, Swift provides some, we call it advanced storage features. For example, object-expiring, object-varying. In fact, all of this, all of this, and there are others, all of these features are implemented with WSGI middleware. So Self is not based on WSGI because it is not written in Python. So Self, for now, he cannot provide this feature. For another question, you may take something to account how many servers are available, how many servers you can use to build a system. We know that if we want to make a distributed storage system reliable, you may need three or more nodes, three or more nodes. Actually, if you use it in production, you may need four nodes or five nodes as a list. So if you do not have many servers, if it is a small cloud, it is a small cloud, and you want to use OpenStack, and you also want to provide object storage service. So you better choose Self. You better choose Self. Okay. And there are other things. I will not list all of them. But I want to say some misunderstanding. Swift is not good at storing large objects, and Self is not good at storing small files. We often heard this, but not really. Swift support large objects with a series of APIs. Details can be found on docs.opstack.org. And for Self storing small files, this is very interesting. This is done by, this is a lab in a university, a lab in a university. In his, this is not a paper, but a report. In this report, it contains detailed information about his test. His test on a small, on a small self cluster, on a small self cluster. The results show that Self can handle small files. Very good. But, but I need to say that this cluster is small. This cluster is small. So, so this, these two questions are not really because something depends on how you use your cluster, how you tune your cluster. How you tune your cluster. For example, if we do not use the APIs for large objects in Swift, okay, the performance of storing large objects are bad, but we use it correctly. We use it correctly. It is better. It is better. The performance is, we can, I did some tests. The bandwidth, the bandwidth of the proxy server, the bandwidth proxy server, 10 gigabytes, 10 gigabytes, 10 gigabytes, almost rest is limit. So, the throughput is not a problem for large objects. And let's get back to object store. Let's get back to object store. I want to talk about something more interesting. I want to talk about something more interesting. This is a picture I took this, this noon. We can see Swift stack, there was, there was a talk from Swift stack. The title is Swift is not just for open stack, but for many environments and apps. And there is something other, some numbers. This, I did not think it is based on very comprehensive investigation, but the trend, the trend shows that funding in object store, in object storage raising exponentially, raising exponentially. What really drives this money? Why do they spend this money on object storage? Object storage. Glance, provide storage for glance, obviously not, obviously not. Glance not was this money. Object store speaks the language of internet, we have talked before, the HTTP, the language of internet. So, what happened on the internet? How do you, how do you push tweets these two days? How do you push photos on Facebook these two days? I think many tens of photos, everybody, tens of photos, everybody. So, these are the real news. The real thing happens for object store, object storage. Mobile apps, social applications, Twitter, eBay, Taobao in China, and WeChat, Chinese use WeChat. Tens, even hundreds of billion images were stored on Taobao.com today. Tens of billions, or even hundreds of billions, I do not get the specific numbers. But the number of images are very, very large, are very large. So, these are the dominant workloads on internet and the dominant workloads for object storage. So, let's see this picture. Let's look at this picture. How can we modeling the real-world workloads? The real-world workloads. We may test object storage used with cost bench, with assets bench, as a bench. That will tell you the limit, the limit of your storage system, but what users care about, what customers care about. The request, the request from internet, the request from internet obeys the poison distribution, poison distribution. Let me assume that every day, you will receive 10 emails. You will receive 10 emails, but even average, every day you will receive 10 emails, but not exactly today, not exactly tomorrow, but the average number. So, this is the this is the characteristics of internet access. Request come in with obeys the poison distribution, poison distribution, but not constantly, not constantly. So, we see this picture. Namda is the expected value, expectation, the average value. Namda is the average value. We can see with the average value, Namda equals 1 equals 4 equals 10, K is the concurrency every unit time. For example, every seconds, every seconds. It is not constant, but probability. So, what is the custom really cares about? He should care about when the average number is 10 or 100 or 100. What is the average latency from a request to the response? From the request to the response. If we use a mobile app, we want to pull a picture, want to pull a photo. We cannot write it for 10 seconds or even five seconds. We cannot write, we cannot write. Actually, the mobile apps need to push the photo in one second or lower, in one second or lower. So, the latency between request and response are very important. This is what the internet applications really needs, the mobile applications really needs, but not what is the limit. What is the limit? What is I OPS, for example, I OPS request per seconds. This story system can support. Not this, but when the workload is, the average workload is, for example, 100 per second to 100 per second. What is the average latency? What is the average latency from request to response? What is the maximum latency from request to response? What is the 90% latency? What is the 80% latency? These are the real news. So, the next question is how we can get the number, this number. So, I will explain how we can get this number and how we can test a story system, object story system to get the latency. These are connective drives. Connective drives I will explain later. So, let's assume there is a mobile social application with 100,000 users. And every day, 30% of the users are active. One active user uses this app 30 times every day, and if he uses it, it will perform five reads every time he uses this app. We can think about if we to pull images from Facebook, from Twitter, and 80% of the workload happens in two hours. Maybe when you go to work or when you have lunch in the two hours. So, we can get the number. The number is 500. So, the average workload is 200 per second requests to the object story system for the images in this support for this mobile social application. And we develop a small benchmark tool with Seagate. So, I want to explain what is connective first. Connective open storage project. This is a project from Linux Foundation. It was founded this year, but connective drives was developed two years or three years ago. And this open project was launched by Linux Foundation this year. The left is connective drive. So, different from conventional drives, conventional disks, this drive do not provide block interfaces, but KV interfaces. And it will be connected via internet, via internet, not SAS or bus or FC or fiber channel or something else, but with Ethernet. Every drive will have an API address. And how to access these drives with connective libraries. So, this is very similar to what Ceph are doing. What Ceph do? Ceph, there are OSDs, object story demons. And there are libraries. Libraries perform read and write to the OSDs. And OSDs provide a KV interface. So, this is very similar, very similar. So, this is a kind of object storage device, I think. Then what is connective best swift? Connective best swift do not have... We can see the architecture here. We have a counter server, container server and object server run on a storage node. A storage node is XS6 server. It is a PC server. But for connective best swift, we still have PC servers, but the number smaller. We only use proxy servers. Actually, we run account server and container server on that nodes, on that nodes, on the proxy server nodes. So, we also call it Paco server, PSU Paco, Paco server, because there are proxy server, account server, container server and object server on that nodes. But the object server actually do not store data on the disk of the servers. But object server do not store data locally, but on connective disks. So, we do not need server here. We do not need server here. We can save a lot of servers. We can save a lot of servers. The storage density will arise. This... There is a detailed talk, detailed talk on at Nantr summit 2014. So, if you are interested in this, you can refer to this talk on YouTube. This is the environment we test. We test the connective best swift. So, we can use the benchmark tools, the benchmark tool to test the connective best swift and conventional swift. Both are okay, but the work is in progress. So, we did this first. We did this first. These are the Paco servers. And this is a connective... We call it connective box or connective board. Connective board provides 10 gigabits, 10 gigabits never access to read and write the data from these disks. And this is one unit, one U with 12 drives, every drive with four terabytes. So, the storage density is higher than servers. One Paco server, one Paco node. One Paco node can connect it to several connective board. So, in this picture, in this test, we only use one of them, but actually it can connect it to several of them. Usually, five or even 10 of them depends on your use case, depends on your workloads. So, we use a benchmark tool also run on this server, also run on this server. Generate loads that obeys the Poisson distribution. I can show a demo here. I'll try it. For this version, the load generated on one container, the name is NECO. We can see this benchmark runs. And here's a number. Here's a number. Operations outstanding on arrival. It means when these requests arrive, how many requests are not responded, are not responded. So, we can see this concurrency, this concurrency are not constant, but it is from a distribution. So, after the test, in this test, it's very short. We can run it for minutes or for hours. Every latency is about 100 milliseconds. Minus of the latency is 11, about 12 milliseconds. And the maximum latency is about 400 milliseconds. So, this is a very simple benchmark. This is a very simple benchmark. We improve it. And that demo was run on my laptop. In this environment, the object size is 1 megabyte. 1 megabytes. And we get the result of the maximum latency. The maximum latency is about 500 milliseconds. And a maximum latency is about 500 milliseconds. And average latency is about 66 milliseconds. So, the result is acceptable. We will try some parameters, some other parameters, because in real world, the object sizes are not the same. So, this is 1 megabytes. All of the objects are 1 megabytes. But in real world, the size of objects are not the same. Maybe the size is normally distributed. And the workload should be hybrid, not only reads. This only tests reads. In real world, there will not be only reads or only writes. There will be hybrid. So, this work is in progress. This work is in progress. So, we can see the key point is these numbers, these numbers as customers really know. They know how many users he will get or how many users he will have. And he can get the daily active percentage of the users. And the other numbers somehow depends on what he designed, his application. And some depends on the user, depends on the users how to use his application. But these numbers are the customers can really get. So, he used he used these numbers to get this number. And he can know how much, how much, how many things, how many things he will, he need to buy, he need to buy. And how much he need to spend on the devices and the servers. So, that's all. Thank you. Thanks. Great presentation. Have you benchmarked Seagate's Kinetic Drive with Swift? Only with Swift. You have tested with Sef? No, not with Sef. Last year's, this Vancouver Summit, the Toshiba guys were benchmarking their KV drives with... With Sef. It's pretty cool. They were actually, I mean, the KV drive from Toshiba has got 256 GB of cash in it. Yeah, maybe the interface of the devices are different. No, it's Kinetic API. I mean, the Toshiba guys haven't released it as a product, obviously. But you could take the Seagate drive and test it with Sef. Okay. Something to look into. Thank you. Again, I think this is one of the best presentations I've seen in a long time. Thank you. And I like the tool that you're creating. Is it available somewhere so we can help you? Yes, it is on GitHub. But the code on GitHub is not the latest version. Okay. I'll upload the latest code in these days, in the next month. Okay. I look forward to it. A little a few bucks. Okay. I feel bad I want to fix. Thank you very much. You're welcome. Yeah, how do we find it? You can search on GitHub for Knobz. Oh, Knobz, okay. Knobz. So this is for Knobz. Maybe the best way is to develop, to rewrite, to rewrite the Node Generator code in Cosbench or SSBench, not a standalone benchmark tool. Yeah. Correctly. Yeah. Yeah. Swift with Kinetic Drivers. Okay. It is supporting self. It is supporting self, but not, I don't know how it will perform. If you use, if you integrate it with self, connect it with self, you also need some servers. Yeah. Yeah. So it is much complex, much complex to develop a connective best self or Zen connected best Swift. This is simple, but yeah. Change these drives, connective drives. I do not know too much information about the marketing or sales, but I heard that it is not very much expensive than, it is not very much expensive. It is a little, a little more expensive than ordinary drives, but not very much. So it is, yeah. So you save, you save servers, you save nodes. Yes, it is based on the, yeah, yeah. It is based on the computational, the computational workload is not very high for object store. Yeah. Yeah. Yeah. Arm. You use arm. They use arm. So this is Intel CPUs and bridge chips are very expensive architecture, but arms are cheap. You're welcome.